Levenshtein Distance

Using an array of string values, will perform an algorithm over a base string value to determine how close each one is to the base value.

More information on the Levenshtein Distance algorithm can be found by simply Googling it. You may also refer to this page on Wikipedia.

https://en.wikipedia.org/wiki/Levenshtein_distance

Version 1

HTTP Request
POST /ado/v1/LevenshteinDistance

Header

ParameterDescription
Ocp-Apim-Subscription-KeyThe subscription key you received when you purchased a plan.

Request Body

Mandatory

ParameterTypeDescription
baseValueStringThe string value to compare all other strings against.
comparisonValuesString[]The list of values to compare against the base value.

Optional

Settings

ParameterTypeDescription
ratioThresholddoubleRestrict the array of values returned by filtering out those below a certain threshold.

A number between 0 and 100. If left blank, the default value will be 0, i.e. all results are returned.
applyRatioThresholdTostringYou have the option of applying the threshold filter to the maximum ratio returned in the data set or the average of all included.

MAX or AVG. Default value is MAX.
ratioSelectionTypestringSTANDARD ratio selections will include the types of ratios that will match all values within a string.

PARTIAL ratio selections will include those ratios that look for partial matches in the both comparison strings.

ALL will execute all (in this case, both) ratio types and return all results.

ALL, STANDARD or PARTIAL. Default value is ALL.
tokenSortTypestring
SORTED
will ensure all “words”.

NOT SORTED ratio selections will include those ratios that look for partial matches in the both comparison strings.

ALL will execute all (in this case, both) sort types and return all results.

ALL, SORTED or NOT SORTED. Default value is ALL.
caseSensitiveBooleanfalse will set all values to lower case prior to performing the comparison. true will compare the value as provided.
removeWhitespaceBooleantrue will remove all whitespace (i.e. spaces) prior to performing the comparison. false will compare the value as provided.
removeSpecialCharactersBooleantrue will remove all special characters prior to performing the comparison. false will compare the value as provided.

Ratio Selection Type

There are two types of ratios that can be selected from.

Partial ratio matches applies logic to do exactly that, try and match partial components of the longer string value with the shorter string value.

The logic breaks up the longer string and then runs the Levenshtein Difference algorithm over partial components of the smaller string.

Sort Type

When the calling application chooses to have all tokens sorted, each string being processed is split up by a space with the resulting values being sorted in an ascending order prior to being compared.

For example, if we take the following words, Tango zebra delta uniform, the sorting will then take those values and sort them like thus, delta Tango uniform zebra, prior to the comparison taking place.

This will occur for both the base value and the comparison value.

This technique then allows for a set of “words” in a sentence to be scrambled but the comparison algorithm will return it as an exact match given sorting was applied.

Examples

Basic

This example shows a basic test across three comparison values with the applied settings.

Request

{
    "baseValue": "Tango zebra delta uniform",
    "comparisonValues": [
        "delta Tango uniform zebra",
        "Uniform Zebra Foxtrot Tango Delta",
        "uniform zebra foxtrot tango delta"
    ],
    "settings": {
        "ratioThreshold": 0,
        "applyRatioThresholdTo": "MAX",
        "ratioSelectionType": "All",
        "tokenSortType": "All",
        "caseSensitive": false,
        "removeWhitespace": false,
        "removeSpecialCharacters": true
    }
}
Code language: JSON / JSON with Comments (json)

Response

{
    "BaseValue": {
        "Supplied": "Tango zebra delta uniform",
        "Actual": "Tango zebra delta uniform"
    },
    "ComparisonSettings": {
        "RatioThreshold": 0.0,
        "ApplyRatioThresholdTo": "MAX",
        "CaseSensitive": false,
        "RemoveWhitespace": false,
        "RemoveSpecialCharacters": true,
        "RatioSelectionType": "All",
        "TokenSortType": "All"
    },
    "Comparisons": [
        {
            "Comparison": {
                "Supplied": "delta Tango uniform zebra",
                "Actual": "delta Tango uniform zebra"
            },
            "Results": {
                "Ratio": 52,
                "PartialRatio": 52,
                "SortedRatio": 100,
                "SortedPartialRatio": 100,
                "MaxRatio": 100,
                "AvgRatio": 76.0
            }
        },
        {
            "Comparison": {
                "Supplied": "Uniform Zebra Foxtrot Tango Delta",
                "Actual": "Uniform Zebra Foxtrot Tango Delta"
            },
            "Results": {
                "Ratio": 41,
                "PartialRatio": 44,
                "SortedRatio": 76,
                "SortedPartialRatio": 76,
                "MaxRatio": 76,
                "AvgRatio": 59.25
            }
        },
        {
            "Comparison": {
                "Supplied": "uniform zebra foxtrot tango delta",
                "Actual": "uniform zebra foxtrot tango delta"
            },
            "Results": {
                "Ratio": 48,
                "PartialRatio": 48,
                "SortedRatio": 83,
                "SortedPartialRatio": 80,
                "MaxRatio": 83,
                "AvgRatio": 64.75
            }
        }
    ]
}
Code language: JSON / JSON with Comments (json)