In the development of many software systems, we often need to handle text matching, string search, and word edit distance issues. The edit distance between two texts refers to the minimum number of operations (such as insertion, deletion, or replacement) required to transform one text into another. The smaller the edit distance, the higher the similarity between the texts. Levenshtein's algorithm is one of the classic algorithms used to compute the edit distance between strings.
In PHP, the levenshtein() function calculates the edit distance between two strings, meaning it returns the minimum number of operations required to transform one string into another.
int levenshtein(string $str1, string $str2, int $cost_ins, int $cost_rep, int $cost_del)
Parameter Name | Parameter Type | Description |
---|---|---|
str1 | string | The first string |
str2 | string | The second string |
cost_ins | int | The cost of insertion (default is 1) |
cost_rep | int | The cost of replacement (default is 1) |
cost_del | int | The cost of deletion (default is 1) |
The function returns the minimum number of operations required to convert one string into another, which represents the edit distance between the two strings.
$str1 = "kitten"; $str2 = "sitting"; $distance = levenshtein($str1, $str2); echo "The distance between $str1 and $str2 is $distance";
Output:
<span class="fun">The distance between kitten and sitting is 3</span>
Explanation: It takes three operations to transform "kitten" into "sitting".
$str1 = "kitten"; $str2 = "sitting"; $distance = levenshtein($str1, $str2, 2, 3, 4); echo "The distance between $str1 and $str2 is $distance";
Output:
<span class="fun">The distance between kitten and sitting is 15</span>
Explanation: Transforming "kitten" into "sitting" requires a combination of insertion, replacement, and deletion, totaling 15 operations.
$str1 = "你好"; $str2 = "再见"; $distance = levenshtein($str1, $str2); echo "The distance between $str1 and $str2 is $distance";
Output:
<span class="fun">The distance between 你好 and 再见 is 4</span>
Explanation: Transforming "你好" into "再见" requires one insertion, two replacements, and one deletion, totaling 4 operations.
The Levenshtein algorithm is a classic algorithm for computing edit distances and is widely used in string processing tasks. PHP's levenshtein() function provides a convenient implementation of this algorithm, allowing us to easily calculate the similarity between strings. Whether in text processing, spell checking, or natural language processing, the levenshtein() function is a powerful tool for developers.