Current Location: Home> Latest Articles> How to Use the Levenshtein Function for Fuzzy Search? Tips for Approximate Matching in PHP

How to Use the Levenshtein Function for Fuzzy Search? Tips for Approximate Matching in PHP

gitbox 2025-06-09

When performing text matching and searching, often we need not only exact matches but also a method to identify strings that are “close” to the target text. PHP provides a very handy function levenshtein to help achieve this goal and perform fuzzy search. Today, we will explain in detail how to use the levenshtein function for fuzzy search and implement approximate matching with PHP.

What is Levenshtein Distance?

Levenshtein distance (also called edit distance) is a metric that measures the difference between two strings. It represents the minimum number of edit operations (insertions, deletions, substitutions) required to transform one string into another. The smaller the Levenshtein distance, the more similar the two strings are.

The Levenshtein Function in PHP

In PHP, we can use the levenshtein function to calculate the Levenshtein distance between two strings. Its basic syntax is as follows:

levenshtein(string $str1, string $str2, int $cost_ins = 1, int $cost_rep = 1, int $cost_del = 1): int
  • $str1 and $str2 are the two strings to compare.

  • $cost_ins is the cost of inserting a character, default is 1.

  • $cost_rep is the cost of replacing a character, default is 1.

  • $cost_del is the cost of deleting a character, default is 1.

  • The return value is the Levenshtein distance between the two strings.

By calculating the Levenshtein distance, we can determine how similar two strings are. The smaller the distance, the more similar they are.

Performing Fuzzy Search Using the Levenshtein Function

In practical applications, we often want to offer a “fuzzy search” feature when searching for keywords. That means we want to find content that is similar to the user’s input, not just exact matches.

1. Implementing Basic Fuzzy Search

Suppose we have an array containing multiple strings, and we want to find those that are similar to a user’s input keyword. We can loop through the array, calculate the Levenshtein distance between each string and the search term, and select those with smaller distances.

<?php
$searchTerm = 'apple';  // User input search term
$items = ['apple pie', 'apple', 'banana', 'grape', 'apricot'];
<p>$threshold = 3;  // Set a maximum distance threshold, smaller means stricter<br>
$results = [];</p>
<p>foreach ($items as $item) {<br>
$distance = levenshtein($searchTerm, $item);<br>
if ($distance <= $threshold) {<br>
$results[] = $item;  // Consider it a fuzzy match if distance is within threshold<br>
}<br>
}</p>
<p>print_r($results);<br>
?><br>

In this example, we calculate the Levenshtein distance between the search term apple and each element in the array. If the distance is less than or equal to the threshold (e.g., 3), the item is considered similar and added to the results array.

The output might be:

Array
(
    [0] => apple pie
    [1] => apple
)

2. Implementing Fuzzy Search with Sorting

Sometimes, we don’t just want to find all similar items, but also sort them by similarity to show the closest matches first. We can achieve this by sorting the calculated Levenshtein distances.

<?php
$searchTerm = 'apple';  // User input search term
$items = ['apple pie', 'apple', 'banana', 'grape', 'apricot'];
<p>$results = [];</p>
<p>foreach ($items as $item) {<br>
$distance = levenshtein($searchTerm, $item);<br>
$results[] = ['item' => $item, 'distance' => $distance];<br>
}</p>
<p>// Sort results in ascending order by distance<br>
usort($results, function ($a, $b) {<br>
return $a['distance'] - $b['distance'];<br>
});</p>
<p>print_r($results);<br>
?><br>

In this example, we first calculate the Levenshtein distance for each string against the search term, then store them in an associative array. Using usort, we sort the results by distance, with the smallest distances appearing first.

Output:

Array
(
    [0] => Array
        (
            [item] => apple
            [distance] => 0
        )
<pre class="overflow-visible!"><div class="contain-inline-size rounded-2xl border-[0.5px] border-token-border-medium relative bg-token-sidebar-surface-primary"><div class="flex items-center text-token-text-secondary px-4 py-2 text-xs font-sans justify-between h-9 bg-token-sidebar-surface-primary dark:bg-token-main-surface-secondary select-none rounded-t-2xl">csharp</div><div class="sticky top-9"><div class="absolute end-0 bottom-0 flex h-9 items-center pe-2"><div class="bg-token-sidebar-surface-primary text-token-text-secondary dark:bg-token-main-surface-secondary flex items-center gap-4 rounded-sm px-2 font-sans text-xs"><button class="flex gap-1 items-center select-none py-1" aria-label="复制"><svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-xs"><path fill-rule="evenodd" clip-rule="evenodd" d="M7 5C7 3.34315 8.34315 2 10 2H19C20.6569 2 22 3.34315 22 5V14C22 15.6569 20.6569 17 19 17H17V19C17 20.6569 15.6569 22 14 22H5C3.34315 22 2 20.6569 2 19V10C2 8.34315 3.34315 7 5 7H7V5ZM9 7H14C15.6569 7 17 8.34315 17 10V15H19C19.5523 15 20 14.5523 20 14V5C20 4.44772 19.5523 4 19 4H10C9.44772 4 9 4.44772 9 5V7ZM5 9C4.44772 9 4 9.44772 4 10V19C4 19.5523 4.44772 20 5 20H14C14.5523 20 15 19.5523 15 19V10C15 9.44772 14.5523 9 14 9H5Z" fill="currentColor"></path></svg>复制</button><button class="flex items-center gap-1 py-1 select-none"><svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-xs"><path d="M2.5 5.5C4.3 5.2 5.2 4 5.5 2.5C5.8 4 6.7 5.2 8.5 5.5C6.7 5.8 5.8 7 5.5 8.5C5.2 7 4.3 5.8 2.5 5.5Z" fill="currentColor" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round"></path><path d="M5.66282 16.5231L5.18413 19.3952C5.12203 19.7678 5.09098 19.9541 5.14876 20.0888C5.19933 20.2067 5.29328 20.3007 5.41118 20.3512C5.54589 20.409 5.73218 20.378 6.10476 20.3159L8.97693 19.8372C9.72813 19.712 10.1037 19.6494 10.4542 19.521C10.7652 19.407 11.0608 19.2549 11.3343 19.068C11.6425 18.8575 11.9118 18.5882 12.4503 18.0497L20 10.5C21.3807 9.11929 21.3807 6.88071 20 5.5C18.6193 4.11929 16.3807 4.11929 15 5.5L7.45026 13.0497C6.91175 13.5882 6.6425 13.8575 6.43197 14.1657C6.24513 14.4392 6.09299 14.7348 5.97903 15.0458C5.85062 15.3963 5.78802 15.7719 5.66282 16.5231Z" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"></path><path d="M14.5 7L18.5 11" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"></path></svg>编辑</button></div></div></div><div class="overflow-y-auto p-4" dir="ltr">[1] => Array
    (
        [item] => apple pie
        [distance] => 4
    )

[2] => Array
    (
        [item] => apricot
        [distance] => 5
    )

[3] => Array
    (
        [item] => banana
        [distance] => 6
    )

[4] => Array
    (
        [item] => grape
        [distance] => 6
    )

)

As you can see, apple with distance 0 is listed first, followed by the items closest to the search term.

3. Using URLs in Queries

Suppose we want to associate search results with URLs. We can embed URLs directly in the strings. Here is an example:

<?php
$searchTerm = 'apple';  // User input search term
$items = ['apple pie', 'apple', 'banana', 'grape', 'apricot'];
$baseUrl = 'http://gitbox.net/search?query=';
<p>$results = [];</p>
<p>foreach ($items as $item) {<br>
$distance = levenshtein($searchTerm, $item);<br>
if ($distance <= 3) {<br>
$results[] = [<br>
'item' => $item,<br>
'url'  => $baseUrl . urlencode($item)  // Concatenate matching item with URL<br>
];<br>
}<br>
}</p>
<p>print_r($results);<br>
?><br>

In this case, the search results not only include the matching items but also generate a URL for each, pointing to a potential search page.

Example output:

Array
(
    [0] => Array
        (
            [item] => apple pie
            [url] => http://gitbox.net/search?query=apple+pie
        )
<pre class="overflow-visible!"><div class="contain-inline-size rounded-2xl border-[0.5px] border-token-border-medium relative bg-token-sidebar-surface-primary"><div class="flex items-center text-token-text-secondary px-4 py-2 text-xs font-sans justify-between h-9 bg-token-sidebar-surface-primary dark:bg-token-main-surface-secondary select-none rounded-t-2xl">csharp</div><div class="sticky top-9"><div class="absolute end-0 bottom-0 flex h-9 items-center pe-2"><div class="bg-token-sidebar-surface-primary text-token-text-secondary dark:bg-token-main-surface-secondary flex items-center gap-4 rounded-sm px-2 font-sans text-xs"><button class="flex gap-1 items-center select-none py-1" aria-label="复制"><svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-xs"><path fill-rule="evenodd" clip-rule="evenodd" d="M7 5C7 3.34315 8.34315 2 10 2H19C20.6569 2 22 3.34315 22 5V14C22 15.6569 20.6569 17 19 17H17V19C17 20.6569 15.6569 22 14 22H5C3.34315 22 2 20.6569 2 19V10C2 8.34315 3.34315 7 5 7H7V5ZM9 7H14C15.6569 7 17 8.34315 17 10V15H19C19.5523 15 20 14.5523 20 14V5C20 4.44772 19.5523 4 19 4H10C9.44772 4 9 4.44772 9 5V7ZM5 9C4.44772 9 4 9.44772 4 10V19C4 19.5523 4.44772 20 5 20H14C14.5523 20 15 19.5523 15 19V10C15 9.44772 14.5523 9 14 9H5Z" fill="currentColor"></path></svg>复制</button><button class="flex items-center gap-1 py-1 select-none"><svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-xs"><path d="M2.5 5.5C4.3 5.2 5.2 4 5.5 2.5C5.8 4 6.7 5.2 8.5 5.5C6.7 5.8 5.8 7 5.5 8.5C5.2 7 4.3 5.8 2.5 5.5Z" fill="currentColor" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round"></path><path d="M5.66282 16.5231L5.18413 19.3952C5.12203 19.7678 5.09098 19.9541 5.14876 20.0888C5.19933 20.2067 5.29328 20.3007 5.41118 20.3512C5.54589 20.409 5.73218 20.378 6.10476 20.3159L8.97693 19.8372C9.72813 19.712 10.1037 19.6494 10.4542 19.521C10.7652 19.407 11.0608 19.2549 11.3343 19.068C11.6425 18.8575 11.9118 18.5882 12.4503 18.0497L20 10.5C21.3807 9.11929 21.3807 6.88071 20 5.5C18.6193 4.11929 16.3807 4.11929 15 5.5L7.45026 13.0497C6.91175 13.5882 6.6425 13.8575 6.43197 14.1657C6.24513 14.4392 6.09299 14.7348 5.97903 15.0458C5.85062 15.3963 5.78802 15.7719 5.66282 16.5231Z" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"></path><path d="M14.5 7L18.5 11" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"></path></svg>编辑</button></div></div></div><div class="overflow-y-auto p-4" dir="ltr">[1] => Array
    (
        [item] => apple
        [url] => http://gitbox.net/search?query=apple
    )

)

Summary

Using the levenshtein function, we can easily implement fuzzy search functionality. Whether it’s simple matching or sorted matching, Levenshtein distance helps us determine string similarity. Depending on the actual needs, we can also associate fuzzy matches with URLs to further enhance the search experience. Hopefully, this article helps you better understand and use PHP’s levenshtein function, improving the flexibility and accuracy of your search features.