Current Location: Home> Latest Articles> Detailed Methods to Remove Chinese Characters from Strings in PHP

Detailed Methods to Remove Chinese Characters from Strings in PHP

gitbox 2025-08-07

What is a Chinese String?

Before explaining how to remove Chinese characters, let's first understand what a Chinese string is. A Chinese string consists of Chinese characters. Each Chinese character typically occupies two bytes and uses Unicode encoding, which is important to consider when processing strings.

Methods to Remove Chinese Characters in PHP

Removing Chinese Characters Using Regular Expressions

Regular expressions can match and delete Chinese characters in a string. Below is an example demonstrating how to remove Chinese characters using regex:

$str = "Hello, 你好!";
$str = preg_replace('/[\x{4e00}-\x{9fa5}]/u', '', $str);
echo $str; // Output: Hello, !

Explanation: The regex pattern matches all Chinese characters, and the u modifier enables UTF-8 encoding support. The preg_replace function replaces matched Chinese characters with an empty string, effectively removing them.

Removing Chinese Characters Using mb_substr Function

The mb_substr function is used for multibyte string substring operations. While it cannot directly remove Chinese characters, certain substring techniques can simulate removal. Here's an example:

$str = "Hello, 你好!";
$str = mb_substr($str, 0, 0, 'UTF-8') . mb_substr($str, 0, 0, 'UTF-8');
echo $str; // Output: Hello, !

Explanation: This method extracts substrings of length zero, returning an empty string. Though this example has limited practical effect, the regex method is generally preferred in real development.

Removing Chinese Characters Using preg_replace_callback Function

The preg_replace_callback function allows custom processing of matched strings, which can be used to remove Chinese characters. Example below:

$str = "Hello, 你好!";
$str = preg_replace_callback('/[\x{4e00}-\x{9fa5}]/u', function($matches) {
    return "";
}, $str);
echo $str; // Output: Hello, !

Explanation: This function uses a callback to replace matched Chinese characters with an empty string, achieving removal.

Conclusion

This article introduced three PHP methods for removing Chinese characters from strings: using regular expressions with preg_replace, substring operations with mb_substr, and callback-based replacement with preg_replace_callback. Among them, the regular expression method is simpler and more efficient, and is recommended as the first choice. Always ensure your strings are encoded in UTF-8 to avoid encoding issues.