How to Count Chinese Characters and the Difference in Character Lengths in PHP

gitbox 2025-07-27

Counting Chinese Characters in PHP

In development, we often need to count the number of Chinese characters in a string. PHP provides a function mb_strlen that can accurately calculate the length of Chinese characters. Here is a simple example:


$string = "PHP实时统计中文字数";
$length = mb_strlen($string, 'utf-8');
echo $length; // Output 10

In the code above, we define a string $string containing Chinese characters, and use the mb_strlen function to get the number of Chinese characters. Note that the encoding is set to 'utf-8' to ensure that Chinese characters are counted correctly.

Differences in Counting Chinese and English Characters

When counting Chinese and English characters, the results often differ. For example, the string “PHP实时统计中文字数” contains 10 Chinese characters, but if we use strlen to count the English characters, the result is 18.


$string = "PHP实时统计中文字数";
$length = strlen($string); // Length of English characters
echo $length; // Output 18

As shown in the code above, the character length calculated using strlen includes both English and Chinese characters with different encoding methods.

Counting Chinese and English Characters in PHP

When calculating the number of Chinese and English characters, there is a difference between the strlen and mb_strlen functions. Specifically, strlen counts byte length, while mb_strlen counts the number of characters.


$string = "PHP Real-time Statistics and Differences in Chinese and English Character Counts";
$length = strlen($string); // Count byte length
echo $length; // Output 68

As shown above, the result returned by strlen is 68, which represents the byte length of the string, not the number of characters.

Practical Differences in Counting Chinese and English Characters

It’s important to note that Chinese and English characters are handled differently in PHP. In the strlen function, one Chinese character typically occupies three bytes, so there is a difference between character count and byte count. You can verify this with the following code:


$string = "PHP实时统计中文字数 and this is a English sentence.";
$length_en = strlen($string); // Count English characters
$length_ch = mb_strlen($string, 'utf-8'); // Count Chinese characters
echo "Length of English characters: " . $length_en;
echo "Length of Chinese characters: " . $length_ch;

With this code, we can clearly see that the number of English characters and Chinese characters differ in the same string.

In summary, strlen is used for counting byte lengths and is suitable for English characters, while mb_strlen is used to handle Chinese characters and provides an accurate character count. Understanding the difference between these two functions is crucial when working with strings containing mixed languages.