Current Location: Home> Latest Articles> How to use mb_strcut to intercept multibyte strings

How to use mb_strcut to intercept multibyte strings

gitbox 2025-05-31

mb_strcut is a very practical function when dealing with multibyte characters (such as Chinese, Japanese, Korean, etc.) in PHP. Unlike substr , it ensures that multi-byte characters will not be truncated, thus avoiding garbled problems. This article will explain in detail how to correctly use mb_strcut to intercept multi-byte strings and demonstrate them in combination with actual cases.

1. The difference between mb_strcut and mb_substr

Before you start, let’s briefly understand the difference between mb_strcut and mb_substr :

  • mb_substr is intercepted by the number of characters;

  • mb_strcut is intercepted by byte length, but it will ensure that characters are not truncated (that is, only part of a Chinese character will not be intercepted);

  • Both support specified character encoding.

This means that mb_strcut is more suitable for processing text at the byte level, such as limiting database field lengths, generating summary, etc.

2. Syntax of mb_strcut function

 string mb_strcut(string $string, int $start, int $length = null, string $encoding = null)
  • $string : the string to be processed;

  • $start : Start position (calculated by bytes);

  • $length : the number of bytes to be intercepted;

  • $encoding : character encoding (such as UTF-8, GBK, etc.), optional.

3. Use examples

Example 1: Basic usage

 <?php
$str = "Hello,world!";
$result = mb_strcut($str, 0, 6, "UTF-8");
echo $result; // Output:Hello
?>

Explanation: Each Chinese character occupies 3 bytes under UTF-8 encoding, so 6 bytes are exactly two Chinese characters.

Example 2: Prevent garbled code

If you use substr to intercept Chinese, garbled code is prone to occur:

 <?php
$str = "Hello,world!";
echo substr($str, 0, 5); // 可能Output乱码
?>

Change to mb_strcut to avoid this problem:

 <?php
$str = "Hello,world!";
echo mb_strcut($str, 0, 5, "UTF-8"); // Output:you
?>

Example 3: Display in conjunction with database or page

When you need to intercept an article summary, you can use the following method:

 <?php
$content = "Welcome to our official website:https://gitbox.net/blog/php-mb_strcut-use";
$summary = mb_strcut($content, 0, 60, "UTF-8");
echo $summary . "...";
?>

This will safely display a fixed-length summary in the webpage without garbled code.

4. How to determine the appropriate interception length?

Because multi-byte characters occupy different byte lengths under different encodings, it is recommended to use mb_strlen and mb_strcut for dynamic processing:

 <?php
function safe_cut($str, $maxBytes, $encoding = "UTF-8") {
    return mb_strcut($str, 0, $maxBytes, $encoding);
}
?>

This allows you to set byte limits flexibly, such as:

 echo safe_cut("This is aPHPExample of string processing", 9); // Output:This is

5. Things to note

  • mb_strcut is a byte-level operation, so it is particularly suitable for precise length control when storing or transmitting data;

  • It does not HTML encoding or filter strings, and needs to be used in combination with functions such as htmlspecialchars ;

  • Make sure the server has mbstring extension enabled and can be viewed via phpinfo() .

6. Conclusion

mb_strcut is an important tool in PHP for handling multibyte strings, especially for scenarios where precise control of byte length is required. By reasonably setting the starting position and length and combining coding settings, it can easily avoid garbled code problems and improve the robustness of the program. Make full use of mb_strcut in development, which can make you more comfortable when dealing with multilingual strings.