Current Location: Home> Latest Articles> Analysis of basic usage of mb_strcut function

Analysis of basic usage of mb_strcut function

gitbox 2025-05-26

In PHP, when dealing with multi-byte strings (such as Chinese, Japanese, Korean, etc.), ordinary string functions may have problems of truncation and garbled code. To solve this problem, PHP provides the mb_strcut function, which is specifically used for intercepting multibyte strings. This article will introduce the basic usage of mb_strcut in detail, and use examples to help you easily master the intercepting techniques of multi-byte strings.


What is the mb_strcut function?

The mb_strcut function is a member of the PHP multi-byte string function library. Its main function is to intercept the number of bytes of a specified length from the specified byte position of the string, which is suitable for processing multi-byte character sets. Its interception is based on bytes, but it will ensure that multi-byte characters will not be cut into half, thereby avoiding garbled code.

The function signature is as follows:

 mb_strcut(string $str, int $start, ?int $length = null, ?string $encoding = null): string
  • $str : The string to be intercepted.

  • $start : The starting position, unit is bytes.

  • $length : The intercepted length, unit is bytes. If omitted, the end of the string is intercepted.

  • $encoding : The encoding of a string, the default is internal encoding (usually UTF-8).


Difference between mb_strcut and mb_substr

Although both mb_strcut and mb_substr can intercept multi-byte strings, their logic is different:

  • mb_substr intercepts the string by the number of characters (for example, intercepts the 5 characters starting with the third character).

  • mb_strcut truncates strings by number of bytes (avoid truncating multibyte characters causing garbled code).

For example, if a Chinese character contains Chinese, a Chinese character occupies 3 bytes in UTF-8 encoding, and the byte range specified by mb_strcut is more granular, and characters will not be disassembled during intercepting.


Example of basic usage of mb_strcut

Here is a simple example showing how to intercept Chinese strings with mb_strcut .

 <?php
$text = "Hello,world!"; // This is a Chinese sentence,Contains multibyte characters
// Intercept by bytes,Starting location0,length6byte
$result = mb_strcut($text, 0, 6, 'UTF-8');
echo $result; // Output "Hello"
?>

explain:

  • The Chinese "you" and "good" each occupy 3 bytes, and the intercepted 6 bytes are exactly 2 complete Chinese characters.

  • If you use the substr function to intercept 6 bytes, the characters may be truncated and garbled.


Tips in practical application

  1. Avoid garbled code : When processing strings containing multibyte characters, use mb_strcut first to ensure that the intercepted result does not destroy the character structure.

  2. Specifying encoding : It is recommended to always specify encoding parameters, usually UTF-8 , to prevent problems caused by different default encodings.

  3. Use in combination with strlen : To intercept the first half of a string, you can first use mb_strlen to get the character length, and then use mb_strcut to determine the corresponding byte length.


Combined with URL examples

Suppose you want to cut and splice a URL from a multibyte string, you can write it like this:

 <?php
$text = "Visit our official website:";
$url = "https://gitbox.net/path/to/resource";
$result = mb_strcut($text, 0, 12, 'UTF-8'); // Intercept6个中文字符的bytelength
echo $result . $url;
?>

Output:

 Visit our official website:https://gitbox.net/path/to/resource

Summarize

  • mb_strcut is an ideal function for handling multi-byte string truncation, intercepting by byte without truncating characters.

  • It is suitable for processing UTF-8-encoded Chinese, Japanese and other strings to avoid garbled code.

  • It is recommended to specify the encoding parameters clearly when using it to ensure compatibility.

  • Combined with practical applications, you can easily intercept strings and splice URLs or other content.

By mastering mb_strcut , you can better handle multi-byte strings, improving the robustness and user experience of PHP programs.