Current Location: Home> Latest Articles> How to Properly Cut Strings Containing Spaces Using the mb_strcut Function? Detailed Steps and Considerations

How to Properly Cut Strings Containing Spaces Using the mb_strcut Function? Detailed Steps and Considerations

gitbox 2025-06-15

What is the mb_strcut Function?

mb_strcut is part of the mbstring extension in PHP, mainly used for cutting multi-byte character strings. Its functionality is similar to substr, but the difference is that mb_strcut can correctly handle strings with multi-byte characters without causing truncation errors.

Basic Syntax of mb_strcut

mb_strcut(string $str, int $start, int $length = null, string $encoding = null): string  
  • $str: The string to be cut.

  • $start: The starting position of the cut (in bytes).

  • $length: The length of the substring (in bytes). If not specified, the substring will extend from $start to the end of the string.

  • $encoding: The character encoding. The default is the encoding configured in PHP (usually UTF-8).

Steps to Cut a String Using mb_strcut

1. Initialize the String and Set Encoding

First, ensure that the string you're using is encoded in UTF-8. Since mb_strcut works with multi-byte characters, the encoding of the string must be correct, and UTF-8 encoding is commonly used.

$str = "Hello, today's weather is great!"; // A string containing Chinese characters and spaces  
$encoding = "UTF-8";  

2. Cut the String to a Specified Length

If we want to cut the first 6 bytes of the string, we can write:

$sub_str = mb_strcut($str, 0, 6, $encoding);  
echo $sub_str;  // Output: Hello,  

The output will be "Hello,", and it correctly handles the space without causing character truncation.

3. Cut to the End of the String

If you want to cut the string to the end, simply set $length to null:

$sub_str = mb_strcut($str, 0);  
echo $sub_str;  // Output: Hello, today's weather is great!  

How to Handle Strings Containing Spaces?

A common issue is how to correctly cut strings containing spaces. Spaces can affect the integrity of the string when cutting, especially when using byte-based methods, as spaces can cause inaccurate cutting points.

When using mb_strcut, while it correctly handles multi-byte characters, you still need to keep the following in mind:

  1. Spaces as Characters: In PHP, spaces are considered characters as well. When using mb_strcut to cut a string, spaces are treated as part of the characters. Therefore, you must ensure that the start and length parameters are set correctly.

  2. Ensure Words Are Not Truncated: If you want to cut the string to make sure it's a complete word or phrase, you can use the mb_strrpos function to find the position of the space and then adjust the cutting length based on that position.

Considerations

  • Encoding Issues: When calling the mb_strcut function, ensure that the string’s encoding is correct. Mismatched encodings can result in garbled text or incorrect cuts.

  • Spaces and Special Characters: Since mb_strcut cuts based on byte count, spaces and special characters might be truncated incorrectly. To avoid this, it’s a good idea to check the cut position to ensure it's not in the middle of a character or right before a space.

  • Performance Considerations: For large-scale string processing, frequent use of mb_strcut can lead to performance issues. It’s recommended to optimize for performance when handling large data and avoid unnecessary string operations.

Example: How to Avoid Cutting in the Middle of a Space

Suppose we have a string containing multiple words, and we want to cut a part of the string that includes a complete word. We can find the position of the space to ensure the cut is made at a word boundary.

$str = "This is a text containing spaces, let's cut it.";  
$encoding = "UTF-8";  
<p>// Find the position of the first space<br>
$first_space_pos = mb_strpos($str, ' ', 0, $encoding);</p>
<p>// Cut 10 characters starting from the first space<br>
$sub_str = mb_strcut($str, 0, $first_space_pos + 10, $encoding);<br>
echo $sub_str;  // Output: This is a tex<br>

In this example, we avoided truncating a word and instead cut a complete part of the text based on the space position.