What is the correct way to extract the first N characters of a string using the PHP substr function?

gitbox 2025-05-31

1. Basic usage of substr function

The substr function in PHP is defined as follows:

 substr(string $string, int $start, ?int $length = null): string

$string : The string to be intercepted.
$start : Start position, 0 represents the first character of the string.
$length : The intercepted length, if not specified, is intercepted from the start position to the end of the string.

2. Simple writing method for extracting the first N characters of a string

Suppose you want to extract the first N characters from the string $str , the code is as follows:

 $str = "This is a test string";
$n = 5;
$substring = substr($str, 0, $n);
echo $substring;

This code outputs the first 5 characters of the string.

3. Pay attention to multi-byte character encoding issues

Although the substr function is simple, it may cause garbled code or truncated half a character when dealing with multi-byte characters (such as Chinese, Japanese, Korean, etc.). This is because substr is based on byte operations, not character operations.

If the string is UTF-8 encoding, it is recommended to use the mb_substr function, which is multibyte-safe:

 $str = "This is a test string";
$n = 5;
$substring = mb_substr($str, 0, $n, "UTF-8");
echo $substring;

mb_substr can correctly process multi-byte characters, ensuring that characters will not be truncated during interception.

4. Combined with examples: safely output the first N characters in a web page

 <?php
$str = "Welcome to visit gitbox.net Website，There are abundant resources here。";
$n = 10;

// Before intercepting the string10Characters
$substring = mb_substr($str, 0, $n, "UTF-8");

// Output to web page
echo "<p>Content Summary：{$substring}...</p>";
?>

5. Summary

When using substr to intercept strings, if it is an English or single-byte encoded string, just use substr($str, 0, $n) .
When processing multibyte strings, you should use mb_substr($str, 0, $n, "UTF-8") to ensure that characters are not truncated and garbled.
When outputting a web page, combine HTML tags to safely display the intercepted results.