Current Location: Home> Latest Articles> How to Capture Match Results with the mb_ereg Function? Complete Steps and Key Considerations

How to Capture Match Results with the mb_ereg Function? Complete Steps and Key Considerations

gitbox 2025-09-02

1. Basic Usage of the mb_ereg Function

mb_ereg function has the following basic syntax:

<span><span><span class="hljs-title function_ invoke__">mb_ereg</span></span><span>(pattern, </span><span><span class="hljs-keyword">string</span></span><span>, &amp;regs)
</span></span>
  • pattern: The regular expression pattern.

  • string: The target string to be matched.

  • regs: This is an optional parameter used to store the match results. If provided, mb_ereg will store the matched parts in the array regs, where index 0 corresponds to the entire match, and the subsequent indices correspond to the matches of sub-patterns.

Example:

<span><span><span class="hljs-variable">$string</span></span><span> = </span><span><span class="hljs-string">"Welcome to PHP Tutorial"</span></span><span>;
</span><span><span class="hljs-variable">$pattern</span></span><span> = </span><span><span class="hljs-string">"PHP"</span></span><span>;
</span><span><span class="hljs-variable">$regs</span></span><span> = [];
</span><span><span class="hljs-keyword">if</span></span><span> (</span><span><span class="hljs-title function_ invoke__">mb_ereg</span></span><span>(</span><span><span class="hljs-variable">$pattern</span></span><span>, </span><span><span class="hljs-variable">$string</span></span><span>, </span><span><span class="hljs-variable">$regs</span></span><span>)) {
    </span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-string">"Match successful\n"</span></span><span>;
    </span><span><span class="hljs-title function_ invoke__">print_r</span></span><span>(</span><span><span class="hljs-variable">$regs</span></span><span>);  </span><span><span class="hljs-comment">// Output captured match results</span></span><span>
} </span><span><span class="hljs-keyword">else</span></span><span> {
    </span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-string">"Match failed\n"</span></span><span>;
}
</span></span>

In the example above, if the string "Welcome to PHP Tutorial" contains "PHP", mb_ereg returns true and stores the matched "PHP" in the $regs array.

2. Capturing Match Results

The key to capturing match results lies in the regs parameter. Through this parameter, mb_ereg stores matched content sequentially and can capture multiple sub-pattern matches.

Example:

<span><span><span class="hljs-variable">$string</span></span><span> = </span><span><span class="hljs-string">"This is an example combining PHP and MySQL"</span></span><span>;
</span><span><span class="hljs-variable">$pattern</span></span><span> = </span><span><span class="hljs-string">"(PHP)(and)(MySQL)"</span></span><span>;
</span><span><span class="hljs-variable">$regs</span></span><span> = [];
</span><span><span class="hljs-keyword">if</span></span><span> (</span><span><span class="hljs-title function_ invoke__">mb_ereg</span></span><span>(</span><span><span class="hljs-variable">$pattern</span></span><span>, </span><span><span class="hljs-variable">$string</span></span><span>, </span><span><span class="hljs-variable">$regs</span></span><span>)) {
    </span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-string">"Match successful\n"</span></span><span>;
    </span><span><span class="hljs-title function_ invoke__">print_r</span></span><span>(</span><span><span class="hljs-variable">$regs</span></span><span>);
} </span><span><span class="hljs-keyword">else</span></span><span> {
    </span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-string">"Match failed\n"</span></span><span>;
}
</span></span>

Output:

<span><span>Match successful
</span><span><span class="hljs-title function_ invoke__">Array</span></span><span>
(
    [</span><span><span class="hljs-number">0</span></span><span>] =&gt; PHP and MySQL
    [</span><span><span class="hljs-number">1</span></span><span>] =&gt; PHP
    [</span><span><span class="hljs-number">2</span></span><span>] =&gt; and
    [</span><span><span class="hljs-number">3</span></span><span>] =&gt; MySQL
)
</span></span>

In this example, the regular expression (PHP)(and)(MySQL) captures three sub-pattern matches: "PHP", "and", and "MySQL". $regs[0] holds the entire match string, while $regs[1], $regs[2], and $regs[3] store the matches of each sub-pattern.

3. Key Considerations

3.1 Writing Regular Expressions

mb_ereg uses a regular expression syntax similar to common regex, but with some differences, especially when handling multibyte characters. Pay special attention to character ranges in regex and techniques for handling Chinese characters.

Example:

<span><span><span class="hljs-variable">$pattern</span></span><span> = </span><span><span class="hljs-string">"^[\x{4e00}-\x{9fa5}]+$"</span></span>;  </span><span><span class="hljs-comment">// Matches only Chinese characters</span></span><span>
</span></span>

This regular expression matches only strings containing Chinese characters (Unicode range: \x{4e00} to \x{9fa5}).

3.2 Parameter Passing

When using mb_ereg, the regs parameter should be passed by reference to ensure captured results are returned. Otherwise, the $regs array will remain empty.

3.3 Function Return Value

mb_ereg returns a boolean value indicating whether the match was successful. It returns true if matched, and false otherwise.

3.4 Encoding Settings

To ensure multibyte character sets work properly, you usually need to set the correct character encoding before calling mb_ereg. Use mb_internal_encoding() to set the encoding:

<span><span><span class="hljs-title function_ invoke__">mb_internal_encoding</span></span><span>(</span><span><span class="hljs-string">"UTF-8"</span></span><span>);
</span></span>

If you are working with non-UTF-8 character sets (e.g., GBK or Shift-JIS), ensure the encoding is correctly set.

3.5 Performance Optimization

Since mb_ereg is optimized for multibyte character sets, its performance can be affected by character set and string length compared to standard regex. When processing large data, optimize accordingly, such as minimizing excessive regex operations.

4. Conclusion

mb_ereg provides a powerful tool in PHP for regex matching with multibyte character sets. By correctly using the regs parameter, you can easily capture and handle match results. Understanding how to write regex, pass parameters, and set encoding will help you use mb_ereg more efficiently for string processing.