在PHP 裡, preg_replace()是處理文本最趁手的“瑞士軍刀”。它基於PCRE(Perl Compatible Regular Expressions),既能做簡單的查找替換,也能完成結構化重寫、清洗數據、批量改名等複雜任務。本文將從零開始,圍繞preg_replace()的函數簽名、正則語法、常見場景與避坑要點,幫你迅速掌握正則替換的核心技巧。
函數簽名:
<span><span><span class="hljs-keyword">mixed</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(
</span><span><span class="hljs-keyword">string</span></span><span>|</span><span><span class="hljs-keyword">array</span></span><span> </span><span><span class="hljs-variable">$pattern</span></span><span>,
</span><span><span class="hljs-keyword">string</span></span><span>|</span><span><span class="hljs-keyword">array</span></span><span> </span><span><span class="hljs-variable">$replacement</span></span><span>,
</span><span><span class="hljs-keyword">string</span></span><span>|</span><span><span class="hljs-keyword">array</span></span><span> </span><span><span class="hljs-variable">$subject</span></span><span>,
</span><span><span class="hljs-keyword">int</span></span><span> </span><span><span class="hljs-variable">$limit</span></span><span> = -</span><span><span class="hljs-number">1</span></span><span>,
</span><span><span class="hljs-keyword">int</span></span><span> &</span><span><span class="hljs-variable">$count</span></span><span> = </span><span><span class="hljs-literal">null</span></span><span>
)
</span></span>$pattern :正則表達式(可為數組,表示多規則)
$replacement :替換內容(可為數組,與pattern 一一對應)
$subject :待處理的字符串(或字符串數組)
$limit :替換次數上限(默認-1 表示不限制)
$count :輸出參數,返回實際替換次數
最小示例:
<span><span><span class="hljs-meta"><?php</span></span><span>
</span><span><span class="hljs-variable">$text</span></span><span> = </span><span><span class="hljs-string">"Color or Colour? I like the color blue."</span></span><span>;
</span><span><span class="hljs-variable">$result</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/colou?r/i'</span></span><span>, </span><span><span class="hljs-string">'color'</span></span><span>, </span><span><span class="hljs-variable">$text</span></span><span>, -</span><span><span class="hljs-number">1</span></span><span>, </span><span><span class="hljs-variable">$count</span></span><span>);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$result</span></span><span>; </span><span><span class="hljs-comment">// Color or color? I like the color blue.</span></span><span>
</span><span><span class="hljs-keyword">echo</span></span><span> PHP_EOL . </span><span><span class="hljs-string">"Replaced: <span class="hljs-subst">$count</span></span></span><span>"; </span><span><span class="hljs-comment">// Replaced: 2</span></span><span>
</span></span>/colou?r/i : ?讓前面的u可選; i修飾符忽略大小寫。
常見分隔符有/ # ~ % { } ( )等。選擇不與模式衝突的分隔符最省心:
<span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'#https?://[^\s]+#'</span></span><span>, </span><span><span class="hljs-string">'[link]'</span></span><span>, </span><span><span class="hljs-variable">$text</span></span><span>);
</span></span>當模式裡有大量/時,改用#能避免大量轉義。
i :忽略大小寫
m :多行模式( ^ 、 $會匹配每一行的行首/行尾)
s :單行模式( .也匹配換行)
u :按UTF-8 處理(中文/emoji 場景強烈推薦)
x :忽略模式中的空白與註釋(可讀性更好)
U :懶惰量詞反轉(將量詞默認從貪婪變為惰性)
示例(多行+ 單行):
<span><span><span class="hljs-variable">$log</span></span><span> = </span><span><span class="hljs-string">"ID:42\nPayload:\n{\n \"a\":1\n}\nEnd"</span></span><span>;
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/^Payload:(.*)End$/ims'</span></span><span>, </span><span><span class="hljs-string">'[DATA HIDDEN]'</span></span><span>, </span><span><span class="hljs-variable">$log</span></span><span>);
</span></span>字符類: [abc] 、 [^abc] 、 \d數字、 \w單詞字符、 \s空白
位置錨點: ^行首, $行尾, \b單詞邊界
量詞: * (0+), + (1+), ? (0/1), {m,n} (範圍)
貪婪/惰性: +是貪婪, +?是惰性(盡可能少匹配)
示例(郵箱掩碼):
<span><span><span class="hljs-variable">$email</span></span><span> = </span><span><span class="hljs-string">'[email protected]'</span></span><span>;
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/(?<=.).+?(?=@)/'</span></span><span>, </span><span><span class="hljs-string">'***'</span></span><span>, </span><span><span class="hljs-variable">$email</span></span><span>);
</span><span><span class="hljs-comment">// a***@example.com</span></span><span>
</span></span>使用前後查找(?<=...) 、 (?=...)精准定位替換範圍,避免捕獲多餘字符。
捕獲分組: (...)會把匹配內容保存到$1, $2, ... (在替換字符串中使用)
非捕獲分組: (?:...)僅分組不保存,性能更好
示例(姓名格式化:張三-李四→張三& 李四):
<span><span><span class="hljs-variable">$name</span></span><span> = </span><span><span class="hljs-string">'張三-李四'</span></span><span>;
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/^(\S+)\s*-\s*(\S+)$/u'</span></span><span>, </span><span><span class="hljs-string">'$1 & $2'</span></span><span>, </span><span><span class="hljs-variable">$name</span></span><span>);
</span><span><span class="hljs-comment">// 張三 & 李四</span></span><span>
</span></span>示例(URL 標準化: HTTP://EXAMPLE.COM/Path → 小寫域名):
<span><span><span class="hljs-variable">$url</span></span><span> = </span><span><span class="hljs-string">'HTTP://EXAMPLE.COM/Path'</span></span><span>;
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/^(https?):\/\/([^\/]+)/ie'</span></span><span>, </span><span><span class="hljs-string">"'<span class="hljs-subst">$1</span></span></span><span>://'.strtolower('</span><span><span class="hljs-subst">$2</span></span><span>')", </span><span><span class="hljs-variable">$url</span></span><span>);
</span></span>?? 老代碼可能出現/e修飾符(已廢棄),不要使用。請改用回調(見下一節)。
當替換值需要計算(如大小寫轉換、動態編號、條件判斷)時,用回調更安全:
<span><span><span class="hljs-variable">$input</span></span><span> = </span><span><span class="hljs-string">"HTTP://EXAMPLE.COM/Path and http://MiXeD.com/Another"</span></span><span>;
</span><span><span class="hljs-variable">$result</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace_callback</span></span><span>(
</span><span><span class="hljs-string">'#\bhttps?://([^/\s]+)#i'</span></span><span>,
function (</span><span><span class="hljs-variable">$m</span></span><span>) {
</span><span><span class="hljs-comment">// $m[0] 是整個匹配,$m[1] 是域名</span></span><span>
</span><span><span class="hljs-variable">$scheme</span></span><span> = </span><span><span class="hljs-title function_ invoke__">stripos</span></span><span>(</span><span><span class="hljs-variable">$m</span></span><span>[</span><span><span class="hljs-number">0</span></span><span>], </span><span><span class="hljs-string">'https://'</span></span><span>) === </span><span><span class="hljs-number">0</span></span><span> ? </span><span><span class="hljs-string">'https://'</span></span><span> : </span><span><span class="hljs-string">'http://'</span></span><span>;
</span><span><span class="hljs-keyword">return</span></span><span> </span><span><span class="hljs-variable">$scheme</span></span><span> . </span><span><span class="hljs-title function_ invoke__">strtolower</span></span><span>(</span><span><span class="hljs-variable">$m</span></span><span>[</span><span><span class="hljs-number">1</span></span><span>]);
},
</span><span><span class="hljs-variable">$input</span></span><span>
);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$result</span></span><span>;
</span><span><span class="hljs-comment">// http://example.com/Path and http://mixed.com/Another</span></span><span>
</span></span>還有一個適合處理大量數據的姐妹函數: preg_replace_callback_array() ,可一次註冊多條規則與其回調:
<span><span><span class="hljs-variable">$text</span></span><span> = </span><span><span class="hljs-string">"Price: 19.99 USD, Date: 2025-08-25"</span></span><span>;
</span><span><span class="hljs-variable">$result</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace_callback_array</span></span><span>([
</span><span><span class="hljs-string">'/\b(\d+(?:\.\d{2})?)\s*USD\b/'</span></span><span> => fn(</span><span><span class="hljs-variable">$m</span></span><span>) => </span><span><span class="hljs-string">'$'</span></span><span> . </span><span><span class="hljs-variable">$m</span></span><span>[</span><span><span class="hljs-number">1</span></span><span>],
</span><span><span class="hljs-string">'/\b(\d{4})-(\d{2})-(\d{2})\b/'</span></span><span> => </span><span><span class="hljs-function"><span class="hljs-keyword">fn</span></span></span><span>(</span><span><span class="hljs-params"><span class="hljs-variable">$m</span></span></span><span>) => </span><span><span class="hljs-string">"<span class="hljs-subst">{$m[2]}</span></span></span><span>/</span><span><span class="hljs-subst">{$m[3]}</span></span><span>/</span><span><span class="hljs-subst">{$m[1]}</span></span><span>",
], </span><span><span class="hljs-variable">$text</span></span><span>);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$result</span></span><span>; </span><span><span class="hljs-comment">// Price: $19.99, Date: 08/25/2025</span></span><span>
</span></span>$pattern和$replacement都支持數組。如果替換值不是一一對應,則會用同一個替換值應用到每個模式:
<span><span><span class="hljs-variable">$input</span></span><span> = </span><span><span class="hljs-string">"foo 123 bar 456 baz"</span></span><span>;
</span><span><span class="hljs-variable">$patterns</span></span><span> = [</span><span><span class="hljs-string">'/\bfoo\b/'</span></span><span>, </span><span><span class="hljs-string">'/\d+/'</span></span><span>, </span><span><span class="hljs-string">'/\bbaz\b/'</span></span><span>];
</span><span><span class="hljs-variable">$replacements</span></span><span> = [</span><span><span class="hljs-string">'FOO'</span></span><span>, </span><span><span class="hljs-string">'[NUM]'</span></span><span>, </span><span><span class="hljs-string">'BAZ'</span></span><span>];
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-variable">$patterns</span></span><span>, </span><span><span class="hljs-variable">$replacements</span></span><span>, </span><span><span class="hljs-variable">$input</span></span><span>);
</span><span><span class="hljs-comment">// FOO [NUM] bar [NUM] BAZ</span></span><span>
</span></span>默認建議加u修飾符,避免把多字節字符拆壞。
中文分詞邊界可用\b ?不可靠。 \b是“單詞邊界”,針對ASCII 單詞字符;處理中文邊界請用明確的字符類或前後查找。
示例(給中文與數字之間加空格):
<span><span><span class="hljs-variable">$str</span></span><span> = </span><span><span class="hljs-string">"版本2已發佈在2025年8月25天"</span></span><span>;
</span><span><span class="hljs-variable">$str</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/(?<=[\x{4e00}-\x{9fa5}])(?=\d)/u'</span></span><span>, </span><span><span class="hljs-string">' '</span></span><span>, </span><span><span class="hljs-variable">$str</span></span><span>);
</span><span><span class="hljs-variable">$str</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/(?<=\d)(?=[\x{4e00}-\x{9fa5}])/u'</span></span><span>, </span><span><span class="hljs-string">' '</span></span><span>, </span><span><span class="hljs-variable">$str</span></span><span>);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$str</span></span><span>; </span><span><span class="hljs-comment">// 版本 2 已發佈在 2025 年 8 月 25 天</span></span><span>
</span></span>\x{4e00}-\x{9fa5}是常用漢字區間,記得加u 。
目標:去掉所有標籤,僅保留純文本。
<span><span><span class="hljs-variable">$html</span></span><span> = </span><span><span class="hljs-string">"<p>Hello <strong>world</strong> &copy; 2025</p>"</span></span><span>;
</span><span><span class="hljs-variable">$plain</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/<[^>]+>/'</span></span><span>, </span><span><span class="hljs-string">''</span></span><span>, </span><span><span class="hljs-variable">$html</span></span><span>);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$plain</span></span><span>; </span><span><span class="hljs-comment">// Hello world ? 2025</span></span><span>
</span></span>簡單清洗OK;複雜HTML 結構請用DOM 才健壯。
<span><span><span class="hljs-variable">$phone</span></span><span> = </span><span><span class="hljs-string">"13812345678"</span></span><span>;
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/(\d{3})\d{4}(\d{4})/'</span></span><span>, </span><span><span class="hljs-string">'$1****$2'</span></span><span>, </span><span><span class="hljs-variable">$phone</span></span><span>);
</span><span><span class="hljs-comment">// 138****5678</span></span><span>
</span></span> <span><span><span class="hljs-variable">$template</span></span><span> = </span><span><span class="hljs-string">"Hi {name}, your order {id} is {status}."</span></span><span>;
</span><span><span class="hljs-variable">$data</span></span><span> = [</span><span><span class="hljs-string">'name'</span></span><span> => </span><span><span class="hljs-string">'Alice'</span></span><span>, </span><span><span class="hljs-string">'id'</span></span><span> => </span><span><span class="hljs-number">42</span></span><span>, </span><span><span class="hljs-string">'status'</span></span><span> => </span><span><span class="hljs-string">'shipped'</span></span><span>];
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-title function_ invoke__">preg_replace_callback</span></span><span>(</span><span><span class="hljs-string">'/\{(\w+)\}/'</span></span><span>, function(</span><span><span class="hljs-variable">$m</span></span><span>) </span><span><span class="hljs-keyword">use</span></span><span> ($</span><span><span class="hljs-title">data</span></span><span>) {
</span><span><span class="hljs-title">return</span></span><span> $</span><span><span class="hljs-title">data</span></span><span>[$</span><span><span class="hljs-title">m</span></span><span>[1]] ?? $</span><span><span class="hljs-title">m</span></span><span>[0];
}, </span><span><span class="hljs-variable">$template</span></span><span>);
</span><span><span class="hljs-comment">// Hi Alice, your order 42 is shipped.</span></span><span>
</span></span> <span><span><span class="hljs-variable">$md</span></span><span> = </span><span><span class="hljs-string">''</span></span><span>;
</span><span><span class="hljs-variable">$img</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(
</span><span><span class="hljs-string">'/!\[([^\]]*)\]\((\S+)(?:\s+"([^"]*)")?\)/'</span></span><span>,
</span><span><span class="hljs-string">'<img src="$2" alt="$1" title="$3">'</span></span><span>,
</span><span><span class="hljs-variable">$md</span></span><span>
);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$img</span></span><span>;
</span><span><span class="hljs-comment">// <img src="/img/logo.png" alt="alt text" title="Title"></span></span><span>
</span></span><span><span><span class="hljs-variable">$text</span></span><span> = </span><span><span class="hljs-string">"Hello,world! PHP\tis\ngreat."</span></span><span>;
</span><span><span class="hljs-comment">// 把非換行的連續空白壓成一個空格</span></span><span>
</span><span><span class="hljs-variable">$text</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/[^\S\r\n]+/'</span></span><span>, </span><span><span class="hljs-string">' '</span></span><span>, </span><span><span class="hljs-variable">$text</span></span><span>);
</span><span><span class="hljs-comment">// 替換中文逗號為英文逗號後加空格</span></span><span>
</span><span><span class="hljs-variable">$text</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/,/u'</span></span><span>, </span><span><span class="hljs-string">', '</span></span><span>, </span><span><span class="hljs-variable">$text</span></span><span>);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$text</span></span><span>; </span><span><span class="hljs-comment">// Hello, world! PHP is great.</span></span><span>
</span></span> <span><span><span class="hljs-variable">$line</span></span><span> = </span><span><span class="hljs-string">'2025-08-25 14:03:22 [INFO] user=alice ip=203.0.113.9'</span></span><span>;
</span><span><span class="hljs-variable">$fmt</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(
</span><span><span class="hljs-string">'/^(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2}) \[(\w+)\] user=(\w+) ip=([\d.]+)$/'</span></span><span>,
</span><span><span class="hljs-string">'[$3][$1T$2Z] $4@$5'</span></span><span>,
</span><span><span class="hljs-variable">$line</span></span><span>
);
</span><span><span class="hljs-comment">// [INFO][2025-08-25T14:03:22Z] [email protected]</span></span><span>
</span></span>優先具體,後泛化:字符類盡量窄,避免.*濫用。必要時改用惰性量詞或前後查找。
避免災難性回溯:模式中(.+)+ 、 (.*){m,}這類結構極易爆棧;能明確邊界就別用“貪吃蛇”。
使用u修飾符:文本包含多字節字符(中文/emoji)時必須加上,否則可能破壞字符。
回調代替/e :任何需要計算的替換都用preg_replace_callback() ,更安全。
控制$limit :當你只想替換第一個匹配時,把$limit設為1 。
統計與測試:利用$count收集替換次數;為關鍵模式寫單元測試,覆蓋邊界用例。
預編譯/緩存:PHP 內部對PCRE 有一定緩存;但在高頻路徑上,盡量避免在循環中構造可變模式。
最小化重現:把長模式拆小,逐段驗證。
可讀性:用x修飾符寫“帶註釋”的模式:
<span><span><span class="hljs-variable">$pattern</span></span><span> = <span class="hljs-string">'/
^\s* # 開頭允許空白
(?P<key>\w+) # 鍵
\s*=\s*
(?P<val>.+?) # 值(惰性)
\s*$
/x'</span>;
</span></span>轉義意識:在PHP 字符串裡要雙重考慮轉義(例如"\d"與\\d的區別)。
只替換第一個匹配: preg_replace($p, $r, $s, 1, $count);
安全移除腳本標籤: preg_replace('#<script\b[^>]*>.*?</script>#is', '', $html);
URL 中的查詢參數重命名:匹配([?&])old=([^&#]*) → $1new=$2
千分位插入逗號(簡單數值): preg_replace('/\B(?=(\d{3})+(?!\d))/', ',', $n);
多餘空行壓縮: preg_replace('/(\R)\s*(\R)/', "$1$2", $text);
去除不可見字符: preg_replace('/[\x00-\x1F\x7F]/', '', $s);
駝峰轉下劃線: preg_replace('/(?<!^)[AZ]/', '_$0', $camel);
<span><span><span class="hljs-meta"><?php</span></span><span>
</span><span><span class="hljs-variable">$log</span></span><span> = <span class="hljs-string"><<<LOG
[2025-08-25 10:00:01] user=john phone=13812345678 [email protected]
[2025-08-25 10:05:09] user=林 phone=13987654321 [email protected]
LOG</span>;
</span><span><span class="hljs-comment">// 1) 基礎脫敏:手機號中間四位打星、郵箱用戶名只留首字符</span></span><span>
</span><span><span class="hljs-variable">$log</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/(\b1\d{2})\d{4}(\d{4}\b)/'</span></span><span>, </span><span><span class="hljs-string">'$1****$2'</span></span><span>, </span><span><span class="hljs-variable">$log</span></span><span>);
</span><span><span class="hljs-variable">$log</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(</span><span><span class="hljs-string">'/(?<=\b)[A-Za-z0-9._%+-](?:[A-Za-z0-9._%+-]?)+(?=@)/'</span></span><span>, </span><span><span class="hljs-string">'*'</span></span><span>, </span><span><span class="hljs-variable">$log</span></span><span>);
</span><span><span class="hljs-comment">// 2) 結構化重寫:轉成 CSV 行</span></span><span>
</span><span><span class="hljs-variable">$csv</span></span><span> = </span><span><span class="hljs-title function_ invoke__">preg_replace</span></span><span>(
</span><span><span class="hljs-string">'/^\[(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2})\]\s+user=([^\s]+)\s+phone=([^\s]+)\s+email=([^\s]+)$/mu'</span></span><span>,
</span><span><span class="hljs-string">'$1,$2,$3,$4,$5'</span></span><span>,
</span><span><span class="hljs-variable">$log</span></span><span>
);
</span><span><span class="hljs-keyword">echo</span></span><span> </span><span><span class="hljs-variable">$csv</span></span><span>;
<span class="hljs-comment">/*
2025-08-25,10:00:01,john,138****5678,j*</span></span><span><span class="hljs-doctag">@example</span></span><span>.com
2025-08-25,10:05:09,林,139****4321,l*</span><span><span class="hljs-doctag">@example</span></span><span>.cn
*/
</span></span>preg_replace()的威力在於“精確描述你要找的模式,並把它改寫成你需要的樣子”。把握分隔符與修飾符、善用捕獲分組與前後查找、在需要計算時用回調,你就能在日常開發中游刃有餘地完成從清洗到重寫的各種文本任務。多寫小例子、多做邊界測試,正則會從“黑魔法”變成順手的日常工具。