During the development process of websites, it is a very important feature to understand where users access your pages, especially in statistics, jump control, permission checks and logging. PHP provides a built-in function parse_url , which can easily extract components from a URL, such as protocol, host, path, query parameters, etc. This article will introduce how to use parse_url to analyze the source information of the user's requested address.
parse_url is a function used in PHP to parse URLs. Its basic syntax is as follows:
parse_url(string $url, int $component = -1): array|string|false
$url is the URL string to parse.
$component is an optional parameter that specifies that only a certain part of the URL is returned (such as PHP_URL_HOST , PHP_URL_PATH , etc.).
The return value is an array containing the various components of the URL; if $component is specified, the string of the corresponding part is returned; if the URL is invalid, the return false .
The user source address is generally stored in the $_SERVER['HTTP_REFERER'] variable. Through it, we can know which page the user jumped from.
$referer = $_SERVER['HTTP_REFERER'] ?? '';
Next, we use parse_url to parse the address:
if (!empty($referer)) {
$urlParts = parse_url($referer);
print_r($urlParts);
}
If the user jumped from https://gitbox.net/products/view?id=123 , the output will be similar to:
Array
(
[scheme] => https
[host] => gitbox.net
[path] => /products/view
[query] => id=123
)
To get the source hostname, you can write it like this:
$host = parse_url($referer, PHP_URL_HOST);
echo "Source host:$host";
You may also want to know which page the user came from, use the following code:
$path = parse_url($referer, PHP_URL_PATH);
$query = parse_url($referer, PHP_URL_QUERY);
echo "path:$path\n";
echo "Query parameters:$query";
You can also further parse the query parameters:
parse_str($query, $queryParams);
print_r($queryParams);
The output may be:
Array
(
[id] => 123
)
Here is a complete example to analyze and display all information about the user's source address:
<?php
$referer = $_SERVER['HTTP_REFERER'] ?? '';
if ($referer) {
echo "original Referer: $referer\n\n";
$urlParts = parse_url($referer);
echo "After parsing URL structure:\n";
print_r($urlParts);
$host = $urlParts['host'] ?? '';
$path = $urlParts['path'] ?? '';
$query = $urlParts['query'] ?? '';
echo "\nSource host名:$host\n";
echo "来源path:$path\n";
echo "Query parameters字符串:$query\n";
parse_str($query, $queryParams);
echo "After parsingQuery parameters:\n";
print_r($queryParams);
} else {
echo "No source information(Referer Does not exist)";
}
?>
Referer does not always exist : some browsers or requesting tools may not send a Referer, or the user has disabled the feature for privacy settings.
Referer can be forged : do not use it as the only security basis.
Cross-domain request issue : Some browsers may not send a referer in full in cross-domain requests.
Ad tracking : determines whether the user jumps from an ad link.
Anti-theft link : Referer to reject resource requests that are not from this site.
User behavior analysis : Data analysis is carried out in combination with the source address of log records.
Through the combination of parse_url and $_SERVER['HTTP_REFERER'] , we can easily analyze the user's source address and provide strong support for website operation and security. When processing URLs and user information, remember to always maintain verification and filtering of data to prevent security issues.