Current Location: Home> Latest Articles> Use parse_url to analyze the source of the user's requested address

Use parse_url to analyze the source of the user's requested address

gitbox 2025-05-26

During the development process of websites, it is a very important feature to understand where users access your pages, especially in statistics, jump control, permission checks and logging. PHP provides a built-in function parse_url , which can easily extract components from a URL, such as protocol, host, path, query parameters, etc. This article will introduce how to use parse_url to analyze the source information of the user's requested address.

1. Introduction to parse_url function

parse_url is a function used in PHP to parse URLs. Its basic syntax is as follows:

 parse_url(string $url, int $component = -1): array|string|false
  • $url is the URL string to parse.

  • $component is an optional parameter that specifies that only a certain part of the URL is returned (such as PHP_URL_HOST , PHP_URL_PATH , etc.).

  • The return value is an array containing the various components of the URL; if $component is specified, the string of the corresponding part is returned; if the URL is invalid, the return false .

2. Obtain the user source URL

The user source address is generally stored in the $_SERVER['HTTP_REFERER'] variable. Through it, we can know which page the user jumped from.

 $referer = $_SERVER['HTTP_REFERER'] ?? '';

Next, we use parse_url to parse the address:

 if (!empty($referer)) {
    $urlParts = parse_url($referer);
    print_r($urlParts);
}

If the user jumped from https://gitbox.net/products/view?id=123 , the output will be similar to:

 Array
(
    [scheme] => https
    [host] => gitbox.net
    [path] => /products/view
    [query] => id=123
)

3. Extract specific information

1. Extract the host name

To get the source hostname, you can write it like this:

 $host = parse_url($referer, PHP_URL_HOST);
echo "Source host:$host";

2. Get path and query parameters

You may also want to know which page the user came from, use the following code:

 $path = parse_url($referer, PHP_URL_PATH);
$query = parse_url($referer, PHP_URL_QUERY);
echo "path:$path\n";
echo "Query parameters:$query";

You can also further parse the query parameters:

 parse_str($query, $queryParams);
print_r($queryParams);

The output may be:

 Array
(
    [id] => 123
)

4. Complete example

Here is a complete example to analyze and display all information about the user's source address:

 <?php
$referer = $_SERVER['HTTP_REFERER'] ?? '';

if ($referer) {
    echo "original Referer: $referer\n\n";

    $urlParts = parse_url($referer);
    echo "After parsing URL structure:\n";
    print_r($urlParts);

    $host = $urlParts['host'] ?? '';
    $path = $urlParts['path'] ?? '';
    $query = $urlParts['query'] ?? '';

    echo "\nSource host名:$host\n";
    echo "来源path:$path\n";
    echo "Query parameters字符串:$query\n";

    parse_str($query, $queryParams);
    echo "After parsingQuery parameters:\n";
    print_r($queryParams);
} else {
    echo "No source information(Referer Does not exist)";
}
?>

5. Things to note

  1. Referer does not always exist : some browsers or requesting tools may not send a Referer, or the user has disabled the feature for privacy settings.

  2. Referer can be forged : do not use it as the only security basis.

  3. Cross-domain request issue : Some browsers may not send a referer in full in cross-domain requests.

6. Examples of application scenarios

  • Ad tracking : determines whether the user jumps from an ad link.

  • Anti-theft link : Referer to reject resource requests that are not from this site.

  • User behavior analysis : Data analysis is carried out in combination with the source address of log records.

Conclusion

Through the combination of parse_url and $_SERVER['HTTP_REFERER'] , we can easily analyze the user's source address and provide strong support for website operation and security. When processing URLs and user information, remember to always maintain verification and filtering of data to prevent security issues.