In daily development, we often need to parse the URL to extract specific parts of it, such as protocol, hostname, path or query parameters. In some scenarios, we may only care about the "file name" contained in the path part in the URL, such as the image name, document name, etc. in the resource address.
In PHP, we can use the built-in parse_url function to break the URL into various parts, and then combine the basename function to extract the file name we want from the path. This article will explain how to implement this process in combination with examples.
parse_url is a function provided by PHP for parsing URLs. It returns an associative array containing the following possible keys:
scheme: such as http, https
host: host name
port: port number
user: username
pass: Password
path: path part
query: query string
fragment: anchor point
The basic usage is as follows:
$url = "https://www.gitbox.net/images/photo.jpg?size=large";
$parts = parse_url($url);
print_r($parts);
The output result is:
Array
(
[scheme] => https
[host] => www.gitbox.net
[path] => /images/photo.jpg
[query] => size=large
)
PHP's basename function can extract the last level file name from the path. For example:
$path = "/images/photo.jpg";
$filename = basename($path); // The result is "photo.jpg"
We can first use parse_url to extract the path in the URL, and then obtain the file name through basename . Here is the complete sample code:
function extractFilenameFromUrl($url) {
$parts = parse_url($url);
if (!isset($parts['path'])) {
return null; // If not path,Unable to extract filename
}
return basename($parts['path']);
}
// Example
$url = "https://cdn.gitbox.net/assets/docs/manual.pdf?download=true";
$filename = extractFilenameFromUrl($url);
echo "The extracted file name is:$filename"; // Output: The extracted file name is:manual.pdf
Sometimes the incoming URL may not have a file name specified, such as: