The curl_multi_* function in PHP is usually used to perform multiple concurrent HTTP requests, especially for high concurrency scenarios. curl_multi_close is a function used to close all multiple cURL handles initialized by curl_multi_init() . Although it is very effective in many scenarios, will it become a performance bottleneck in high concurrency environments? This article will analyze the working principle of curl_multi_close in depth, discuss the performance problems it may bring, and give relevant optimization suggestions.
The curl_multi_* series of functions provide a mechanism that enables PHP to handle multiple HTTP requests at the same time. Common steps to use include:
curl_multi_init() : Initialize a multiple cURL handle.
curl_multi_add_handle() : Add multiple cURL handles to multiple cURL handles.
curl_multi_exec() : executes all cURL requests.
curl_multi_getcontent() : Get the response content of the request.
curl_multi_close() : Closes all cURL handles added to multiple handles.
The function of curl_multi_close is to close all cURL handles and release the corresponding resources. This function needs to be called after all requests have been completed and the relevant data has been retrieved.
However, in high concurrency environments, especially when processing hundreds of thousands of requests, the execution of curl_multi_close can become a performance bottleneck. Why? Let's take a deeper look.
In high concurrency environments, curl_multi_close needs to close all cURL handles one by one, which may involve a large number of memory release operations. For each request, cURL will allocate a certain amount of memory to store connection information, request data, response data, etc. When the number of requests is very large, curl_multi_close needs to release these memory resources one by one, which may lead to memory fragmentation and consume a large amount of CPU resources for cleaning, affecting the overall performance of the program.
cURL uses the operating system's network connection pool, so each cURL request is actually implemented through the system's underlying network connection. When a large number of requests are completed, the call to curl_multi_close will cause a large number of network connections to be closed, which will also take some time to complete. If the request volume is very large, the process of closing the connection may cause delays, affecting the system's response time.
Although PHP itself is single-threaded, cURL will rely on the threads of the operating system for management when performing multiple concurrent requests. In high concurrency scenarios, curl_multi_exec() will start multiple threads to process the request, while curl_multi_close needs to wait for all threads to end before closing all handles. If some requests respond for a longer time, curl_multi_close will be blocked, resulting in performance degradation.
In order to avoid curl_multi_close becoming a performance bottleneck, we can take the following optimization measures:
Try to avoid initiating too many concurrent requests at once, and use a mechanism similar to a request pool to execute the requests in batches. For example, you can use PHP's curl_multi_select function to control the maximum number of concurrencies requested.
$multiHandle = curl_multi_init();
$handles = [];
for ($i = 0; $i < 1000; $i++) {
$url = "https://gitbox.net/api/v1/data/{$i}";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_multi_add_handle($multiHandle, $ch);
$handles[] = $ch;
}
$active = null;
do {
$mrc = curl_multi_exec($multiHandle, $active);
if ($active) {
curl_multi_select($multiHandle);
}
} while ($active && $mrc == CURLM_OK);
// Close all handles
foreach ($handles as $ch) {
curl_multi_remove_handle($multiHandle, $ch);
curl_close($ch);
}
curl_multi_close($multiHandle);
To avoid request blocking for too long, we can set an appropriate timeout for each request to ensure that if a request responds too slowly, we can close it in time and continue to process other requests. Appropriate timeout settings can reduce waste of system resources.
curl_setopt($ch, CURLOPT_TIMEOUT, 10); // Set the request timeout time to10Second
For scenarios where a large number of repeated requests for the same domain name, persistent connection and connection pooling techniques can be considered. cURL provides the CURLOPT_FORBID_REUSE option to prevent multiplexing of connections, thereby reducing the overhead of multiple connections.
curl_setopt($ch, CURLOPT_FORBID_REUSE, true); // Disable connection reuse
Ensure that the server can handle a large number of concurrent connections, and can optimize performance by increasing the maximum execution time of PHP, adjusting the operating system's file handle limits, etc.
In high concurrency environments, curl_multi_close can indeed become a performance bottleneck, especially when the number of requests is very large, the process of resource release and connection shutdown may bring latency and performance degradation. However, by reasonably controlling the number of concurrency, optimizing request timeouts, using connection pools, etc., the performance of the system can be significantly improved and curl_multi_close can be avoided as a bottleneck.