Complete Guide to Integrating Baidu Speech Recognition API with PHP (Code Examples & Key Tips)

gitbox 2025-06-06

1. Introduction

Speech recognition has become an increasingly important feature in modern smart applications, especially in areas such as voice assistants, smart customer service, and voice input. Baidu’s speech recognition API provides developers with a powerful and stable voice-to-text solution. This guide walks you through how to implement it using PHP.

2. Overview of Baidu Speech Recognition API

The Baidu API supports converting audio into text, with multilingual support and various input formats. Developers can send audio data to the API via HTTP requests and receive the transcribed result in JSON format, which can then be processed in their applications.

To successfully use the API, you need to configure the request parameters correctly and parse the JSON response to extract the recognized text.

3. Implementation Steps in PHP

3.1 Preparation

Before calling the API, you must register an account on Baidu Cloud, create an application, and obtain the following credentials:

AppID
API Key
Secret Key

These are essential for authentication and must be included in your requests. Also, make sure that speech recognition services are enabled for your application.

3.2 PHP Code to Call the API


// Set Baidu speech recognition API URL
$url = 'http://vop.baidu.com/server_api';
// Required parameters
$cuid = "123456789"; // Unique user ID
$format = "pcm";     // Audio format
$rate = 16000;       // Sample rate
$channel = 1;        // Channel count
$token = "24.f601973d83600bb9532f8c32ed61c45c.2592000.1570309632.282335-17098763"; // Access token

// Read audio file contents
$audio = file_get_contents("test.pcm");

// Set request headers
$header = array(
    "Content-Type: audio/" . $format,
    "Content-Length: " . strlen($audio),
    "cuid: " . $cuid,
    "rate: " . $rate,
    "channel: " . $channel,
    "token: " . $token
);

$options = array(
    'http' => array(
        'method'  => 'POST',
        'header'  => implode("\r\n", $header),
        'content' => $audio
    )
);

$context = stream_context_create($options);
// Send request and parse JSON response
$result = file_get_contents($url, false, $context);
$result = json_decode($result, true);

The code can be broken down into three steps:

Set API endpoint and parameters
Read the PCM audio file
Send a POST request and parse the returned JSON

Be aware that the access_token is essential for authentication and must be valid and active. You can refer to the official Baidu documentation for how to generate this token.

3.3 Important Considerations

Audio File Requirements: The file must be in 16-bit PCM format, mono channel, and 16kHz sample rate.
User Identifier (cuid): Must be unique, often using user ID or device ID.
Access Token: The API has usage limits per day, and the token is needed for access control.

4. Conclusion

This article explained how to implement speech recognition in PHP using Baidu’s API, including how to configure parameters, upload audio data, and handle the results. With proper integration, you can easily add speech-to-text functionality to your application and enhance user experience with voice capabilities.