Current Location: Home> Latest Articles> Efficient Targeted Web Data Scraping with PHP and phpSpider Tutorial

Efficient Targeted Web Data Scraping with PHP and phpSpider Tutorial

gitbox 2025-06-27

Introduction

phpSpider is a lightweight and practical PHP crawler framework that allows you to quickly scrape information from the web and save it to a local database by writing simple rules. It is ideal for developers who need to extract data with specific formats and can greatly simplify the crawler development process.

Installing phpSpider

Before using phpSpider, you need to install the framework. phpSpider manages dependencies via Composer, so make sure Composer is installed on your system.

composer create-project phpspider/phpspider

After installation, you can verify the framework by running a test script:

cd phpspider
php tests/simple_test.php

Scraping Website Data

phpSpider supports unlimited web data scraping. Below is a demonstration of how to scrape data from a simple website.

Creating a Project

First, create a new project by executing the command below. phpSpider will generate the corresponding project folder automatically:

php phpspider startproject myproject

Writing Scraping Rules

Once the project is created, define scraping rules that instruct phpSpider how to extract data from the target website. Inside the myproject/rules directory, create a rule.php file with content similar to the example below:

return [
    'start_urls' => [
        'http://www.example.com'
    ],
    'rules' => [
        [
            'type' => 'regex',
            'pattern' => '/(<a.*?>(.*?)<\/a><\/div>)/',
            'id' => 1,
            'fields' => [
                [
                    'name' => 'title',
                    'selector' => 'text()',
                ],
                [
                    'name' => 'link',
                    'selector' => '@href',
                ]
            ],
        ]
    ],
];
?>

Running the Spider

After configuring the rules, you can run phpSpider to start scraping data:

php phpspider run myproject

Conclusion

After reading this article, you should have a clear understanding of how to use phpSpider, from installation and project creation to rule configuration and running the crawler. As a simple and powerful PHP spider framework, phpSpider is well-suited for quickly building targeted scraping projects, helping you easily collect the web data you need.