Current Location: Home> Latest Articles> PHP DOMDocument Class Tutorial: Efficiently Handle HTML and XML Documents

PHP DOMDocument Class Tutorial: Efficiently Handle HTML and XML Documents

gitbox 2025-06-28

PHP DOMDocument Class Tutorial

In PHP, the DOMDocument class provides a powerful tool for working with HTML and XML documents. With it, we can efficiently parse, modify, and generate document content. This article will guide you through using the DOMDocument class to load, manipulate, and save HTML and XML documents.

Creating a DOMDocument Object

When using the DOMDocument class, the first step is to instantiate a DOMDocument object so that you can operate on the document content.


$dom = new DOMDocument();

Once instantiated, we can load an HTML or XML document for further manipulation.

Loading HTML or XML Documents

DOMDocument offers several methods for loading HTML or XML documents, including:

  • loadHTML(): Loads an HTML document from a string.
  • loadHTMLFile(): Loads an HTML document from a file.
  • load(): Automatically selects the appropriate loading method based on the document type.

Here’s an example of loading an HTML document from a string:


$html = "<html><body><p>Hello, World!</p></body></html>";
$dom->loadHTML($html);

You can also load an HTML document from a file:


$dom->loadHTMLFile('example.html');

Or load an XML document:


$dom->load('example.xml');

Navigating and Manipulating the Document

Once the document is loaded, we can use various DOMDocument methods to navigate and manipulate the content of the document. Some common methods include:

  • getElementsByTagName(): Retrieve elements by their tag name.
  • createElement(): Create a new element node.
  • appendChild(): Add a node as a child of another node.
  • removeChild(): Remove a child node from its parent.

For example, to retrieve all paragraph elements:


$paragraphs = $dom->getElementsByTagName('p');
foreach ($paragraphs as $paragraph) {
  echo $paragraph->nodeValue;
}

Create a new heading element and append it to the document:


$newElement = $dom->createElement('h2', 'New Heading');
$parentElement->appendChild($newElement);

Remove a node:


$parentElement->removeChild($childElement);

Generating HTML or XML Documents

After manipulating the document, you can export it as either an HTML or XML document. The common methods are:

  • saveHTML(): Convert the DOMDocument object to an HTML string.
  • saveHTMLFile(): Save the DOMDocument object as an HTML file.
  • save(): Automatically choose the appropriate save method based on the document type.

Convert the document to an HTML string:


$htmlString = $dom->saveHTML();

Save the document as an HTML file:


$dom->saveHTMLFile('output.html');

Or save it as an XML file:


$dom->save('output.xml');

Summary

The DOMDocument class in PHP provides powerful functionality for handling HTML and XML documents. With it, we can easily load, navigate, modify, and generate document content. Whether parsing an HTML page or manipulating an XML file, DOMDocument is a highly useful tool.