PHP5 and XML
While PHP has offered XML support since its early versions, that support improved exponentially with the introduction of PHP5. Because the PHP4 support for XML was somewhat limited, such as offering only a SAX-based parser enabled by default and the PHP4 DOM not implementing the W3C standard, PHP XML developers reinvented the wheel, so to speak, with PHP5 and complied with commonly used standards.
New for XML in PHP5
PHP5 includes totally rewritten and new extensions, including the SAX parser, the DOM, SimpleXML, XMLReader, XMLWriter, and the XSLT processor. All these extensions are now based on the libxml2.
Along with the SAX support improved from PHP4, PHP5 also supports both the DOM according to W3C standard and the SimpleXML extension. SAX, DOM, and SimpleXML are all enabled by default. If you are familiar with the DOM from other languages, you will have an easier time coding with similar functionality in PHP than before.
Reading, manipulating, and writing XML in PHP5
SimpleXML, in combination where necessary with the DOM, is the ideal choice for developers working with straightforward, predictable, and relatively small XML documents to read, manipulate, and write XML in PHP5.
Of the many APIs available in PHP5, the DOM and SimpleXML are the most familiar, in the case of the DOM, and the easiest to code, in the case of SimpleXML.And for the most common situations, like those you are dealing with here, the most functional.
DOM extension
The Document Object Model (DOM) is a W3C standard set of objects for representing HTML and XML documents, a standard model of how you can combine these objects, and a standard interface for accessing and manipulating them. Many vendors support the DOM as an interface to their proprietary data structures and APIs, which gives the DOM model a lot of authority with developers due to its familiarity. The DOM is easy to understand and utilize since its structure in memory resembles the original XML document. To pass on information to the application, DOM creates a tree of objects that duplicates exactly the tree of elements from the XML file, with every XML element being a node in the tree. The DOM is a tree-based parser. Because DOM builds a tree of the entire document, it uses a lot of memory and processor time. Therefore, performance issues make it impractical to parse large documents with DOM. The key use of the DOM extension in the context of this article is its ability to import SimpleXML format and output DOM format XML, or the reverse, for use as a string or XML file.
SimpleXML
The SimpleXML extension is the tool of choice for parsing an XML document. The SimpleXML extension requires PHP5 and includes interoperability with the DOM for writing XML files and built-in XPath support. SimpleXML works best with uncomplicated, record-like data, such as XML passed as a document or string from another internal part of the same application. Provided that the XML document isn't too complicated, too deep, and lacks mixed content, SimpleXML is easier to code than the DOM, as its name implies. It is also more reliable if you work with a known document structure.
The DOM in action
The DOM is the W3C DOM specification that you work with in a browser and manipulate with JavaScript. It has all the same methods, so you will use familiar coding techniques. Listing 2 illustrates the use of the DOM to create an XML string and XML document, formatted for your viewing pleasure.
Listing 2. Using the DOM
<?php //Creates XML string and XML document using the DOM $dom = new DomDocument('1.0'); //add root - <books> $books = $dom->appendChild($dom->createElement('books')); //add <book> element to <books> $book = $books->appendChild($dom->createElement('book')); //add <title> element to <book> $title = $book->appendChild($dom->createElement('title')); //add <title> text node element to <title> $title->appendChild( $dom->createTextNode('Great American Novel')); //generate xml $dom->formatOutput = true; // set the formatOutput attribute of // domDocument to true // save XML as string or file $test1 = $dom->saveXML(); // put string in test1 $dom->save('test1.xml'); // save as file ?> |
This produces the output file in Listing 3.
<?xml version="1.0"?> <books> <book> <title>Great American Novel</title> </book> </books> |
Listing 4 imports a SimpleXMLElement object into a DOMElement object, illustrating the interoperability of the DOM and SimpleXML.
Listing 4. Interoperability, Part 1 — DOM imports SimpleXML
<?php $sxe = simplexml_load_string('<books><book><title>'. 'Great American Novel</title></book></books>'); if ($sxe === false) { echo 'Error while parsing the document'; exit; } $dom_sxe = dom_import_simplexml($sxe); if (!$dom_sxe) { echo 'Error while converting XML'; exit; } $dom = new DOMDocument('1.0'); $dom_sxe = $dom->importNode($dom_sxe, true); $dom_sxe = $dom->appendChild($dom_sxe); echo $dom->save('test2.xml'); ?> |
The function in Listing 5 takes a node of a DOM document and makes it into a SimpleXML node. You can then use this new object as a native SimpleXML element. If any errors occur, it returns FALSE.
Listing 5. Interoperability, Part 2 — SimpleXML imports DOM
<?php $dom = new domDocument; $dom->loadXML('<books><book><title>Great American Novel</title></book></books>'); if (!$dom) { echo 'Error while parsing the document'; exit; } $s = simplexml_import_dom($dom); echo $s->book[0]->title; // Great American Novel ?> |
>
SimpleXML in action
The SimpleXML extension is the tool of choice for parsing an XML document. The SimpleXML extension includes interoperability with the DOM for writing XML files and built-in XPath support. SimpleXML is easier to code than the DOM, as its name implies.
For those of you who might be new to PHP, Listing 6 formats a test XML file as an include for your convenience.
Listing 6. Test XML file formatted as a PHP include called example.php in the following code samples
<?php $xmlstr = <<<XML <books> <book> <title>Great American Novel</title> <characters> <character> <name>Cliff</name> <desc>really great guy</desc> </character> <character> <name>Lovely Woman</name> <desc>matchless beauty</desc> </character> <character> <name>Loyal Dog</name> <desc>sleepy</desc> </character> </characters> <plot> Cliff meets Lovely Woman. Loyal Dog sleeps, but wakes up to bark at mailman. </plot> <success type="bestseller">4</success> <success type="bookclubs">9</success> </book> </books> XML; ?> |
In an Ajax application, you might want to extract the zip code from an XML document and query a database. Listing 7 extracts <plot> from your example XML include directly above.
Listing 7. Extracting the Node — How easy does it get?
<?php include 'example.php'; $xml = new SimpleXMLElement($xmlstr); echo $xml->book[0]->plot; // "Cliff meets Lovely Woman. ..." ?> |
On the other hand, you might want to extract a multi-line address. When multiple instances of an element exist as children of a single parent element, normal iteration techniques apply. Listing 8 demonstrates this functionality.
Listing 8. Extracting multiple instances of an element
<?php include 'example.php'; $xml = new SimpleXMLElement($xmlstr); /* For each <book> node, echo a separate <plot>. */ foreach ($xml->book as $book) { echo $book->plot, '<br />'; } ?> |
In addition to reading element names and their values, SimpleXML can also access element attributes. In Listing 9, access attributes of an element just as you would elements of an array.
Listing 9. Demonstrating SimpleXML accessing the attributes of an element
<?php //Input XML file repeated for your convenience $xmlstr = <<<XML <?xml version='1.0' standalone='yes'?> <books> <book> <title>Great American Novel</title> <characters> <character> <name>Cliff</name> <desc>really great guy</desc> </character> <character> <name>Lovely Woman</name> <desc>matchless beauty</desc> </character> <character> <name>Loyal Dog</name> <desc>sleepy</desc> </character> </characters> <plot> Cliff meets Lovely Woman. Loyal Dog sleeps, but wakes up to bark at mailman. </plot> <success type="bestseller">4</success> <success type="bookclubs">9</success> </book> </books> XML; ?> <?php include 'example.php'; $xml = new SimpleXMLElement($xmlstr); /* Access the <success> nodes of the first book. * Output the success indications, too. */ foreach ($xml->book[0]->success as $success) { switch((string) $success['type']) { // Get attributes as element indices case 'bestseller': echo $success, ' months on bestseller list'; break; case 'bookclubs': echo $success, ' bookclub listings'; break; } } ?> |
To compare an element or attribute with a string or pass it into a function that requires a string, you must cast it to a string using (string). Otherwise, by default, PHP treats the element as an object, as Listing 10 demonstrates.
Listing 10. Call it a string or lose
<?php include 'example.php'; $xml = new SimpleXMLElement($xmlstr); if ((string) $xml->book->title == 'Great American Novel') { print 'My favorite book.'; } htmlentities((string) $xml->book->title); ?> |
Data in SimpleXML doesn't have to be constant. Listing 11 will output a new XML document, shown below, just like the original, except that the new XML will change Cliff to Big Cliff.
Listing 11. Changing text node using SimpleXML
<?php $xmlstr = <<<XML <?xml version='1.0' standalone='yes'?> <books> <book> <title>Great American Novel</title> <characters> <character> <name>Cliff</name> <desc>really great guy</desc> </character> <character> <name>Lovely Woman</name> <desc>matchless beauty</desc> </character> <character> <name>Loyal Dog</name> <desc>sleepy</desc> </character> </characters> <plot> Cliff meets Lovely Woman. Loyal Dog sleeps, but wakes up to bark at mailman. </plot> <success type="bestseller">4</success> <success type="bookclubs">9</success> </book> </books> XML; ?> <?php include 'example.php'; $xml = new SimpleXMLElement($xmlstr); $xml->book[0]->characters->character[0]->name = 'Big Cliff'; echo $xml->asXML(); ?> |
Since PHP 5.1.3, SimpleXML has had the ability to easily add children and attributes. Listing 12 will output an XML document based on the original but having a new character and descriptor.
Listing 12. Adding children and text nodes using SimpleXML
<?php $xmlstr = <<<XML <?xml version='1.0' standalone='yes'?> <books> <book> <title>Great American Novel</title> <characters> <character> <name>Cliff</name> <desc>really great guy</desc> </character> <character> <name>Lovely Woman</name> <desc>matchless beauty</desc> </character> <character> <name>Loyal Dog</name> <desc>sleepy</desc> </character> <character> <name>Yellow Cat</name> <desc>aloof</desc> </character> </characters> <plot> Cliff meets Lovely Woman. Loyal Dog sleeps, but wakes up to bark at mailman. </plot> <success type="bestseller">4</success> <success type="bookclubs">9</success> </book> </books> XML; ?> <?php include 'example.php'; $xml = new SimpleXMLElement($xmlstr); $character = $xml->book[0]->characters->addChild('character'); $character->addChild('name', 'Yellow Cat'); $character->addChild('desc', 'aloof'); $success = $xml->book[0]->addChild('success', '2'); $success->addAttribute('type', 'reprints'); echo $xml->asXML(); ?> |
More