ext/dom and libxml2 charset and entities behaviors
In case you are unaware, there is [as of PHP 5.1.0] a second argument to the DomDocument->SaveXML() method.
This argument currently only supports one value which is the constant LIBXML_NOEMPTYTAGS. This option makes sure that you do not end up with <tag /> but instead, <tag></tag>. This can make things easier if you need more predictable text to perform other changes on later.
However, in playing around with the option, I noticed that my markup changed somewhat significantly in size (it’s a large document). Some further playing yields that the following six uses of DomDocument->SaveXML() yield different results:
  is a non-breaking space character (in HTML ). ext/dom Defaults to UTF-8
<?php
$dom = DOMDocument::loadXML("<xml><test /> </xml>");
echo $dom->saveXML();
/*
Default behavior, entities stay as entities, no encoding added to the XML prolog
<?xml version="1.0"?>
<xml><test/> </xml>
*/
echo $dom->saveXML($dom->documentElement);
/*
Entities are transformed to output charset, no XML prolog
<xml><test/>[nbsp char]</xml>
*/
echo $dom->saveXML($dom);
/*
Entities are transformed to output charset, encoding added to the XML prolog
<?xml version="1.0" encoding="UTF-8"?>
<xml><test/>[nbsp char]</xml>
*/
echo $dom->saveXML($dom->documentElement, LIBXML_NOEMPTYTAG);
/*
Entities are transformed to output charset, no XML prolog, tags expanded
<xml><test></test>[nbsp char]</xml>
*/
echo $dom->saveXML($dom, LIBXML_NOEMPTYTAG);
/*
Entities are transformed to output charset, encoding added to the XML prolog, tags expanded
<?xml version="1.0" encoding="UTF-8"?>
<xml><test></test>[nbsp char]</xml>
*/
echo $dom->saveXML(null, LIBXML_NOEMPTYTAG);
/*
Entities stay as entities, no encoding added to the XML prolog, tags expanded
<?xml version="1.0"?>
<xml><test></test> </xml>
*/
?>
Just something to keep in mind next time you’re fooling around with the DOM.
- Davey
@janinaz I checked out your IMDB, very cool that you got into an episode of Dollhouse :)
@dshafik [3 hours ago]
@dshafik Hey0, ;)
@janinaz [3 hours ago]
@ejacqui You mean the PSPs retarded little brother?
@dshafik [3 hours ago]
Does anyone remember the Ms Dewey viral search campaign from Microsoft a couple of years ago? Ms Dewey was played by @janinaz. OHAI.
@dshafik [3 hours ago]
@dshafik Yeah, I have immediate uses for traits. I like namespaces, but they're boring. OTOH, closures + traits == yummy.
@weierophinney [9 hours ago]
