An XML Callback Antipattern
Here’s another quick tip-n-trick for Ajax server-side mocking today. This should really be classed as an anti-pattern in XML parsing, but it’s common enough that you’re bound to run into it soon enough.
For those who haven’t heard the term, an anti-pattern encapsulated a common bad practice in software development. Writing extremely long methods, tangling up presentation and logic, and falling asleep in meetings can all be considered anti-patterns. Lots of people do them out of habit, but they don’t help you to get on.
Back to the XML then. Have a look at the following bit of server-side code that generates some XML:
Response.ContentType="text/xml";
StringBuilder builder=new StringBuilder("<results>");
//...run database query...
foreach (DataRow row in results.Rows){
builder.append("<match name='"+row["name"].ToString()+"'>");
builder.append("<address>");
builder.append(row["address"].ToString());
builder.append("</address>");
builder.append("</match>");
}
builder.append("</results>");
Response.Write(builder.ToString());
Here’s a typical output from said code, right?
<results>
<match name="dave">
<address>Stroud, UK</address>
</match>
<match name="somebody else">
<address>world</address>
</match>
</results>
Wrong! In fact, it’ll look like this:
<results><match name="dave"><address>Stroud, UK</address></match><match name="somebody else"><address>world</address></match></results>
Structurally, the two are pretty much the same, but we humans like to indent and structure our XML, whereas the machines that read it don’t.
Now have a look at this client-side code, that attempts to parse the XML output.
var xmlDoc=this.req.responseXML;
var elDocRoot=xmlDoc.getElementsByTagName("results")[0];
if (elDocRoot){
var children=elDocRoot.childNodes;
for (var i=0;i<children.length;i++){
var match=children[i];
attrs=match.attributes;
name=attrs.getNamedItem("name").value;
//do stuff...
}
}
This code will run in IE but not in Firefox, if served up with nicely indented XML. Why is that? The answer lies in the use of the childNodes property. The Mozilla XML parser treats whitespace as children, whereas IE doesn’t. Each whitespace node is a simple TextNode, which has no attributes, and so we throw an error.
So, what’s the anti-pattern, and what’s the solution? When we use firstChild, we’re coupling the structural detail of our document to the client code. We ought to parse the XML based solely on its semantic content, and retrieve elements by their properties, not their place in the document structure. The final snippet rewrites our client code to allow it to read the indented XML safely.
var xmlDoc=this.req.responseXML;
var elDocRoot=xmlDoc.getElementsByTagName("results")[0];
if (elDocRoot){
var children=elDocRoot.childNodes;
for (var i=0;i<children.length;i++){
var match=children[i];
if (match.nodeName.toLowerCase()="match"){
attrs=match.attributes;
name=attrs.getNamedItem("name").value;
//do stuff...
}
}
}
Checking the nodeName property not only filters out the whitespace, but allows us greater flexibility in restructuring the XML at a late date (say, for example, that we want to add
Getting back to the mock XML practice, parsing the response nicely (i.e. based on semantic structure rather than node order) allows us to hand-write our mock documents in a much more natural way.
October 31st, 2005 at 2:22 pm
Why not just use the handy getElementsByTagName, as defined at http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html#ID-745549614 instead of directly accessing the childNodes? You might not be able to get away with dodgy variations in case of your tag names, but I’ve found it simpler and neater, while allowing for both the whitespace vagaries and minor structural changes.