February 07, 2007

What I have to say about what Dave Winer has to say about JSON

Dave Winer, inventor of XML-RPC, apparently first found out about the data format known as JSON just a couple of months ago. There were a firestorm of comments, but my comment got too long so I've reformatted it as an article in my blog and linked to it. I know you can't reply to me here so, I guess reply to me there instead.

My comment continues beyond the jump.

Valid are the comments that not everyone choses to handle data in their native languages as XML. Most people use XML as a document description language and DOM as an accessor to such documents. This is no different than using PDF to describe printable pages digitally and using some PDF library to get at the data inside. Nobody would chose to convert something that's not a PDF (like an associative array in PERL) into a PDF just to get it to another client or server and convert it back into an associative array (be the recipient JavaScript, Java, Ruby, ASP or whatever).

This is a long way of saying that PDF would be a bad choice of a serialization language. For the same reason, XML is also a bad choice of a serialization language. You can do many Grand Things using XML and solve many Grand Problems. I don't think I've run into any of those problems yet. I just have data in language X and want to wind up with the same data in language Y. I want a language neutral serialization format.

It is true that JSON can be consumed by JavaScript using eval. It is also true that as long as your receiver trusts your generator this can be a huge win. That describes 90% of your transactions: web client consuming data from it's own server.

But this approach is tainted by the "security concerns" given light in Henry's statement:
> JavaScript has the equivalent of Lisp’s ‘eval’ , but not ‘read’.
> And for some further unfathomable reason, nobody seems
> to understand this simple yet important difference.

To whit, the uninitiated may try to "eval()" a JSON string meaning to unserialize harmless data, but wind up executing harmful code instead. On the browser this could be a nuisance or a security violation with attackers stealing cookie data... on the server side (using SSJS or python) attackers could conceivably try to attack your database or file system.

However, that by itself fails to make JSON a kludge. The fact remains that an exceedingly efficient regex will sanitize a JSON string in such a way that regex+eval() = lisp's read(). This approach is so efficient it pays to black box it and never trust eval.

That of course secures your JavaScript — and apparently your python. Tweak the regex a bit to pre-process into whatever object literal notation you want, and you've written a safe decoder for nearly every language which has an eval. Even LISP.

I take it as read that by now that Mr. Whiner knows that nobody invented JSON. JSON is in fact older than XML or XML-RPC. It was in use in a longhand form as mentioned by jk since 1995. With it's shorthand format ratified by ECMA in '99, it's slowly seeped into virtually every web browser. Being older in this case means it has wider and more mature support than XMLHttpRequest and can be used to provide a greater number of end users the benefits of client side interactivity.

Just as XML sucks for serialization, nobody argues against the assertion that JSON sucks for document representation. When people cheer or boo about JSON "replacing" XML they are doing so assuming different domains for replacement.

Most of the people cheering are tired of manipulating the DOM to get a simple piece of data from what should be a simple web service, and want to replace XML's use in that domain.

Most of the people booing are passing complicated documents back and forth in domains and back offices I know nothing about, and are afraid of being overrun by cowboys breaking their standards and their discipline.

Well, David, this may come as another surprise to you, but out here in the Wild Wild West you are bound to get your boots dirty. You may have coined XML-RPC and purport to love XML and the discipline it requires — and doing so may serve your purposes famously in your clean offices trading TPS reports with various departments — but when you come out here to decry JSON in public with the italicized "IT’S NOT EVEN XML!" you do so using a web page with this doctype:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

And here is what validator.w3.org has to say about your opinion:
"This page is not Valid XHTML 1.0 Transitional!"

Posted by jesse at 02:25 PM