Saturday, August 06, 2005

RSS vs. Atom

I was looking at implementing a syndication feed for our catalog the other day. A XML feed allows data to be transferred between entities without the formatting text included. Or you can some of it, if you really want. It's structured data that can be easily parsed by software. HTML does not have this characteristic, but XML does. This blog for example has a feed. This was brought about by my seeing an increase in websites that include the "alternate" tag. I like to be the first craft site on the block to add a new feature, so I looked into it. The big decision is the feed format. There are currently two strongly supported formats for syndication. The first is RSS, which stands for Really Simple Syndication, or something else depending on who you talk to. RSS has been around for some time and you can view its history by clicking on the aforementioned link. The other format is Atom which is being pushed by Google. Both formats have specifications by industry standard groups and both are supported by a wide variety of different software programs.
But which is better? I decided on Atom. My reasons reach beyond just what Google tells me to do and has more to do with why Google is backing it. I think that the reason Google favors Atom over RSS is that Atom is better. "RSS is too complicated" is the basis for that. This is an over generalized statement but I'm hoping to boil the argument down without delving into overly excruciating detail. I have seen the new RSS 2.0 format, and yes it is simpler, but not simple enough and allows for considerable abuse. RSS is based on RDF, which is a really good way to store data and is used in software like Mozilla which is one of the reasons it is a really good thing. The downfall of RSS I believe, is that it allows for extensibility via RDF. This is normally a good thing and a very powerful feature. But when trading data between websites, it is unecessary. Since both websites must agree on the XML format prior to transfer, there is no way for the receiving website to interpret the tags that have been added to the format by the sending website. This doesn't make RDF or RSS bad, rather it means that they are not ideal for the use for which I'm intending. Atom lacks the overhead of support for tag extension and as a result is simpler.
On the internet, there is a trend that is usually true. In a conflict of two standards, one will win. Not all the time, but most of the time. By win I mean widespread adoption at the expense of the other. I think that RSS will be supported solely by some websites for a long time but will be the increasing minority. Many websites will support both for a while until they drop RSS and only support Atom to reduce maintenance. Some sites, probably the majority, will support just Atom. The pressure of market forces will apply increasingly over time towards Atom.
So I settled on Atom. I didn't do the actual implementation though. I don't store the last modified time for products and categories in my database which is a required tag for both formats. Including the last modified time was something I thought about during design, but discarded it as I couldn't see the use of it later on. My mistake. As soon as I add the extra columns to my database, I'll be able to finish the feed. Hopefully in time, product feeds from different sites will allow consumers to accurately comparison shop without significant overhead. I'm tired of the tab delimited format standard. Which makes me wonder, why doesn't Froogle take Atom?

0 Comments:

Post a Comment

<< Home