November 2002 Archives

Web Services We'd Like To See.

In a recent CNET article, Margaret Kane reports on Google and Amazon's success with Web services and the benefits they are beginning to reap. Tim O'Reilly provides his commentary on the piece here. O'Reilly notes a key takeaway from Amazon and Google’s success is "...the importance of a decentralized approach rather than a top-down approach by a single vendor." In addition to his comments, I think it's also interesting to note that two service providers are driving real Web service adoption and not software vendors such as Microsoft and IBM. (Could this be an indication of a significant shift in the industry?)

Being a developer I want to see more service providers support general use Web services. I've been pondering this for a while, but the recent CNET article and the attention it has received put my thinking into high gear. Here are some of the service providers that I would like to see follow Google and Amazon's lead:

  • eBay. As the CNET article indicates, eBay has begun down this path on a limited basis presumably with third-party add-on service providers like AuctionWatch. eBay needs to open it up for general use. Some of the obvious uses include more powerful search and monitoring agents (think of melding a news aggregator with "real time" auction information) and "PowerSellers" integrating their eBay activity into their own web sites. This would only be the beginning. As Amazon and Google have proved, the net will find new and inventive uses for services that the original providers never could have imagined. I am just as curious to see what changes in eBay's auction policy may occur from the availability of this new capability.
  • PayPal. I realize they are now an eBay company, but since their integration has only just begun, I chose to highlight them separately. PayPal's signature service would be trickier to implement then others because real money is being handled, but it's not impossible. PayPal already offers "convenient" HTML interfaces to payments and shopping carts functionality. PayPal has been a major contributor to making e-commerce transaction possible to the masses. A move towards Web services would be the next logical step in my opinion.
  • FedEx. UPS. While keeping with the e-commerce theme, I'll mention FedEx and UPS here. I don’t think general Web service availability would be nearly as earth shaking as all the other examples here. They would be useful to the burgeoning market of mom and pop e-commerce that eBay and PayPal have helped foster. Like PayPal, both offer HTML interfaces to some of their services that can be made machine readable with some screen scraping which can be problematic and brittle. In addition to package tracking, both companies should create APIs for managing their shipping contacts, calculating costs relative to weight and location, and subscribing to package delivery alerts.
  • MapQuest. Opening their geographic plotting engine and roadmap data could yield some very interesting applications. An obvious use would be integrating directions with traffic advisories. (Call me odd, but I find it helpful to know that "fastest route" has two lanes closed for construction.) I had once hoped Vicinity/MapBlast would have made a move into Web services to gain ground and push MapQuest, but now as a Microsoft company that hope has waned given Redmond's top down approach to Web services.
  • Yahoo. The Web portal company has been marketing premium services to generate new revenue streams. Yahoo's relative success, in addition to Apple's similar foray with their .Mac services, indicates that there probably is a market. I believe the availability of a Web service API would be helpful to furthering those efforts. Yahoo Stores would benefit immediately from such a move. (See my comments on PayPal.) Yahoo Groups can be accessed in part using email and to retrieve recent messages as an RSS feed, but further accessibility via Web services could be useful. Other potential targets for Web services would include interfaces to personal email, calendar, briefcase and address book. (I would pay a nominal fee to Yahoo for this.)
  • Any site with information and news lacking RSS support. Whether it is just assumed or simply overlooked, RSS is the most widely deployed Web service across the Internet. Granted, most RSS feeds have very simple interfaces with almost as simple backends that are unlike the Web services that usually come to mind. (Who says Web services need to be complex or sophisticated anyhow?) Under the principles of the REST architectural style that the Web was built on, RSS feeds do qualify. Consider that any site search engine becomes a Web service if it could emit results in RSS and the format's potential in the realm of Web services becomes more apparent.

This is by no means a complete list. It is however a starting point to further discussions as to the future of Web service adoption and their potential of Web services to help drive innovation. What do you think?

XHTML as Syndication Format?

[The following are highlights of comments I made in response to Anil Dash's proposal to use XHTML as a syndication format instead of RSS. I've lightly edited the text for relevancy.]

A set of tags already exists: they're called RSS. The point of XHTML (or one of them at least) is that it can leverage the full range of XML toolkits and specification. This includes XML Namespaces that allows tags from other schemas to be included thereby extending the original from its designed purposes. Some have already been experimenting with combining RSS and XHTML tags into their pages here. (Do a view source to see what I mean.)

What Anil proposes (H3 with this class...) has been done before in the past before we had XML -- its called screen scraping -- albeit more refined screen scraping, but screen scraping nonetheless. It all seems rather retro to me.

I don't mean to sound harsh. I really don't. Its an interesting notion that has its merits. I can understand the argument that ideally content authors shouldn't have to create two versions. I think that in practice its limitations will outweigh its benefits.

A separate syndication file should be more bandwidth efficient especially with aggregators and the like banging away frequently on them. Aggregators have recently improved from their early days of brute force updates -- downloading a feed on some interval regardless of changes. RSS is more about data (that just so happens to be about content) where XHTML is more about display. Combining the two is fine, but inefficient in that the information necessary for one task must be ignore when the document is used for the other.

I complete object that RSS exists out of laziness as Anil says. If a content author is "too lazy" to generate two versions of their content, I'd suggest that they author their content in RSS. You can easily convert RSS more efficiently and reliably into XHTML. RSS is for machine processing while XHTML is designed for display. In fact I could be really lazy as a content author and have multiple XHTML pages generated from one RSS file.

I think of RSS files as more of a Web service then a web page. That may help provide a different perspective.

In responding to my comments Scott Andrew LePera writes:

Properly-structured XHTML is far more robust than RSS for providing syntactic structure for a Web document, and is just as machine readable. The fact that an <EM> is rendered as italicized text in a browser is completely incidental.

The more I learn about these issues, the more I become convinced that it's wrong to ask authors to jump through additional hoops to support formats for alternate endpoints like RSS newsreaders. At the end of the day, I'm paying the RSS tax through additional bandwidth and ensuring that what I put in my XHTML won't break my RSS (like matching character encodings and avoiding relative links).

Syntactic structure of a Web document is not the intended purpose of RSS. RSS was designed to syndicate a collection of online resources called a channel. Admittedly the RSS format and its documentation have been a disaster and are lacking, so let me clarify: RSS files were never intended to contain or transport HTML. If you go back to the version 0.91 format documents, you will see that was to be "plain text" that was required/recommended to be 500 characters or less. The <description> element was to be an excerpt or brief abstract of the content on the other side of the <link>. What has become common practice is the work of a one-man design committee and his personal agenda. Let us not confuse intent with misuse.

I'm all for experiments such as Anil's proposal, though I'm personally skeptical that is will be more successful then (proper) RSS and XHTML working together.

[UPDATE: In related to this growing discussion, Mark Pilgrim writes in "The rebellion will be syndicated": Tantek Çelik: XHTML vs. the world. Bet on the world. Another good point.]

A RESTful Publishing API.

I thought the Blogger Web service API and its close cousin the MetaWeblog API were interesting ideas. After attempting to experiment with them a bit I came to realize that their implementations where lacking the features or flexibility to take full advantage of MovableType features. For instance both APIs did not support MT's extends fields such as excerpt, "more text" and recently a keywords/meta data field. In the case of Blosxom, another blogging tool I use, implementing these APIs would be more substantial then the tool itself -- a disproportionate fit to say the least.

While (re)aquatinting myself with the concepts of REST architecture and other related thoughts, I began to consider how they could be applied to developing a flexible and scalable interface that was bettered suited for a wider range of publishing tools like MT and Blosxom. The thought has been bouncing around my head for months and then I saw the making of what I was thinking.

Joe Gregorio has opened The Well-Formed Web site begun work on a weblogging tool (dubbed RESTLog) featuring a Web service API based on REST architecture principles. (Nice work getting something out there Joe!)

While a good piece of work, the implementation of the RESTful interface is not exactly what I had in mind though. Also of issue, the API and the tool are currently married. This is primarily why the Blogger API is ineffective it reflects to simple functionality of Blogger and was not designed with other tools in mind. I'd suggest that the scope of the API not be limited to blogging tools. This API could perhaps be added to a Wiki tool or other like tool.

If this API where to have a life away from Joe's tool, the language current in the APIs description would need to reflect the different implementations behind the tool. For instance when POSTing a weblog entry, the current interface documentation reads "Creates a new news item and rebuilds the index.html and index.rss files..." Not all tools need to statically generate (rebuild) pages. For example, Blosxom dynamically generates its pages. Other implementation may not want (or be able) to use the index.html or index.rss names. Another such instance is the template interfaces that assume only two templates are in use -- one for HTML and another for RSS. This of course is not the case for most other tools.

The interface should allow for processing options to be passed to the receiving system. Going back to my experience using MovableType, authors can check options for automatically convert line breaks to HTML and allowing comments or TrackBack pings. The interface would have to allow for such options to be sent and modified.

I think Joe has the right idea leveraging RSS in the interface. However, the interface needs to be more specific by setting very clear constraints and guidelines particularly when it comes to RSS 2.0. Interoperability and the reliable expectation of elements are issues without additional constraints because the RSS 2.0 format is too loose and ambiguous otherwise. The XSS Profile is an example of such an attempt. Perhaps it could offer some suggestions -- a simple core of mostly required elements augmented by modules via XML namespaces with all prior cruft depreciated.

While a good starting point, I am a bit skeptical that RSS will entirely suffice in this role. If the <description> element is being used to transport the body of a post, where would an excerpt go? The <description> element was original designed to carry an excerpt of the entry. Something like <content:encoded> could be used instead, but should an element that is core be in a module? A <body> would make sense, but there is no such tag in the RSS format. Derivating from the RSS format by introducing tags specific to the task at hand in my opinion is likely to be necessary.

I think this effort is a good idea that has been long overdue. I'm enthusiastic to see it advance. Hopefully my comments are helpful in doing so.

Raising the Bar on the RSS Validator.

Sam Ruby links to my O'Reilly RSS article and asks "Perhaps the RSS validator should optionally issue warnings (as opposed to errors) nudging people in the directions of best practices such as the ones that Tim has outlined?"

Needless to say I agree. As I've written in the past, being valid RSS does not guarantee that a feed's content is well-formed enough to be useful to an end user. Thanks to the loose design of RSS2 format that generally has made things worse rather then better, there are many perfectly "legal", but less then neighborly uses. It's why I drafted the XSS profile and its why I support such warnings being added to the RSS Validator.

Raising the Bar on RSS Feed Quality.

Earlier this week O'Reilly published my latest article "Raising the Bar on RSS Feed Quality." In it I offer recommendations for authoring more useful and effective feeds with an approach that is neutral, practical, and conservative.

This article requires more then just a summary and link though. It was much too involved an effort to not say more.

I am no stranger to publishing having co-produced my own indie music fanzine for several years that eventually made its way onto the web and into the My Netscape Network the started it all. What spurred my interest that eventually lead to this article was a conversation I've highlighted here before.

"You see," lamented Mark Pilgrim, "most RSS Feeds Suck." Mark's comments couldn't have been timelier when first published. I had been experimenting with ways to streamline my intake of weblogs and news using Rael Dornfest's lightweight Perl aggregator blagg. (I now have taken on furthering a RSS feed plugin for MovableType and have other related projects in the works.) While I had achieved a certain level of success, I was surprised and taken aback by the varying quality and inconsistencies of RSS feeds that made my solution less then optimal and at times unreliable. I stopped reading some weblogs because their feeds where too poorly done and simply not useful or worth the hassle. I'm certain I am not the first.

Mark's solution for the technical issues was to develop and publish an "ultra liberal parser" that would allow for common mistakes and other anomalies while processing output.

Joe Gregorio, developer of the Aggie RSS news aggregator, agreed with Mark's assessment, but questioned if Mark's solution was too liberal. "...where is the motivation to fix those feeds?" he asked.

Mark followed with a very noteworthy response: "I. Was. Missing. News." End users don't care about standards. They care about, in this case, getting their news. Developers care about standards because they help developers.

Both viewpoints expressed are valid, important and symbiotic. Without the predictability and structure that standards provide, application developers will struggle to reliably deliver content from feeds to end users.

This exchange only focused on the technical issues of consuming a feed. It does not address other issues that can detract from a feed's utility and effectiveness such as the absence of basic elements or a lack of descriptive and meaningful content that are sadly quite prevalent. (Hence my article.)

Not to sit idly by, Mark along with Sam Ruby and Bill Kearney developed the RSS Validator service that checks RSS feeds for problems and generates friendly and instructive messages to fixing them. The service is optimized for RSS 2.0, but supports other versions of the format. This recent development is significant because it provides a much-needed tool for alerting publishers to issues in their syndication feeds. More work is still needed.

As I mentioned this article was involved. Not necessary the article itself, but all of the discussions, debates and projects it lead me too. In addition to the mt-rssfeed plugin that I completely rewrote, I got involved in the great RSS "war" back in September. I lobbied for RSS 1.0 reform with a simpler RDF-based format and learned a bit about the shortcomings and merits of RDF. I also drafted a "more sane" RSS 2 profile dubbed Extremely Simple Syndication or XSS when it became apparent that was going to be one step forward and two steps back. I also learned about the proper use of XML namespaces in developing my own liberal parser (for the plugin), the shortcoming of the XML::Parser::Lite module (the hard way), using HTTP ETags amd Last-Modified and all about proper XML encoding. Currently I have a plugin for MovableType that does proper XML encoding/decoding (UTF8, CDATA...) nearing release and version 1.1 of the mt-rssfeed plugin on the drawing boards.

Now that the article is finally out after nearly 3 months (it's a long story I won't go into), I'm looking forward to finishing some of these projects off -- at least for a while. I'd like to turn my attention to something a bit different since I'm a technology generalist.

It's been a long strange trip.

Analyze This.

Mark Pilgrim has posted an amusing analysis of Dave Winer's comments on REST vs. SOAP. This reminds me that I have a few posts to make, but its Friday and I need a break. Stay tuned more to come.

My friends Pete (who doesn't have a weblog) sent me a link to this well-done editorial by David Pogue entitled "Profit and Innovation at Microsoft." (Free registration required.) Pogue's editorial susinctly and rationally gets to the crux of the matter. Having been on the receiving end of Microsoft's "innovation" more times then I care to remember, I appreciate can apprecaite this viewpoint.

Why does Microsoft bother me so? Because, in my view, its success relies primarily on this unique "you're our customer whether you like it or not" arrangement. If Microsoft won through the superiority of its products or the brilliance of its new ideas, I wouldn't resent its dominance one bit. (You go, Sony!)

But that's not very likely to happen. Beyond Windows and Office, when has Microsoft become the dominant player in a market it covets? It's either a distant second-place player or a complete loser in palmtops, digital music formats, online services, set top boxes, game consoles, phones and other areas it has set out to conquer, no matter how many hundreds of millions of dollars it spends. If Microsoft were truly the quality-driven innovator it claims to be, surely it would have claimed the #1 spot in some of these other categories.

Instead, according to an article this week in The Financial Times, the numbers tell the real story: Microsoft's Xbox game division lost $177 million last quarter, its MSN online service lost $97 million, its application-software division lost $68 million and its palmtop division lost $33 million. The only profits at Microsoft, in fact, came from its Windows monopoly money: $2.84 billion. (If there's any doubt that Microsoft is abusing its monopoly, that's an 85 percent profit margin.)

RDF Follow-Up.

The recent "what's wrong with RDF?" discussion has been highly enlightening watching from the sidelines. It clarified some of the issues RDF has yet to address adequately and put other aspects into perspective for me.

Over at O'Reilly, Simon St. Laurent follows my summary with "What's right with RDF" where he writes "RDF is excellent at addressing a particular set of problems. The Resource Description Framework's primary approach is description. XML often presents something (a document, a table) directly; RDF more typically presents a description of something, not the thing itself. For some applications - like metadata and ontology development - this approach fits beautifully with the problem set." He continues "if RDF fits your problem set, run with it. If it doesn't fit, fight it - for that problem set. We're not all going to be happy all of the time, but RDF's strengths should not be forgotten in the arguing."

Also over at O'Reilly, Kendall Grant Clark reviews Tim Bray's proposal for an alternate RDF/XML serialization, called RPV, that is unambiguous and highly human-readable. The goal of RPV is not to support the full RDF specification, but rather the most common elements. Bray explained, "What RDF needs is the equivalent of XML, a brutal reduction (at least at the syntax level) that hits 80/20 points and anybody can figure out in 15 minutes by looking at it." He continued "I'm not saying RPV is the way to go, it's just a challenge: it proves that you there is a way to encode resource/property/value triples in XML that is human-readable and human-writeable."

Shelley Powers makes some final clarifications on her comments during the debate.

...I do not discount the complexity and difficulty inherent with RDF. I am aware, all too aware, of how complex the RDF Model documents can be. I know that there is much of the lab and not enough of the real world associated with the effort. And I'm not trying to dismiss people's concerns with the model or the RDF/XML serialization when I say that we need to release the RDF specification rather than start over.

When I say that I don't have problems with the RDF/XML, people should be aware that this is because I spent an enormous amount of time with the RDF specifications learning the core of the RDF model. I then spent a considerable amount of time learning how RDF is serialized with RDF/XML. I will now spend a significant amount of time reading through the newly released specifications to see where my understanding differs from the newest releases.

All of this has taken time and effort. I do not deny this.

I sincerely appreciate Shelley's honesty, effort, patience and passion in this recent debate. My personal encounters with RDF advocates/experts have been lacking and generally unsatisfactory. They've lacked a sense of clarity or acknowledge the realities of the "real world." Shelley' comments provided the perspective and sense of mutual understanding I wish most of the RDF community would exercise. (I don't hold this again RDF though.)

Shelley writes "...I'm not speaking for the RDF Working Group, in any way. I am giving my own viewpoints and opinions, which the WG may not agree with. No one can speak for the WG members, but they, themselves."

I wish Shelley did speak for the RDF workgroup. I found it rather odd, almost disconcerting, during the whole affair that few (any?) member of the RDF working group got involved. I would feel even better about this recent discussion hearing their viewpoints and knowing they are listening and have taken heed.

Commenting on Shelley's post Simon St. Laurent writes:

Thanks, Shelley.

You've brought some really difficult issues to a much broader group of people than the usual suspects, and I'm hoping that we'll see some interesting results over the next few months as people think about the questions you and other participants in the discussion have raised.

I doubt we'll all be living in peace and harmony by then, but we might at least have a better perspective on what we all see going on with these technologies.

I agree.

Fighting Spam with Digital Identities.

Kevin Werbach's "Death by Spam" has been the talk of the net lately. Werbach predicts the end of email as we know it (pervasive, flexible, universal connectivity) as the spam problem continues to worsen. Werbach concludes "Like it or not, the only way to kill spam is for an element of e-mail to die as well."

Jon Udell offers an intriguing thought (again) for fighting spam that Werbach does not cover. Instead of "whitelists", the equivalent of Instant Messaging buddy lists, Udell proposes that the use of digital identities could help filter email into two piles. Those who have asserted their identity go into one pile (mail you want to read) and those who have not go into another (mail you probably don't want to read). Digital certificates are better then whitelists in that they facilitate "trusted communication without prearrangement."

Every user suffers from and understands this plague. Blocking spam could finally incent the masses to use digital identities. There are other issues that need to be addressed. Udell goes on to point out that users still need to "jump through the hoops that now complicate the acquisition of a digital ID -- or to spur vendors to simplify that process. I've often wondered what it would take to get us over the activation threshold."

I wonder too. Personally I do not use a digital identity though I realize I should. It is a weak excuse, but I'm aware acquiring a digital ID is burdensome and simply haven't had the time or energy to take that on. I'm very reliant on email and I'm growing increasingly sick of spam. Perhaps it has come time to take this on and do my part to set an example.

What's Wrong With RDF?

"What puzzles and confuses me is why there is so much animosity towards RDF" writes Shelley Powers, author of O'Reilly's upcoming book on RDF.

Shelley's post was made in response to Tim Bray's attempt to implement an RDF model into the RDDL specification that ultimately lead to his recommendation to use XLink instead. Bray's comments where picked up through the community unleashing a torrent of criticism and "animosity" directed at RDF. Jonathan Borden summarizes the significance of Bray's comments when he wrote, "this is the crux of the problem. If Tim Bray can't do RDDL/RDF using his little toe, with his hand tied behind his back and the rest of him hog tied and upside down, then what prayer to we have trying to foist this upon the rest of the world, i.e. people who just want to create and document XML namespaces?"

Shelley Powers' post to the xml-dev touched off a heated discussion late last week that continued across mailing lists and weblogs through the weekend. In this weblog post I will attempt to highlight and summarize this conversation. I attempted to order the comments in a way that they make sense and do follow somewhat of a chronological order though not entirely. I have attempted to compensate for the distributed and parallel nature of the conversation in order to maintain some semblance of its flow.

"I am particularly unhappy because of Tim Bray's involvement in all of this," wrote Shelley Powers on her weblog. "There's an implication and an assumption made that because Tim Bray 'invented' XML, he's qualified to be a definitive judge of RDF and RDF/XML. However, the two efforts are not the same: XML deals with meta-language, RDF with meta-data. Tim has a right to his opinion, and I don't fault him for it though I don't have a tremendous amount of respect for his half-hearted and rather dubious effort to use RDF/XML to model RDDL."

(Jonathan Bowen subsequently posted a human-readable RDF compliant RDDL format to demonstrate a human readable RDDL format could be created with an RDF model.)

Shelley offered some advice to anyone put off by RDF: "If you don't understand it, and don't want to take the time to understand it, or don't feel it will buy you anything, or hate the acronym, or you're in a general bitchy mood that's easily triggered if someone uses "Semantic" in the same sentence that contains "Web", the solution is simple: don't use it. Don't use it. Don't study it, look at it, listen about it, work with it, sleep with it, or generally go out and dance late at night with it."

She also notes "However, you may feel about RDF, the spec, or RDF/XML, the serialization, I would hope that you all remember one thing: in the last few days, the RDF Working Group has released not one, not two, but six new working drafts. Six. That's a hell of a lot of work."

(See this post for more on these latest RDF drafts from the W3C's Semantic Web Activity.)

Simon St. Laurent writes "I have a lot of respect for certain RDF applications that appear to be working, a general lack of interest in describing the world as graphs, and a serious distaste for RDF syntax. I genuinely resent what I see as the unfortunate influence of RDF on XML's post-1.0 development and the URI-centric viewpoint it has foisted on XML."

Simon later went on to say "RDF is powerful stuff, great for those who want to use it. Just keep it off _my_ dance floor, please."

Tim Bray responded "I'd go further. I think the current RDF/XML syntax is so B.A.D. (broken as designed) that it has seriously got in the way of people being open-minded about RDF. I'm baffled why the RDF working group has been forbidden to work on replacing that syntax."

In response Shelley Powers posted "because, Tim, there are implementations of RDF/XML as described, including Mozilla and RSS 1.0. I know you don't approve of them, but they are real, they are production, they are in use. Bitch about them as much as you want, but people use them."

On the comment board to Shelley's weblog Mark Pilgrim offered his take. "The fundamental flaw of the overzealous RDF advocates is the implicit assumption that "because I want to work with this data as RDF, it must be produced and stored natively as RDF." This is demonstrably false, and is what people are objecting to when they talk about "the RDF tax"."

Joe Gregorio published similar sentiments, "...my animosity comes from a push by possibly overzealous RDF proponents to change every format they come in contact with into valid RDF serialized as XML. I can point to RSS 1.0 and now the abortive RDDL as RDF attempt as failures of that strategy. On the other hand I can point to the use of RDF in Mozilla as a successful strategy of *leaving the native format alone* but still getting the benefits of RDF, as I pointed out Wednesday."

(Gregorio later published that Mozilla's use of RDF is smaller then first believed according to a OSAF mailing list thread.)

Gregorio continues, "I think a healthy dose of skepticism and a critical eye turned on it by people outside of the usual circle would be helpful to RDF, and it certainly couldn't hurt the XML serialization."

Elsewhere Tim Bray offered: "<famous-anecdote>Stuart Feldman, the Bell Labs guy who invented "make", woke up one morning a few weeks after he'd released it, and realized that the syntax basically sucked - all those tabs and colons and weird continuation rules. He started working on something better and was shot down because someone said "Stuart, there are *dozens* of people using this, it's too late to change it."</famous-anecdote> I think the number of people who are now using RDF is comparable, in relation to the number of people who need something like RDF, to the couple of dozen make users in 1970-something. It is *not* too late to fix the RDF syntax, it just takes some courage and initiative."

Shelley Powers responded:

"Yeah, but who is to say that [Stuart Feldman's] new approach would have been better? We can work and work and work a spec until we're blue in the face and not find a perfect solution. People learn to work the situation, or they learn to automate it -- i.e. autoconf, automake, and libtool.

Tim, we need the [workgroup] to finish. We have been waiting over a year for them to finish. We need something stable that we can work with. We do NOT need to start all over again. I would pack it in at that point. I really would."

Responding to a comparison of RDF now to HTML in the early days of the Web, Bray wrote, "HTML was by no means "bad". It was exactly what the world needed, and millions of people started using it because they liked it and because they could do "view source" and figure it out. My gripe with RDF/XML is precisely that it's failing to learn this lesson from HTML's success. Thus not enough people are using it, even though it's arguably what they need."

Shelley Powers notes, "the RDF Working Group was given a charter not to rewrite RDF/XML but to answer issues and provide as much cleanup and clarification as they could but to still remain within that support for previous implementations. It's sad that one can't just throw things out and start over again, but that's the way of the real world."

To that Bray responds "No it's not and yes you can, and you should."

Elsewhere Mark Pilgrim wrote in response to similar comments by Shelley, "you're hurting yourself more than anyone else by defending the status quo. You have a lot invested in RDF (the theory), and it'll all go to hell. The rest of the world will remain blissfully unaware that there was this great idea here, buried under mounds and mounds of incomprehensible angle brackets."

Tim Bray also wrote "The proponents of RDF (including myself) say that RDF's value add is that it allows the efficient interchange and manipulation of [Resource, Property, Value] triples. I happen to believe this propaganda and I also believe that one of the obstacles is the human-incomprehensible syntax. If you believe that RDF/XML's current syntax is not a problem please continue with your project of trying to sell it to the world, but it feels to me you're trying to accomplish a good thing with one hand tied behind your back."

(Mark Pilgrim offers a personal account of his attempts and frustrations with RDF here. I don't quote it here since the entire post is worth a read as a first-hand account of the issues being discussed throughout this discussion.)

In addressing the XML serialization of RDF Danny Ayers offers, "probably the primary cause of the ugliness of RDF/XML is the mismatch between the tree model of XML and the graph model of RDF. To explicitly represent a graph in XML the syntax will start getting ugly whatever you do. This is a weakness of XML, not RDF.

In a post to the xml-dev mailing list Shelley Powers wrote "I'll be honest, I don't care about the human readable/writeable aspects of RDF/XML as much as I do care that there are tools and APIs that manage it all for me. Sorry -- but I just don't think that is the most important aspect of either XML or RDF/XML. Again, IMHO."

To which Sean McGrath replied:

"I'm afraid, I take a diametrically opposite view. Things should be as complex as necessary but not more complex.

Punting to tools and APIs to salvage mankind from complexity of its own making is one of the main reason this industry is constantly battling the alligators rather than clearing out the swamps."

Elsewhere Jack William Bell echoed the same sentiment. "I have a problem with [an (easily) human readable format not being necessary]. If you don't care about being able to read it easily, why not use a binary format of some kind in the first place and reduce the bandwidth footprint?"

Tim Brays writes "I guess where Shelley and I would agree to disagree is that she doesn't think that easy human-readability is very important in the data formats she uses, and I think it's terribly, terribly important; I think one of the central lessons of the Web is that enabling people to do a "View Source" and roll their own based on what they see is, well... there's nothing more important."

Shelly Powers explained "RDF/XML is a mapping of that model to XML -- a mapping that's not necessarily easy or uncomplicated. XML was picked because XML is the prime metalanguage format used in many intra-mechanical transitions, such as forming the messages and providing the framework for something such as SOAP. It wasn't necessarily picked because XML is human readable, though we hope that would be a side benefit."

Tim Bray writes:

"for the record, I did *not* invent XML, I was a member of a [workgroup] of 11 people supported by an interest group of another hundred or so who subsetted an existing standard called SGML whose position was spookily similar to where RDF is today: it's important, some smart people are using it to do some big things, but it has no grass-roots uptake.

Turns out that some of the things you could do with SGML you can't do with XML, and some of them are awfully handy, but in the end it turned out that the complexity cost for doing them pushed the cost/benefit ratio into really lousy territory. Hmm, there's an echo in here."

In response to a post on why RDF is hard, Simon St. Laurent wrote "I don't think the RDF community has ever really understood that what they do is genuinely difficult for most people. The RDF community seems very self-selecting to me - those who can cope with RDF like it, and the rest of us keep our distance. I'm not sure it's ever been clear to people who find RDF intuitive why so many people bounce off of it completely, and I'm not convinced that it's possible to explain that to someone who genuinely likes RDF."

Shelley Powers replied "No one is forcing anyone to use RDF. This isn't a dismissal -- this was meant to be a reassurance."

Bluetooth: Teething Pains or Cavities?

In response to the Bluetooth article "Teething Pains" published by the Boston Globe on November 11th, Bob Frankston provides some notable insights and criticism of the wireless networking technology's design.

We should learn from the example of X.400. X.400 was (is?) a mail protocol approved and required by essentially all the telecommunication agencies throughout the world. It was designed over a period of ten years yet failed against SMTP (Simple Mail Transport Protocol) which could be implemented in an afternoon. Like x.400, the Bluetooth was designed and promulgated before anyone could learn from the first generation. Bluetooth is designed to work in the specific cases imagined by its designers and thus will perform very well in precisely those scenarios and these are the scenarios touted in press releases.

Frankston continues...

Bluetooth is in the mainstream of the old model of telecommunications in which all the services are defined by the center and every new capability must be approved before it can be deployed and thus before we even understand it. 802.11 is simply a transport for packets and doesn't stand in the way of creating new capabilities.

Once again we face a familiar paradox. Bluetooth which defines so much of the solution is thus limited to what it defines and that is very little and it only works among a few nearby devices. 802.11 which makes few promises inherits the existing richness of the Internet Protocols and has no such limits of distance.

RSS 1.0's Deeper Value?

Bill Kearney writes referring to Edd Dumbill's article Addressing RSS's logical model:

Basically what Edd's saying is that there could be more than one channel in a feed document and that the items could be shared among those different channels. This as opposed to sending out a separate feed for each channel and duplicating items all over the place. The channel itself has a sequence container that indicates not only what order to use but what items to use. That rdf:Seq container is telling you what items to use in the channel.

So, once again, the strange way RSS-1.0 appears to be doing things has a much deeper value.

While in theory RSS 1.0 may be doing something of deeper value, that value is generally not a feasable realization in practice. This was a question I raised some time ago and never got a satisfactory reply that justifed such a need for this "value."

Recent threads have indicated that RSS bandwidth consumption, whether centralized or decentralized, is becoming a major issue. Part of this can be solved by designing smarter aggregators which has begun to happen. Another part to handling this issue rests on publishers to develop a feed that conserves bandwidth while still remain useful to its taraget audience.

In certain circumstance having one item in multiple channels may help conserve bandwidth, but overall I believe it would not. In fact, if put in practice this would enlarge RSS feed files even further and make it more difficult for consumers to conserve bandwidth. Suppose I where to use RSS 1.0 to create one "uber" RSS file with channels for my latest posts overall and in each category. If a consumer wanted the items in one channel, they would have to download all the items and channels then resolve my channel's <Seq> list with the items. Perhaps the item in the channel of interest has not been updated while another has thereby changing the modified on timestamp -- more wasted bandwidth. Since I use a tool that generates my RSS feeds, as do most others, creating separate files for each channel is trivial. This approach can conserve bandwidth by containing only relevant information to the channel. (This is of the utmost importance when the end user is in a low-bandwidth resource constrained environment such as a mobile phone.) It also allows end users to subscribe to only receive the channels that interest them by a unique URI. (Multiple channels in one file/URI is not very RESTful. RESTonians have railed against SOAP's one URI, many methods approach how is this different?) Its also more straight-forward to process and allows for HTTP Modified-Date and ETags to be reliable and accurate.

I'm of the mind that the <rss> and <channel> tags should get folded together into one tag -- preferably just <channel>. Assuming the window has not already closed, I'm also of the mind that rather then try and justify its design decisions with theoretical idealism the RSS 1.0 working group spend its making modifying the specification to be useful and more practical -- or perhaps they should be the ones to take up knitting.

Cory Doctrow at Boing Boing announced that Clay Shirky will be their latest guest blogger.

I'm looking forward to this. Clay is one of those people that I would wished would publish their own weblog. BEA's Adam Bosworth is another. I guess I'll just have to settle for a sampling of my wish.

Un-Neighborly RSS.

[From my post to the RSS2-Support list.]

I've noticed recently that a number of feeds I follow have not been working properly in that my aggregator doesn't generate a hyperlink to the entry. I took a look at these feeds and noted they are all RSS 2.0 being generated by Radio. Furthermore, there is a <guid> element, but no <link> tag. The <guid> in fact a valid URL that points to the content, so all that has happened is a renaming of a tag.

What was the point of renaming the <link> element? I know the "spec" says this is legal, but it seems silly since not long ago Sam Ruby's quick survey of RSS tag usage nearly 88% of feeds provided an item <link>. Its breaking interoperability that the user base has defined and for what? In order for me to read these feeds the way I like I have to modify/upgrade my software? That doesn't seem very neighborly at all.

About this Archive

This page is an archive of entries from November 2002 listed from newest to oldest.

October 2002 is the previous archive.

December 2002 is the next archive.

Find recent content on the main index or look in the archives to find all content.