Producing valid RSS from my XSL Transform

In a previous post, I showed the XSL transform I designed to transform a simple XML file I use into “(almost) valid RSS.” I called it (almost) valid because of one problem, which I address here.

The problem reviewed

My original XSL assigned an arbitrary ID to each entry in the news file based on its order in the list. What this meant was that the first item received the integer 1 for an ID, the second 2, the third 3, etc. Here it is:

<!--old method, produces invalid link & guid-->
<link><xsl:text>http://www.duncanandmeg.org/news.php#</xsl:text>
  <xsl:value-of select="position()" /></link>
<guid><xsl:text>http://www.duncanandmeg.org/news.php#</xsl:text>
  <xsl:value-of select="position()" /></guid>

This was fine for my original purposes, but since this hardly represented a unique & permanent identifier for each post, it caused some problems when echoed into the RSS <guid> and <link> elements. The significance of the <guid> element in particular is eliminated with such an approach. Every time I added an item to the list (see the previous post for an explanation of my schema), it would be assigned the ID of 1, and the older posts would be assigned the ID numbers 2, 3, 4, 5 etc.

This meant that most RSS aggregators would not detect that a new post had appeared and would not update accordingly.

Choosing a unique ID schema

Since I could no longer use my simple numbering scheme based on position in the original XML file, I had to come up with some other identifier. I learned about the Tag URI algorithm, but decided that it was more complex than I really needed for this application, and the scheme of my original XML file didn’t lend itself to producing these anyway.

Since I don’t update the entries in this file frequently, I invented a simple method that produces relatively unique identifiers based on each entry’s data. All my new ID’s consist of two elements based on the data in my XML file:

  1. The text from the entry’s title preceding the first space (” “) character
    (If the entry consists of only one word, this element will not appear. This is a weakness that would be more important if updates were frequent)
  2. The date of the entry encoded in the format mm-dd-yy

Once I settled on my scheme, producing it in XSL turned out to be fairly simple.

XSL transform

I’m including only the portion of the XSL tranform that deals with the entry <link> and <guid> here. The rest of it is in the previous post.

&lt;item&gt;
  &lt;title&gt;&lt;xsl:value-of select="headline" /&gt;&lt;/title&gt;

  &lt;!--unique id created and stored in XSL variable--&gt;
  &lt;xsl:variable name="bookmark"&gt;

    &lt;!--find word before first space--&gt;
    &lt;xsl:value-of select="substring-before (headline, ' ')" /&gt;
    &lt;!--produce date in format mm-dd-yy--&gt;
    &lt;xsl:value-of select="translate (date, '/', '-')" /&gt;

  &lt;/xsl:variable&gt;

  &lt;!--variable placed in link and guid elements--&gt;
  &lt;link&gt;&lt;xsl:text&gt;http://www.duncanandmeg.org/news.php#&lt;/xsl:text&gt;
    &lt;xsl:value-of select="$bookmark" /&gt;&lt;/link&gt;
  &lt;guid&gt;&lt;xsl:text&gt;http://www.duncanandmeg.org/news.php#&lt;/xsl:text&gt;
    &lt;xsl:value-of select="$bookmark" /&gt;&lt;/guid&gt;

  &lt;!-- continue with transform, see original post --&gt;

&lt;/item&gt;

With a reasonably unique guid, most RSS readers properly update now when I post new items.

0 Responses to “Producing valid RSS from my XSL Transform”


Comments are currently closed.