How to use XSLT to transform XML

A client came to us with an interesting request: he wanted to transform XML using XSLT in the Translator. What’s more, he had already written a large amount of the conversion logic in the XSLT language, and he wanted to avoid having to rewrite it all in Lua.

For situations like these, it’s nice to know that the Translator’s flexibility can handle such cases! To remedy this problem, all we needed was an external program that can parse the XSLT.

Our solution is demonstrated in the example code below. In this example, we use the command-line tool msxsl.exe as a wrapper around Microsoft’s MSXML library. We chose this program because the client in question was running Iguana on a Windows platform. That said, there are several other XSLT parsers that can be invoked externally for a variety of platforms. Here is a list of potential programs: http://en.wikipedia.org/wiki/Xslt#Processor_implementations.

local XML_TEMPLATE = [[
<?xml version="1.0"?>
<patients>
  <patient id="12345678">
    <first-name>John</first-name>
    <last-name>Smith</last-name>
  </patient>
  <patient id="42759178">
    <first-name>Sarah</first-name>
    <last-name>Campbell</last-name>
  </patient>
</patients>
]]

-- Note: If your chosen XSLT processor is not accessible through your PATH
-- and is not located in the Iguana installation directory then this variable
-- will need to contain the absolute filepath to the executable for Iguana to
-- find it.
local XSLT_PROCESSOR = "msxsl.exe"

local function addQuotes(Str)
   return '"' .. Str .. '"'
end

function main(Data)
   -- Use the XML template as input data if no other sample data
   -- has been provided.
   if Data == "" and iguana.isTest() then
      Data = XML_TEMPLATE
   end

   trace(Data)

   -- Load the input data into a temporary file to be used by the
   -- command-line tool.
   local TempFilename = os.tmpname()
   local TempFilehandle, Msg = io.open(TempFilename, "w")
   assert(TempFilehandle ~= nil, Msg)
   TempFilehandle:write(Data)
   TempFilehandle:close()

   -- Retrieve the filepath to the XSLT file containing the transformations
   -- to perform.
   local TransformFilename = "transform.xsl"

   local ProjectFilepaths = iguana.project.files()
   local TransformFilepath = ProjectFilepaths["other/" .. TransformFilename]

   -- Error checking on the filepath retrieved.
   assert(TransformFilepath ~= nil, "Could not locate the file " ..
      TransformFilename .. ".")

   -- Now we can perform the transformation with the two files as input.
   -- The filenames are surrounded by quotes in case they contain any spaces.
   -- Also, we redirect stderr to stdin in case the program exits with an
   -- error message.
   local Command = table.concat({XSLT_PROCESSOR, addQuotes(TempFilename),
         addQuotes(TransformFilepath), "2>&1"}, " ")
   local Prog = io.popen(Command)
   local Output = Prog:read("*a")
   Prog:close()

   -- Remove the temporary file, since it's no longer needed.
   local Result, Msg = os.remove(TempFilename)
   assert(Result ~= nil, Msg)

   -- At this point we can do something with the transformed data, like send
   -- it to the Iguana queue.
   queue.push{data=Output}
end

Here are the contents of the “transform.xsl” file referenced by the script:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="xml" indent="yes" encoding="UTF-8"/>

  <xsl:template match="patients">
    <root>
      <xsl:apply-templates select="patient"/>
    </root>
  </xsl:template>

  <xsl:template match="patient">
    <name id="{@id}">
      <xsl:value-of select="first-name"/><xsl:text> </xsl:text><xsl:value-of select="last-name"/>
    </name>
  </xsl:template>

</xsl:stylesheet>

Note: For the script to work correctly, this file should be located in the “other” folder referenced by your project.

Additional Notes

Here are some additional notes about the project:

  • The XML template defined at the top of the main script is only there for illustrative purposes. If you are using the script in a Filter or To Translator component then the template can be removed once you have loaded in some appropriate sample data, or, if you are using it in a From Translator component, the template can be removed once you have set up a data source for your script’s execution.
  • Similarly, the file “transform.xsl” in the project’s “other” folder is only an example. This can be removed or replaced once you have added your own XSLT file to the project.
  • This script was written to work with Iguana 5.5.1. If you are using an earlier version of Iguana then line 46 of the script will need to be changed to access the project files like this: iguana.project.files. The files field is not a function in versions of Iguana earlier than 5.5.1.
  • For the example to work correctly on your machine, your chosen XSLT parsing program may need to be in the Iguana installation directory. This is the working directory used by the Translator when it tries to execute an external program, and if the program is outside this directory and is not accessible through your machine’s PATH environment variable, then the execution will likely fail. Another alternative is to provide the absolute file path to the program when the script calls io.popen.

Why is the incoming message written to a file before being transformed?

This may seem like a poor design decision on our part, and indeed we could have passed the message to msxsl.exe directly instead of writing it to a file first. This illustrates an important dilemma that you may encounter yourself on occasion. It asks, “Should I maximize the speed of my script or go for increased fault tolerance instead?” Usually there isn’t an easy answer to this question, as it often comes down to the mindset of the individual developer. In this case, we opted for the second option…

Passing the XML input directly to a command-line tool poses a difficult problem because XML files contain several characters which are interpreted as shell “meta” characters. For example, take the “>” character used to delimit an XML tag. On most platforms, this character is taken to mean “redirect the output of program x to file y”. As such, if you were to give your XML message to an external program directly without performing any escaping first, you would likely get very different results from what you were expecting.

Yes, you could escape the contents of your XML message prior to calling the external program, but doing this correctly is not a trivial task. An XML message could potentially contain any of the shell meta characters, and the correct way to escape these characters varies greatly with each platform. As an example, a nice little helper function could be written to escape the incoming message beforehand, but what if you forget to escape “$” characters (which are used on most systems to substitute a value for a variable)? Imagine that your interface receives a message one day that contains a dollar sign within one of its tags. Now manual intervention is required to repair the interface before it can successfully process these types of messages again.

With this in mind, we chose to load the contents of the XML message into a temporary file first before passing it to the external program. This allows us to avoid needing to escape the contents of the message at all. The trade-off? Choosing decreased performance in favour of a more robust interface. The additional file operation that the script performs does affect speed somewhat, but we suspect that this won’t be noticeable unless your interface is receiving a high amount of traffic.