Setting an XML node to a space ” ” character

This addresses an XML Oddity that was raised in our forums. You may also want to view “XML Oddity II” for handling empty XML nodes.

The issue is a simple one: If you use the node:setInner(" ") function to set an XML node to a space (” “), it is instead set to an empty string (“”).

Unfortunately there is no “one size fits all” solution, so we present several possibilities. You need to choose the one that matches your requirements.

We present these options:

  1. Best Practice: Use the node:setText() function, from the xml custom module, to add a TEXT element containing a space.
  2. Recommended: Use the node:append() function to add a TEXT element containing a space.
  3. Recommended: Create a new node function node:setInnerSpace() that sets the node to a space.
  4. Override the builtin node:setInner() function so that will set a node to a space.

The last option is “elegant”, but it runs the risk of breaking code that expects the old behaviour, or reverting to the old behaviour (if the overriding function is accidentally omitted or removed). The other options are probably safer. The choice is yours.

Sample Code [top]

Option One

Code for main():

require 'xml'

function main()
 local X = xml.parse{data='<SPACETIME></SPACETIME>'}
 local tim = X.SPACETIME:append(xml.ELEMENT,'TIME')
 tim:setInner(os.date())
 local spc = X.SPACETIME:append(xml.ELEMENT,'SPACE')

 -- Adds a space
 -- Updates the *first* TEXT element
 -- If there is no TEXT element then it appends one and updates it
 -- WARNING: This overwrites any data in the first TEXT element
 spc:setText(' ')
end

Code for the xml module: Copy the module code from Useful XML node functions module>Sample Code

Option Two

Code for main():

function main()
   local X = xml.parse{data='<SPACETIME></SPACETIME>'}
   local tim = X.SPACETIME:append(xml.ELEMENT,'TIME')
   tim:setInner(os.date())
   local spc = X.SPACETIME:append(xml.ELEMENT,'SPACE')

   -- Adds a space
   -- Space is added by appending a text element containing a space
   -- Always appends a new text element (use node:setInner() to overwrite an existing element)
   spc:append(xml.TEXT, ' ')
end

Option Three

Code for main():

require 'xml'

function main()
   local X   = xml.parse{data='<SPACETIME></SPACETIME>'}
   local tim = X.SPACETIME:append(xml.ELEMENT,'TIME')
   tim:setInner(os.date())
   local spc = X.SPACETIME:append(xml.ELEMENT,'SPACE')

   -- White space is ignored
   spc:setInner(' ')
   trace(X:S())
   
   -- Adds a space
   -- Overwrites the element it is pointed at with a space (just like setInner)
   -- Can "accidentally" delete content if you point it at a non-empty element
   spc:setInnerSpace()
   trace(X:S())
end

Code for the module :

-- module is likely to be re-used so use a descriptive name like "xml"

-- OPTION 1: setInnerSpace()

function node.setInnerSpace(XmlNode)
   XmlNode:setInner('x')
   XmlNode[1] = ' '
   return XmlNode
end

Option Four

Code for main():

require 'xml'

function main()
   local X   = xml.parse{data='<SPACETIME></SPACETIME>'}
   local tim = X.SPACETIME:append(xml.ELEMENT,'TIME')
   tim:setInner(os.date())
   local spc = X.SPACETIME:append(xml.ELEMENT,'SPACE')
      
   -- Updated setInner() to work with spaces
   -- Overwrites the element it is pointed at with a space
   -- Can "accidentally" delete content if you point it at a non-empty element
   spc:setInner(' ')
   trace(X:S())
end

Code for the module:

-- this module contains functions customized for this channel only
-- so we named it after the channel we used = unique name (and obvious)

-- OPTION 2: change function name to setInner() to override builtin setInner()

local SetInnerCopy = node.setInner -- copy builtin setInner() - needed for Option 2 only

function node.setInner(XmlNode, Value)
   if Value == ' '  then
      SetInnerCopy(XmlNode, 'x')
      XmlNode[1] = ' '
   else 
      SetInnerCopy(XmlNode, 'x')
   end
   return XmlNode
end

Using the code [top]

  • This code would probably be used in a To Translator, Filter or From Translator component script
  • Choose the option you prefer:
    • Use node:append(): This is a builtin function so no additional module is needed
    • setInnerSpace(): Because the code is a good candidate for re-use by other channels, place it in a module that uses a descriptive name, we used “xml”
      Note: If you already have an “xml” module you can add this code (or use a different name).
    • Override setInner(): Because the code should not be re-used by other channels, we recommend indicating it is private by using a module named after the channel:
      • This gives you an unique name that is easy to remember
      • The unique name prevents the module from being “accidentally discovered” and re-used
      • This module should only contain code unique to this module

How it works [top]

The issue is caused because the builtin node:setInner() parses the XML message and ignores the singleton space ” ” character.

Option One

In our example the “expected” text element is missing which causes the index out of bounds error. Using node:setText() simply adds the “missing” element into the node tree.

This method simply adds a text element containing a space under the specified element (in this case “SPACE”):

Option Two

In our example the “expected” text element is missing which causes the index out of bounds error. Using node:append() simply adds the “missing” element into the node tree.

This method simply adds a text element containing a space under the specified element (in this case “SPACE”):

Note: This method appends a new text element each time it is used, so if you use it multiple times you will add multiple elements:

If you want to update the same element again (rather than adding more) use node:setInner().

Options Three and Four

The issue is caused because the builtin node:setInner() parses the XML message and ignores the singleton space ” ” character.

Both options do the same thing under the hood, they use node:setInner() to set the node to “x” then they reference the node using an index and set it to a space ” “.
Note: Setting an empty node with an index does not work, so you need to set it to “x” (or some other value) first.

Option three new function node:setInnerSpace():

Option four override node:setInner():

Note: Be aware that both of these options will overwrite any element/node with a space, exactly as node:setInner() does. So be sure that you are only using them to update an empty node.

For example, this is probably not what you want:

This behaviour is correct because it is consistent with how node:setInner() behaves:

Best Practices [top]

 

  • You need to make sure that the option you choose matches your requirements
  • Test the code to make sure it works correctly with the type of XML data you receive
  • Using a node function makes the code easier to read and the behaviour change is limited to one place

What not to do [top]

  • Assume the function will do what you need without understanding how it works
  • Put the module into production without testing against representative sample data
  • If you use the last option overriding node:setInner(), don’t omit the overriding function
    Note: Be aware that someone else could update the module and “accidentally” remove the overriding function without your knowledge