An HL7 message consists of a series of segments that are separated by a single “\r” terminator character. The HL7 standard states very precisely that each segment should be terminated with just one 0x0D (“\r”) character. However, it is quite common to receive messages that use non-conforming terminators. For example, we often come across messages that use 0x0A (“\n”) or 0x0D0A (“\r\n”) as their terminators. These issues stem from the fact that, for historical reasons, different operating systems use different newline characters for text files.
This FAQ will show you an extremely simple method (two lines of code) that will convert most offending terminators to the standard “\r”.
Before You Start
Before we show you how to fix non-conforming terminators, let’s discuss why they happen and how to diagnose them.
Reasons for non-conforming terminators
There are various possible reasons why a file uses non-conformant terminators:
- When the file was transported between systems, the terminators were (incorrectly) converted to match the target operating system. For example, using ftp in ASCII mode produces this result. That is why we recommend using ftp in binary mode (which performs no conversions).
- Older systems may use terminators specific to operating systems:
- Linux/Unix might use 0x0A (“\n”)
- Windows is likely to use 0x0D0A (“\r\n”)
- If you cut and paste message text in Windows, it will “helpfully” convert “\r” to “\n” without your knowledge
Finding non-conforming terminators
Non-conforming terminators can be difficult to diagnose because most modern text editors will display all newline variations as a new line. To see the offending characters themselves, you need an editor that can display special characters, or an editor that offers a hexadecimal viewing mode. Fortunately, Iguana offers both these display options!
These annotation dialogs clearly show you the the non-conforming terminator (shown as either “\n” or 0x0A, depending on the view mode selected):
How It Works
Once you’ve identified that you do in fact have non-conforming terminators, you can easily correct them using a simple string substitution. As demonstrated in the example below, we will use only a few lines of code to convert “\r\n” and “\n” (the two most common offenders) to “\r”.
Note: As simple as this seems, please notice that the conversion order is important! If we reversed the order, then “\r\n” would be incorrectly converted to “\r\r”:
It is also very easy to add other conversions to your code, as needed. What if you need to interact with systems that (sometimes) introduce extra terminators, e.g. “\n\n” or “\r\r”? To resolve this issue, simply add the following additional lines to your script:
You could even be a bit “clever” (and obscure) by using a single extra conversion instead:
As simple as this string substitution might be, you may be tempted to try alternative methods. For example, you may want to split segments on the terminator.
Imagine that your sample data has “\n” terminators, like this:
You might be tempted to change the code to split segments on “\n”, like this:
At first glance, the error seems fixed!
Alas, if a correctly formatted HL7 message with “\r” delimiters is passed through your script, the error simply occurs again:
The correct solution is to convert the (non-conformant) “\n” to “\r”, as we showed you in the previous section:
Once you are comfortable resolving this issue, check out these examples that use the same or similar techniques:
- Processing a batch of HL7 messages
- Remove selected characters from any HL7 field for any message
- Parsing an arbitrary text file
- Fixing broken MSH segments