Migrating from legacy HL7 technology to Iguana

Using regression testing to recreate interfaces

Introduction [top]

The idea is to use regression testing similar to that mapped out in the From DB to HL7 tutorial. The precise details of how to do it will vary based on the nature of what data transformation is being performed by the system in question.

The key to it is to build some mechanism using Iguana and the existing legacy technology to make each message traceable through the system. Often the unique message control ID’s associated form a good way to do this. You can use them as a primary key in storing the raw text of each message in a database – as mentioned in the ‘more complicated’ example of legacy system or use them to filenames <message control id>.txt to contain the message text as done in From DB to HL7 tutorial. The principle is the same.

If message control IDs are not guaranteed to be unique then another good trick is using MD5 checksums.

The Translator is equally good at pulling information from databases or files so both techniques work well.

The basic goal is to make it fast to see the differences in the old and new logic as quickly as possible – fortunately the dynamic visual nature of the Translator gives us great tools to achieve this. The cycle of detecting errors, correcting the code and observing the results is extremely fast.

Regression testing an entire database [top]

One final check of equivalence for validating that the new Iguana based logic is equivalent to the old logic is this:

  1. Start off with two empty databases of the application having data fed into it.
  2. Populate one database by running the legacy technology with the HL7 dataset.
  3. Populate the second database by running the new Iguana based technology with the same HL7 dataset.
  4. Then use the Translator to do queries against both databases to find the differences.

The Iguana Translator is a great tool for doing this comparison since you can selectively filter out ‘noise’ – for instance if the code generates random GUIDs and so on that are different on every run, these can be excluded from the comparison.

Checking the entire database is best as a final check since it’s something of a blunt tool:

  1. This run will usually take a fair amount of time.
  2. If differences that are found earlier on it’s much faster to correct them in the local context of a single message transaction.

However if the new logic passes this test you can have complete confidence that it’s equivalent.

The only real dilemma is that often in this process bugs are discovered in the old logic which you might wish to correct in the new logic. This is why it is nice to be able to selectively ignore some differences.