Working with Databases

Introduction

This article contains useful techniques that we have developed for working with databases. We suggest you use this article as a reference and read the section(s) that are relevant to what you are doing. You may also want to skim the article to get an overview of the various techniques.

For an overview of how Iguana works with databases read Database Fundamentals.

Tip: In this article we use local functions in the main module, with the main() function at the end of the module. Why do we do this?

  • It is the simplest way to make a well-structured Lua program in Iguana
  • It allows us to demonstrate all the code in a single screenshot
  • It allows you to copy a single snippet of code to the main module to try it out
  • Using local functions hides them from other channels (which is a very good thing)
  • We need to put main() at the end so it can see the local functions (because Lua uses a single pass compiler)

Note: We recommend using local and shared modules for production code.

Efficient Database Access [top]

When you start using the Translator with databases the “obvious” thing to do is to read and process the data in a From Translator component. We do not recommend this approach.

Our recommended best practice is structure the code to do as little work as possible in the From Translator. A powerful technique is to do most of the real work in a Filter Translator or To Translator.

How It Works

It’s easiest to think of a concrete example of feeding from a database.

For example, you can set up a From Translator component to poll a Patient table. The logic can be very simple:

  1. In the From Translator simply push Patient IDs into the queue
  2. Then in the Filter or To Translator component use those IDs to drive more elaborate logic:
    1. Pull an ID off the queue
    2. SELECT the data from the database to flesh out the message
    3. Perform business logic/processing
    4. Finally push the processed message onto the queue

The benefit of this is that the the polling code is completely separated from the business logic.

Any Exceptions to the Rule?

One exception to this is when you have a very slow source database. In this case it may be best to minimize database access by querying all the data in a From Translator.

You can structure the work flow like this:

  1. Query the all the data you need to build up the message in the From Translator component
  2. Then serialize that data into a convenient format like XML or JSON and enqueue it:
  3. Read the data from the queue in a Filter or To Translator and map it into the desired format
    1. Pull the message data off the queue
    2. Perform business logic/processing
    3. Finally push the processed message onto the queue

The benefit of doing things this way is that it minimizes queries to the “slow database”. The responsiveness of the Translator in step 3 will be snappy, since there are no slow blocking queries to the database involved. Step 1 may still be slow (because of the slow database), but it can be optimized to retrieve the data as efficiently as possible.

What’s Next?

We explained the principles of efficient database access. The next step is to look at these two practical examples:

Please contact support at support@interfaceware.com if you need more help.

Escaping values for SQL [top]

If your code needs to generate straight SQL calls, you will need to be able to process data that contains double-quote ” and single-quote ‘ characters, which have special meanings in SQL.

We supply the conn:quote{} function to handle this requirement.

Note: The conn:quote{} function and the sqlescape module quote all returned strings, so you must not add quotes around strings in SQL query text.

How It Works

Simply call conn:quote{} to perform the specific database quoting required:

 

The conn:quote{} function quotes all returned strings, so you must not add quotes around strings in SQL query text.

Doing so will produce an error as shown:

Adding quotes without escaping worked (above) for “Smith”, but it fails (below) for “O’Toole” because the apostrophe is not escaped:

Sample Code

Copy the Escape a SQL query string code sample from our repository.

What’s Next?

We showed you how to escape strings that you use for SQL queries. We strongly recommend that you use the conn:quote{} function, rather than the legacy sqlescape module.

For more information on working with databases see the following pages:

Handling tables that use an auto-incremented ID column [top]

Vendors often create tables in their databases that use auto-incremented key columns. If your code is using conn:merge{} to implement insertion operations, handling these key columns requires a little more thought.

Before You Start

For the code in this example to work you need to create a Patient table with an auto-incrementing Id column.

Follow these steps for MySQL:

Note: we assume you are reasonably familiar with MySQL Workbench or a similar tool for MySQL.

How It Works

Suppose that your database contains a patient table with an auto-incremented Id column. You will need to implement this type of algorithm:

  1. Check whether an external patient identifier has appeared in the inbound feed
  2. Do a SELECT query on the patient table, with a WHERE clause that checks for the presence of the patient in the table
  3. If the patient exists, leave the main Id column in the table. This ensures that the conn:merge{} call will do an UPDATE for this Id
  4. Otherwise, if the patient does not exist, use T:remove('Id') to remove the ID column, then conn:merge{} will then do an insert, and a new auto-incremented Id will be created

Here is an example:

In this case the Mrn (External Id) is not found, so a new row will be inserted in the patient table.

The example uses the demo-autoinc.vmd, and you can get the Using a DB with an autoincrementing ID code from our repository.

The merge will only run in the editor if live=true for db.tables{} and conn:merge{}.

Foreign Keys for Databases [top]

This is a common conceptual problem that people need to solve. Most databases involve relationships between data: for instance, a one-to-many relationship of lab results to a patient.

So when mapping, for example, HL7 data to a database, you need to have a solution to take a primary key, such as a patient ID, and associate it with related data.

Three typical solutions discussed are:

  • Pass foreign keys, such as Patient ID, as arguments into mapping routines which need them
  • Add foreign keys to child tables after they are populated
  • Generate GUIDs and use these to associate records together

Pass foreign keys in as arguments

This is one technique of handling foreign key relationships in data. Just pass in the foreign key into the routine used to map the data.

Add foreign keys to child tables after they are populated

This is an alternative way of putting in a foreign key. It’s slightly more modular, since the AddPatientId() function can be put into a shared helper library.

Generate GUIDs and use these to associate records together

This is another technique for adding foreign keys. In this case, we generate a random GUID and use it as Primary Key for the Patient table and as a Foreign Key for the Kin table:

Note: This method could be useful if you want to share patients with other systems that accept/use GUID identifiers, as it ensures that identifiers will not collide.

This solution assumes that the Patient.Id field in the database will accept a GUID value (not guaranteed). You will also need to ensure that the HL7 Mrn is stored in another field like Patient.Mrn – so you can match incoming HL7 messages to the patient.

Alternatively you could create a second identifier like Patient.Guid (an alternate key in DB speak), and use it to store the GUID:

Note: This method is probably cleaner as you can use any value for the (internal) Patient.Id, e.g., an auto-incrementing key, the HL7 Mrn, or some other proprietary identifier.

util.guid() is a method supplied by iNTERFACEWARE. It generates a cryptographically strong unique global identifier.

Translating Code Sets [top]

The codemap module can be used to translate code sets, this can be useful for any channel not just one using HL7, see the Translating code sets page for more information.

There are two versions of the codemap module: one allows for a default value to be returned if there is no matching code, the second returns nil if there is no match. We generally recommend the second more general module.

Simple list

You can define a list like this:

Code translation table

Looking up unknown code “W” returns nil:

Now we add a default so “Other” is returned:

Polling a database checking for flag fields to send out data [top]

Assume you have an existing HIS system with data stored in its on table schema, and you need to generate messages based off that schema. There is no interface system provided, so you need to query the database directly.

Before You Start

This example assumes that a couple of tables exist:

  • A Patient table populated by the existing HIS, which contains the patient demographic information. It has an Admit Date column that is used to determine when to send out messages.
  • A patient_polling_table, which contains a single value called PollTime, which will record the last time a poll was done for messages.
  • You can use the example_demo.vmd file to create these tables in the Translator.

How It Works

You will set use a channel with ‘From’ and ‘To Translator’ components:

  1. Use the live=true setting to get the ‘From Translator’ component working
    • When the ‘From Translator’ component once it is working you can set live=false (to prevent accidental updates)
  2. Run the channel to push the messages to the queue and the logs
    • Sample messages are not needed for the ‘From Translator’ as it reads from a database
    • You can change the time in patient_polling_table to force re-processing of patients
  3. Open the ‘To Translator’ component for the channel and import sample data from the logs
    • Note: You could just add some sample IDs manually
  4. Use the live=true setting to get the ‘To Translator’ component working
  5. When both components are working you just run the channel and the ‘From Translator’ will process the queued messages

Source: From Translator

First, set up Iguana to have a From Translator->To Translator channel. The ‘From Translator’ script will check when the database was last polled, and identify any admissions that occurred after that time. It then pushes the ID for each row to Iguana’s queuing system.

In this example ID “4525285” was admitted after the last time the database was polled:

The code, for reference:

conn = db.connect{
   api=db.MY_SQL, 
   name='test',
   user='root', 
   password='password',
   live=true
}

-- Here we get multiple patient ids from a table, based on a AdmitDate column
-- looking for rows that are ready to be processed
function main()
   local PollTimeRs = conn:query('SELECT PollTime FROM patient_polling_table;')

   if #PollTimeRs == 1 then
      local PollTime = PollTimeRs[1].PollTime;
      --now select for all patient ids from the last time we polled
      local SelectSql = "SELECT Id from Patient WHERE AdmitDate>'"..PollTime.."' AND  AdmitDate<=NOW();"
      local IdRs = conn:query(SelectSql)
      -- we push in the patient id into the queue. The To Translator component
      -- will generate the messages needed
      for i = 1,#IdRs do
         queue.push{data=IdRs[i].Id:nodeValue()}
      end
      -- once added the the queue, update the polltime
      conn:execute{sql='UPDATE patient_polling_table SET PollTime=NOW();',live=true}
   end
end

We set the conn:execute() second param to true to execute SQL interactively in the Iguana editor. The interactive option does not push items onto the queue. To test the ‘To Translator’ component you need to import sample data from the logs (details at the bottom of the page).

Destination: To Translator

By storing patient ids in Iguana’s queue system, we can import the ids as sample messages while developing the ‘To Translator’ component.

The script does three things:

  1. SELECTs the corresponding row from the Patient table (using the Id from the queue)
  2. Maps the patient data into HL7 message node tree
  3. Passes the message on for further processing, e.g., write it to a file, database, etc

The code for generating messages in the ‘To Translator’ component is as follows:

Here is the example_demo.vmd and the code:

conn = db.connect{
   api=db.MY_SQL, name='test',
   user='root', 
   password='password',
   live=true
}

-- you should put this function in a local string module 
function string.removeNL(S)
 if S:byte(#S-1) == 13 and S:byte(#S) == 10 then
 return S:sub(1, #S-2)
 end
 return S
end

local function mapPatient(msg,RS)
 -- TODO map other segments
 msg.PID[2][1]=RS[1].Id
 msg.PID[5][1][1][1] = RS[1].LastName
 msg.PID[5][1][2] = RS[1].GivenName
 msg.PID[19] = RS[1].Ssn
 msg.PID[10][1][1] = RS[1].Race
 msg.PID[13][1][1] = RS[1].PhoneHome
 msg.PV2[8] = RS[1].AdmitDate
end

function main(Data)
   --remove any newlines that are present
   Data = string.removeNL(Data)
   --select for data
   local RS = conn:query(
      "SELECT * FROM Patient WHERE Id='"..Data.."'"
   )
   if #RS > 0 then
      --load the vmd file containing the message definition 
      local msg = hl7.message{
         vmd='example/demo.vmd',name='ADT'
      }

      -- now map what we want and generate the message
      mapPatient(msg,RS)

      -- now we have the message, at this point we can 
      -- write it to a file put it into a db, etc
      msg:S()
   end
end

What’s Next?

As you can see, this is a straightforward method for generating a message from an existing database table schema. You can adapt this code to your needs, by saving the message data to a file, database, or transmitting it to another channel, etc.

You may be interested in the Interfaces section in the wiki.

Please contact support at support@interfaceware.com if you need more help.

Polling a database, and generating CSV files [top]

This page shows you how to create a simple “From Database to CSV file” channel. These same ideas can be applied to create a “From Database to Anything” channel, or a “From Anything to CSV file” channel.

Before You Start

We are going to follow the structure recommended in the parent page, Efficient Database Access. Basically, this means we will do as little work as possible in the From Translator component, and do most of the work in a Filter Translator. This approach leverages the power and convenience of the Iguana queue.

There are various advantages to doing things this way:

  • The From Translator code is very simple, just SELECT the unprocessed record IDs and push them into the Iguana queue
  • Once the IDs have been queued it is very easy to import them from the log into the Filter to use as sample data
  • Once you have sample data it is easy to generate the full messages in the filter script (using the sample IDs)
  • You can also easily test your code against many messages, by quickly moving through the sample IDs

Note: if you had done all the processing in a From Translator the testing, in particular, would be much more difficult.

How It Works

First you create a channel in Iguana with a “From Translator” source, and a “To File” destination.

This channel will also use a Translator in its Filter, so when I initialize the channel’s properties, I check the “Use filter” checkbox in the Filter tab (making sure to leave the other Filter settings as their default values).

Source: From Translator

First things first: You start by writing a simple database polling function. All the script needs to do is poll the database for rows that have not been processed (determined by a NULL value in the Status column). It then pushes the ID for each row to Iguana’s queuing system.

In this example we are pushing the ID “MM241” onto the queue:

Here is the entire script:

local conn = db.connect{
   api=db.MY_SQL, 
   name='test', 
   user='root', 
   password='password', 
   live = true
}

local function PushId(Id)
 queue.push(Id)
end

-- The main function is the first function called from Iguana.
function main()
   local RS = conn:query('SELECT Id FROM patient WHERE Status IS NULL')

   for i = 1, #RS do
      PushId(RS[i].Id:nodeValue())
   end
end

We set live = true to execute SQL interactively in the Iguana script editor (we do the same in filter script below).

Filter: Translator

The filter script does four things:

  1. SELECTs the corresponding row from the Patient table (using the Id from the queue).
  2. Creates the CSV data by concatenating the fields in the row (excluding the Status field).
  3. Marks the row as processed.
  4. Pushes the CSV data onto the queue.

The script is as follows:

local csv = require 'csv_db'

conn = db.connect{
   api=db.MY_SQL, 
   name='test',
   user='root', 
   password='password'
}

function main(Data)
   -- Get all patient fields
   local RS = conn:query("SELECT * FROM patient WHERE Id='"..Data.."'")

   -- Generate CSV content. Here we will include every field
   -- other than the Status column.
   local Patient = RS[1]
   local CSV = ''
   for i=1,#Patient do
      if Patient[i]:nodeName() ~= 'Status' then
         if i > 1 then CSV = CSV..',' end
         CSV = CSV..csv.escape(Patient[i])
      end
   end

   -- Mark the row as "processed"
   conn:execute{
      sql = "UPDATE patient SET Status=NULL WHERE Id='"..Data.."'",
      live = true
      }

   -- Send the CSV content out
   queue.push{data=CSV}
   return CSV
end

You need to escape the CSV data as follows:

  • Escape any double quotes inside a string, i.e., ‘xx”x’ becomes ‘xx””x’
  • Wrap strings in double quotes, i.e., ‘xxx’ becomes ‘”xxx”‘

Use this Lua simple module to do the escaping, put the code in a local module csv_db:

local csv = {}

-- Pad the value with double quotes (if the value is a string),
-- and escape any double quotes inside the value.
--
function csv.escape(N)
   if N:isNull() then
      return ''
   elseif N:nodeType() ~= 'string' then
      return tostring(N)
   end
   local V = tostring(N)
   V = V:gsub('"', '""')
   return '"'..V..'"'
end
return csv

Destination: To File

The To File is configured to write each message out into its own .csv file, like this:

And that’s all there is to it!

What’s Next?

We showed you how simple it is to read from a database and output the data into CSV files. The next step is to adapt the code sample to your own needs, you can poll for any value/condition and do whatever processing you need. For example, the next page shows how to poll for a flag field and generate an HL7 message in response.

You may also be interested in some of these pages:

Please contact support at support@interfaceware.com if you need more help.

Pipe-lining [top]

If you are running a high volume data center, then one topic you might be interested in is what I call pipe-lining, or batching of data from several messages together into a single transaction.

Iguana is extremely fast, the database becomes the biggest bottleneck.

Now, if you think about the underlying mechanics of transactions in databases, the bottleneck is always the time needed to physically sync data down to the disc storage. So, with databases, if you insert, say, 100 rows within one transaction, you’ll get dramatically better performance than if you were to do 100 individual transactions. Relating this to HL7: if the nature of your data set is that you are always inserting data, this is a possible optimization that you can do.

This type of technique is worthwhile considering if you are processing large batches of HL7 transactions. Say for instance you have files which contain hundreds of HL7 messages. Then an algorithm you might want to consider is this:

  1. Use a From Translator component and Lua file handling APIs to break the incoming files into chunks of 100 or say messages and enqueue these blocks.
  2. In a To Translator component break the chunks of 100 into individual messages.
  3. Parse each message and map out the data into a single set of tables as you obtain from db.tables{}
  4. Then do one call to conn:merge{} which commits all data mapped from the 100 messages in a single database transaction.

This has the potential to give you a much faster throughput for a large amount of batched HL7 data. The performance gains can be dramatic as writing 100 records as opposed to writing a single record is very little extra overhead for a database engine, so saving the data may not be 100 times faster but it certainly will be many times faster!

If are interested in doing something like this implemented contact us at support@interfaceware.com and we can arrange a quote from our professional services team to assist you.

Inserting data into a ‘hostile’ vendor application [top]

There are many times the business requirement arises where you want to insert data into a third party application database directly without the co-operation of the vendor.

Typical scenarios this may occur are:

  • You are a vendor selling a replacement system, your customer still have this system from a competitor which they are forced to use for a time.
  • It might be an end of lifed system which is no longer supported – perhaps the vendor is no longer in business.

For these types of scenarios it’s not a bad strategy to consider actually mapping data directly into the database of such a system. The database schema is static and so it often easier to feed data directly into the database rather than attempting to use a poorly documented ‘official’ HL7 or other type of interface.

It’s going to depend on the application in question. So the million dollar question – how difficult is this and what’s a good strategy to do it?

What we recommend is:

  1. Get the application in a virtualized image like a VMware image.
  2. Use tricks the one for quickly surveying a database.
  3. This means you can use a process of trial and error with inserting data into the system in the convenience of your work environment and easily roll back any mistakes.
  4. When it comes to production deployment it makes sense to set up a test system and clearly the define the user acceptance criteria for the integration.

The amount of effort required will be directly proportional to:

  1. How much data you attempt to push into the system.
  2. The quality of the data model – if it’s a poorly organized datamodel with a lot of complexity and duplication of data then it will make the integration work harder.
  3. How skilled the person programming is at navigating and comprehending the database of the app.

Using the Iguana Translator makes this work a lot more rapid since you can iteratively and quickly verify the integration but unavoidably there will be a considerable amount of analysis required of the application you are feeding into.

Dealing with binary data [top]

Binary data is currently supported for SQL server ODBC connections.

Unique identifier

UUIDs are read from the database as binary data. The following code can be used to convert the binary data into a readable representation:

local function binToUUID (Bin)
   -- Byte ordering for endian conversion
   local ByteOrdering = { 4, 3, 2, 1, 0, 6, 5, 0, 8, 7, 0, 9, 10, 0, 11, 12, 13, 14, 15, 16 }

   local OutText = ""
   for i = 1, #ByteOrdering do
      local bo = ByteOrdering[i]
      if bo == 0 then
         OutText = OutText..'-'
      else
         OutText = OutText .. string.format ("%02X", string.byte(Bin,bo))
      end
   end
   return OutText   
end

ASCII hex display

The following snippet can be used to convert binary data into a human-readable hex based ASCII:

local function binToHex (Bin, RowWidth)
   local OutText = ""
   for i = 1, #Bin - 1 do
      if (i > 1 and RowWidth and (i-1) % RowWidth == 0) then
         OutText = OutText .. '\n'
      end
      OutText = OutText .. string.format ("%02X", string.byte(Bin,i)) .. ' '
   end
   OutText = OutText .. string.format ("%02X", string.byte(Bin,#Bin))
   return OutText  
end

Trick for surveying a database [top]

Often times the best place to find data in a system is to grab from the database. The APIs that many applications provide can be more bothersome and less reliable than bypassing them to go to the database.

A trick we have found useful for quickly accessing the content of a database at least with MySQL is the following code fragment:

local conn = db.connect{
   api=db.MY_SQL,
   user='root',
   password='password',
   name='test'
}

local function ShowTable(Name)
 local R = conn:query('SELECT * FROM '..Name..' LIMIT 1')
 return R
end

function main()
   local T = {}
   local R = conn:query('SHOW TABLES') 
   for i = 1, #R do 
      T[i]={}
      T[i].name = R[i][1]:nodeValue()
      T[i].data = ShowTable(R[i][1]:nodeValue())   
   end
   return T   
end

This loops through all the tables in the database and queries the first row of data. It’s usually pretty fast and provides a quick browse-able tree of the database.

Ironically it often seems faster than using many of the dedicated tools available for this purpose. Less clicks required.

Hint – for Microsoft SQL Server the equivalent SQL is “SELECT name from sysobjects where xtype = “U”

See also:

Leave a Reply