Processing Excel and CSV files

Note: This code on this page uses the legacy dbc module which is a wrapper for the legacy db.<xxx> database functions.

If you are using Iguana 5.5 it is recommended best practice to use database connection objects.

The code on this page will work with all versions of Iguana including 5.5 – it just uses older style functions.

This is a detailed comparison of Excel and Text (CSV) ODBC drivers.

The example uses four spreadsheets saved as xls and CSV files:

There are a few points already:

  • I am using a little “cheat” with api=db.SQL_SERVER as Iguana does not have explicit support for Excel and CSV (no api=db.EXCEL or api=db.TEXT)
  • An Excel DSN refers to a specific Excel file, so there a four DSN’s one for each sample file
  • The Text (csv) DSN refers to a directory so there is only one DSN needed, to refer to a file you just name it as the table in the query.

Next lets compare how the data looks once it is loaded into Iguana

The Patient_Col file has a header row with column names and works well with both drivers (Excel ODBC on the left)

  • Both drivers pick up the column names, though you may need to set this up for the Text driver
  • As you can see the Excel driver imports blank and NULL fields, the Text (csv) driver omits the blank data – this could be annoying even though no data is lost

The Patient_NoCol file has no header row, there is only data – this does not work well the Excel driver (on the left)

  • The Excel driver drops a real clanger here – it expects column names in the first row and has used data as field names. This is not acceptable as data is lost. There is no setting to change this in the ODBC administrator, so you would need to modify the file (or perhaps find a better third party ODBC driver).
  • The Text (CSV) driver works well as it assigns columns names, and imports the data correctly (albeit leaving out blank data as before). It is possible to use the odbc administrator to assign column names manually.