Processing Excel and CSV files

Introduction

Excel and CSV files are very common for reports and data interchange, so how can we process them in Iguana? I will discuss two approaches here.

Both methods use a From Translator Source to monitor the “incoming” directory for files to be processed.

Limitation: unfortunately when Iguana is run as a service on Windows 7 the script for converting xls files to CSV does not work (and probably will not work with Vista). This appears to do with the increased security/access restrictions  that Windows 7 places on Services. The workaround is to run Iguana from the command line as iguana --run when using Windows 7. We will update this section when we find a solution.

The first method uses separate From and To Translators:

This follows Best Practise for Iguana by separating loading (From Translator source ) and processing of data (To Translator destination)

  • From Translator:
    • Convert xls files to CSV format.
    • Load the CSV data flle using Lua.
    • Queue the data for processing in the To Translator.
    • Move the file to a “processed” directory.
  • To Translator:
    • Process the queued data by splitting out rows and columns and loading them into a Lua table.
    • Apply filtering logic to clean up the data if necessary.
    • Map the Lua tables to your database.
    • And finally merge the data into your database.

For more information see this example of importing CSV files.

The second method does all the processing in a single From Translator:

  • First setup a Text (csv) DSN with the ODBC Administrator Tool.
  • Convert xls files to CSV format.
  • Use db.query() to load the CSV file into DB Result Set node tree reflecting your database structure.
  • Apply filtering logic to clean up the data if necessary.
  • Move the file to a “processed” directory.
  • Map the Lua tables to your database.
  • And finally merge the data into your database.

ODBC information

  • Unfortunately the Microsoft Excel ODBC driver locks the attached xls file, so it cannot be used in an automated import process (because you need manual intervention to move the file after processing).
  • The Text (CSV) ODBC driver does not place a lock on the attached file and is well suited to an automatic import process (because you can move the file programmatically after processing).

    Note: It is only because we need to use the Text (CSV) ODBC driver the first step for processing xls files is to convert them into CSV format.

Here is a more detailed comparison of Excel and Text (CSV) ODBC drivers.

You may need to install MDAC or the MS Access engine to get the ODBC drivers. You can setup the data sources using the ODBC Administrator Tool.

Leave A Comment?