We have a requirement to read a PDF & change few text information and send it through to the other party. The document contains graphical info as well (ECG)
I found 2 pages telling to use 3rd party conversion program;
Is there any Internal/Iguana specific way of doing it without 3rd party programs/API by any chnace?
Otherwise appreciate if someone can share the experience who has done this before (even with 3rd party API). We are working on Windows platform.
Iguana does not have any built in PDF parser. Most of the workflows involving PDFs that our customers are undertaking involve embedding them in messages (think patient charts and the like).
The help pages that you found offer a good recipe for PDF-XML-PDF transformation (I assumed you wanted to send it on in PDF format). Iguana can parse any xml format to enable content extraction and transformation.
I would just note that the actual conversion process may be the least challenging part. My understanding is that the XML schema for structuring a PDF is more focused on describing presentation rather than describing data structure. The process of extracting information may be as unreliable as, for example, “screen scraping” data off a web page by parsing its HTML. This article explains it weLL: https://stackoverflow.com/questions/12126282/pdf-to-xml-and-back-to-pdf-again
I would begin by testing a few PDFs using a conversion utility and bringing the XML output into the Translator editor as sample data to see how difficult it will be to build consistent rules for extracting or changing the data. If you feel confident logical rules can be built to do this, then you can focus on the recipe for automating the PDF-XML conversion on the command line from Iguana and pulling the files in for processing.
Director, Client Education
You must be logged in to reply to this topic.
Please don't hesitate to take anonymous feedback survey or leave us a comment.