This topic contains 2 replies, has 2 voices, and was last updated by Casey Trauer 3 years, 2 months ago.
Splitting Xml
-
Hi, I have a large XML and i want to break it into chunks. Can someone advise how to achieve this in iguana without using external library
Example
<Staffs>
<Staff>…</staff>
.
.
.
3000 staffs
.
<Staff>…</staff>
</Staffs>How can I split 3000 staffs into 500 each
Thanks in advance
Hi Sandeep,
I am going to make the following assumptions about what you want to achieve for my answer:
1) You want to output XML containing 500 staff elements from an input XML of ~3000 staff elements.
2) You have not ruled out using xml.parse as part of your process.OK, so…
The basic solution involves these steps:
- Parse the XML file.
- Loop through staff entries to build X separate files. (We will actually do a loop within a loop.)
- After all the entries for each file are collected, serialize to XML by rebuilding a string.
Here is the script for my suggestion. There are probably more elegant ways to do it, but I hope mine is clear and easy to follow. (I have also attached it as project that you can import into your Iguana.)
function main(Data) -- Convert XML file into a table you can iterate through local StaffList = xml.parse{data=Data} -- Set the parameters of our loop through the full list -- RecordMaxCount will be max count of staff entries local StartingPoint = 1 local TotalNumberOfStaff = #StaffList.Staffs local RecordMaxCount = 5 --trace(#StaffList.Staffs[6]:nodeName()) -- Set the parameters for each sub list local NewStartingPoint = 1 local NewEndPoint -- Start loop through full list for i=StartingPoint, TotalNumberOfStaff, RecordMaxCount do -- Create table to hold sub list local NewStaffList = {} -- Define default endpoint for sub list local NewEndPoint -- This bit of logic is to determine if we are near the end of our -- staff list and the total number of remaining entries is less than -- our RecordMaxCount. -- We want to reset our sub loop end point so that it doesn't try -- to process entries that don't exist. if (TotalNumberOfStaff - NewStartingPoint) < RecordMaxCount then NewEndPoint = TotalNumberOfStaff else NewEndPoint = i + RecordMaxCount - 1 end -- Now we are going to loop through subsection to collect entries for j=NewStartingPoint, NewEndPoint do NewStaffList[j]=StaffList.Staffs[j] trace (NewStaffList) end SerializeStaffList(NewStaffList) NewStartingPoint = NewStartingPoint + RecordMaxCount NewEndPoint = NewEndPoint + RecordMaxCount end end function SerializeStaffList(NewStaffList) -- Variable to hold serialized XML local NewStaffFile = "" -- Loop through sub list and serialize by rebuilding XML as string for i=1,#NewStaffList do local Entry = "<" .. NewStaffList[i]:nodeName() .. ">" .. NewStaffList[i]:nodeText() .. "" NewStaffFile = NewStaffFile .. Entry end -- Add root nodes / XML envelope NewStaffFile = "
" .. NewStaffFile .. " " trace (NewStaffFile) -- Process new file. -- Now that you have your broken up file, what are you doing with it? endAttachments:
You must be logged in to view attached files.Casey Trauer,
Director, Client Education
iNTERFACEWAREBy the way, I would download the project instead of copy and pasting the inline code. I see that some of it didn’t render correctly because it won’t print the xml tags.
Casey Trauer,
Director, Client Education
iNTERFACEWARE
You must be logged in to reply to this topic.