This topic contains 2 replies, has 2 voices, and was last updated by  Casey Trauer 4 months ago.

Splitting Xml

  • Hi, I have a large XML and i want to break it into chunks. Can someone advise how to achieve this in iguana without using external library

    Example

    <Staffs>
    <Staff>…</staff>
    .
    .
    .
    3000 staffs
    .
    <Staff>…</staff>
    </Staffs>

    How can I split 3000 staffs into 500 each

    Thanks in advance

    Hi Sandeep,

    I am going to make the following assumptions about what you want to achieve for my answer:

    1) You want to output XML containing 500 staff elements from an input XML of ~3000 staff elements.
    2) You have not ruled out using xml.parse as part of your process.

    OK, so…

    The basic solution involves these steps:

    1. Parse the XML file.
    2. Loop through staff entries to build X separate files. (We will actually do a loop within a loop.)
    3. After all the entries for each file are collected, serialize to XML by rebuilding a string.

    Here is the script for my suggestion. There are probably more elegant ways to do it, but I hope mine is clear and easy to follow. (I have also attached it as project that you can import into your Iguana.)

    
    function main(Data)
       
       -- Convert XML file into a table you can iterate through
       local StaffList = xml.parse{data=Data}
       
       
       -- Set the parameters of our loop through the full list
       -- RecordMaxCount will be max count of staff entries
    
       local StartingPoint = 1
       local TotalNumberOfStaff = #StaffList.Staffs
       local RecordMaxCount = 5  
       
       --trace(#StaffList.Staffs[6]:nodeName())
       
       
    
    	-- Set the parameters for each sub list
    	local NewStartingPoint = 1
       local NewEndPoint
       
       
       
       -- Start loop through full list
       for i=StartingPoint, TotalNumberOfStaff, RecordMaxCount do 
       
    
          -- Create table to hold sub list
          local NewStaffList = {}
          -- Define default endpoint for sub list
          local NewEndPoint
          
          -- This bit of logic is to determine if we are near the end of our
          -- staff list and the total number of remaining entries is less than
          -- our RecordMaxCount.
          -- We want to reset our sub loop end point so that it doesn't try
          -- to process entries that don't exist.
    
          if (TotalNumberOfStaff - NewStartingPoint) < RecordMaxCount then 
             NewEndPoint = TotalNumberOfStaff
             else
             NewEndPoint = i + RecordMaxCount  - 1
          end
    
          
          
          -- Now we are going to loop through subsection to collect entries
          for j=NewStartingPoint, NewEndPoint do 
             NewStaffList[j]=StaffList.Staffs[j]
             trace (NewStaffList)
          end
    
          SerializeStaffList(NewStaffList)
          
          
          
          NewStartingPoint = NewStartingPoint + RecordMaxCount
          NewEndPoint = NewEndPoint + RecordMaxCount
          
       
       end
       
       
    end
    
    
    function SerializeStaffList(NewStaffList) 
    
      
       -- Variable to hold serialized XML
       local NewStaffFile = ""
       
       -- Loop through sub list and serialize by rebuilding XML as string
       for i=1,#NewStaffList do 
          local Entry = "<" .. NewStaffList[i]:nodeName() .. ">" .. 
               NewStaffList[i]:nodeText() .. ""
          NewStaffFile = NewStaffFile .. Entry
       end
    
       -- Add root nodes / XML envelope
       NewStaffFile = "" .. NewStaffFile .. ""
       
       trace (NewStaffFile)
       
       -- Process new file. 
       -- Now that you have your broken up file, what are you doing with it?
       
    end
    
    
    Attachments:
    You must be logged in to view attached files.

    Casey Trauer,
    Director, Client Education
    iNTERFACEWARE

    By the way, I would download the project instead of copy and pasting the inline code. I see that some of it didn’t render correctly because it won’t print the xml tags.

    Casey Trauer,
    Director, Client Education
    iNTERFACEWARE

Tagged: 

You must be logged in to reply to this topic.