This topic contains 0 replies, has 1 voice, and was last updated by lev 10 years, 3 months ago.
Thai cp874 (Windows 874) to UTF-8
You must be logged in to reply to this topic.
This topic contains 0 replies, has 1 voice, and was last updated by lev 10 years, 3 months ago.
Iguana will support languages encoded in UTF-8, unfortunately some systems use cp874 (Windows 874) and Thai language users cannot pipe inbound data to Iguana because of encoding mismatch.
Apparently it is easy to transcode from cp874 to UTF-8 using pretty trivial module offered below.
Say your inbound data comes in files encoded in cp874.
In Iguana channel Source component Translator script read this file content in binary mode (AKA ‘rb’)
Pipe the content of file through below module calling function cp874toUTF8()
utf8sOutData = transcode.cp874toUTF8(cp874InData)
Push UTF-8 result, calling queue.push{}, to Iguana queue for further normal processing by Filter or Destination component of the channel.
transcode={} function transcode.convert(Data, Map) local j = 1 local p = {} for i=1, #Data do local C = Map[Data:byte(i)] if C then p[#p+1] = Data:sub(j,i-1) p[#p+1] = C j = i + 1 end end p[#p+1] = Data:sub(j,#Data) return table.concat(p) end
function transcode.cp874toUTF8(D) local D2='' local d1=string.char(0xe0)..string.char(0xb8) local d2=string.char(0xe0)..string.char(0xb9) local cp874CodeSet={ [128]='172',[133]='166',[145]='152', [146]='153',[147]='156',[148]='157', [149]='162',[150]='147',[151]='148' } local cp874List={128,133,145,146,147,148,149,150,151} local function din(n) for k,v in ipairs(cp874List) do if n==v then return true end end end for i=1,#D do if string.byte(D,i) < 0x80 then D2=D2..D:sub(i,i) elseif string.byte(D,i) > 0x9f then if string.byte(D,i) < 0xe0 then D2=D2..d1.. string.char(string.byte(D,i)-0x20) elseif string.byte(D,i) > 0xdf and string.byte(D,i) < 0xfc then D2=D2..d2.. string.char(string.byte(D,i)-0x60) elseif din(string.byte(D,i)) then D2=D2.. transcode.convert( string.byte(D,i),cp874CodeSet) end end end return D2 end
Let me know if it works for you. Comments needed. Happy transcoding.
You must be logged in to reply to this topic.