pyshapelib unicode saga
Bram de Greve
bram.degreve at gmail.com
Thu Mar 15 16:33:56 CET 2007
Philippe Le Grand wrote:
> The DBF format specifies that field names and the contents of
> character fields are ASCII, using the OEM code page (a.k.a. IBM PC
> code page, a.k.a. code page 437; see wikipedia).
> I believe FoxPro uses a flag to identify alternate codepages at offset
> 1Dh in the header of the file, but whether that is actually part of
> the standard is unclear to me.
> You can find dbf file specs at:
> http://www.dbf2002.com/dbf-file-format.html (dbf III+ ,IV)
> or http://www.dbase.com/KnowledgeBase/int/db7_file_fmt.htm (dbf VII)
> The dbf associated with shapefiles is version III+, I believe.
Great, I'll take a look at those ...
> For portability (which is the only relevant purpose of shapefiles as
> far as I an concerned), you might want to restrict yourself to the
> most common features of the standard, i.e. ASCII field names and
> character field contents.
OK, so, basically, there's no work that needs to be done if we follow
the specs strictly, since current implementation only allows ASCII anyway =)
But, OTOH, if Thuban aspires to use unicode all the way, we have found a
barrier here. Anyway, if there would be some extension to support other
encodings, at the very least the default should be ASCII ...
> Thanks for your work. I hope to be able to soon start testing, and
> giving you feedback.
That would be great.
More information about the Thuban-devel