Skip navigation.
Home

Python SAX

xml.sax._exceptions.SAXParseException: not well-formed (invalid token)


UnicodeDecodeError: 'utf8' codec can't decode byte 0x9a in position: unexpected code byte


1) My file was not encoded in utf8 (could use eclipse to check/change encoding)
2) added to data file the following line
<?xml version="1.0" encoding="UTF-8"?>

       parser.parse(open(fileName))
        #parser.parse(codecs.open(fileName, "r", "utf-8"))

3)
Error: xml.sax._exceptions.SAXParseException:  junk after document element

Solution:
added <ListRecords> and </ListRecords> to the beginning and the end of the document respectively
[http://mail.python.org/pipermail/python-list/2002-November/172310.html]


Refs:
http://evanjones.ca/python-utf8.html
http://bytes.com/groups/python/818634-problem-parsing-utf-8-encoded-xml-minidom