xml.sax._exceptions.SAXParseException: not well-formed (invalid token)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x9a in position: unexpected code byte
1) My file was not encoded in utf8 (could use eclipse to check/change encoding)
2) added to data file the following line
<?xml version="1.0" encoding="UTF-8"?>
parser.parse(open(fileName))
#parser.parse(codecs.open(fileName, "r", "utf-8"))
3)
Error: xml.sax._exceptions.SAXParseException: junk after document element
Solution:
added <ListRecords> and </ListRecords> to the beginning and the end of the document respectively
[http://mail.python.org/pipermail/python-list/2002-November/172310.html]
Refs:
http://evanjones.ca/python-utf8.html
http://bytes.com/groups/python/818634-problem-parsing-utf-8-encoded-xml-minidom
Python SAX
Submitted by Neil Rubens on Tue, 05/12/2009 - 15:35
»
- Neil Rubens's blog
- Login to post comments