python - BeautifulSoup fails parsing when it hits an unescaped bracket -


I am having trouble loading the page that contains a literal (invisible) email tag, such as

  & lt; Html & gt; & Lt; Top & gt; & Lt; Title & gt; Trial & lt; / Title & gt; & Lt; / Head & gt; & Lt; Body & gt; & Lt; P & gt; test test. & Lt; / P & gt; & Lt; P & gt; This & lt; Joe@somewhere.com> & Lt; / P & gt; Has an email address for & lt; / Body & gt; & Lt; / Html & gt;  

Parsing fails when it runs that block:

The file "/tools/oss/packages/x86_64-rhel5/python/2.7.1/" Lib / python2.7 / HTMLParser.py ", line 115, error HTMLParseError raise (message, self.getpos ()) HTMLParseError: malformed start tag, on line 748, column 82

I do not believe I have been the first one to hit it, but I can not get any help or useful document immediately, is there anything clear to me?

Thank you,

- Paul

Always the way - When you post a question you get an answer later.

It seems that I hit the bug described - later updating Sundasup really fixes the problem.


Comments

Popular posts from this blog

java - NullPointerException for a 2d Array -

python - Assemble mpeg file unable to play in mediaplayer -

c# - NameSpace Manager or XsltContent to parse aspx page -