Ben McCann

Co-founder at Connectifier.
ex-Googler. CMU alum.

Ben McCann on LinkedIn Ben McCann on AngelList Ben McCann on Twitter


HTML Parsing using the Firefox DLLs


One of my first posts was a comparison of HTML parsers. Today I found a particularly challenging document to parse. None of the parsers I had compared earlier were able to handle the malformed HTML in this table where the td elements were prematurely ended. The behavior of Neko and HtmlCleaner made the most sense Read More