Wednesday, May 2, 2012

Parsing HTML output in Plone functional doctests with lxml

When writing functional doctests, i used to fumble a bit to inspect what was in the HTML. Today i looked into lxml and it makes it a lot easier to test, especially the XPath makes for very readable tests.

For example, to test that a certain text appears in a viewlet, but not in the page itself, parsing the tree of the document is convenient. (Use case: A viewlet that displays "Other Items".)

 This snippet tests our viewlet, which should at that point in the test show exactly one item:
    >>> from lxml import etree
    >>> html = etree.HTML(browser.contents)
    >>> len(html.xpath('//*[@id="other-advertorial-texts"]/div[@class="box"]'))
    1