Working with Selenium


This extension is currently ALPHA.

Things will change, break, not work as expected, and the documentation is lacking some serious work.

This section is here to give a brief overview but is neither complete nor definitive.

You’ve been warned.

Writing web crawlers with Bonobo and Selenium is easy.

First, install bonobo-selenium:

$ pip install bonobo-selenium

The idea is to have one callable crawl one thing and delegate drill downs to callables further away in the chain.

An example chain could be:

digraph { rankdir = LR; login -> paginate -> list -> details -> "ExcelWriter(...)"; }

Where each step would do the following:

  • login() is in charge to open an authenticated session in the browser.
  • paginate() open each page of a fictive list and pass it to next.
  • list() take every list item and yield it.
  • details() extract the data you’re interested in.
  • … and the writer saves it somewhere.