Working with Selenium¶
This extension is currently ALPHA.
Things will change, break, not work as expected, and the documentation is lacking some serious work.
This section is here to give a brief overview but is neither complete nor definitive.
You’ve been warned.
Writing web crawlers with Bonobo and Selenium is easy.
First, install bonobo-selenium:
$ pip install bonobo-selenium
The idea is to have one callable crawl one thing and delegate drill downs to callables further away in the chain.
An example chain could be:
Where each step would do the following:
login() is in charge to open an authenticated session in the browser.
paginate() open each page of a fictive list and pass it to next.
list() take every list item and yield it.
details() extract the data you’re interested in.
… and the writer saves it somewhere.