As for Selenium, no. Never use selenium for scraping unless whatever you are trying to do can't be done without a browser. 99% of the time scraping can be done without a browser. The main use case for Selenium is stuff like automating tests for website frontends etc. For scraping you really don't want to depend on a browser along with 1.000.000+ lines of code worth of overhead and attack surface just to extract some simple data out of some text/html.
As is usually the case with Python (you forgot to mention the language) there are multiple options. My suggestions would be:
- requests (easy to use library for doing HTTP requests)
- lxml (very fast and lightweight XML and HTML parser written in C, with XPATH support)
Those 2 is really all you need. In some rare cases you might need Javascript support. I've personally never needed it because often times you can parse out whatever you need with substring operations or regular expressions on the relevant piece of JS. This saves a lot of overhead and again running a JS engine introduces a rather large attack surface. But anyway, should you need it there are python bindings for Google's V8 engine.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.