Tuesday, 20 August 2013

Web Crawler with Ajax/JavaScript

Web Crawler with Ajax/JavaScript

I have tried to use HtmlUnit to implement a crawler which can obtain the
results generated by executing the Ajax request and javascript's
execution.However, HtmlUnit is not so powerful to meet my demand because
it can't obtain all the rendered DOM element generated by executing
JavaScript or AJax. And then I aslo tried to use pywebkitgtk and
pyQtwebkit, it did generated some dynamic DOM element.But they don't work
stably, and I have no idea to tackle it. It seems that someone aslo
mentioned using the selenium.Can anybody give me some suggestions to
implement a Ajax Crawler? Many thanks!

No comments:

Post a Comment