web scraping - Using tor and python to scrape Google Scholar -
I am working on a project that has been cited in the journal article how to analyze. I have a big file in the name of the journal article. I try to pass them to Google Scholar and see how many testimonials each has.
The strategy I follow here is:
-
Use "scholar.py", it is a previously written Pyro script that Google Scholar Searches and returns information on the first hit in the CSV format (including the number of quotes)
-
Google Scholar blocks you after a certain number of searches (About 3000 article titles for my query ) I have found that most people use tow (and) to solve this problem Tow is a service that lets you find random IP every few minutes to do.
I have successfully set up and working both scholar.py and tor I am not very familiar with dragon or library urllib2 and am surprised that for scholar.py What changes need to be made so that questions can be made through Tor.
I also approach an easy (and potentially quite different) large scale Google Scholar questions if one exists.
Thank you in advance
The best way to use Tor.
See.
Comments
Post a Comment