You're replying to a comment by Tom.

Tom Permalink
November 09, 2009, 19:42

I am getting the following error scraping Google from python:

urllib2.HTTPError: HTTP Error 503: Service Unavailable

Strange thing is that I can view results through a browser on the same machine(same IP etc) no problem. I am already masquarding the "User Agent Id' to mimic Firefox. I am also collecting in the cookies and feeding these back with the request.

Does anyone have an idea as to how Google will be telling the Python screen scraper versus the browser apart? Perhaps some other header in the request?

I would be very grateful to get anybodys thoughs and experiences on this.

Tom

Reply To This Comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the word "cdrom": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.