Beware of bugs in the above code; I have only proved it correct, not tried it.
I am doing a startup!
Cross-browser testing from your browser!
I have written my fourth book!
Be faster than Larry Wall at command line!
You're replying to a comment by Tom.
I am getting the following error scraping Google from python:
urllib2.HTTPError: HTTP Error 503: Service Unavailable
Strange thing is that I can view results through a browser on the same machine(same IP etc) no problem. I am already masquarding the "User Agent Id' to mimic Firefox. I am also collecting in the cookies and feeding these back with the request.
Does anyone have an idea as to how Google will be telling the Python screen scraper versus the browser apart? Perhaps some other header in the request?
I would be very grateful to get anybodys thoughs and experiences on this.
(why do I need your e-mail?)
It would be nice if you left your e-mail address. Sometimes I want to send a private message, or just thank for the great comment. Having your e-mail really helps.
I will never ever spam you.
(Your twitter name, if you have one. (I'm @pkrumins, btw.))
* use <pre>...</pre> to insert a plain code snippet.
* use <pre lang="lang">...</pre> to insert a syntax highlighted code snippet.
For example, <pre lang="python">...</pre> will insert Python highlighted code.
* use <code>...</code> to highlight a variable or a single shell command.
* use <a href="url" nospam>title</a> to insert links.
<a href="url" nospam>title</a>
* use other HTML tags, such as, <b>, <i>, <blockquote>, <sup>, <sub> for text formatting.
Type the word "cdrom_139": (just to make sure you're a human)
Please preview the comment before submitting to make sure it's OK.
Peteris Krumins' blog about programming, hacking, software reuse, software ideas, computer security, browserling, google and technology.
Reach me at:
Or meet me on:
Subscribe through an RSS feed:
(what is rss?)
Subscribe through email:
Enter your email address:
Delivered by FeedBurner
I love to read science books. They make my day and I get ideas for awesome blog posts, such as Busy Beaver, On Functors, Recursive Regular Expressions and many others.
Take a look at my Amazon wish list, if you're curious about what I have planned reading next, and want to surprise me. :)
See all top articles
See all downloads
See more detailed list of recent articles
See more detailed category information
See more detailed list of all articles