We are tied down to a language which makes up in obscurity what it lacks in style.
I am doing a startup!
Cross-browser testing from your browser!
I have written my fourth book!
Be faster than Larry Wall at command line!
You're viewing a comment by chris and its responses.
Very nice program! I'm finding google hates being scraped unless I put in a huge delay. I have a need to thoroughly sift one single domain name for tens of thousands of pages of data. This is going to take weeks at this rate. Does anyone know where I can get / buy archived search data so I could sort it locally without lag and terms of service issues?
Get a web crawling program, and just crawl that domain. Skip Google.
I do this all the time, using a free program called WinHTTrack (on Windows; also available on other platforms). See http://www.httrack.com/
Aim HTTrack at the top page of the site, and start the crawling. It does a great job grabbing anything that is linked.
(why do I need your e-mail?)
It would be nice if you left your e-mail address. Sometimes I want to send a private message, or just thank for the great comment. Having your e-mail really helps.
I will never ever spam you.
(Your twitter handle, if you have one.)
* use <pre>...</pre> to insert a plain code snippet.
* use <pre lang="lang">...</pre> to insert a syntax highlighted code snippet.
For example, <pre lang="python">...</pre> will insert Python highlighted code.
* use <code>...</code> to highlight a variable or a single shell command.
* use <a href="url" nospam>title</a> to insert links.
<a href="url" nospam>title</a>
* use other HTML tags, such as, <b>, <i>, <blockquote>, <sup>, <sub> for text formatting.
Type the word "antispam_139": (just to make sure you're a human)
Please preview the comment before submitting to make sure it's OK.
Peter Krumins' blog about programming, hacking, software reuse, software ideas, computer security, browserling, google and technology.
Reach me at:
Or meet me on:
Subscribe through an RSS feed:
(what is rss?)
Subscribe through email:
Enter your email address:
Delivered by FeedBurner
I love to read science books. They make my day and I get ideas for awesome blog posts, such as Busy Beaver, On Functors, Recursive Regular Expressions and many others.
Take a look at my Amazon wish list, if you're curious about what I have planned reading next, and want to surprise me. :)
See all top articles
See all downloads
See more detailed list of recent articles
See more detailed category information
See more detailed list of all articles