You're viewing a comment by alessio and its responses.
You're viewing a comment by alessio and its responses.
I am being sponsored by Syntress! They bought me an amazing dedicated server to run catonmat on. If you're looking web services, I highly recommend the Syntress guys!
I am being sponsored by A-Writer! If you ever need help with essay writing, look no further than A-Writer! They will help you with your writing in as quickly as 3 hours!
I love to read science books. They make my day and I get ideas for awesome blog posts, such as Busy Beaver, On Functors, Recursive Regular Expressions and many others.
Take a look at my
Amazon wish list, if you're curious about what I have planned reading next, and want to surprise me. :)
If you are interested in advertising on catonmat.net, contact me.
Free tools for coding on Vietstarsoft.com.
Programming homework help.


Hello, you did a great job and your code really helped me. For the results number I have a very dirty solution that works with the current google version, and that might be improved by someone with regexp skills.
Just add this function (adapted from the previous _extract_info):
def _extract_total_results_num(self, soup): empty_info = {'from': 0, 'to': 0, 'total': 0} div_ssb = soup.find('div', id='resultStats') if not div_ssb: self._maybe_raise(ParseError, "Div with number of results was not found on Google search page", soup) return empty_info txt = ''.join(div_ssb.findAll(text=True)) txt = txt.replace(',', '') matches = re.search(r'About \d* results', txt, re.U) if not matches: return '' res_num = matches.group(0)[6:-8] return res_numAnd then modify the get_results function by substituting:
with:
Just let me know if it worked for someone else...
Reply To This Comment