You're viewing a comment by alessio and its responses.

alessio Permalink
June 22, 2011, 12:47

Hello, you did a great job and your code really helped me. For the results number I have a very dirty solution that works with the current google version, and that might be improved by someone with regexp skills.
Just add this function (adapted from the previous _extract_info):

def _extract_total_results_num(self, soup):
        empty_info = {'from': 0, 'to': 0, 'total': 0}
        div_ssb = soup.find('div', id='resultStats')
        if not div_ssb:
            self._maybe_raise(ParseError, "Div with number of results was not found on Google search page", soup)
            return empty_info

        txt = ''.join(div_ssb.findAll(text=True))
        txt = txt.replace(',', '')
        matches ='About \d* results', txt, re.U)
        if not matches:
            return ''
        res_num =[6:-8]
        return res_num

And then modify the get_results function by substituting:

total': MAX_VALUE}


'total': self._extract_total_results_num(page)}

Just let me know if it worked for someone else...

Comment Responses

July 14, 2012, 13:02

Hi , i do all the patches and gets the num_results equals zero , could you send your code to my mail : googcheng at or the wensite where you host it , Hope you help me and wanna calculate the PMI

Reply To This Comment

(why do I need your e-mail?)

(Your twitter handle, if you have one.)

Type the word "unix_139": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.