Google Python Search LibraryHere is a quick hack that I wrote. It's a Python library to search Google without using their API. It's quick and dirty, just the way I love it.

Why didn't I use Google's provided REST API? Because it says "you can only get up to 8 results in a single call and you can't go beyond the first 32 results". Seriously, what am I gonna do with just 32 results?

I wrote it because I want to do various Google hacks automatically, monitor popularity of some keywords and sites, and to use it for various other reasons.

One of my next post is going to extend on this library and build a tool that perfects your English. I have been using Google for a while to find the correct use of various English idioms, phrases, and grammar. For example, "i am programmer" vs. "i am a programmer". The first one is missing an indefinite article "a", but the second is correct. Googling for these terms reveal that the first has 6,230 results, but the second has 136,000 results, so I pretty much trust that the 2nd is more correct than the first.

Subscribe to my posts via catonmat's rss, if you are intrigued and would love to receive my posts automatically!

How to use the library?

First download the xgoogle library, and extract it somewhere.

Download: xgoogle library (.zip)
Downloaded: 33536 times.
Download url:

At the moment it contains just the code for Google search, but in the future I will add other searches (google sets, google suggest, etc).

To use the search, from "" import "GoogleSearch" and, optionally, "SearchError".

GoogleSearch is the class you will use to do Google searches. SearchError is an exception class that GoogleSearch throws in case of various errors.

Pass the keyword you want to search as the first parameter to GoogleSearch's constructor. The constructed object has several public methods and properties:

  • method get_results() - gets a page of results, returning a list of SearchResult objects. It returns an empty list if there are no more results.
  • property num_results - returns number of search results found.
  • property results_per_page - sets/gets the number of results to get per page. Possible values are 10, 25, 50, 100.
  • property page - sets/gets the search page.

As I said, get_results() method returns a SearchResult object. It has three attributes -- "title", "desc", and "url". They are Unicode strings, so do a proper encoding before outputting them.

Here is a screenshot that illustrates the "title", "desc", and "url" attributes:

Google Search Result, url, title, description
Google search result for "catonmat".

Here is an example program of doing a Google search. It takes the first argument, does a search on it, and prints the results:

from import GoogleSearch, SearchError
  gs = GoogleSearch("quick and dirty")
  gs.results_per_page = 50
  results = gs.get_results()
  for res in results:
    print res.title.encode("utf8")
    print res.desc.encode("utf8")
    print res.url.encode("utf8")
except SearchError, e:
  print "Search failed: %s" % e

This code fragment sets up a search for "quick and dirty" and specifies that a result page should have 50 results. Then it calls get_results() to get a page of results. Finally it prints the title, description and url of each search result.

Here is the output from running this program:

Quick-and-dirty - Wikipedia, the free encyclopedia
Quick-and-dirty is a term used in reference to anything that is an easy way to implement a kludge. Its usage is popular among programmers, ...

Grammar Girl's Quick and Dirty Tips for Better Writing - Wikipedia ...
"Grammar Girl's Quick and Dirty Tips for Better Writing" is an educational podcast that was launched in July 2006 and the title of a print book that was ...Writing - 39k -

Quick & Dirty Tips :: Grammar  Girl
Quick & Dirty Tips(tm) and related trademarks appearing on this website are the property of Mignon Fogarty, Inc. and Holtzbrinck Publishers Holdings, LLC. ...

Quick and Dirty search on Google
Compare these results to the output above.

You could also have specified which search page to start the search from. For example, the following code will get 25 results per page and start the search at 2nd page.

gs = GoogleSearch("quick and dirty")
  gs.results_per_page = 25 = 2
  results = gs.get_results()

You can also quickly write a scraper to get all the results for a given search term:

from import GoogleSearch, SearchError
  gs = GoogleSearch("quantum mechanics")
  gs.results_per_page = 100
  results = []
  while True:
    tmp = gs.get_results()
    if not tmp: # no more results were found
  # ... do something with all the results ...
except SearchError, e:
  print "Search failed: %s" % e

You can use this library to constantly monitor how your website is ranking for a given search term. Suppose your website has a domain "" and the search term you want to find your position for is "python videos".

Here is a code that outputs your ranking: (it looks through first 100 results, if you need more, put a loop there)

import re
from urlparse import urlparse
from import GoogleSearch, SearchError

target_domain = ""
target_keyword = "python videos"

def mk_nice_domain(domain):
    convert domain into a nicer one (eg. into
    domain = re.sub("^www(\d+)?\.", "", domain)
    # add more here
    return domain

gs = GoogleSearch(target_keyword)
gs.results_per_page = 100
results = gs.get_results()
for idx, res in enumerate(results):
  parsed = urlparse(res.url)
  domain = mk_nice_domain(parsed.netloc)
  if domain == target_domain:
    print "Ranking position %d for keyword '%s' on domain %s" % (idx+1, target_keyword, target_domain)

Output of this program:

Ranking position 6 for keyword python videos on domain
Ranking position 7 for keyword python videos on domain

Here is a much wicked example. It uses the GeoIP Python module to find all 10 websites for keyword "wicked code" that are physically hosting in California or New York in USA. Make sure you download GeoCityLite database from "" and extract it to "/usr/local/geo_ip".

import GeoIP
from urlparse import urlparse
from import GoogleSearch, SearchError

class Geo(object):
  GEO_PATH = "/usr/local/geo_ip/GeoLiteCity.dat"

  def __init__(self):
    self.geo =, GeoIP.GEOIP_STANDARD)

  def detect_by_host(self, host):
      gir = self.geo.record_by_name(host)
      return {'country': gir['country_code'].lower(),
              'region': gir['region'].lower()}
    except Exception, e:
      return {'country': 'none', 'region': 'none'}

dst_country = 'us'
dst_states = ['ca', 'ny']
dst_keyword = "wicked code"
num_results = 10
final_results = []
geo = Geo()

gs = GoogleSearch(dst_keyword)
gs.results_per_page = 100

seen_websites = []
while len(final_results) < num_results:
  results = gs.get_results()
  domains = [urlparse(r.url).netloc for r in results]
  for d in domains:
    geo_loc = geo.detect_by_host(d)
    if (geo_loc['country'] == dst_country and
                 geo_loc['region'] in dst_states and
                 d not in seen_websites):
      final_results.append((d, geo_loc['region']))
      if len(final_results) == num_results:

print "Found %d websites:" % len(final_results)
for w in final_results:
    print "%s (state: %s)" % w

Here is the output of running it:

Found 10 websites: (state: ca) (state: ca) (state: ca) (state: ca) (state: ca) (state: ca) (state: ca) (state: ca) (state: ny) (state: ca)

You may modify these examples the way you wish. I'd love to hear some comments about what you can come up with!

And just for fun, here are some other simple uses:

You can make your own Google Fight:

import sys
from import GoogleSearch, SearchError

args = sys.argv[1:]
if len(args) < 2:
 print 'Usage: "keyword 1" "keyword 2"'

  n0 = GoogleSearch('"%s"' % args[0]).num_results
  n1 = GoogleSearch('"%s"' % args[1]).num_results
except SearchError, e:
  print "Google search failed: %s" % e

if n0 > n1:
  print "%s wins with %d results! (%s had %d)" % (args[0], n0, args[1], n1)
elif n1 > n0:
  print "%s wins with %d results! (%s had %d)" % (args[1], n1, args[0], n0)
  print "It's a tie! Both keywords have %d results!" % n1

Downloaded: 8076 times.
Download url:

Here is an example usage of

$ ./ google microsoft
google wins with 2680000000 results! (microsoft had 664000000)

$ ./ "linux ubuntu" "linux gentoo"
linux ubuntu wins with 4300000 results! (linux gentoo had 863000)

After I wrote this, I generalized this Google Fight to take N keywords, and made their passing to program easier by allowing them to be separated by a comma.

import sys
from operator import itemgetter
from import GoogleSearch, SearchError

args = sys.argv[1:]
if not args:
  print "Usage: keyword one, keyword two, ..."

keywords = [k.strip() for k in ' '.join(args).split(',')]
  results = [(k, GoogleSearch('"%s"' % k).num_results) for k in keywords]
except SearchError, e:
  print "Google search failed: %s" % e

results.sort(key=itemgetter(1), reverse=True)
for res in results:
    print "%s: %d" % res

Downloaded: 8268 times.
Download url:

Here is an example usage of

$ ./ earth atmospehere, sun atmosphere, moon atmosphere, jupiter atmosphere
earth atmospehere: 685000
jupiter atmosphere: 31400
sun atmosphere: 24900
moon atmosphere: 8130

I am going to expand on this library and add search for Google Sets, Google Sponsored Links, Google Suggest, and perhaps some other Google searches. Then I'm going to build various tools on them, like a sponsored links competitor finder, use Google Suggest together with Google Sets to find various phrases in English, and apply them to tens of other my ideas.

Download "xgoogle" library and examples:

Download: xgoogle library (.zip)
Downloaded: 33536 times.
Download url:

Downloaded: 8076 times.
Download url:

Downloaded: 8268 times.
Download url:


March 12, 2009, 16:15

Doesn't google throttle you after a while, if you scrape their pages too often? Too many requests from one IP and Google stops responding...

Daniel Vitor Morilha Permalink
March 12, 2009, 17:48

I usually google words to get the correct spell of it. Actually I had an idea a while ago to make a editor based on phrases popularity :P

March 12, 2009, 18:30

Looks like the scraper example you published pulls tons of dupes. Any way to fix?

March 12, 2009, 18:44

Jorge, it does. got to be careful. put a sleep between calls if you are doing a lot of scraping.

Daniel, me too. I have had this idea for a while as well. :)

Steve, can you tell me the query you used? Google sometimes displays 2 results from the same site (2nd usually indented to the right), that's an ok behavior. One way to escape that is to keep a list or dict of seen urls, then check if you have seen the url already.

malik Permalink
November 06, 2014, 07:33

which type of python version does it supporting?

Kamil Permalink
March 12, 2009, 22:00

Kid, use corpora for such checks, don't burn energy on google's servers. Jeez.

malik Permalink
November 06, 2014, 07:31

can u tell me what is Corpora ? Because i am finding out the tutorials that would help me to extract all base on query. for example if i type 'Programming in c++" in python it should generate all types of link. So help me if u know.

March 12, 2009, 22:42

Just remember about ;)

March 12, 2009, 22:52

Kamil, are there publicly available corporas? I know google's one but it's on 6DVDs and costs $150.

Gints, shhhhhh.

Madars Permalink
March 12, 2009, 23:25

as I said to you in private discussion, I still miss iteration so much. that would make xgoogle more pythonic and useful.

March 13, 2009, 03:12

Gints beat me to commenting about that... which is why I've been using other search engines so far.

I've been looking into this kind of thing. I had found different python bindings to their ajax search.


This looks interesting though. Bookmarked.

March 13, 2009, 13:02

Hi there,

you could also use the "pyajaxgoogle" binding [1] to search.


March 13, 2009, 14:34

ME, not really. That is their api that gives 32 results.

March 13, 2009, 15:07

Peter: Yes, but at least it's legal.

March 13, 2009, 15:54

Everything is legal... Having a mindset that something is illegal is just wrong (in a sense that you will always hold back from creating something cool, because you think it's illegal).

Varun Thacker Permalink
March 13, 2009, 18:55

my college uses a proxy server(NTLM) which is with authentication.How do I use xgoogle?

March 13, 2009, 20:01

Varun, you can set environment variable:

$ http_proxy=""
$ export http_proxy

and then run the application that uses xgoogle.

Other way is to edit xgoogle/ file and add


to list of handlers.

March 19, 2009, 08:21

Hi! This is a beautifull example how to use google and python. I have my own projects with python. First i try to show the people what simple and good is python. I started with Romania.

Goodidea Permalink
April 06, 2009, 02:58

Nice idea i will explore the code as a tutorial
Thanks a lot men :-)

May 05, 2009, 00:59

Hi Peter, I was looking at your code. I have done a much simpler utility for searching google. I used to use beautifulsoup also, but now I use lxml and xpath. It produces much quicker and cleaner code... here is an example that returns an array of the urls and text:

from lxml import etree as et
from urllib import quote_plus,urlopen

def gsearch(q='',num=10,datelimit=''):  
        for a in links:
        return returninfo

Let me know what you think!

May 05, 2009, 01:10

Chad, thanks for leaving a comment. There are a couple of points that I want to make:

1. you don't check for errors - my code does very rigorous error checking so that my applications did not suddenly die because of unhandled exceptions.

2. i love the conciseness of your code - mine is 10x longer.

3. i did not know lxml supported xpath - very nice to learn that.

4. i know lxml is much faster than BeautifulSoup, but it's also less prone to malformed HTML. but perhaps in the case when we parse Google it's not that important.

That's everything that I can think of at the moment.

May 05, 2009, 01:32

The code was just a snippet of some other code I have, but in regards to your points:

1. There are only 3 points where errors can creep in that I see:
1- if the urlopen fails or
2- during the htmlparser() if the html is super-malformed (same w/ BeautifulSoup).
3- if google changes their html format(but that will screw up almost any scraper)

The xpath and rest of the code will be work without problem since xpath will return '[]' if the xpath fails.

4. If you change the line:
then lxml handles malformed html almost as well as BeautifulSoup.

Anyways, I enjoy your blog, and just thought that I'd throw that out there.

Maj Permalink
June 08, 2009, 22:36

Scraping is dangerous because all it takes is one change to destroy all the work.

Sam Permalink
June 17, 2009, 04:16

How would you recommend folding this into a script that uses a set google query and set parameters and writes the output to a file? This way it could be used regularly without feeding it all the variables over and over... (Sorry if this is obvious to others out there!)

June 28, 2009, 16:29

Thank you for this nice library. It is very useful I think. I have got 503 errors, even if I use sleep between search actions. I think this can be related to agent setting. How can we set a custom browser agent in you code?

June 28, 2009, 16:30

Volkan, it's somewhere in the source. I did not make it explicitly changble.

Munir Permalink
July 27, 2009, 09:16

Just cant stop my self to comment on your blog. Good post.

Orz Permalink
July 27, 2009, 19:16

Dude, awesome! This works for my IRC bot!

July 28, 2009, 11:18

Any chance you'd be willing to put it up on Bitbucket or GitHub? :)

July 28, 2009, 12:20

Oh - and licensing it as open source?

July 31, 2009, 03:30

Doug, about bitbucket or github: Sure. I will. I just have to automate my tools more, to push out changes from my repo to bitbucket or github. I don't want to do anything manually. I haven't yet done this, but I soon will. At the moment the latest version is always at

Doug, about licensing: All my work is open source. You may use it any way you wish.

lowel Permalink
August 09, 2009, 22:51

thanks for a very nice lib.
I have added two more parameters domain and hl.

However changing the language parameters will give no results. It seems to be the html that is slightly different using i.e. hl=sv (swedish). I have been flagging your code for some hours now as this is my first time using python. Would you have the solution for this? Even though is probably the most used way, I am interested in the local versions as well.


stray Permalink
August 23, 2009, 19:00

Hi Peter! I'm using your lib, but i've some problems surfing google pages. What is the best way to change it?
How can i know when they're over?



August 23, 2009, 19:33

Hi Stray! They are over when get_results() returns an empty list.

To get all results do this:

results = []
while True:
  tmp = gs.get_results()
  if not tmp: # no more results were found
August 25, 2009, 02:34


Thanks a lot for the code! I was rather saddened to see pygoogle no longer being maintained, nor Google releasing any SOAP API keys any longer. This is perfect.

One thing I run across- I wondered if there was an easy way to enclose a search in quotes ("") instead of the default?

I'm searching for rather long strings that pretty much require some quotes around it to pull the exact results.

Thanks for any input!


balcon Permalink
August 28, 2009, 16:34

I have an error:
Search failed: Failed getting HTTP Error 503: Service Unavailable
What i do wrong?

September 06, 2009, 23:22

Hi Peter,

I've used this library for a variety of things so I thought I would just pop in and say thanks for providing something that works well is easy to use.

I just started writing a replacement google library for my company's internal use. Like chad, I am also a huge fan of lxml for a variety of reasons. I also would like to make the library a bit more "pythonic" in general by adding smart generators so that you can iterate over results without worrying about what page they are on. I wrap your SearchResults objects already so I will probably provide a class/function hook in the constructor so can yield() instances of WhateverClass.

balcon: that 503 error is 99% likely to mean that you tried to scrape google too quickly. Bare minimum time between searches is about 10 seconds if you're doing more than 5-6 requests.



stray Permalink
October 07, 2009, 17:37

Hi Pete! I'm again here to post :=)
You suggest me that they(google pages) are over when get_results() returns an empty list. Using your lib i've found it is not so true in fact look at this scenario:

-> Google results: 1680000
-> Dork: inurl:polito (it's my university :P)

... I print all links

-> Results: 100

But as you can see they're only 100 instead of 1680000. I don't demand to have all these outcomes (I've used your examples)

Have fun!



October 07, 2009, 18:51

Stray, here is the code that I just tried:

>>> from import GoogleSearch
>>> import time
>>> gs = GoogleSearch("inurl:polito")
>>> gs.results_per_page = 100
>>> res = []
>>> while True:
...  tmp = gs.get_results()
...  if not tmp:
...   break
...  res.extend(tmp)               
...  time.sleep(5)
>>> print len(res)

Seems to work for me.

The thing is that Google can show that it has 10 billion results but in reality it will return only 1000 for any search. And if it thinks there are some duplicates in those 1000, then it will return even less. In this case it returned 618 results.

lowel Permalink
October 16, 2009, 10:26

Did someone manage to use different parameters for other google domains and languages?

Kind regards,

October 27, 2009, 08:32

I can't download file. Please fix link. Thanks very much.

October 27, 2009, 09:23

AloneRoad, works for me. Try to see what is going on on your side.

Tom Permalink
November 09, 2009, 19:42

I am getting the following error scraping Google from python:

urllib2.HTTPError: HTTP Error 503: Service Unavailable

Strange thing is that I can view results through a browser on the same machine(same IP etc) no problem. I am already masquarding the "User Agent Id' to mimic Firefox. I am also collecting in the cookies and feeding these back with the request.

Does anyone have an idea as to how Google will be telling the Python screen scraper versus the browser apart? Perhaps some other header in the request?

I would be very grateful to get anybodys thoughs and experiences on this.


November 23, 2009, 00:32

Well done, thanks for providing this.

Given that your are a scientist, have a go at a 'quick and dirty', easy-to-use Google Scholar lib next. That's duly needed, as Google doesn't provide an API for that service (yet?).

The scientific community will be eternally thankful. Some use cases and suggestions can be found here:

November 23, 2009, 01:07

Alessandro, thanks for the comment. The never ending list of requests to add API for Google Scholar inspired me to write it right this very moment! I am doing it!

November 23, 2009, 01:31

Tom, I didn't notice your comment.

Perhaps Google figured that your user agent was spammy. Try creating GoogleSearch object with random_agent argument set to true:

from import GoogleSearch

gs = GoogleSearch(query, random_agent=True)
November 23, 2009, 02:30

Thanks for the code, can't wait to put it to use!

December 01, 2009, 22:57

Marhaban (More informal Hi, greetings in Arabic) Peter.

Firstly, Excellent work!

I do very specialized English to Arabic names and terms transcription work; verifying their integrity and veracity.

"Transcriptions" is the formal linguistic terminology for "spellings".

Without explaining further you can bring up my unique, very easy to use web page:

Your xgoogle Google parser is what I've had in mind for a long time. Once I (the user) verifies which Arabic transcription variation(s) to use as search terms your Google parser is a powerful adjunct.

Here is my slight deviation on your original code with the native Arabic search term: native Arabic "Philadelphia".

For those of you versed in Arabic or other Semitic languages such as Hebrew Philadelphia in the incorrectly reads left to right where it should read right to left with contiguous characters.

Hopefully on the submit the Arabic characters will retain their at least human readable form even though they're in the wrong direction and not revert to some %hex encoding.

But no worries! From a functional respect it all correctly "comes out in the wash" (i.e. run the script)

My question is it seems on the whole your script works fine but on looking at a corresponding "native" Google search via Firefox I seem to be missing some URL's per page.

I was wondering if you have any plans to upgrade to different languages? Maybe there's some encodings not being recognized by your code ... perhaps some setting or designation I can do from my side?

That said the number of URLs I seem to miss nowhere near invalidates using your work as a wonderful adjunct to mine.

Job well done!


Joel S.

# -*- coding: utf-8 -*-

import sys


from import GoogleSearch, SearchError
gs = GoogleSearch("فيلادلفيا")
gs.results_per_page = 100 = 2
results = gs.get_results()

counter = 0

for res in results:
print res.title.encode('utf8')
print res.desc.encode('utf8')
print res.url.encode('utf8')
counter = counter + 1
print counter

except SearchError, e:
print "Search failed: %s" % e

December 01, 2009, 23:13

Marhaban Peter; Quick addendum - on pasting my code snippet the Arabic Philadelphia seems to have "righted itself" and appears in perfect,correct human readable form reading right to left.

Yes, for anyone who wants to try my Python snippet it reverts back reading to right in the Python editor.

It should still work fine for you provided you make the proper tweaks from your system.

مع سلامة (Maa Salama or Ciao y'all in Arabic)

Joel S.

Nicolas Couture Permalink
December 06, 2009, 16:29


Thanks for sharing this code. I was wondering is there a particular reason why you are packaging the BeautifulSoup module in xgoogle?

Recently I've ran into troubles using a new version of BeautifulSoup with soup2text ( and using an older version fixed the problem.

Did you encounter a similar situation and if so do you know what additions in BeautifuSoup broke your code?

December 06, 2009, 18:42

Hi Nicolas,

Yes, I encountered a similar situation. I am packaging BeautifulSoup in my code because it's the most stable version I have ever used. The new BS uses a different parsing engine and when doing tests it would throw unexpected errors such as EncodeError, IndexError and others. And the old one parses it just fine.

December 10, 2009, 06:16

Looks awesome, very thorough. In my script I wrote a simple class with a static search method to grab result links from the page source... all I really needed at the time. Though this will definitely be useful to me in the future. Nicely done.

December 22, 2009, 02:50

I am getting a weird error

your example code in the readme file works great in the interactive but fails when I put it in a file
this is the code -->
>>> from import GoogleSearch
>>> gs = GoogleSearch("catonmat")
>>> gs.results_per_page = 25
>>> results = gs.get_results()
>>> for res in results:
... print res.title.encode('utf8')

$ ./
from: can't read /var/mail/
./ line 4: syntax error near unexpected token `('
./ line 4: `gs = GoogleSearch("quick and dirty")'

Whyever is it looking for /var/mail/xgoogle, when I have it loaded in dist-pkgs where the interactive imports it just fine. What am I missing?

December 22, 2009, 03:26

OK. Never mind, I got it working. back-tics crawled in somehow. gs=GoogleSearch(`name')

Debsankha Permalink
January 11, 2010, 13:51

Thanks a lot for the awesome code man. This was exactly what I was looking for in order to write a lyric downloader for amarok in python.

"Google's Terms of Service do not allow the sending of automated queries of any sort to our system without express permission in advance from Google."

If every website adopted this policy, Google themselves would be out of business tomorrow. Or maybe compete with DMOZ. There probably exists no company that sends out more automated queries than Google. It's not exactly the height of hypocrisy, because they do respect robots.txt and if you don't want to be there you don't have to. However, we all know that if you are not to be found on google, you don't exist. Descartes famously said, "Cogito ergo sum" I think therefore I am. Today he would say, "Above the fold on google ergo sum."

Mosalam Permalink
January 21, 2010, 02:42

Thanks for the handy script. I modified to support all languages when it parses from/to/total numbers:

January 26, 2010, 14:39

I'd like to modify the lib to search in differents web sites:,,,

I'll try this afternoon. Do you think it's an easy task? I'll send you a patch then.

January 26, 2010, 14:46

It shouldn't be difficult, Juanjo.

Clone latest code from github:

xgoogle at github.

March 02, 2010, 02:38

Does this API provide a method to get the estimated total number of results for a given search string?

March 02, 2010, 02:48

no, don't bother answering my last question. my bad, i didn't read the post properly.

sunshine Permalink
March 13, 2010, 21:09


I really loved this library. I had some problems with getting results for queries in Hebrew. It looks like it has problems parsing the results page.

Do you plan to add a support?


Yuriy Zhilovets Permalink
March 22, 2010, 14:41

It's not too difficult to send an http query to ggogle and parse an answer. But what to do with Google's captcha? Your library does not seem to take into account such possibility.

April 17, 2010, 11:15

Thank you, very useful.

Corsair Permalink
April 23, 2010, 02:22

Used this library in a little calculator that queries Google instead of actually calculating anything. Compiled in py2exe just to try and works great! Thanks a lot!

April 27, 2010, 15:36

Yeah - That 8 results thing is really lame on Google's part. Come-on, are they trying to encourage scraping or what?

Sofoklis Permalink
May 06, 2010, 19:33

Scripts don't work anymore. Maybe the recent change in the interface of Google...? Help please!

May 06, 2010, 23:11

I don't have time to fix it now. If someone fixes it, I'll accept the patch and put it on github.

xgoogle repo at github

May 08, 2010, 13:24

I only use xgoogle for checking results for one phrase from time to time and it was sufficient to comment out this code in, method get_results():

#if self.num_results == 0:
# self.eor = True
# return []

It's quick and dirty, just the way author loves it (and maybe even more) ;-)

TimoLindfors Permalink
May 08, 2010, 15:24

Thanks, this works!

ALU Permalink
May 08, 2010, 20:20

Thx man, u are great! u solve my problem!
But can you explains how come by removing the following codes will makes the program works?

revan blezinsky Permalink
May 10, 2010, 08:10


Have you fix this bug?
Exception: __init__() got an unexpected keyword argument 'page'
Traceback (most recent call last):
File "./", line 299, in


g = googleScan(config)
File "/pentest/web/fimap/", line 33, in __init__ = GoogleSearch(self.config["p_query"], page=self.config["p_skippages"])
TypeError: __init__() got an unexpected keyword argument 'page'

Thank You.

vics Permalink
May 14, 2010, 15:41

it's true! google blocked me on second day...

vics Permalink
May 14, 2010, 15:52

sorry! i haven't read this..

Stefan Permalink
May 17, 2010, 18:53

On your quick-and-dirty example search I get the error
Search failed: Div with number of results was not
found on Google search page

Is this due to google search page redesign?

amions Permalink
May 19, 2010, 02:05

Yes, Google search redesigned the structure of the sum of search results which is located on the right top of the search page.
Google would return the begin and end number of the all search result in plain text at before, while now this numbers have been changed by script.

And these changes could influence the xgoogle's function "_extract_info(self, soup)" in the file ""

The crude solution I have taken as following:Change the function get_results(self)" in the file ""

Wish this will helpful for you ~

def get_results(self):
""" Gets a page of results """
if self.eor:
return []
MAX_VALUE = 1000000
page = self._get_results_page()
#search_info = self._extract_info(page)
results = self._extract_results(page)
search_info = {'from': self.results_per_page*self._page, \
'to': self.results_per_page*self._page + len(results),
'total': MAX_VALUE}
if not self.results_info:
self.results_info = search_info
if self.num_results == 0:
self.eor = True
return []
if not results:
self.eor = True
return []
if self._page > 0 and search_info['from'] == self._last_from:
self.eor = True
return []
if search_info['to'] == search_info['total']:
self.eor = True
self._page += 1
self._last_from = search_info['from']
return results

May 21, 2010, 07:12

Hi thanks for the patch but I can't figure out the formatting. Can you post the patch again with correct formatting? Use <pre> ... </pre> to wrap the code. Thanks!

May 20, 2010, 23:14

Hey, Pete. I keep running into problems and am having trouble figuring out what's gone wrong.

Here's my sample code...

$ from import GoogleSearch
$ gs = GoogleSearch('oh yes')
$ gs.get_results()

At this point, I get an empty list back from, from this code section...

if not self.results_info:
	self.results_info = search_info
	if self.num_results == 0:
		self.eor = True
		return []

Can you help a brother out here, please?

The odd part is that sometimes, I get results and sometimes I don't. ...almost like Google is just flat out denying me. However, I would think that I would get an HTTP error, and I am not.

Any help is much appreciated!

May 21, 2010, 07:13

Hi. Xgoogle is currently broken and I don't have enthusiasm to fix it. Someone pasted a patch above but it's not indented right and again I don't have enough enthusiasm to figure out how to indent it right.

May 21, 2010, 12:46

Okay, no problem, Pete. That's all I needed to know. Thanks!

Andrew Permalink
May 21, 2010, 22:14

Hi I think this is the correct formatting.



    def get_results(self):
        """ Gets a page of results """
        if self.eor:
            return []
        MAX_VALUE = 1000000
        page = self._get_results_page()
        #search_info = self._extract_info(page)
        results = self._extract_results(page)
        search_info = {'from': self.results_per_page*self._page,
                       'to': self.results_per_page*self._page + len(results),
                       'total': MAX_VALUE}
        if not self.results_info:
            self.results_info = search_info
            if self.num_results == 0:
                self.eor = True
                return []
        if not results:
            self.eor = True
            return []
        if self._page > 0 and search_info['from'] == self._last_from:
            self.eor = True
            return []
        if search_info['to'] == search_info['total']:
            self.eor = True
        self._page += 1
        self._last_from = search_info['from']
        return results
May 22, 2010, 14:07

I am still getting an empty list back from after replacing `get_results()' with this version. What am I missing/screwing up?

Has this 'patch' worked for anyone else?

May 23, 2010, 13:58

Wow, Andrew, thanks for taking the time to put it together. I'm gonna try it out now and if it works, put it in xgoogle.

Yoyo, will get back to you with results soon.

Andrew Permalink
May 25, 2010, 15:12

Hey yoyo!

Are you getting an empty list or are you just not doing anything with the data? Have you tried running the examples on this site?

Your source does the same for me when running from my ide it returns nothing because I have not done anything with the data.

Let me know cheers

May 27, 2010, 15:28

Well, I am trying out the correctly tabbed patch that Andrew posted now. My command line test worked (Andrew, I failed to post the rest of the commands that showed my work). Since Goo has me pegged already, I am very careful about how I run my program now.

I will get back later on today, with the results.


May 27, 2010, 23:58

Success! Thanks a million!

Here are my results.

Andrew Permalink
May 31, 2010, 01:39

Hey no worries!

Thanks for the props on your site, glad I could help

Have a good one



Andrew Permalink
May 21, 2010, 22:25

Would you by chance need a hand in the development of this Library. I am semi skilled in Python (self taught) and can offer help to keep it a float if you need web hosting I can offer that for free also :D I have two dedicated Linux servers available at my disposal.

It's a good library and I am in the process to develop some simple seo tools with it for a company my web design team works for

Let me know cheers



May 23, 2010, 14:00

Andrew, sure, join my team!

The latest source code of xgoogle is at github: xgoogle github repo.

Feel free to fork it and start hacking on it! If you wish, you can even be the project leader, as I am currently so busy that I can't spend much time on this project.

Talking about hosting, thanks for the offer! I actually have dedicated server already :) But if I ever need a new server, I'll keep you in mind! Thanks again!

Andrew Permalink
May 21, 2010, 22:35

Simple hack to use more than one google...

In the tool I am developing I needed to be able to input different google's to search so i came up with this simple hack

class GoogleSearch(object):
    def __init__(self, query, tld, random_agent=False, debug=False, lang="en", re_search_strings=None):
        self.query = query
        self._tld = tld

You can then specify a different google by typing

GoogleSearch("keywords", tld="")
Ayushman Permalink
May 28, 2010, 10:54

i have been trying to use the library but it always returns zero results

Thanks in advance

detcad Permalink
May 31, 2010, 20:13

Same here.

detcad Permalink
May 31, 2010, 20:22

The solution seems to be to change get_results() in to this:

June 01, 2010, 19:18

Thanks, it is works!

Nick Permalink
June 11, 2010, 19:51

found this, but I've been unable to make it work. I've applied the patch to get_results, but I continue to show 0 results with the examples provided.

is there another known issue?

shun Permalink
June 22, 2010, 00:02

I'm also getting 0 results. The above patch seems to just comment out error checking. Can anyone fix this for the new format? (The new format says "About XXXXX results" instead of "Results X-XX of about XXXXX results")

June 23, 2010, 06:36

Hi, thanks for this. It's amazing and working perfectly for me. Any idea how you would construct the url in class GoogleSearch in to do an google image and google news search? Is it possible. Sorry if this is obvious. I am a Python newbie.

Lamka Permalink
June 28, 2010, 07:58

The script doesn't work for me, I just want to get the total number of results return, but it always return 0, please help!!

July 08, 2010, 16:24

So if I want to use this in a web application, that is a public one, should I put some note or license in case I use this code?


July 22, 2010, 10:18

It seems that this library not working, when I use this code:

from import GoogleSearch

if __name__ == '__main__':
    gs = GoogleSearch('google')
    print gs.num_results
    results = gs.get_results()
    print len(results)

It prints


Meybe Google change his page

sebastian Permalink
August 01, 2010, 22:02

Maybe google has changed something on their pages because this library doesn't work.

sebastian Permalink
August 01, 2010, 22:26

oops haven't read the comments above lol, fix has already been posted, thanks for this lib, its pretty cool

August 12, 2010, 21:19

Thanks for a awesome search library. Any plans to release a regex version instead of beautifulsoup?

chris Permalink
August 23, 2010, 10:50

Very nice program! I'm finding google hates being scraped unless I put in a huge delay. I have a need to thoroughly sift one single domain name for tens of thousands of pages of data. This is going to take weeks at this rate. Does anyone know where I can get / buy archived search data so I could sort it locally without lag and terms of service issues?

Bob Permalink
October 19, 2010, 21:13

Get a web crawling program, and just crawl that domain. Skip Google.

I do this all the time, using a free program called WinHTTrack (on Windows; also available on other platforms). See

Aim HTTrack at the top page of the site, and start the crawling. It does a great job grabbing anything that is linked.

Ugo Permalink
August 30, 2010, 16:08

Is there a way to handle the new Google UI ?
I'm looking for a way to count the number of pages indexed for my website in Google, this library is awesome.
The fix that the person gave above just removes all of this.... any plan on fixing the regexp ?
I've tried all I could, but I'm very bad at using regexp ....

Thanks a lot,


Mike Permalink
September 01, 2010, 21:38

Has anyone gotten this to work lately? Even with the patch, I still get 0 results when I try to search.

Cheese Permalink
September 08, 2010, 07:03

not working anymore?

September 10, 2010, 19:22

What about a version of this for python 3 ?

whatever Permalink
September 24, 2010, 15:06

@RY, if you have to ask this question then you shouldn't be using python 3.

zaggi Permalink
September 11, 2010, 12:27

its working but you have to change this part in
matches ='%s (\d+) - (\d+) %s (?:%s )?(\d+)' % self._re_search_strings, txt, re.U)
if not matches:
return empty_info
return {'from': int(, 'to': int(, 'total': int(}
cos regex below doesnt suit current google template. I do not post my cos it is only for parsing all the results of . Made a quick fix for this

ugo Permalink
September 27, 2010, 08:43

Would be great if you could post your fix though :)

Better than nothing !



September 27, 2010, 11:04

It should be fixed now. Someone sent me in a patch and it semi-works (doesn't return the number of results correctly).

September 20, 2010, 09:41

Now does it the search limit set to 10?.
It's not possible increment the results more than 10.

Anyone could confirm this issue?

Timo Juhani Lindfors Permalink
September 21, 2010, 08:59

ecasbas, yes it seems that gs.results_per_page has no effect anymore and I always get 10 results. The query

GET /search?hl=en&q=python&num=50&btnG=Google+Search HTTP/1.1
Accept-Encoding: identity
Accept-Language: en-us,en;q=0.5
Connection: close
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/2009011913 Firefox/3.0.6

just gets a reply with only 10 results. Maybe num=XX parameter just got deprecated?

Timo Juhani Lindfors Permalink
September 21, 2010, 22:52

It seems using

gs = GoogleSearch(raw_query, random_agent = True)

instead of

gs = GoogleSearch(raw_query)

works most of the time but of course it might be a good idea to remove the non-working user-agent from the list.

shun Permalink
October 04, 2010, 18:55

Everyone seems to be missing the issue.

elif self.hl == 'en':
matches ='Results (\d+) - (\d+) of (?:about )?(\d+)', txt, re.U)

Google does not display its results this way anymore. Shouldn't the correct regular expression be something like (r'About....

I am such a python wimp that I can't do it on my own. Anybody there that can help?

October 15, 2010, 21:26

Thanks for the great script. As a python newbie I'm just getting my feet wet, but this is a great start. I appreciate you sharing the module!

whelanr Permalink
October 31, 2010, 04:55

Seems google is blocking the search of any server side file searches such as "*.asp" and the likes. No practical application of this that is ethical but just a heads up.

Daniel Permalink
November 02, 2010, 10:42

Hi, I think that the google is blocking the search... every time that I run the program google_fight or google_fight2, they answared 0 to all words. Can someone say me what is happing? and can I resolve this?


Arv Permalink
November 02, 2010, 19:01

I was trying to use the sponsored links module for the first time today, but I kept getting 404 errors. Regular google search works fine. Does anyone know if google change their system?


November 11, 2010, 01:00


Great lib, really helps a Python n00b like me out. Using this I don't have to re-invent the wheel.

I just need to figure out where to put the urllib2 line to use a proxy.

Berry van der Linden
PS you seem to have encountered some spammers see three comment above mine.

Petar Permalink
November 19, 2010, 07:21

This piece looks very nice. However, few people asked about language and domain restrictions. I know that domain can be defined in the query itself, like: "%s" % word. However, this is not possible for language, since lr=? is defined elsewhere. That can be handy, don't you think? How can this be made?


Lisbeth Permalink
December 22, 2010, 02:50


I would like to use the script to find all Google results for a search (say 'foo bar').

I am fairly new to Python (<2 days) - can anyone give pointers on where to change to allow allow results to be returned (rather than first page only).

NB: I have been using '' which returns results from first search page.

Thank you.

Sebastian Permalink
January 05, 2011, 04:00

Does the library got broken again? I'm not sure how to debug it but im not getting results and neither any error.

Leke Permalink
January 09, 2011, 10:36

I had this problem first time I tried to ran the library. I turned out I was calling the library from the wrong folder because I didn't realise the xgoogle folder that had the library was inside another folder called xgoogle.

Leke Permalink
January 09, 2011, 10:39

Just noticed I sometimes get "Timed out" errors like...

Search failed: Failed getting timed out

Is this google throttling me?

Chris Permalink
January 24, 2011, 17:22

Hi Peter, I am new to python, can you tell me how I make this work on my python instance? I had thought it was something like 'python install' however, there is not a with your zip file.

Thanks in advance and sorry for the basic question.


Chris S Permalink
February 17, 2011, 23:35

I was running into issues calculating the number of search results (num_results) returned from Google as they changed their layout and formating. The following is one solution I found to the problem. I have edited the regular expression to accept the new format along with small other tweaks.

def _extract_info(self, soup):
    empty_info = {'from': 0, 'to': 0, 'total': 0}
    div_results = soup.find('div', id='resultStats')  #div handle has changed
    if not div_results:
        self._maybe_raise(ParseError, "Div with number of results was not found on Google search page", soup)
        return empty_info

    txt = ''.join(div_results.findAll(text=True))
    txt = txt.replace(',', '')          #Remove commas
    txt = txt.rstrip(' ')		#Remove line break

    ##new format: About XXXXX results  (x.xx seconds)
    matches = r'%s (\d+) %s\s+\((\d+\.\d+) %s\)' % self._re_search_strings, txt, re.U)

    if not matches:
        return empty_info
    return {'total': int(, 'time': float(} 

I have only tested the above code using a few queries but it appears to return the correct results. I hope this will be of help to someone.


February 19, 2011, 11:07

Hi Peter,

Thank you for sharing your code! It's a very useful library :)


sheen Permalink
June 20, 2011, 10:29

num_results always shows the result 1000000 why?
and the result_per_page always give 10 result per page it is not accepting result_per_page=25/50/100 . why?

sheen Permalink
June 20, 2011, 10:30

how can i solve this above problem?????

alessio Permalink
June 22, 2011, 13:11

just see the comment below: guess it's what u're searching (even if not so clean...)

alessio Permalink
June 22, 2011, 12:47

Hello, you did a great job and your code really helped me. For the results number I have a very dirty solution that works with the current google version, and that might be improved by someone with regexp skills.
Just add this function (adapted from the previous _extract_info):

def _extract_total_results_num(self, soup):
        empty_info = {'from': 0, 'to': 0, 'total': 0}
        div_ssb = soup.find('div', id='resultStats')
        if not div_ssb:
            self._maybe_raise(ParseError, "Div with number of results was not found on Google search page", soup)
            return empty_info

        txt = ''.join(div_ssb.findAll(text=True))
        txt = txt.replace(',', '')
        matches ='About \d* results', txt, re.U)
        if not matches:
            return ''
        res_num =[6:-8]
        return res_num

And then modify the get_results function by substituting:

total': MAX_VALUE}


'total': self._extract_total_results_num(page)}

Just let me know if it worked for someone else...

July 14, 2012, 13:02

Hi , i do all the patches and gets the num_results equals zero , could you send your code to my mail : googcheng at or the wensite where you host it , Hope you help me and wanna calculate the PMI

June 30, 2011, 01:14

after changing all the patches mentioned above, I got the working but I don't think it is stable. Sometimes, it gives me 10 results from the google result page but sometimes, it just give me two reference entries

Jun Hou Permalink
July 20, 2011, 04:52

Agree. For num_results, some search query works and some does not. I try to use NGD to calculate semantics between two words. Can anyone fix this?

nitinhardeniya Permalink
March 29, 2012, 14:34

I am also looking for NGD but i am getting 0 as results only
for any query.

Any other tool or/solution also welcome for NGD


July 14, 2012, 12:58

Hi , so am I, how to solve it and hope you share the way you copy with it!

July 14, 2012, 12:56

Hi, Jun Hou ! Have you solved the problem ? I am in the same problem , can you give some advice , thanks a lot!

July 01, 2011, 09:13

With new version of google xgoogle doesn't work anymore, i add this mod in and now work fine:

def _extract_results(self, soup):
results = soup.findAll('li','g')


def _extract_description(self, result):
desc_div = result.find('span', 'st'))


max5555 Permalink
July 01, 2011, 20:14

Last line has to be like this
desc_div = result.find('span', 'st')

Thank you very much securda!

Xiangyuan Tang Permalink
July 12, 2011, 01:43

Hi Securda! Where should I add this in Thanks very much!

Helpful Person Permalink
July 18, 2011, 18:32

In in the xgoogle folder, look for two methods named _extract_results and _extract_description .
Change the assignment variables for results and desc_div to the new ones given. Old code has been commented out below and new code is present. Thanks securda!!! This works like a charm. :)

First method to change:
def _extract_results(self, soup):
#results = soup.findAll('li', {'class': 'g'})
results = soup.findAll('li','g')

Second method to change:
def _extract_description(self, result):
#desc_div = result.find('div', {'class': re.compile(r'\bs\b')})
desc_div = result.find('span', 'st')

HamSandwich Permalink
April 18, 2012, 21:55

i checked out master on github recently and this hack is still needed. afterwards it works great. thanks

nitinhardeniya Permalink
March 29, 2012, 14:14

Hey does the xgoogle still works with
num_results it's giving me always 0

Please help even your Google Fight is also giving 0 results

Shadab Alam Permalink
July 14, 2011, 19:37

I have downloaded and tried the examples.
when I run the example1 then it shows following output.

salam@Mac10:~/Desktop/webcoding/xgoogle/xgoogle$ python The Quick and Dirty Guide to Learning Languages Fast ...
: A. G. Hawke: Books.

The script gives only one site and that too not the first site.
Can you please tell me the reason for this behaviour.


Jun Hou Permalink
July 20, 2011, 04:36

I have the same problem!! And I can not get the correct result page numbers with "num_results"
Some1 helps! Thank you!

cocolapin Permalink
July 26, 2011, 15:13

Same problem, probably google has change the output of their datas, to bypass tiers solutions, and to force people to use their front-end.

People behind google don't propose library to use them search engine. However they propose a lot of libraries to use others applications like calendar, maps, videos. Google does that to collect as many data as they can.

At the end we have google propose no solution, because we haven't source code, we can't owned the application. In the view of google we are just users who give more information :S

Sam Permalink
August 27, 2011, 14:08

After some searches Google redirects me to their captcha page, probably because no cookies are set... Anyone has a solution?
If not I will code a small search tool using scoogle, I think that should work without cookies and the html response is much easier to parse ;)

rafeeque Permalink
November 24, 2011, 16:22

The first program for searching quick and dirty is printing one rusult.
google sets is showing error.
Not getting the results as given the sample output.
Why like this?

December 19, 2011, 05:43

How do I change the search country. It seems like the the basic search returns New York data. I want to be able to choose which state or country. could someone update this code to handle that.
from import GoogleSearch, SearchError
gs = GoogleSearch("quick and dirty")
gs.results_per_page = 50
results = gs.get_results()
for res in results:
print res.title.encode("utf8")
print res.desc.encode("utf8")
print res.url.encode("utf8")
except SearchError, e:
print "Search failed: %s" % e

January 22, 2012, 11:25

arggg, the last post is from 2009, is alway any body here ?
The lib doesn't work now, google have certainly change 3 to 5 times they format of pages.
I try to run the and change the line :
query = "xxx"
random_agent = True
debug = True
lang = "en"
tld = "com"
from import GoogleSearch, SearchError
gs = GoogleSearch(query, random_agent, debug, lang, tld)
results = gs.get_results()
print results

and get empty result ?
>>> ================================ RESTART ================================

is any body have an updated version ?

January 22, 2012, 11:29

lol sorry the last post is from : December 19, 2011, 05:43

January 22, 2012, 11:34

lol sorry a new time that don't work in IDLE python app but work in command line
:-( shame on me

Mathew Permalink
April 12, 2012, 15:12

Could you please anyone tell me, how can I make query using xgoogle with Arabic?

Vish Permalink
April 19, 2012, 22:51

>>> gs = GoogleSearch("Newyork").num_results
>>> gs

the property num_results does not return number of search results found.It simply returns 0

David Permalink
July 16, 2012, 05:54

Hey,pal did u solve it? if yes, please share the code

fatemeh Permalink
May 04, 2012, 20:40

hi, thanks of this library
i wrote this example:

from import GoogleSearch, SearchError
gs = GoogleSearch("quick and dirty")
gs.results_per_page = 50
results = gs.get_results()
for res in results:
print res.title.encode("utf8")
print res.desc.encode("utf8")
print res.url.encode("utf8")
except SearchError, e:
print "Search failed: %s" % e

but it returned me just one result.i tried it to search for other words but it returned one result. why?
thanks alot

islamEltally Permalink
May 15, 2012, 09:13

I want to use xgoogle library for translation in my python app bet i have some problems in install and run
please send to me an simple example and installation steps
thanks in advance

illa Permalink
July 18, 2012, 01:47

same here can't get past 1 result..

fatemeh Permalink
May 04, 2012, 21:06

sorry i did not want to write comment 2 times.sorry

October 09, 2012, 08:08

Hi Peter,

I am reading line 252-255:

url = title_a['href']
match = re.match(r'/url\?q=(http[^&]+)&', url)
if match:
    url = urllib.unquote(

So url matches somthing like "/url?q=http://domain.tld/&name=joe" but not, say, "".

Can you please show me some use cases in the search results page?


Ayman Permalink
October 15, 2012, 06:41

i hope that some one fix the xgoogle problem for the reslut i still get 0 result from google search , coild any body help me how to caluclate the google distance similarity .
or fix the problem of get result 0 .

patfla Permalink
November 01, 2012, 03:29

It seems that _socket.pyd doesn't load in 2.7.3

>>> C:\projects\python>python
Python 2.7.3 (default, Jun 11 2012, 17:36:33) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from import GoogleSearch, SearchError
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "xgoogle\", line 13, in <module>
    import urllib
  File "C:\Python27\Lib\", line 26, in <module>
    import socket
  File "C:\Python27\Lib\", line 47, in <module>
    import _socket
ImportError: DLL load failed: The specified module could not be found.

And I say that for the following reason:

It seems that Spiros fixed the problem however I've never been in the position of needing a nightly build for python and when I look for one now, I don't find anything.

How does one apply (or even find) patches like that indicated in issue 16119?

Although even if I figure out how to fix this (maybe just back my installation up to 2.7.2) my guess is that from a) a previous Django exercise (that I'm trying to double-check with xgoogle and b) the most recent posts above that google is now able to detect programmatic access and doesn't return any results in order to discourage (prevent) programmatic access.

kishwar Permalink
November 07, 2012, 07:14

After applying all the patches, i am getting only 10 results instead of 50.This is very important for me.Please help me.

January 22, 2013, 18:01

Hello, first thank for this share, i need this working to check position of my domain for specific KW. I tried this:

import re
from urlparse import urlparse
from import GoogleSearch, SearchError

target_domain = ""
target_keyword = "python videos"

def mk_nice_domain(domain):
convert domain into a nicer one (eg. into
domain = re.sub("^www(\d+)?\.", "", domain)
# add more here
return domain

gs = GoogleSearch(target_keyword)
gs.results_per_page = 100
results = gs.get_results()
for idx, res in enumerate(results):
parsed = urlparse(res.url)
domain = mk_nice_domain(parsed.netloc)
if domain == target_domain:
print "Ranking position %d for keyword '%s' on domain %s" % (idx+1, target_keyword, target_domain)


And its working fine, but when i change that KW's its not working i get in my terminal something like this:

tripz0r-iMac-2:~ tripz0r$ python /Users/tripz0r/Desktop/urldigger-02c/
Ranking position 5 for keyword 'python videos' on domain
tripz0r-iMac-2:~ tripz0r$ python /Users/tripz0r/Desktop/urldigger-02c/
tripz0r-iMac-2:~ tripz0r$

So no any result when i change KW and URL in:

target_domain = ""
target_keyword = "my kw"

Any suggestion, how to fix it?

And also one question i have for you, could this software search for some KW for 100 pages in Google? Or just in first 10 or?

Thanks so much with sharing this.

David Sadler Permalink
May 03, 2013, 09:23

Thanks for coding this - it's a very useful library!

Forgive my ignorance but could you help me understand the relationship between results_per_page and get_results

for example if my google query returns 1000 results (in a browser - verified actually 1000)

and the gs.results_per_page = 1000 only returns about 70 results.

If I set the gs.results_per_page = 10 - How to I call the next page?

Or is the intended functionality to return all results with the gs.results_per_page?

Thanks for any guidance on this.

Xiaozhu Meng Permalink
May 15, 2013, 03:57

This looks like what I am looking for.

Can I use this library to do advanced google search like only returning results from certain site or domain?


Sorin Permalink
July 10, 2013, 15:03

It seems that the project is currently located at -- this being much better place for filing bugs or adding patches.

Lucia Maria Permalink
July 19, 2013, 08:39

Hi there!

I'm doing a google search for 100 results per page, however, it only returns 3 or 4 results. Could you guys help me?

Purplex Permalink
February 28, 2014, 04:54

Hello Lucia!
Even I'm getting only 3 results. I can't figure out how to make it work. Please let me know if you've solved it.

suki Permalink
August 04, 2013, 04:10

I went download the codes. So first I chmod 755 the and got error line 24 from import GoogleSearch, SearchError, ParseError. So then I went into the folder xgoogle and also chmod 755 but still the same. What else can I do ?

August 27, 2013, 07:03

trying to use this lib to search for pdfs on the internet.. the problem am having is that e.g if i search for "Medicine:pdf" the first page returns to me is not the first page google returns,i.e if i actually use google.... dont know whats wrong

Cass Permalink
January 28, 2014, 16:35

I am seeing the same thing just using the first example above (quick and dirty). If I type it into Google I get one result but when I run the script I get different results. It there a way to get the same result as a Google search returns?

hcast Permalink
September 12, 2013, 16:20

This would be a great lib as google's API is horrible. I've been trying to use the search and i continually get no results using a copy of the quick and dirty code. Has anyone else seen this?

Bernardo Permalink
December 03, 2013, 16:32

Does this still functions? cause it only gives me 3-4 results, and ive already tried the solution suggested by HelpfulPerson with no luck

Petar Permalink
January 24, 2014, 17:40

Hi, I tried xgoogle with a list of words to get the number of hits (that is only thing I really need; number of hits meight be used as a proxy for word frequency, as proposed by Grefenstette and Nioche (2000)). I am experiencing so many problems, mainly because my word lists contain about 250 words. Google returns nothing -- xgoogle gives me a list with all zero-values... Any idea? I have a code for Bing frequencies, but those are weird. Especially if I want to get frequency estimates for "small languages", such as Croatian or Slovakian etc.

Please, help me! Desperately stuck scientist... :-(

Purplex Permalink
February 28, 2014, 04:49

Hello Peter!
Thanks for this great and useful script.

I have got one doubt here, whenever I use
>>> while True:
... tmp = gs.get_results()
... if not tmp:
... break
... res.extend(tmp)
... time.sleep(5)

It returns a 'failed getting


timed out'. I can't figure out what is causing that error. It would be great if you can help me on that.

someguy Permalink
March 29, 2014, 18:48

what about duckduckgo?

Monika Permalink
August 13, 2014, 08:05

I download the xgoogle library zip folder and now i m trying to use this code for accessing google but this will give invalid syntax error

from import GoogleSearch, SearchError
gs = GoogleSearch("quick and dirty")
gs.results_per_page = 50
results = gs.get_results()
for res in results:
print res.title.encode("utf8")
print res.desc.encode("utf8")
print res.url.encode("utf8")
except SearchError, e:
print "Search failed: %s" % e

Monika Permalink
August 13, 2014, 08:06

Plz tell me how to use this library to access google

venkatesh Permalink
September 07, 2014, 19:31

Hi all, please help me resolve this issue.
I got the following error when trying to run the file.

Traceback (most recent call last):
File "C:\Users\Satman\Desktop\STUDY\PYTHON\XGoogle\examples\", line 7, in


from import (GoogleSearch, SearchError)
ImportError: cannot import name GoogleSearch

venkatesh Permalink
September 07, 2014, 20:08

Hi thanks, got the error cleared myself after some googling.
I placed the xgoogle folder in this path C:\Python27\Lib\site-packages and got the error cleared

Ankur Permalink
September 18, 2014, 09:09

I am using xgoogle API. Great API I must say, but I am getting 1 problem that even when I set
results_per_page = 50
I only get 5 results and also these results are not in sync with Google Search.

Can you please help me out with possible solution.

Thanks in advance !

Hayssam Traboulsi Permalink
October 03, 2014, 07:36

Thanks for sharing this library!
Actually, I would like to ask how to extract the body of a certain result in addition to its title, description, url

Valerio Permalink
December 02, 2014, 20:13

Hi can you please let me know if is possible to modify the code to set the different googles as a parameter and returning ranking from them? e.g. returns different results from, etc Also I get very different results if I run a search on from a web browser, why?

Developer Permalink
February 28, 2015, 17:02

I tried running your ranking script but then My cursor changes and i get this error

from: can't read /var/mail/urlparse
from: can't read /var/mail/
./ line 5: target_domain: command not found
./ line 6: target_keyword: command not found
./ line 8: syntax error near unexpected token `('
./ line 8: `def mk_nice_domain(domain):'

What could be wrong I am confused please

Hafiz Shafiq Permalink
December 09, 2015, 08:47

If I have to search from some other tabs of google e.g. videos. How it will works ?

djien kwee Permalink
February 03, 2016, 19:50

My environment: Ubuntu 12.04, python 2.7.3, python-bs4 is installed. Then run xgoogle/examples/
The results is an empty list.

Please inform me the reason

Sandro Pamrihno Permalink
July 25, 2017, 17:42


I checked your github and the last commits are some years ago.
Is this project still under active development and fully working ?

Thank you for your reply.


Praveen Permalink
November 09, 2017, 05:21

Did anyone find a updated solution for this. Thanks

Denis Permalink
April 19, 2018, 08:31

Interesting script, one question: is it possible to implement this library for advanced google search according a specifis domain?

Best Denis

Leave a new comment

(why do I need your e-mail?)

(Your twitter handle, if you have one.)

Type the word "apple_139": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.