Asynchronous DNS

Once upon a time, I had to quickly resolve thousands of DNS names. My first solution was to call gethostbyname repeatedly for each of the hosts. This turned out to be extremely slow. I could only do 200 hosts in a minute. I talked with someone and he suggested to try to do it asynchronously. I looked around and found adns - asynchronous dns library. Since I was writing the code in Python, I looked around some more and found Python bindings for adns. I tried adns and - wow - I could do 20000 hosts in a minute!

In this post I want to share the slow code and the fast asynchronous code. The slow code is only useful if you need to resolve just several domains. The asynchronous code is much more useful. I made it as a Python module so that you can reuse it. It's called "async_dns.py" and an example of how to use it is included at the bottom of the post.

Here is the slow code that uses gethostbyname. The only reusable part of this code is "resolve_slow" function that takes a list of hosts to resolve, resolves them, and returns a dictionary containing { host: ip } pairs.

To measure how fast it is I made it resolve hosts "www.domain0.com", "www.domain1.com", ..., "www.domain999.com" and print out how long the whole process took.

#!/usr/bin/python

import socket
from time import time

def resolve_slow(hosts):
    """
    Given a list of hosts, resolves them and returns a dictionary
    containing {'host': 'ip'}.
    If resolution for a host failed, 'ip' is None.
    """
    resolved_hosts = {}
    for host in hosts:
        try:
            host_info = socket.gethostbyname(host)
            resolved_hosts[host] = host_info
        except socket.gaierror, err:
            resolved_hosts[host] = None
    return resolved_hosts

if __name__ == "__main__":
    host_format = "www.domain%d.com"
    number_of_hosts = 1000

    hosts = [host_format % i for i in range(number_of_hosts)]

    start = time()
    resolved_hosts = resolve_slow(hosts)
    end = time()

    print "It took %.2f seconds to resolve %d hosts." % (end-start, number_of_hosts)

And here is the fast code that uses adns. I created a class "AsyncResolver" that can be reused if you import it from this code. Just like "resolve_slow" from the previous code example, it takes a list of hosts to resolve and returns a dictionary of { host: ip } pairs.

If you run this code, it will print out how long it took to resolve 20000 hosts.

#!/usr/bin/python
#

import adns
from time import time

class AsyncResolver(object):
    def __init__(self, hosts, intensity=100):
        """
        hosts: a list of hosts to resolve
        intensity: how many hosts to resolve at once
        """
        self.hosts = hosts
        self.intensity = intensity
        self.adns = adns.init()

    def resolve(self):
        """ Resolves hosts and returns a dictionary of { 'host': 'ip' }. """
        resolved_hosts = {}
        active_queries = {}
        host_queue = self.hosts[:]

        def collect_results():
            for query in self.adns.completed():
                answer = query.check()
                host = active_queries[query]
                del active_queries[query]
                if answer[0] == 0:
                    ip = answer[3][0]
                    resolved_hosts[host] = ip
                elif answer[0] == 101: # CNAME
                    query = self.adns.submit(answer[1], adns.rr.A)
                    active_queries[query] = host
                else:
                    resolved_hosts[host] = None

        def finished_resolving():
            return len(resolved_hosts) == len(self.hosts)

        while not finished_resolving():
            while host_queue and len(active_queries) < self.intensity:
                host = host_queue.pop()
                query = self.adns.submit(host, adns.rr.A)
                active_queries[query] = host
            collect_results()

        return resolved_hosts

if __name__ == "__main__":
    host_format = "www.host%d.com"
    number_of_hosts = 20000

    hosts = [host_format % i for i in range(number_of_hosts)]

    ar = AsyncResolver(hosts, intensity=500)
    start = time()
    resolved_hosts = ar.resolve()
    end = time()

    print "It took %.2f seconds to resolve %d hosts." % (end-start, number_of_hosts)

I wrote it in a manner that makes it reusable in other programs. Here is an example of how to reuse this code:

from async_dns import AsyncResolver

ar = AsyncResolver(["www.google.com", "www.reddit.com", "www.nonexistz.net"])
resolved = ar.resolve()

for host, ip in resolved.items():
  if ip is None:
    print "%s could not be resolved." % host
  else:
    print "%s resolved to %s" % (host, ip)

Output:

www.nonexistz.net could not be resolved.
www.reddit.com resolved to 159.148.86.207
www.google.com resolved to 74.125.39.99

Download "async_dns.py":

Download: async_dns.py
Downloaded: 4115 times.
Download url: http://www.catonmat.net/download/async_dns.py

I hope someone finds this useful!

Comments

f00li5h Permalink
September 17, 2009, 23:47

leaving paw prints all over your latest article

Eric TF Bat Permalink
September 18, 2009, 01:19

Got this at the top of your blog:

"You came here from www.google.com.au searching for . You might be also interest in these posts:

* No related posts"

Might want to debug that. I use Google Reader in my iGoogle page, so you can probably assume that anything coming from *.google.* with a blank search query is actually a link via and RSS reader. Not sure if you can fix that yourself or notify the maintainer of your plugin, but there you go.

argv Permalink
September 18, 2009, 02:47

Returning to catonmat to reference that excellent "command line history" article. But I'll leave another silly comment while I'm here... hope you'll look more into DNS and maybe write some more articles on it.

re: async lookups

My command line browser has had a built-in async dns resolver for the last 10 yrs. And it has a switch so it can be used as a stand alone lookup tool, e.g. in a shell script.

If someone besides me did a 'speed test' of this browser against any high level language module (like lwp and all its progeny) - whether for doing dns lookups alone or completing entire http transactions - and they asked me to bet the likely winner... even after 10 yrs of "development", I would still put my money on the C code of my browser over any perl or python module as well as any other later developed scripting language challenger. This is one reason I why I never bother with the high level scripting languages. (Another reason is I'm too lazy to learn them.) Anyway, my vote is for the many async dns libs available in C.

With DNS lookups, shouldn't we also be mindful of what the servers can handle? What is the usual and reasonable QPS load for the server you are querying? I think this is may be more important than how fast we can send out bulk queries and receieve the reponses.

September 20, 2009, 08:26

Hi argv. Thanks for your excellent comments again. I absolutely love them. Keep them coming!

You are absolutely right that your code would beat all the high level language modules. In this article though, I am using a Python interface to the C module. I guess this code would also be slower than yours because it's more generic and is written to handle most of DNS protocol. As I understood your code handles only resolution of A records. There is less code to execute and it would beat other implementations.

If I developed a high speed DNS resolver on my own, I'd make it use tens of DNS servers to minimize load on single server and maximize throughput.

And if I was totally into it, I'd find the optimal number of queries per second to send to each server. The code would start at something like 100 queries per second, and change that rate, and see how the responses vary. If increasing queries per sec also increases responses per sec, I'd increase queries some more. But if I'd get less responses per second, I'd decrease them. This way I'd find the optimal load for each of the servers.

It seems my ISPs DNS server was able to handle my 20000 requests per minute.

Btw, any suggestions on what I could write about DNS next?

September 20, 2009, 08:44

Eric TF Bat, it's fixed now! Thanks for reporting this!

October 06, 2009, 15:13

You could have used gethostbyname2_r() in C. The default resolver functions aren't re-entrant because they use thread-specific data in libc. It's not standard, but it's included in netdb.h on both Linux and BSD. I've successfully used the standard gethostbyname() function in concurrent programming by isolating the critical section with a POSIX Threads mutex. There's a download link to the source code of my nameserver security scanner which demonstrates this here:

http://www.security-database.com/toolswatch/PorkBind-updated-to-1-3.html

October 06, 2009, 15:23

By the way, they're optional but you download the entire domain's IN A records with a zone transfer.

http://en.wikipedia.org/wiki/DNS_zone_transfer

The zone transfer query type is AXFR. There's also the ZXFR query type which Wikipedia seems to have left out. ZXFR adds compression. dig supports both, IIRC.

October 24, 2009, 14:29

Zone transfers are forbidden by any well setup system. There's simply too much security risk in place, and this has been the case for a decade or more. Therefore, you can generally only look up individual records. Zone transfers are restricted only to only between authoritative name servers.

December 01, 2009, 03:59

I had to do something similar, but due to my addiction to Perl I used Net::DNS module to achieve parallelism.

http://cpansearch.perl.org/src/OLAF/Net-DNS-0.65/demo/mresolv

There is one more sophisticated way, that I know of, for doing bulk DNS lookups as opposed to the round robin approach, which assuming you have increasing volumes of queries, will eventually overwhelm, or at least generate complaints, from the weakest DNS server in the chain. If you are only looking up hundreds of thousands and not hundreds of millions of records, the round robin approach will probably be ok.

Another interesting DNS topic would be a discussion of recursive DNS resolutions. Taking the IP address and resolving the in-addr.arpa record.

It would be nice to be able to edit posts. In my previous post, I meant to say "Reverese - not recursive":

Another interesting DNS topic would be a discussion of reverse DNS resolutions. Taking the IP address and resolving the domain name.

One productive use of reverse DNS mappings is a simple dual authentication method to help thwart spammers. Check if hostname to IP address, and IP address to hostname match. Hostname in this case being the MX record of the sender.

I wanted to test your adns solution, but could use a touch of advice here. When your code does 'import adns' I get the "module not found" error. I downloaded git://github.com/lucab/adns.git as well. Just need to figure out how to get all the pieces talking to each other. Any suggestions? Thanks.

December 28, 2009, 18:30

Abottleinfrontofme Frontallobotomy, Thanks for all the comments first of all!

Here are my suggestions:

It may be pretty tricky to set it up. First make sure you have compiled adns C library [1] (compiled as adns.so or adns.a) in your system library path (or local library path, exported via LD_LIBRARY_PATH).

Next, get the adns Python library wrapper around adns C library at [2] and build it via command [3].

Next, you can install the library wrapper via [4] and then finally try to run 'import adns'.

Another way is to do it all under virtualenv [5]. This way library installations won't pollute the system.

[1] http://www.chiark.greenend.org.uk/~ian/adns/
[2] http://code.google.com/p/adns-python/downloads/list
[3] python setup.py build
[4] python setup.py install
[5] http://pypi.python.org/pypi/virtualenv

prabin Permalink
October 26, 2010, 14:47

I try to use adns module and install according to the instruction
but while I try “import adns” from the python then error occurs as
import adns
Traceback (most recent call last):
ImportError: libadns.so.1: cannot open shared object file: No such
file or directory

Please help me on this

hansly Permalink
March 24, 2014, 09:25

Conclusions big pass4sure MB5-705 companies uses their reputation to exploit potential employees, they don't care much if pass4sure MB4-874 a bright mind would not be recruited as pass4sure M70-201 they care only not to recruit a bad one, as mentioned prior in comments.

argv Permalink
January 19, 2010, 09:20

well done peteris.

so we can optimise the dns query in terms of: 1. the user's code and 2. the server's response time.

but i think the real slowdown with dns is that not all servers are configured the same. as a result we often have to do more lookups (more recursion) than is really necessary.

this is i think what slows down so many programs that rely on DNS lookups.

i played around with trying to write my own simplistic recursion script using only standard UNIX tools and a simple 'host'-type lookup tool. e.g., the adns libs come with one called adnshost.

what i found is that ideally it should never take more than 3 queries to work from root to tld to host ip (A record). but people use gimmicks like CNAME or they configure their dns records in a number of different ways that force us to make extra queries, e.g. i counted 7 in one instance (i've read about people counting many more)- all this just to find a single ip.

it is very inefficient.

so here's an idea for you. this may get some of your readers agitated because in reaches back to the future.

compare the speed of async lookups over the internet versus the speed of having the ip's already in your hosts file, mounted in ram via a symlink. no disk reads.

in other words, you prefetch all the ip's using bulk queries (sounds like the chrome approach, eh?) and drop them in /etc/hosts to make them available to your DNS-requiring programs. the question is: is this faster than the 'real-time' way (DNS lookups, piecemeal, on the fly)?
google chrome takes this prefetch approach.

for something a bit more 'sophisticated' than a hosts file approach (i.e. something that could potentially scale up), here's a similar setup i'm experimenting with at the moment:

we run 'pdns_recursor' on a local ip, non-standard port, not port 53. then we point our 'host' lookup tool at that port, for our bulk queries- i.e., the 'prefetch' stage. pdns_r send out the queries and caches the reponses in memory. we dump the cache from pdns to a very simple, single zone file (our own simple 'root zone'), sort it and prune it. this zone file is then compiled into a binary .db format.
on a local ip, port 53, we run 'nsd', a fast authoritative only nameserver, serving up records that we prefetched, to our DNS-requiring programs. it simply reads the .db file. that's all it does. no slaves or zone xfers (although it can do those things). in our use, it serves only the local user programs that require DNS.

in other words we separate the lookups over the internet (the slow step) from the lookups done by our DNS-requiring programs. all lookups sent to port 53 by our DNS-requiring programs are local and never recursive. at that stage we have 'eliminated' the variablity introduced into DNS by varying nameserver configurations. local lookups. same effect as making use of a large hosts file.

as stated above, the bulk querying over the internet uses pdns_recursor. a few hundred lines of c++. it works pretty good.

i gave up trying to write my own recursion script. i concluded the reason it's difficult is server 'misconfigurations' (unnecessary complexity, imo). in other words, a very simple script will handle the majority of nameservers, which are configured in a straightforward manner and don't use forwarding gimmicks like CNAME.

elhoim Permalink
March 01, 2010, 12:40

Concerning the status code contained in the object answer being returned by query.check(), where does it come from?
I checked the adns header file and python-adns module but i cannot find it.

argv Permalink
March 23, 2010, 07:37

Someone at Google actually wrote some python script to find the DNS servers with the lowest response times. It's called namebench and is available on Google Code.

PythonOne Permalink
June 15, 2010, 19:09

Does this module support multi-threading?

September 06, 2010, 22:16

This code will forever loop on an incorrect circular CNAME setup.
A --> B --> C --> A or A --> B --> A
Since it resubmits the found record back in the query list.

Thiru Permalink
May 21, 2012, 12:05

What does the line if __name__ == "__main__": and hosts = [host_format % i for i in range(number_of_hosts)]
do

May 22, 2012, 22:52

Line __name__ == "__main__" means "if the program is running from command line".

Line hosts = [host_format % i for i in range(number_of_hosts)] means "create a list of hosts, where each host is a formatted string "www.host%d.com", where %d ranges from 0 to number_of_hosts-1".

Vijayendra Permalink
April 09, 2013, 17:55

Very nice article. I want to know what is the basic difference between synchronous and Asynchronous DNS.
How would you put in simple terms.

Thanks

Leave a new comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the first letter of your name: (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.

Advertisements