How can I do DNS lookups in Python, including referring to /etc/hosts?

Each Answer to this Q is separated by one/two green lines.

dnspython will do my DNS lookups very nicely, but it entirely ignores the contents of /etc/hosts.

Is there a python library call which will do the right thing? ie check first in etc/hosts, and only fall back to DNS lookups otherwise?

I’m not really sure if you want to do DNS lookups yourself or if you just want a host’s ip. In case you want the latter,

/!\ socket.gethostbyname is deprecated, prefer socket.getaddrinfo

from man gethostbyname:

The gethostbyname*(), gethostbyaddr*(), […] functions are obsolete. Applications should use getaddrinfo(3), getnameinfo(3),

import socket
print(socket.gethostbyname('localhost')) # result from hosts file
print(socket.gethostbyname('google.com')) # your os sends out a dns query

The normal name resolution in Python works fine. Why do you need DNSpython for that. Just use socket‘s getaddrinfo which follows the rules configured for your operating system (on Debian, it follows /etc/nsswitch.conf:

>>> print(socket.getaddrinfo('google.com', 80))
[(10, 1, 6, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::93', 80, 0, 0)), (2, 1, 6, '', ('209.85.229.104', 80)), (2, 2, 17, '', ('209.85.229.104', 80)), (2, 3, 0, '', ('209.85.229.104', 80)), (2, 1, 6, '', ('209.85.229.99', 80)), (2, 2, 17, '', ('209.85.229.99', 80)), (2, 3, 0, '', ('209.85.229.99', 80)), (2, 1, 6, '', ('209.85.229.147', 80)), (2, 2, 17, '', ('209.85.229.147', 80)), (2, 3, 0, '', ('209.85.229.147', 80))]

It sounds like you don’t want to resolve DNS yourself. dnspython is a standalone DNS client that will understandably ignore your operating system because it’s bypassing the operating system’s utilities.

We can look at a shell utility named getent to understand how the (Debian 11-like) operating system resolves DNS for programs. This is likely the standard for all *nix like systems that use a socket implementation.

See man getent‘s “hosts” section, which mentions the use of getaddrinfo, which we can see as man getaddrinfo.

To use it in Python, we have to extract some info from the data structures:

import socket

def get_ipv4_by_hostname(hostname):
    # see `man getent` `/ hosts `
    # see `man getaddrinfo`

    return list(
        i        # raw socket structure
            [4]  # internet protocol info
            [0]  # address
        for i in 
        socket.getaddrinfo(
            hostname,
            0  # port, required
        )
        if i[0] is socket.AddressFamily.AF_INET  # ipv4

        # ignore duplicate addresses with other socket types
        and i[1] is socket.SocketKind.SOCK_RAW  
    )

print(get_ipv4_by_hostname('localhost'))
print(get_ipv4_by_hostname('google.com'))

list( map( lambda x: x[4][0], socket.getaddrinfo( \
     'www.example.com.',22,type=socket.SOCK_STREAM)))

gives you a list of the addresses for www.example.com. (ipv4 and ipv6)

This code works well for returning all of the IP addresses that might belong to a particular URI. Since many systems are now in a hosted environment (AWS/Akamai/etc.), systems may return several IP addresses. The lambda was “borrowed” from @Peter Silva.

def get_ips_by_dns_lookup(target, port=None):
    '''
        this function takes the passed target and optional port and does a dns
        lookup. it returns the ips that it finds to the caller.

        :param target:  the URI that you'd like to get the ip address(es) for
        :type target:   string
        :param port:    which port do you want to do the lookup against?
        :type port:     integer
        :returns ips:   all of the discovered ips for the target
        :rtype ips:     list of strings

    '''
    import socket

    if not port:
        port = 443

    return list(map(lambda x: x[4][0], socket.getaddrinfo('{}.'.format(target),port,type=socket.SOCK_STREAM)))

ips = get_ips_by_dns_lookup(target="google.com")

I found this way to expand a DNS RR hostname that expands into a list of IPs, into the list of member hostnames:

#!/usr/bin/python

def expand_dnsname(dnsname):
    from socket import getaddrinfo
    from dns import reversename, resolver
    namelist = [ ]
    # expand hostname into dict of ip addresses
    iplist = dict()
    for answer in getaddrinfo(dnsname, 80):
        ipa = str(answer[4][0])
        iplist[ipa] = 0
    # run through the list of IP addresses to get hostnames
    for ipaddr in sorted(iplist):
        rev_name = reversename.from_address(ipaddr)
        # run through all the hostnames returned, ignoring the dnsname
        for answer in resolver.query(rev_name, "PTR"):
            name = str(answer)
            if name != dnsname:
                # add it to the list of answers
                namelist.append(name)
                break
    # if no other choice, return the dnsname
    if len(namelist) == 0:
        namelist.append(dnsname)
    # return the sorted namelist
    namelist = sorted(namelist)
    return namelist

namelist = expand_dnsname('google.com.')
for name in namelist:
    print name

Which, when I run it, lists a few 1e100.net hostnames:


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.