Userlevel IPv6 Programming Introduction

Ulrich Drepper
2008-10-30

Despite the 10+ years of development many people still don't know how to program for IPv6 or better yet, protocol independence. This can be seen by the high number of programs which are still not IPv6-ready. I'll explain here a few things related to the userlevel APIs people should be using. The text should be helpful for people who write new code as well for those converting IPv4-only code.

In the following text I'll restrict the description to storing, resolving, and printing and scanning IP addresses. At some later time I might add more details of advanced features.

I will exclusively cover C interfaces here. I assume that C++ programs use the same interfaces because up to this day I still haven't found a C++ net class library which doesn't make me angry or sick.

Storing Addresses

Programs in an IPv4-only world stored IP addresses often in ints. If the programmer was a bit more clueful she used struct in_addr. Regardless, the type must be (at least) 32 bits wide. That is not enough for IPv6 addresses which are 128 bits (16 bytes) wide.

Instead of going ahead and hardcoding the 16 byte requirement programs should instead use the struct sockaddr_storage type defined in <sys/socket.h>. This type should be large enough to hold all types of addresses.

This advice should be qualified a bit, though. The type size is really big (128 bytes). If lots of addresses have to be stored one might want to use a representation which is just large enough. Ideally this means no fixed size arrays but instead (length, byte sequence) pairs. All the interfaces providing addresses either report how large an address is or they produce only addresses of a specific form. This means the program can always know the size of the address.

Resolving Addresses

There are two directions for resolving of addresses. In the first case a (host) name is resolved to one or more addresses. The second direction is resolving addresses to (host) names. This functionality has been around for a long time. But the interfaces used were designed for IPv4 only and also for a simpler time where each machine had exactly one IP address. Times change but let's examine the old, sorry state first.

Historic Name and Address Lookup

The initial interface to resolve a name to an address was:

#include <netdb.h>
struct hostent *gethostbyname(const char *name);

The hostent structure returns the official host name, the address type, the address length, a number of addresses, and possible aliases. The function can return both IPv4 and IPv6 addresses (controlled by the _res variable) but never can return both types at the same time. The use of _res can also have nasty side effects on concurrently running code.

As a stop-gap solution the gethostbyname2 function was instroduced:

#include <netdb.h>
struct hostent *gethostbyname2(const char *name, int af);

It allows to request IPv4 or IPv6 addresses based on the second parameter which can be AF_INET or AF_INET6. This eliminates using the _res variable and potentially allows requesting addresses for other protocols. I am not aware of any implementation which handles anything other than the two. This still does not eliminate one big problem with the interfaces: they are not thread safe since they return a pointer to static memory. For this we have two other interfaces (gethostbyname_r and gethostbyname2_r).

These interfaces were OK in the early days. Machines had one network interfaces. Yes, some names are associated with multiple addresses but this is mainly meant for load balancing betweem different machines. But there are two big problems:

  1. the interface force one to make a decision about using IPv4 and/or IPv6 and if both, in which order to try the connection.
  2. the order in which the addresses for each lookup returns are not specified. In fact, for DNS lookups the order is determined by the name server. The server cannot know which the best address for the client is.

Not trying the best address first can mean that connections which cannot succeed are tried (causing delays), traffic can be routed over public lines instead of through intranet lines, the speed can be reduced because indirections are used, etc. Fixing these problems could in theory be solved by the programmer but it requires a whole lot of quite complex code. It should not be the responsibility of the programmer.

Determining a host name from an IP address happens using the gethostbyaddr interface:

include <netdb.h>
struct hostent *gethostbyaddr(const void *addr, int len, int type);

This function could be used for all protocols. But the usual problem is that the function is not thread-safe. There is gethostbyaddr_r but like the other *_r functions it depends on the programmer allocating the memory and adjusting the buffer size. gethostbyaddr also returns addresses which really isn't needed in these situations.

Modern Address Lookup

POSIX specifies the interface which should be used for address lookups:

#include <netdb.h>
int getaddrinfo(const char *node, const char *service,
                const struct addrinfo *hints, struct addrinfo **res);

This is a terribly complicated interface, it has many options. I'll explain a few of them but first the most basic lookup and more importantly, how to handle the results. This is a simple code sequence:

struct addrinfo *res;
struct addrinfo hints;
memset(&hints, '\0', sizeof(hints));
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_ADDRCONFIG;
int e = getaddrinfo("www.redhat.com", "http", &hints, &res);
if (e != 0) {
  printf("failure %s\n", gai_strerror (e));
  return;
}
int sock = -1;
for (struct addrinfo *r = res; r != NULL; r = r->ai_next) {
  sock = socket(r->ai_family, r->ai_socktype, r->ai_protocol);
  if (sock != -1 && connect(sock, r->ai_addr, r->ai_addrlen) == 0)
    break;
  if (sock != -1) {
    close(sock);
    sock = -1;
  }
}
freeaddrinfo(res);
if (sock != -1) {
  ... use socket ...
}

As you can see the getaddrinfo functions specifies all the information needed for the connection. The example code also shows how the different addresses are returned.

The most important thing when using getaddrinfo is to make sure that all results are used in order. To stress the important words again: all and order. Too many (incorrect) programs only use the first result. The order in which the results are returned is not arbitrary or random. The results are in fact sorted.

The POSIX specification does not require this but this is mainly because POSIX is protocol agnostic. But there is RFC 3484. It explains how addresses should be sorted based on source and destination addresses. I have another write-up on just this topic so I won't repeat it here. The important thing to remember here is that the addresses are returned in an order where the first returned address has the highest probability to succeed.

There are a few more hints which can be passed. The following flags can be set in ai_flags (the list is not complete):

AI_ADDRCONFIG
This flag should always be set when the returned values are needed to make connections. If no specific protocol is requested, the Linux getaddrinfo implementation returns both IPv4 and IPv6 addresses. This can be less than optimal and is certainly slower if the machine has only interfaces for one protocol. These days there are still many systems which have no configured IPv6 address at all. In that case using an IPv6 address will always fail. Worse, it might cause the IPv6 kernel module to be loaded unnecessarily. Using AI_ADDRCONFIG avoids this by determining what protocols are supported by the currently configured network interfaces and return only addresses for those.
AI_CANONNAME
The first element of the returned list has the ai_canonname filled in with the official name of the machine. This name might be different from the name passed in as the first parameter. The value is only really meaningful when the name lookup happens using DNS which has the concept of a canonical host name.
AI_V4MAPPED
When looking up IPv6 addresses the function will return mapped IPv4 addresses if there is no IPv6 address available at all.
AI_ALL
If this flag is set along with AI_V4MAPPED when looking up IPv6 addresses the function will return all IPv6 addresses as well as all IPv4 addresses. The latter mapped to IPv6 format.
AI_IDN
Setting this flag indicates that the provided host name is encoded using the International Domain Name format (RFC 3490).
AI_CANONIDN
Setting this flag indicates that the returned canonical host name should be encoded using the IDN format.

The ai_family and ai_protocol fields in the hint structure can also be set. Most of the time this is not needed. There are a lot more details of getaddrinfo. The POSIX specification describes them pretty well. Just use the man pages (man getaddrinfo requests the Linux version, man 3p getaddrinfo the POSIX man page).

Modern Host Name Lookup

To determine the name of a host and the name of a service from the socket address one should use getnameinfo:

#include <netdb.h>
int getnameinfo(const struct sockaddr *sa, socklen_t salen, char *node, socklen_t nodelen, char *service,
                socklen_t servicelen, int flags);

The first two parameters describe the socket address. These can be values as returned by getaddrinfo or getpeername or they can be constructed by hand. The important point is that the code is protocol independent.

The result strings are returned in the buffers pointed to by node and service. The length of the two buffers is specified by the nodelen and servicelen parameters respectively. Either buffer pointer can be a null pointer in which case no such information is returned. If a buffer is too small the function returns EAI_OVERFLOW.

The flags parameter can have one or more of the NI_* flags defined in <netdb.h> set. I'm not going all of them here. It is perhaps worthwhile to mention that the NI_DGRAM flag can be used to lookup UDP services instead of the default TCP services. This corresponds roughly to the second parameter of getservbyport.

Printing and Scanning Addresses

The routines to generate textual representations for IP addresses and to convert text to IP addresses is also protocol independent but not automatic. Two functions which can be used are:

#include <arpa/inet.h>:
int inet_pton(int af, const char *cp, void *buf);
const char *inet_ntop (int af, const void *cp, char *buf, socklen_t len);

For both functions the programmer has to make calls for the different protocols which are supported. Supported are only AF_INET and AF_INET6. The second parameter of inet_pton is a string. The third parameter to a memory region large enough to hold an address. For AF_INET the requirement is NS_INADDRSZ bytes (i.e., sizeof(struct in_addr)) and for AF_INET6 it is NS_IN6ADDRSZ (i.e, sizeof (struct in6_addr)).

For inet_ntop the second parameter is the address (usually the sin_addr field of a struct sockaddr_in object or the sin6_addr field of a struct sockaddr_in6 object). The buffer to store the result in is passed in the third parameter, its length in the fourth. The buf value is also the return value of a successful call.

These are the lowlevel functions and some people might want to use them. But they are more cumbersome to use than it's worth it. There are two higher-level interfaces and we've already talked about them. getaddrinfo and getnameinfo can be used to convert addresses back and forth. And what is more, they automatically recognize the address format and handle them transparently.

A call to getaddrinfo with the string as the first parameter and ideally the AI_NUMERICHOST bit set in the ai_flags field of the hints will return an object with the converted address. This is a bit slower due to the dynamic memory allocation but simplifies things a lot. That's usually more important. Another plus: the result is fully usable socket address structure. If a service name is also passed to getaddrinfo even the port is already filled in.

Similarly, to convert an address to a string use getnameinfo with the NI_NUMERICHOST flag set. There are really no disadvantages to using the function instead of inet_ntop.

Interface Checklist

The following is a table listing on the left side the interfaces which must/should be avoided and on the right side the replacements. The list also contains interfaces not mentioned above because they should be avoided when possible.

Obsolete Interface Protocol Independent Interface
struct sockaddr_in (when used to store addresses) struct sockaddr_storage
gethostbyname getaddrinfo
gethostbyname2
getservbyname when filling in port number
gethostbyaddr getnameinfo
getservbyport when handling socket address
inet_addr getaddrinfo with AI_NUMERICHOST flag
inet_aton
inet_nsap_addr
inet_ntoa getnameinfo with NI_NUMERICHOST flag
inet_nsap_ntoa
inet_makeaddr Avoid completely, we don't have this concept anymore
inet_netof
inet_network
inet_neta
inet_net_ntop
inet_net_pton
rcmd rcmd_af
rexec rexec_af
rresvport rresvport_af

Example Client Code

The following is a trivial client program. In fact, there are two programs. One is using the old interfaces and it uses IPv4 only. The second program is protocol independent. The tables shows the differences. Obviously not all IPv4 programs look exactly like this but it should be easy enough to recognize commonalities and therefore adjust the code for protocol independence.

#include <errno.h> #include <errno.h>
#include <error.h> #include <error.h>
#include <netdb.h> #include <netdb.h>
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <string.h>
#include <unistd.h> #include <unistd.h>
#include <netinet/in.h> #include <netinet/in.h>
#include <sys/socket.h> #include <sys/socket.h>
   
int int
main (int argc, char *argv[]) main (int argc, char *argv[])
{ {
  int result = 0;   int result = 0;
  struct hostent *h = gethostbyname (argv[1]);   struct addrinfo *ai;
  if (h == NULL)   struct addrinfo hints;
    error (EXIT_FAILURE, errno, "gethostbyname");   memset (&hints, '\0', sizeof (hints));
  struct servent *s = getservbyname ("echo", "tcp");   hints.ai_flags = AI_ADDRCONFIG;
  if (s == NULL)   hints.ai_socktype = SOCK_STREAM;
    error (EXIT_FAILURE, errno, "getservbyname");   int e = getaddrinfo (argv[1], "echo", &hints, &ai);
  struct in_addr **addrs = (struct in_addr **) h->h_addr_list;   if (e != 0)
  while (*addrs != NULL)     error (EXIT_FAILURE, 0, "getaddrinfo: %s", gai_strerror (e));
  struct addrinfo *runp = ai;
  while (runp != NULL)
    {     {
      int sock = socket (PF_INET, SOCK_STREAM, 0);       int sock = socket (runp->ai_family, runp->ai_socktype,
                         runp->ai_protocol);
      if (sock != -1)       if (sock != -1)
        {         {
          struct sockaddr_in sin;
          sin.sin_family = AF_INET;
          sin.sin_port = s->s_port;
          sin.sin_addr = **addrs;
          if (connect (sock,           if (connect (sock,
                       (struct sockaddr *) &sin, sizeof (sin)) == 0)                        runp->ai_addr, runp->ai_addrlen) == 0)
            {             {
              char *line = NULL;               char *line = NULL;
              size_t len = 0;               size_t len = 0;
              ssize_t n = getline (&line, &len, stdin);               ssize_t n = getline (&line, &len, stdin);
              write (sock, line, n);               write (sock, line, n);
              n = read (sock, line, len);               n = read (sock, line, len);
              write (STDOUT_FILENO, line, n);               write (STDOUT_FILENO, line, n);
              close (sock);               close (sock);
              goto out;               goto out;
            }             }
          close (sock);           close (sock);
        }         }
      ++addrs;       runp = runp->ai_next;
    }     }
  error (0, 0, "cannot contact %s", argv[1]);   error (0, 0, "cannot contact %s", argv[1]);
  result = 1;   result = 1;
 out:  out:
  freeaddrinfo (ai);
  return result;   return result;
} }

The protocol independent code is actually smaller, requires fewer function calls, and is less error prone.

Example Server Code

The following is a trivial TCP server, again in two versions. The server binds to all available addresses. There is one little oddity to observe. The Linux kernel by default does not allow to bind an IPv4 and an IPv6 socket to the same port at the same time (this can be changed with /proc/sys/net/ipv6/bindv6only). An IPv4 socket cannot accept an IPv6 connection. The connect() call would fail with ECONNREFUSED. But it is possible to accept an IPv4 connection with an IPv6 socket. The address returned by the accept() call is an V4-mapped IPv6 address. What this means is that if a server is meant to accept IPv4 and IPv6 connections the server has to bind an IPv6 socket. For the benefits of systems which allow/require binding an IPv4 and IPv6 socket the server code still iterates over all addresses returned by getaddrinfo(). The program recognizes the limitation the Linux kernel imposes by checking for the EADDRINUSE returned by the bind() call. In this case the program just ignores the error.

#include <error.h> #include <error.h>
#include <errno.h> #include <errno.h>
#include <netdb.h> #include <netdb.h>
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
#include <unistd.h> #include <unistd.h>
#include <arpa/inet.h>
#include <netinet/in.h> #include <netinet/in.h>
#include <sys/poll.h> #include <sys/poll.h>
   
int int
main (int argc, char *argv[]) main (int argc, char *argv[])
{ {
  struct servent *s = getservbyname ("echo", "tcp");   struct addrinfo *ai;
  if (s == NULL)   struct addrinfo hints;
    error (EXIT_FAILURE, errno, "getservent");   memset (&hints, '\0', sizeof (hints));
  struct pollfd fds[1];   hints.ai_flags = AI_PASSIVE | AI_ADDRCONFIG;
  hints.ai_socktype = SOCK_STREAM;
  int e = getaddrinfo (NULL, "echo", &hints, &ai);
  if (e != 0)
    error (EXIT_FAILURE, 0, "getaddrinfo: %s", gai_strerror (e));
  int nfds = 0;   int nfds = 0;
  struct addrinfo *runp = ai;
  while (runp != NULL)
    {
      ++nfds;
      runp = runp->ai_next;
    }
  struct pollfd fds[nfds];
  for (nfds = 0, runp = ai; runp != NULL; runp = runp->ai_next)
    {
  fds[nfds].fd = socket (AF_INET, SOCK_STREAM, 0);       fds[nfds].fd = socket (runp->ai_family, runp->ai_socktype, runp->ai_protocol);
  if (fds[nfds].fd == -1)       if (fds[nfds].fd == -1)
    error (EXIT_FAILURE, errno, "socket");         error (EXIT_FAILURE, errno, "socket");
  fds[nfds].events = POLLIN;       fds[nfds].events = POLLIN;
  int opt = 1;       int opt = 1;
  setsockopt (fds[nfds].fd, SOL_SOCKET, SO_REUSEADDR,       setsockopt (fds[nfds].fd, SOL_SOCKET, SO_REUSEADDR,
              &opt, sizeof (opt));                   &opt, sizeof (opt));
  struct sockaddr_in sin;
  sin.sin_family = AF_INET;
  sin.sin_port = s->s_port;
  sin.sin_addr.s_addr = INADDR_ANY;
  if (bind (fds[nfds].fd,       if (bind (fds[nfds].fd,
            &sin, sizeof (sin)) != 0)                 runp->ai_addr, runp->ai_addrlen) != 0)
        {
          if (errno != EADDRINUSE)
    error (EXIT_FAILURE, errno, "bind");             error (EXIT_FAILURE, errno, "bind");
          close (fds[nfds].fd);
        }
      else
        {
  if (listen (fds[nfds].fd, SOMAXCONN) != 0)           if (listen (fds[nfds].fd, SOMAXCONN) != 0)
    error (EXIT_FAILURE, errno, "listen");             error (EXIT_FAILURE, errno, "listen");
  ++nfds;           ++nfds;
  int i = 0;         }
    }
  freeaddrinfo (ai);
  while (1)   while (1)
    {     {
      int n = poll (fds, nfds, -1);       int n = poll (fds, nfds, -1);
      if (n > 0)       if (n > 0)
        for (int i = 0; i < nfds; ++i)
        if (fds[i].revents & POLLIN)           if (fds[i].revents & POLLIN)
          {             {
            struct sockaddr_in rem;               struct sockaddr_storage rem;
            socklen_t remlen = sizeof (rem);               socklen_t remlen = sizeof (rem);
            int fd = accept (fds[i].fd, (struct sockaddr *) &rem, &remlen);               int fd = accept (fds[i].fd, (struct sockaddr *) &rem, &remlen);
            if (fd != -1)               if (fd != -1)
              {                 {
                struct hostent *h = gethostbyaddr (&rem.sin_addr,                   char buf1[200];
                                                   sizeof (rem.sin_addr),                   if (getnameinfo ((struct sockaddr *) &rem, remlen,
                                                   rem.sin_family);                                    buf1, sizeof (buf1), NULL, 0, 0) != 0)
                char *buf1 = h ? h->h_name : "???";                     strcpy (buf1, "???");
                char *buf2 = inet_ntoa (rem.sin_addr);                   char buf2[100];
                  (void) getnameinfo ((struct sockaddr *) &rem, remlen,
                                      buf2, sizeof (buf2), NULL, 0,
                                      NI_NUMERICHOST);
                printf ("connection from %s (%s)\n", buf1, buf2);                   printf ("connection from %s (%s)\n", buf1, buf2);
                char buf[1000];                   char buf[1000];
                ssize_t l = read (fd, buf, sizeof (buf));                   ssize_t l = read (fd, buf, sizeof (buf));
                write (fd, buf, l);                   write (fd, buf, l);
                close (fd);                   close (fd);
              }                 }
          }             }
    }     }
} }

Note that this code works as expected because IPv6 addresses are sorted before IPv4 addresses in the RFC 3484 sorting.

In case a server has to be bound to a specific address one usually does not have to call getaddrinfo() to determine the address because it is determined by a call to a function like getifaddrs(). getaddrinfo() can still be called to get the port number for the service.