XDR allocate, free and destroy

Introduction

XDR is the External Data Representation used by ONCRPC (a.k.a. SunRPC) and described in RFC 4506. The protocol itself is well-documented and supported, but the common tools around it are hardly documented at all, except in some old books on RPC.

In particular it is not clear how to use the rpcgen tool and the xdr* functions together to serialise data structures, correctly handling memory allocation and freeing.

This paper documents how to do this and provides simple to follow code samples.

Downloads

General principles

Filters

Each XDR primitive (string, int, etc.) has a matching XDR function (xdr_string, xdr_int, etc.). When you define your own XDR structure (eg. struct my_struct { ... }), you use rpcgen to create its own function (eg. xdr_my_struct). See the next section for how to use rpcgen.

The single function is multi-purpose. It can be used both to serialise ("encode") and to deserialise ("decode") the structure to and from the wire format. The same function can also be used to free up the structure.

In the language of the XDR documentation, this single function with three purposes is called a filter. The single function associated with each XDR primitive type is called a filter primitive.

Streams

XDR can run over several different underlying streams. The commonly used streams are:

Stream Stream creation function Notes
Files xdrstdio_create Using FILE * from <stdio.h>.
In-memory buffers xdrmem_create Buffers have a fixed maximum size, specified in advance. They cannot grow on demand.
Records xdrrec_create Intended for use with sockets. You have to provide read and write functions. Unless you want to write a complete custom stream, records provide a fairly flexible mechanism for writing your own XDR back-ends. Records are used almost exclusively by SunRPC. Record streams behave a bit differently from the other XDR stream types - I have devoted an entire section to them at the end of this document.
Custom (Your own) Write your own custom stream. You have to supply some basic functions to serialise bytes and 32-bit ints, and all the other primitives and filters should just work on top. See the x_ops functions in <rpc/xdr.h>.

When you create one of the above XDR streams, you must specify an "operation" - a kind of mode for the stream. The mode is one of:

Note that in the case of xdrmem_create "wire" means "memory buffer", but it still contains opaque serialised data.

Examples of use are given below.

Filter / stream interaction

The stream's operation (XDR_ENCODE etc.) is passed into the filter function, and this is how the filter function knows whether it should be used to encode, decode or free.

Useful man page

The main manual page which describes all these filter primitives and other library functions is xdr(3).

rpcgen

Despite its name, rpcgen can be used to generate just XDR code, without any SunRPC dependencies.

Your custom data types should be written into a ".x" file. The format of this file is superficially similar to a C header file, but contains many extensions to what is possible in C. It is described in RFC 4506 which you should read now.

For this document, I use an example of a linked list of strings, which comes from RFC 4506. My test.x, which you can find in the downloads tarball, describes the linked list of strings like so:

struct stringentry {
  string item<1024>;
  stringentry *next;
};

typedef stringentry *stringlist;

This defines two data types called stringentry and stringlist.

We use rpcgen to compile this into a C header file and a C source file. The C header file will contain C data types equivalent to the stringentry and stringlist. The C source file will contain two filter functions called xdr_stringentry and xdr_stringlist. As described above, each filter function is multi-purpose. It can be used to encode, decode and free the datatype.

$ rpcgen -c -o test.c test.x
$ rpcgen -h -o test.h test.x

Encoding

Encoding is the process of going from an in-memory structure (eg. a list of strings stored in memory), to the XDR-encoded machine-independent data on the wire.

In this example "on the wire" is an external file, so we are using the xdrstdio_create stream.

XDR xdr;
FILE *fp;
stringlist strings = /* see below */;

fp = fopen ("test1.out", "w");
xdrstdio_create (&xdr, fp, XDR_ENCODE);

if (!xdr_stringlist (&xdr, &strings)) {
  fprintf (stderr, "test1: could not encode\n");
  exit (1);
}

xdr_destroy (&xdr);
fclose (fp);

So here we first open our output file, then use xdrstdio_create to create a stream (XDR xdr). Notice how I pass operation XDR_ENCODE as the operation - we are going to use this stream to encode from in-memory structures to the wire format. All filter functions and filter primitives will see this operation, and will change their behaviour accordingly.

xdr_stringlist is one of the filter functions generated by rpcgen. It encodes the strings structure and writes it out to the stream (xdr). Because it sees xdr->x_op == XDR_ENCODE it knows to encode.

Finally we must do two things to finish off encoding. Firstly we have to call xdr_destroy which just calls the hidden internal function xdr->x_ops->x_destroy. This is specific to the stream, and in fact results in the standard library call fflush(3) being called on the underlying file pointer. Then we have to close the file pointer ourselves (the stream does not close it for us).

Allocating and freeing strings

If I had allocated strings statically then there would be no reason to worry about freeing it. Normally though this structure would be allocated dynamically. In this example:

stringlist strings;

strings = malloc (sizeof (struct stringentry));
strings->item = strdup ("hello");
strings->next = malloc (sizeof (struct stringentry));
strings->next->item = strdup ("goodbye");
strings->next->next = NULL;

Even this dynamically allocated structure we could free by hand. However XDR generates us a filter function which can be used to free the structure. To use it, we just call:

xdr_free ((xdrproc_t) xdr_stringlist, (char *) &strings);

This frees everything, and assumes that everything was allocated dynamically, even strings. This is why I strdup'd the strings while allocating above.

xdr_free is really a helper function. Notice that there is no XDR (stream) passed here, yet xdr_stringlist, like all filters, requires a stream. In fact xdr_free creates a dummy XDR stream, sets its operation to XDR_FREE, then calls the filter. The filter sees that the operation is XDR_FREE and changes its behaviour to "free mode".

Decoding

Decoding is the process of going from the XDR-encoded machine-independent data on the wire, back to an in-memory structure.

In this example "on the wire" is an external file, so we are using the xdrstdio_create stream.

XDR xdr;
stringlist strings;
stringentry *entry;
FILE *fp;

fp = fopen ("test1.out", "r");
xdrstdio_create (&xdr, fp, XDR_DECODE);

strings = NULL;
if (!xdr_stringlist (&xdr, &strings)) {
  fprintf (stderr, "test1: could not decode\n");
  exit (1);
}

fclose (fp);

for (entry = strings; entry; entry = entry->next)
  printf ("entry->item = %s\n", entry->item);

xdr_free ((xdrproc_t) xdr_stringlist, (char *) &strings);
xdr_destroy (&xdr);

Firstly I open the file and associate it with the file stream (xdr) in decode operation (XDR_DECODE).

The next code decodes the contents of the file into the stringlist strings structure in memory.

Important note: Settings strings = NULL is vital in this case. It tells the filter to dynamically allocate the required structures for us. I could have passed in the address of an existing structure here, in which case the filter decodes into that existing structure. That works fine for fixed-size structures, but not so well for linked lists which really have to be allocated dynamically since we don't necessarily know in advance how many strings there will be.

Thirdly I print out the strings, just to prove that the encode/decode process actually worked.

Lastly I free the dynamically allocated list of strings and destroy the stream. (These operations can be done in either order since they are independent of each other. In fact the destroy could have been done much earlier - after the decode).

Inconsistency in freeing structures vs pointers

In the linked list example above, stringlist strings is a pointer to a stringentry. xdr_free ((xdrproc_t) xdr_stringlist, (char *)&strings) frees everything including strings itself.

This is contrary to the documentation which states:

xdr_free (proc, objp)

Generic freeing routine. The first argument is the XDR routine for the object being freed. The second argument is a pointer to the object itself. Note: the pointer passed to this routine is not freed, but what it points to is freed (recursively).

In the download tarball, file "test2.c", I encode and decode request and reply structures. I put the actual structures on the stack, but the rest is allocated dynamically. Thus my allocation and free looks like:

request rq = { 0 };

if (!xdr_request (&xdr, &rq)) {
  fprintf (stderr, "read_request: could not decode\n");
  exit (1);
}

xdr_free ((xdrproc_t) xdr_request, (char *) &rq);

This works, and behaves as per the documentation. ie. the stack-allocated rq structure is untouched, but all dynamic sub-allocations off this structure are freed correctly (verified with valgrind).

The inconsistency arises in the xdr_reference filter primitive, and may be a bug. However, sending structures over the wire as in the request/reply example is likely to be much more common, so the form of usage in "test2.c" is likely to be more useful.

Record streams

(This section to come)

Security

There is general advice given in the RFC on security considerations when decoding XDR data from untrusted sources.

You should always use strings, arrays, etc. with a <maximum> constraint on their length. For example: string filename<1024>;. That way the decoding filter will allocate only a bounded amount of memory, or fail cleanly.

Recursive structures, such as the linked list of strings above, cannot be decoded safely by the filters written by rpcgen. An untrusted source could feed an unbounded list of strings and cause a stack overflow. The only solution is to modify the rpcgen-generated filter function by hand to catch this case, or else to avoid recursive structures altogether (arrays with an upper bound can be used in preference to linked lists).


rjones AT redhat DOT com

$Id: index.html,v 1.3 2007/04/11 11:45:47 rjones Exp $