XDR is the External Data Representation used by ONCRPC (a.k.a. SunRPC) and described in RFC 4506. The protocol itself is well-documented and supported, but the common tools around it are hardly documented at all, except in some old books on RPC.
In particular it is not clear how to use the rpcgen tool
and the xdr* functions together to serialise data
structures, correctly handling memory allocation and freeing.
This paper documents how to do this and provides simple to follow code samples.
Each XDR primitive (string, int, etc.) has a matching XDR function
(xdr_string, xdr_int, etc.). When you
define your own XDR structure (eg. struct my_struct {
... }), you use rpcgen to create its own function
(eg. xdr_my_struct). See the next section for how to use
rpcgen.
The single function is multi-purpose. It can be used both to serialise ("encode") and to deserialise ("decode") the structure to and from the wire format. The same function can also be used to free up the structure.
In the language of the XDR documentation, this single function with three purposes is called a filter. The single function associated with each XDR primitive type is called a filter primitive.
XDR can run over several different underlying streams. The commonly used streams are:
| Stream | Stream creation function | Notes |
|---|---|---|
| Files | xdrstdio_create |
Using FILE * from <stdio.h>. |
| In-memory buffers | xdrmem_create |
Buffers have a fixed maximum size, specified in advance. They cannot grow on demand. |
| Records | xdrrec_create |
Intended for use with sockets. You have to provide read and write functions. Unless you want to write a complete custom stream, records provide a fairly flexible mechanism for writing your own XDR back-ends. Records are used almost exclusively by SunRPC. Record streams behave a bit differently from the other XDR stream types - I have devoted an entire section to them at the end of this document. |
| Custom | (Your own) | Write your own custom stream. You have to supply some
basic functions to serialise bytes and 32-bit ints, and all
the other primitives and filters should just work on top.
See the x_ops functions in <rpc/xdr.h>. |
When you create one of the above XDR streams, you must specify an "operation" - a kind of mode for the stream. The mode is one of:
XDR_ENCODE - You are going to serialise (encode)
in-memory structures onto the wire. XDR_DECODE - You are going to deserialise (decode)
from the wire into in-memory structures. XDR_FREE - You want to free an in-memory structure.
You shouldn't normally use this! See the description of
xdr_free function below.
Note that in the case of xdrmem_create
"wire" means "memory buffer", but it still contains opaque serialised
data.
Examples of use are given below.
The stream's operation (XDR_ENCODE etc.) is passed
into the filter function, and this is how the filter function
knows whether it should be used to encode, decode or free.
The main manual page which describes all these filter primitives and other library functions is xdr(3).
Despite its name, rpcgen can be used to generate just
XDR code, without any SunRPC dependencies.
Your custom data types should be written into a ".x"
file. The format of this file is superficially similar to a C header
file, but contains many extensions to what is possible in C. It is
described in RFC
4506 which you should read now.
For this document, I use an example of a linked list of strings, which
comes from RFC 4506. My test.x, which you can find in
the downloads tarball, describes the linked list of strings like so:
struct stringentry {
string item<1024>;
stringentry *next;
};
typedef stringentry *stringlist;
This defines two data types called stringentry and
stringlist.
We use rpcgen to compile this into a C header file
and a C source file. The C header file will contain C data types
equivalent to the stringentry and stringlist.
The C source file will contain two filter functions called
xdr_stringentry and xdr_stringlist.
As described above, each filter function is multi-purpose. It
can be used to encode, decode and free the datatype.
$ rpcgen -c -o test.c test.x $ rpcgen -h -o test.h test.x
Encoding is the process of going from an in-memory structure (eg. a list of strings stored in memory), to the XDR-encoded machine-independent data on the wire.
In this example "on the wire" is an external file, so we are
using the xdrstdio_create stream.
XDR xdr;
FILE *fp;
stringlist strings = /* see below */;
fp = fopen ("test1.out", "w");
xdrstdio_create (&xdr, fp, XDR_ENCODE);
if (!xdr_stringlist (&xdr, &strings)) {
fprintf (stderr, "test1: could not encode\n");
exit (1);
}
xdr_destroy (&xdr);
fclose (fp);
So here we first open our output file, then use
xdrstdio_create to create a stream (XDR
xdr). Notice how I pass operation XDR_ENCODE as
the operation - we are going to use this stream to encode from
in-memory structures to the wire format. All filter functions and
filter primitives will see this operation, and will change their
behaviour accordingly.
xdr_stringlist is one of the filter functions generated
by rpcgen. It encodes the strings structure and writes
it out to the stream (xdr). Because it sees
xdr->x_op == XDR_ENCODE it knows to encode.
Finally we must do two things to finish off encoding. Firstly we have
to call xdr_destroy which just calls the hidden internal
function xdr->x_ops->x_destroy. This is specific
to the stream, and in fact results in the standard library call
fflush(3) being called on the underlying file pointer.
Then we have to close the file pointer ourselves (the stream does not
close it for us).
strings
If I had allocated strings statically then there would be
no reason to worry about freeing it. Normally though this structure
would be allocated dynamically. In this example:
stringlist strings;
strings = malloc (sizeof (struct stringentry));
strings->item = strdup ("hello");
strings->next = malloc (sizeof (struct stringentry));
strings->next->item = strdup ("goodbye");
strings->next->next = NULL;
Even this dynamically allocated structure we could free by hand. However XDR generates us a filter function which can be used to free the structure. To use it, we just call:
xdr_free ((xdrproc_t) xdr_stringlist, (char *) &strings);
This frees everything, and assumes that everything was allocated
dynamically, even strings. This is why I strdup'd the
strings while allocating above.
xdr_free is really a helper function. Notice that there
is no XDR (stream) passed here, yet
xdr_stringlist, like all filters, requires a stream. In
fact xdr_free creates a dummy XDR stream,
sets its operation to XDR_FREE, then calls the filter.
The filter sees that the operation is XDR_FREE and
changes its behaviour to "free mode".
Decoding is the process of going from the XDR-encoded machine-independent data on the wire, back to an in-memory structure.
In this example "on the wire" is an external file, so we are
using the xdrstdio_create stream.
XDR xdr;
stringlist strings;
stringentry *entry;
FILE *fp;
fp = fopen ("test1.out", "r");
xdrstdio_create (&xdr, fp, XDR_DECODE);
strings = NULL;
if (!xdr_stringlist (&xdr, &strings)) {
fprintf (stderr, "test1: could not decode\n");
exit (1);
}
fclose (fp);
for (entry = strings; entry; entry = entry->next)
printf ("entry->item = %s\n", entry->item);
xdr_free ((xdrproc_t) xdr_stringlist, (char *) &strings);
xdr_destroy (&xdr);
Firstly I open the file and associate it with the file stream
(xdr) in decode operation (XDR_DECODE).
The next code decodes the contents of the file into the
stringlist strings structure in memory.
Important note: Settings strings = NULL is vital in this
case. It tells the filter to dynamically allocate the required
structures for us. I could have passed in the address of an existing
structure here, in which case the filter decodes into that existing
structure. That works fine for fixed-size structures, but not so well
for linked lists which really have to be allocated dynamically since
we don't necessarily know in advance how many strings there will be.
Thirdly I print out the strings, just to prove that the encode/decode process actually worked.
Lastly I free the dynamically allocated list of strings and destroy the stream. (These operations can be done in either order since they are independent of each other. In fact the destroy could have been done much earlier - after the decode).
In the linked list example above,
stringlist strings is a pointer to a
stringentry.
xdr_free ((xdrproc_t) xdr_stringlist, (char *)&strings)
frees everything including strings itself.
This is contrary to the documentation which states:
xdr_free (proc, objp)Generic freeing routine. The first argument is the XDR routine for the object being freed. The second argument is a pointer to the object itself. Note: the pointer passed to this routine is not freed, but what it points to is freed (recursively).
In the download tarball, file "test2.c", I encode and decode request and reply structures. I put the actual structures on the stack, but the rest is allocated dynamically. Thus my allocation and free looks like:
request rq = { 0 };
if (!xdr_request (&xdr, &rq)) {
fprintf (stderr, "read_request: could not decode\n");
exit (1);
}
xdr_free ((xdrproc_t) xdr_request, (char *) &rq);
This works, and behaves as per the documentation. ie. the
stack-allocated rq structure is untouched, but all
dynamic sub-allocations off this structure are freed correctly
(verified with valgrind).
The inconsistency arises in the xdr_reference filter
primitive, and may be a bug. However, sending structures over the
wire as in the request/reply example is likely to be much more common,
so the form of usage in "test2.c" is likely to be more useful.
(This section to come)
There is general advice given in the RFC on security considerations when decoding XDR data from untrusted sources.
You should always use strings, arrays, etc. with a <maximum>
constraint on their length. For example: string
filename<1024>;. That way the decoding filter will
allocate only a bounded amount of memory, or fail cleanly.
Recursive structures, such as the linked list of strings above, cannot
be decoded safely by the filters written by rpcgen. An
untrusted source could feed an unbounded list of strings and cause a
stack overflow. The only solution is to modify the
rpcgen-generated filter function by hand to catch this
case, or else to avoid recursive structures altogether (arrays with an
upper bound can be used in preference to linked lists).
$Id: index.html,v 1.3 2007/04/11 11:45:47 rjones Exp $