Guide to writing well formed audit events
=========================================

Background
----------
The audit system is a security subsystem that monitors system and user
activities based on admin defined rules to ensure compliance with
organizational security policies. The events should contain enough information
that a security officer can figure out what has happened on a system later.
For example, there may be a policy where access to files in a specific
directory are on a "need to know" basis. The security officer would restrict
access and allow intended access through group permissions or posix ACLs. But
how would he know if the intended security policy is working? The audit system
can be configured to watch who accesses those files and provide the
information. The security officer can then run reports periodically to make
sure no unusual access has occurred.

Events
------
An audit event is all records that have the same host (node), timestamp, and
serial number. Each event on a host (node) has a unique timestamp and serial
number. The serial number separates events that occur during the same
millisecond. An event may be composed of multiple records which have
information about different aspects of an audit event.

The audit system has been designed to be able to answer the question, "who did
what to whom and what was the result." These pieces of information have names.
The 'who' is the subject. This is the thing doing the action. It is usually
a process that is running on a user's behalf. The 'what' is generally described
by the audit event type or a key (this is an admin defined name for the event).
It let's you know if we are dealing with a file access, login, system shutdown,
etc. The 'to whom' is called the object. The object is what is being acted
upon such as a file, socket, user account, or process. The result is either
success or fail. It either did or did not complete.

When writing events, the author will need to list several attributes so that
no doubt can be left as to what is being recorded. For example, if we are
recording the subject, you would naturally assume we are speaking of the
current uid, which is after all the user. Because of things like sudo or
su, users can change accounts sometimes. The audit system has a concept of
loginuid (also referred to as auid) which is the account used to login with.
But to perform an action, the user invokes a process, so we likely want to show
which process is acting on the user's behalf. But is pid alone useful when
reconstructing events? Probably not, so we should record the executable name.
However, it turns out that if the user invoked a script, the executable is the
interpreter. So, we also need the command name, too. Because users can log into
multiple sessions at the same time, we should also disambiguate which session
they are in with the sessionid. To summarize, we may need to record: uid, auid,
pid, exec, comm, ses fields just to specify exactly who the subject is. Some
cases may require more fields.

The same kind of process needs to be thought about when recording the object.
Suppose the object was a file. Files are contained in one or more inodes on a
device. This means they have an inode number associated with them. However, a
user may use an editor to change the file making a temporary copy during
editing. This temporary file replaces the original, which is deleted, resulting
in the inode number associated with the file to change. So, we need to record
the path. But if the path could be relative, we need to also record the
current working directory. An inode could contain different kinds of file
objects (like a fifo, directory, socket, or a regular file) so that information
needs to be included. But in case there is a question about whether access
should have been allowed, we should also gather attributes such as owner and
access modes.

The event writer should always think about if there has been enough information
recorded so that later a security officer knows what the event means.

Fields
------
An audit record is composed of multiple fields. Information recorded in these
fields is held by a name/value pair that contains an '=' between them
(name=value). Each field is separated from one another by a space.

The value recorded is typically numeric. No attempt should be made to interpret
the meaning of the value during the creation of the event. For example, if
uid=0 is being recorded, it is not necessary to say that it's the root account.
That can be looked up in post processing.

If the value side is not numeric and user space can influence the value
(such as file names, unauthenticated acct names, process names, etc.) or it has
whitespaces or other control characters then certain precautions will need to
be taken. It may turn out that a clever user may wish to trick naive parsing
to pin blame on another account or to make it look like something else was
being accessed.

The established convention in this case is to scan the value string to see if
it has characters that have special meaning to the audit record parser. If it
does not, the value is enclosed by a double quote '"'. If it contains a
character with meaning to the parser, then all characters in the value are
convertered to a hex character encoding so that parsing the field is
unmistakable. This can also be used for recording data structures if it were
ever needed. Hex encoding doubles the number of bytes needed to represent the
value. This is only done when recording a non-numeric value that user space can
control.

If the value side is text and needs more than one word to explain its meaning,
then you must "glue" the words together so that they make one word. Remember,
a space is the separator between fields. Using a space means the parser will
not pick up some words. It is also not preferred to hex encode them as this
doubles the disk space needed for something that is entirely avoidable. To
"glue" the words together, its recommended to use a hyphen.

Field Names
-----------
Field names in a record should be consistent so that the parser can make
sense of the value associated with a field. When writing events, always use
a known field name and don't make one up. If nothing fits, take a guess and
make sure you check with the linux-audit mail list to see if it's acceptable.
Always make the field name completely lower case and no capitalization. If you
have a need to make the name compound - e.g. prefixed - then use a hyphen to
"glue" the two pieces together. The value associated with the field needs to
have the same formatting as listed here or translations of the values can have
errors. The following list enumerates known field names:

[ NOTE - This list is no longer maintained in this document. A short sample
  is listed below to illustrate the information collected. The full list can
  be retrieved as a csv file from
  http://people.redhat.com/sgrubb/audit/field-dictionary.csv ]

	a?         - numeric, the arguments to a syscall
	acct       - encoded, a user's account name
	addr       - the remote address that the user is connecting from
	arch       - numeric, the elf architecture flags
	argc       - numeric, the number of arguments to an execve syscall
	audit_backlog_limit - numeric, audit system's backlog queue size
	audit_enabled - numeric, audit systems's enable/disable status
	audit_failure - numeric, audit system's failure mode
	auid       - numeric, login user id


Maintenance
-----------
Over time compliance regulations change as do Common Criteria needs. Generally
once you write an event, you should never alter it. If you do, it's best to 
send an email to the linux-audit mail list explaining what the change is prior
to implementing the change. This allows people that might have analysis
programs to know of the change or discuss options. Additionally, it may be
necessary to alter your event to change the formatting or field name or field
order over time. So, please get review to make sure everyone agrees and if
asked to make changes due to new requirements, please help out.

If you do make changes to your event, you should use the ausearch-test program
to make sure the new event is well formed. Passing the tests shows that the
event can still be searched, but you should also ask ausearch to interpret the
event to make sure any interpretation is what you expect. If not, then you have
problably used a pre-existing field name for a different purpose.

Logging Code Examples
---------------------
Kernel:

if (audit_enabled) {
    struct audit_buffer *ab;
    uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(current));
    unsigned int sessionid = audit_get_sessionid(current);

    ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_KERNEL_OTHER);
    if (!ab)
        return;
    audit_log_format(ab, "auid=%u ses=%u" ,loginuid, sessionid);
    audit_log_task_context(ab);
    audit_log_format(ab, " comm=");
    audit_log_untrustedstring(ab, comm);
    audit_log_end(ab);
}

User space:

    char buf[4096], *acct;
    int fd = audit_open();
    // acct is untrusted string and must be encoded
    acct = audit_encode_nv_string("acct", pamh->user, 0);
    snprintf(buf, sizeof(buf), "op=change-password sauid=%d %s",
            audit_getloginuid(), acct);
    audit_log_user_message(fd, AUDIT_USER_CHAUTHTOK, buf, NULL, NULL,
            NULL, 0);
    free(acct);
    close(fd);