Container Security


Eric Paris

Architect OpenShift

Container Security

As explained by the three pigs

Glossary
Pig == Application

Chapter 1

Where should the pigs live?

When should I use containers versus virtual machines?

Standalone Homes
(Separate Physical Machines)

Duplex Home
(Virtual Machines)

Hostel
(Services Same Machine)

Park
(setenforce 0)

Apartment Building
(Containers)

Pigs in Apartment Buildings

Best combination resource sharing
ease of maintenance & Security.

Chapter 2

What kind of
apartment building?

What platform should host your containers?

Straw?

Running containers on do it yourself platform.

Sticks?

Running containers on community platform.

Brick?

Running containers on RHEL and OpenShift.

Chapter 3

How do I separate/secure pig apartments?

How do you ensure container separation?

Containers do not Contain

http://www.maritimenz.govt.nz/images/Incident-area/Rena7.jpg

Do you care?

Treat Container Services just like
regular services

Drop privileges as quickly as possible

Run your services as non Root whenever possible

Treat root within a container the same as root outside of the container

"Docker is about running random crap from the internet as root on your host"

Only run containers from trusted parties

Why don't containers contain?

Everything in Linux is not namespaced

Containers are not comprehensive like virtual machines (kvm)

Kernel file systems: /sys, /sys/fs, /proc/sys

Cgroups, SELinux, /dev/mem, kernel modules

But Eric

Our engineers say their
applications need to run as root

Protecting Host Kernel
from processes within containers

Protecting Kernel file systems

Protect Kernel file systems: /sys, /sys/fs, /proc/sys

Read Only Mount Points

Mask Out Kernel file systems

Limiting the power of root

Stop root from remounting file systems as read/write

Capabilities

man capabilities

DESCRIPTION
     For  the  purpose  of  performing  permission  checks, traditional UNIX
     implementations distinguish two  categories  of  processes:  privileged
     processes  (whose  effective  user ID is 0, referred to as superuser or
     root), and unprivileged processes (whose  effective  UID  is  nonzero).
     Privileged processes bypass all kernel permission checks, while 
     unprivileged processes are subject to full permission checking based on
     the process's credentials (usually: effective UID, effective GID, and 
     supplementary group list).

     Starting with kernel 2.2, Linux divides  the  privileges  traditionally
     associated  with  superuser into distinct units, known as capabilities,
     which can be independently enabled and disabled.   Capabilities  are  a
     per-thread attribute.

Capabilities Removed

CAP_SETPCAPModify process capabilities

CAP_SYS_MODULEInsert/Remove kernel modules
CAP_SYS_RAWIOModify Kernel Memory
CAP_SYS_PACCTConfigure process accounting
CAP_SYS_NICEModify Priotity of processes
CAP_SYS_RESOURCEOverride Resource Limits
CAP_SYS_TIMEModify the system clock
CAP_SYS_TTY_CONFIGConfigure tty devices
CAP_AUDIT_WRITEWrite the audit log
CAP_AUDIT_CONTROLConfigure Audit Subsystem
CAP_MAC_OVERRIDEIgnore Kernel MAC Policy
CAP_MAC_ADMINConfigure MAC Configuration
CAP_SYSLOGModify Kernel printk behavior

Capabilities Removed

CAP_NET_ADMINConfigure the network
CAP_SYS_ADMINCatch all

SYS_ADMIN

less /usr/include/linux/capability.h 
...
/* Allow configuration of the secure attention key */
/* Allow administration of the random device */
/* Allow examination and configuration of disk quotas */
/* Allow setting the domainname */
/* Allow setting the hostname */
/* Allow calling bdflush() */
/* Allow mount() and umount(), setting up new smb connection */
/* Allow some autofs root ioctls */
/* Allow nfsservctl */
/* Allow VM86_REQUEST_IRQ */
/* Allow to read/write pci config on alpha */
/* Allow irix_prctl on mips (setstacksize) */
/* Allow flushing all cache on m68k (sys_cacheflush) */
/* Allow removing semaphores */
/* Used instead of CAP_CHOWN to "chown" IPC message queues, semaphores
   and shared memory */
/* Allow locking/unlocking of shared memory segment */
/* Allow turning swap on/off */
/* Allow forged pids on socket credentials passing */
/* Allow setting readahead and flushing buffers on block devices */

SYS_ADMIN

/* Allow setting geometry in floppy driver */
/* Allow turning DMA on/off in xd driver */
/* Allow administration of md devices (mostly the above, but some
   extra ioctls) */
/* Allow tuning the ide driver */
/* Allow access to the nvram device */
/* Allow administration of apm_bios, serial and bttv (TV) device */
/* Allow manufacturer commands in isdn CAPI support driver */
/* Allow reading non-standardized portions of pci configuration space */
/* Allow DDI debug ioctl on sbpcd driver */
/* Allow setting up serial ports */
/* Allow sending raw qic-117 commands */
/* Allow enabling/disabling tagged queuing on SCSI controllers and sending
   arbitrary SCSI commands */
/* Allow setting encryption key on loopback filesystem */
/* Allow setting zone reclaim policy */

CAP_SYS_ADMIN

That's not all folks.

$ git grep 'capable(CAP_SYS_ADMIN)' | wc -l
533
		 

Limiting operating systems view

Namespaces

PID Name Space

Network Name Space

Controlling interaction with Device nodes

Cgroups

Device Cgroup

Controls which device nodes can be created within namespace

Device nodes allow processes to configure kernel

/dev/console/dev/zero/dev/null/dev/fuse
/dev/full/dev/tty*/dev/urandom/dev/random

images mounted with nodev

Protecting the
host file system

SELinux

Everyone Please Recite With Me

SELinux is a LABELING system

Every Process has a LABEL

Every File, Directory, System object has a LABEL

Policy rules control access between labeled processes and labeled objects

The Kernel enforces the rules

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Type Enforcement

Protects the host system from container processes

Container processes can only read/execute /usr files

Container processes only write to container files.

process typecontainer_t
file typecontainer_file_t

MCS Enforcement

Multi Category Security

Based on Multi Level Security (MLS)

MCS Enforcement

MCS Enforcement

MCS Enforcement

MCS Enforcement

MCS Enforcement

Protects containers from each other.

Container Processes can only read/write their own files.

Container Runtimes pick out unique random MCS Label.
s0:c1,c2

Assigns MCS Label to all content

Launches the container processes with same label

Limiting the syscall attack surface on the kernel

seccomp

Shrink the attack surface on the kernel

Eliminate syscalls
kexec_load, open_by_handle_at, init_module, finit_module, delete_module, iopl, ioperm, swapon, swapoff, sysfs, sysctl, adjtimex, clock_adjtime, lookup_dcookie, perf_event_open, fanotify_init, kcmp

block 32 bit syscalls

block old weird networks

#nobigfatdaemons.

CRI-O https://cri-o.io

Dedicated small daemon for running containers under Kubernetes

Buildah https://buildah.io

Dedicated tool for building container images

Podman https://podman.io

Replacement CLI for Docker
Run/develop containers as non root

Security Context Constraints

$ oc describe scc restricted
Name:                restricted
Priority:               
Access:
  Users:             
  Groups:               system:authenticated
Settings:
  Allow Privileged:        false
  Default Add Capabilities:      
  Required Drop Capabilities:    KILL,MKNOD,SYS_CHROOT,SETUID,SETGID
  Allowed Capabilities:       
  Allowed Volume Types:       configMap,downwardAPI,emptyDir,persistentVolumeClaim,secret
  Allow Host Network:         false
  Allow Host Ports:        false
  Allow Host PID:          false
  Allow Host IPC:          false
  Read Only Root Filesystem:     false
  Run As User Strategy:             MustRunAsRange
    UID:             
    UID Range Min:            
    UID Range Max:            
  SELinux Context Strategy:         MustRunAs
    User:               
    Role:               
    Type:               
    Level:              
  FSGroup Strategy:                 MustRunAs
    Ranges:             
  Supplemental Groups Strategy:     RunAsAny
    Ranges:             

Questions?

Chapter 4

How do you furnish the pigs apartment?

How do I secure content inside container?

LINUX 1999

Where did you go to get software?

Go to yahoo.com or AltaVista.com
and google it?

I found it on rpmfind.net, download and install.

Hey I hear there is a big Security vulnerability in Zlib.

How many copies of the Zlib vulnerability to you have?

I have no clue!!!

Red Hat to the rescue

Red Hat Enterprise Linux solved this problem

Certified software and hardware platforms

People have no idea of quality of software in container images

Or they are building them themselves?

Lets Talk About DEV/OPS

Containers move the responsibility for security updates from the Operator to the Developer.

Do you trust developers to
fix security issues in their images?

What happens when the next Shell Shock hits

RHEL Certified Images

Who maintains your container environment?

Community Standards?

You need Control

Don't let this be you.

Questions?