From: Mikulas Patocka This device-mapper target creates a read-only device that transparently validates the data on one underlying device against a pre-generated tree of cryptographic checksums stored on a second device. Two checksum device formats are supported: version 0 which is already shipping in Chromium OS and version 1 which incorporates some improvements. Signed-off-by: Mikulas Patocka Signed-off-by: Mandeep Singh Baines Signed-off-by: Will Drewry Signed-off-by: Elly Jones Cc: Milan Broz Cc: Olof Johansson Cc: Steffen Klassert Cc: Andrew Morton Signed-off-by: Alasdair G Kergon FIXME Add simple veritytool examples FIXME Confirm veritytool location once uploaded --- Documentation/device-mapper/verity.txt | 194 +++++++ drivers/md/Kconfig | 20 drivers/md/Makefile | 1 drivers/md/dm-verity.c | 913 +++++++++++++++++++++++++++++++++ 4 files changed, 1128 insertions(+) Index: linux-3.3/Documentation/device-mapper/verity.txt =================================================================== --- /dev/null +++ linux-3.3/Documentation/device-mapper/verity.txt @@ -0,0 +1,194 @@ +dm-verity +========== + +Device-Mapper's "verity" target provides transparent integrity checking of +block devices using a cryptographic digest provided by the kernel crypto API. +This target is read-only. + +Construction Parameters +======================= + + + + + + + This is the version number of the on-disk format. + + 0 is the original format used in the Chromium OS. + The salt is appended when hashing, digests are stored continuously and + the rest of the block is padded with zeros. + + 1 is the current format that should be used for new devices. + The salt is prepended when hashing and each digest is + padded with zeros to the power of two. + + + This is the device containing the data the integrity of which needs to be + checked. It may be specified as a path, like /dev/sdaX, or a device number, + :. + + + This is the device that that supplies the hash tree data. It may be + specified similarly to the device path and may be the same device. If the + same device is used, the hash_start should be outside of the dm-verity + configured device size. + + + The block size on a data device. Each block corresponds to one digest on + the hash device. + + + The size of a hash block. + + + The number of data blocks on the data device. Additional blocks are + inaccessible. You can place hashes to the same partition as data, in this + case hashes are placed after . + + + This is the offset, in -blocks, from the start of hash_dev + to the root block of the hash tree. + + + The cryptographic hash algorithm used for this device. This should + be the name of the algorithm, like "sha1". + + + The hexadecimal encoding of the cryptographic hash of the root hash block + and the salt. This hash should be trusted as there is no other authenticity + beyond this point. + + + The hexadecimal encoding of the salt value. + +Theory of operation +=================== + +dm-verity is meant to be setup as part of a verified boot path. This +may be anything ranging from a boot using tboot or trustedgrub to just +booting from a known-good device (like a USB drive or CD). + +When a dm-verity device is configured, it is expected that the caller +has been authenticated in some way (cryptographic signatures, etc). +After instantiation, all hashes will be verified on-demand during +disk access. If they cannot be verified up to the root node of the +tree, the root hash, then the I/O will fail. This should identify +tampering with any data on the device and the hash data. + +Cryptographic hashes are used to assert the integrity of the device on a +per-block basis. This allows for a lightweight hash computation on first read +into the page cache. Block hashes are stored linearly-aligned to the nearest +block the size of a page. + +Hash Tree +--------- + +Each node in the tree is a cryptographic hash. If it is a leaf node, the hash +is of some block data on disk. If it is an intermediary node, then the hash is +of a number of child nodes. + +Each entry in the tree is a collection of neighboring nodes that fit in one +block. The number is determined based on block_size and the size of the +selected cryptographic digest algorithm. The hashes are linearly-ordered in +this entry and any unaligned trailing space is ignored but included when +calculating the parent node. + +The tree looks something like: + +alg = sha256, num_blocks = 32768, block_size = 4096 + + [ root ] + / . . . \ + [entry_0] [entry_1] + / . . . \ . . . \ + [entry_0_0] . . . [entry_0_127] . . . . [entry_1_127] + / ... \ / . . . \ / \ + blk_0 ... blk_127 blk_16256 blk_16383 blk_32640 . . . blk_32767 + + +On-disk format +============== + +Below is the recommended on-disk format. The verity kernel code does not +read the on-disk header. It only reads the hash blocks which directly +follow the header. It is expected that a user-space tool will verify the +integrity of the verity_header and then call dmsetup with the correct +parameters. Alternatively, the header can be omitted and the dmsetup +parameters can be passed via the kernel command-line in a rooted chain +of trust where the command-line is verified. + +The on-disk format is especially useful in cases where the hash blocks +are on a separate partition. The magic number allows easy identification +of the partition contents. Alternatively, the hash blocks can be stored +in the same partition as the data to be verified. In such a configuration +the filesystem on the partition would be sized a little smaller than +the full-partition, leaving room for the hash blocks. + +struct superblock { + uint8_t signature[8] + "verity\0\0"; + + uint8_t version; + 1 - current format + + uint8_t data_block_bits; + log2(data block size) + + uint8_t hash_block_bits; + log2(hash block size) + + uint8_t pad1[1]; + zero padding + + uint16_t salt_size; + big-endian salt size + + uint8_t pad2[2]; + zero padding + + uint32_t data_blocks_hi; + big-endian high 32 bits of the 64-bit number of data blocks + + uint32_t data_blocks_lo; + big-endian low 32 bits of the 64-bit number of data blocks + + uint8_t algorithm[16]; + cryptographic algorithm + + uint8_t salt[384]; + salt (the salt size is specified above) + + uint8_t pad3[88]; + zero padding to 512-byte boundary +} + +Directly following the header (and with sector number padded to the next hash +block boundary) are the hash blocks which are stored a depth at a time +(starting from the root), sorted in order of increasing index. + +Status +====== +V (for Valid) is returned if every check performed so far was valid. +If any check failed, C (for Corruption) is returned. + +Example +======= + +Setup a device: + dmsetup create vroot --table \ + "0 2097152 "\ + "verity 1 /dev/sda1 /dev/sda2 4096 4096 2097152 1 "\ + "4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 "\ + "1234000000000000000000000000000000000000000000000000000000000000" + +A command line tool veritysetup is available to compute or verify +the hash tree or activate the kernel driver. This is available from +the LVM2 upstream repository and may be supplied as a package called +device-mapper-verity-tools: + git://sources.redhat.com/git/lvm2 + http://sourceware.org/git/?p=lvm2.git + http://sourceware.org/cgi-bin/cvsweb.cgi/LVM2/verity?cvsroot=lvm2 + +veritysetup -a vroot /dev/sda1 /dev/sda2 \ + 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 Index: linux-3.3/drivers/md/Kconfig =================================================================== --- linux-3.3.orig/drivers/md/Kconfig +++ linux-3.3/drivers/md/Kconfig @@ -370,4 +370,24 @@ config DM_FLAKEY ---help--- A target that intermittently fails I/O for debugging purposes. +config DM_VERITY + tristate "Verity target support (EXPERIMENTAL)" + depends on BLK_DEV_DM && EXPERIMENTAL + select CRYPTO + select CRYPTO_HASH + select DM_BUFIO + ---help--- + This device-mapper target creates a read-only device that + transparently validates the data on one underlying device against + a pre-generated tree of cryptographic checksums stored on a second + device. + + You'll need to activate the digests you're going to use in the + cryptoapi configuration. + + To compile this code as a module, choose M here: the module will + be called dm-verity. + + If unsure, say N. + endif # MD Index: linux-3.3/drivers/md/Makefile =================================================================== --- linux-3.3.orig/drivers/md/Makefile +++ linux-3.3/drivers/md/Makefile @@ -42,6 +42,7 @@ obj-$(CONFIG_DM_LOG_USERSPACE) += dm-log obj-$(CONFIG_DM_ZERO) += dm-zero.o obj-$(CONFIG_DM_RAID) += dm-raid.o obj-$(CONFIG_DM_THIN_PROVISIONING) += dm-thin-pool.o +obj-$(CONFIG_DM_VERITY) += dm-verity.o ifeq ($(CONFIG_DM_UEVENT),y) dm-mod-objs += dm-uevent.o Index: linux-3.3/drivers/md/dm-verity.c =================================================================== --- /dev/null +++ linux-3.3/drivers/md/dm-verity.c @@ -0,0 +1,913 @@ +/* + * Copyright (C) 2012 Red Hat, Inc. + * + * Author: Mikulas Patocka + * + * Based on Chromium dm-verity driver (C) 2011 The Chromium OS Authors + * + * This file is released under the GPLv2. + * + * In the file "/sys/module/dm_verity/parameters/prefetch_cluster" you can set + * default prefetch value. Data are read in "prefetch_cluster" chunks from the + * hash device. Setting this greatly improves performance when data and hash + * are on the same disk on different partitions on devices with poor random + * access behavior. + */ + +#include "dm-bufio.h" + +#include +#include +#include + +#define DM_MSG_PREFIX "verity" + +#define DM_VERITY_IO_VEC_INLINE 16 +#define DM_VERITY_MEMPOOL_SIZE 4 +#define DM_VERITY_DEFAULT_PREFETCH_SIZE 262144 + +#define DM_VERITY_MAX_LEVELS 63 + +static unsigned dm_verity_prefetch_cluster = DM_VERITY_DEFAULT_PREFETCH_SIZE; + +module_param_named(prefetch_cluster, dm_verity_prefetch_cluster, uint, S_IRUGO | S_IWUSR); + +struct dm_verity { + struct dm_dev *data_dev; + struct dm_dev *hash_dev; + struct dm_target *ti; + struct dm_bufio_client *bufio; + char *alg_name; + struct crypto_shash *tfm; + u8 *root_digest; /* digest of the root block */ + u8 *salt; /* salt: its size is salt_size */ + unsigned salt_size; + sector_t data_start; /* data offset in 512-byte sectors */ + sector_t hash_start; /* hash start in blocks */ + sector_t data_blocks; /* the number of data blocks */ + sector_t hash_blocks; /* the number of hash blocks */ + unsigned char data_dev_block_bits; /* log2(data blocksize) */ + unsigned char hash_dev_block_bits; /* log2(hash blocksize) */ + unsigned char hash_per_block_bits; /* log2(hashes in hash block) */ + unsigned char levels; /* the number of tree levels */ + unsigned char version; + unsigned digest_size; /* digest size for the current hash algorithm */ + unsigned shash_descsize;/* the size of temporary space for crypto */ + int hash_failed; /* set to 1 if hash of any block failed */ + + mempool_t *io_mempool; /* mempool of struct dm_verity_io */ + mempool_t *vec_mempool; /* mempool of bio vector */ + + struct workqueue_struct *verify_wq; + + /* starting blocks for each tree level. 0 is the lowest level. */ + sector_t hash_level_block[DM_VERITY_MAX_LEVELS]; +}; + +struct dm_verity_io { + struct dm_verity *v; + struct bio *bio; + + /* original values of bio->bi_end_io and bio->bi_private */ + bio_end_io_t *orig_bi_end_io; + void *orig_bi_private; + + sector_t block; + unsigned n_blocks; + + /* saved bio vector */ + struct bio_vec *io_vec; + unsigned io_vec_size; + + struct work_struct work; + + /* A space for short vectors; longer vectors are allocated separately. */ + struct bio_vec io_vec_inline[DM_VERITY_IO_VEC_INLINE]; + + /* + * Three variably-size fields follow this struct: + * + * u8 hash_desc[v->shash_descsize]; + * u8 real_digest[v->digest_size]; + * u8 want_digest[v->digest_size]; + * + * To access them use: io_hash_desc(), io_real_digest() and io_want_digest(). + */ +}; + +static struct shash_desc *io_hash_desc(struct dm_verity *v, struct dm_verity_io *io) +{ + return (struct shash_desc *)(io + 1); +} + +static u8 *io_real_digest(struct dm_verity *v, struct dm_verity_io *io) +{ + return (u8 *)(io + 1) + v->shash_descsize; +} + +static u8 *io_want_digest(struct dm_verity *v, struct dm_verity_io *io) +{ + return (u8 *)(io + 1) + v->shash_descsize + v->digest_size; +} + +/* + * Auxiliary structure appended to each dm-bufio buffer. If the value + * hash_verified is nonzero, hash of the block has been verified. + * + * The variable hash_verified is set to 0 when allocating the buffer, then + * it can be changed to 1 and it is never reset to 0 again. + * + * There is no lock around this value, a race condition can at worst cause + * that multiple processes verify the hash of the same buffer simultaneously + * and write 1 to hash_verified simultaneously. + * This condition is harmless, so we don't need locking. + */ +struct buffer_aux { + int hash_verified; +}; + +/* + * Initialize struct buffer_aux for a freshly created buffer. + */ +static void dm_bufio_alloc_callback(struct dm_buffer *buf) +{ + struct buffer_aux *aux = dm_bufio_get_aux_data(buf); + + aux->hash_verified = 0; +} + +/* + * Translate input sector number to the sector number on the target device. + */ +static sector_t verity_map_sector(struct dm_verity *v, sector_t bi_sector) +{ + return v->data_start + dm_target_offset(v->ti, bi_sector); +} + +/* + * Return hash position of a specified block at a specified tree level + * (0 is the lowest level). + * The lowest "hash_per_block_bits"-bits of the result denote hash position + * inside a hash block. The remaining bits denote location of the hash block. + */ +static sector_t verity_position_at_level(struct dm_verity *v, sector_t block, + int level) +{ + return block >> (level * v->hash_per_block_bits); +} + +static void verity_hash_at_level(struct dm_verity *v, sector_t block, int level, + sector_t *hash_block, unsigned *offset) +{ + sector_t position = verity_position_at_level(v, block, level); + unsigned idx; + + *hash_block = v->hash_level_block[level] + (position >> v->hash_per_block_bits); + + if (!offset) + return; + + idx = position & ((1 << v->hash_per_block_bits) - 1); + if (!v->version) + *offset = idx * v->digest_size; + else + *offset = idx << (v->hash_dev_block_bits - v->hash_per_block_bits); +} + +/* + * Verify hash of a metadata block pertaining to the specified data block + * ("block" argument) at a specified level ("level" argument). + * + * On successful return, io_want_digest(v, io) contains the hash value for + * a lower tree level or for the data block (if we're at the lowest leve). + * + * If "skip_unverified" is true, unverified buffer is skipped and 1 is returned. + * If "skip_unverified" is false, unverified buffer is hashed and verified + * against current value of io_want_digest(v, io). + */ +static int verity_verify_level(struct dm_verity_io *io, sector_t block, + int level, bool skip_unverified) +{ + struct dm_verity *v = io->v; + struct dm_buffer *buf; + struct buffer_aux *aux; + u8 *data; + int r; + sector_t hash_block; + unsigned offset; + + verity_hash_at_level(v, block, level, &hash_block, &offset); + + data = dm_bufio_read(v->bufio, hash_block, &buf); + if (unlikely(IS_ERR(data))) + return PTR_ERR(data); + + aux = dm_bufio_get_aux_data(buf); + + if (!aux->hash_verified) { + struct shash_desc *desc; + u8 *result; + + if (skip_unverified) { + r = 1; + goto release_ret_r; + } + + desc = io_hash_desc(v, io); + desc->tfm = v->tfm; + desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP; + r = crypto_shash_init(desc); + if (r < 0) { + DMERR("crypto_shash_init failed: %d", r); + goto release_ret_r; + } + + if (likely(v->version >= 1)) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + goto release_ret_r; + } + } + + r = crypto_shash_update(desc, data, 1 << v->hash_dev_block_bits); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + goto release_ret_r; + } + + if (!v->version) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + goto release_ret_r; + } + } + + result = io_real_digest(v, io); + r = crypto_shash_final(desc, result); + if (r < 0) { + DMERR("crypto_shash_final failed: %d", r); + goto release_ret_r; + } + if (unlikely(memcmp(result, io_want_digest(v, io), v->digest_size))) { + DMERR_LIMIT("metadata block %llu is corrupted", + (unsigned long long)hash_block); + v->hash_failed = 1; + r = -EIO; + goto release_ret_r; + } else + aux->hash_verified = 1; + } + + data += offset; + + memcpy(io_want_digest(v, io), data, v->digest_size); + + dm_bufio_release(buf); + return 0; + +release_ret_r: + dm_bufio_release(buf); + + return r; +} + +/* + * Verify one "dm_verity_io" structure. + */ +static int verity_verify_io(struct dm_verity_io *io) +{ + struct dm_verity *v = io->v; + unsigned b; + int i; + unsigned vector = 0, offset = 0; + + for (b = 0; b < io->n_blocks; b++) { + struct shash_desc *desc; + u8 *result; + int r; + unsigned todo; + + if (likely(v->levels)) { + /* + * First, we try to get the requested hash for + * the current block. If the hash block itself is + * verified, zero is returned. If it isn't, this + * function returns 0 and we fall back to whole + * chain verification. + */ + int r = verity_verify_level(io, io->block + b, 0, true); + if (likely(!r)) + goto test_block_hash; + if (r < 0) + return r; + } + + memcpy(io_want_digest(v, io), v->root_digest, v->digest_size); + + for (i = v->levels - 1; i >= 0; i--) { + int r = verity_verify_level(io, io->block + b, i, false); + if (unlikely(r)) + return r; + } + +test_block_hash: + desc = io_hash_desc(v, io); + desc->tfm = v->tfm; + desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP; + r = crypto_shash_init(desc); + if (r < 0) { + DMERR("crypto_shash_init failed: %d", r); + return r; + } + + if (likely(v->version >= 1)) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + return r; + } + } + + todo = 1 << v->data_dev_block_bits; + do { + struct bio_vec *bv; + u8 *page; + unsigned len; + + BUG_ON(vector >= io->io_vec_size); + bv = &io->io_vec[vector]; + page = kmap_atomic(bv->bv_page); + len = bv->bv_len - offset; + if (likely(len >= todo)) + len = todo; + r = crypto_shash_update(desc, + page + bv->bv_offset + offset, len); + kunmap_atomic(page); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + return r; + } + offset += len; + if (likely(offset == bv->bv_len)) { + offset = 0; + vector++; + } + todo -= len; + } while (todo); + + if (!v->version) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + return r; + } + } + + result = io_real_digest(v, io); + r = crypto_shash_final(desc, result); + if (r < 0) { + DMERR("crypto_shash_final failed: %d", r); + return r; + } + if (unlikely(memcmp(result, io_want_digest(v, io), v->digest_size))) { + DMERR_LIMIT("data block %llu is corrupted", + (unsigned long long)(io->block + b)); + v->hash_failed = 1; + return -EIO; + } + } + BUG_ON(vector != io->io_vec_size); + BUG_ON(offset); + + return 0; +} + +/* + * End one "io" structure with a given error. + */ +static void verity_finish_io(struct dm_verity_io *io, int error) +{ + struct bio *bio = io->bio; + struct dm_verity *v = io->v; + + bio->bi_end_io = io->orig_bi_end_io; + bio->bi_private = io->orig_bi_private; + + if (io->io_vec != io->io_vec_inline) + mempool_free(io->io_vec, v->vec_mempool); + + mempool_free(io, v->io_mempool); + + bio_endio(bio, error); +} + +static void verity_work(struct work_struct *w) +{ + struct dm_verity_io *io = container_of(w, struct dm_verity_io, work); + + verity_finish_io(io, verity_verify_io(io)); +} + +static void verity_end_io(struct bio *bio, int error) +{ + struct dm_verity_io *io = bio->bi_private; + + if (error) { + verity_finish_io(io, error); + return; + } + + INIT_WORK(&io->work, verity_work); + queue_work(io->v->verify_wq, &io->work); +} + +/* + * Prefetch buffers for the specified io. + * The root buffer is not prefetched, it is assumed that it will be cached + * all the time. + */ +static void verity_prefetch_io(struct dm_verity *v, struct dm_verity_io *io) +{ + int i; + + for (i = v->levels - 2; i >= 0; i--) { + sector_t hash_block_start; + sector_t hash_block_end; + verity_hash_at_level(v, io->block, i, &hash_block_start, NULL); + verity_hash_at_level(v, io->block + io->n_blocks - 1, i, &hash_block_end, NULL); + if (!i) { + unsigned cluster = *(volatile unsigned *)&dm_verity_prefetch_cluster; + + cluster >>= v->data_dev_block_bits; + if (unlikely(!cluster)) + goto no_prefetch_cluster; + + if (unlikely(cluster & (cluster - 1))) + cluster = 1 << (fls(cluster) - 1); + + hash_block_start &= ~(sector_t)(cluster - 1); + hash_block_end |= cluster - 1; + if (unlikely(hash_block_end >= v->hash_blocks)) + hash_block_end = v->hash_blocks - 1; + } +no_prefetch_cluster: + dm_bufio_prefetch(v->bufio, hash_block_start, + hash_block_end - hash_block_start + 1); + } +} + +/* + * Bio map function. It allocates dm_verity_io structure and bio vector and + * fills them. Then it issues prefetches and the I/O. + */ +static int verity_map(struct dm_target *ti, struct bio *bio, + union map_info *map_context) +{ + struct dm_verity *v = ti->private; + struct dm_verity_io *io; + + bio->bi_bdev = v->data_dev->bdev; + bio->bi_sector = verity_map_sector(v, bio->bi_sector); + + if (((unsigned)bio->bi_sector | bio_sectors(bio)) & + ((1 << (v->data_dev_block_bits - SECTOR_SHIFT)) - 1)) { + DMERR_LIMIT("unaligned io"); + return -EIO; + } + + if ((bio->bi_sector + bio_sectors(bio)) >> + (v->data_dev_block_bits - SECTOR_SHIFT) > v->data_blocks) { + DMERR_LIMIT("io out of range"); + return -EIO; + } + + if (bio_data_dir(bio) == WRITE) + return -EIO; + + io = mempool_alloc(v->io_mempool, GFP_NOIO); + io->v = v; + io->bio = bio; + io->orig_bi_end_io = bio->bi_end_io; + io->orig_bi_private = bio->bi_private; + io->block = bio->bi_sector >> (v->data_dev_block_bits - SECTOR_SHIFT); + io->n_blocks = bio->bi_size >> v->data_dev_block_bits; + + bio->bi_end_io = verity_end_io; + bio->bi_private = io; + io->io_vec_size = bio->bi_vcnt - bio->bi_idx; + if (io->io_vec_size < DM_VERITY_IO_VEC_INLINE) + io->io_vec = io->io_vec_inline; + else + io->io_vec = mempool_alloc(v->vec_mempool, GFP_NOIO); + memcpy(io->io_vec, bio_iovec(bio), + io->io_vec_size * sizeof(struct bio_vec)); + + verity_prefetch_io(v, io); + + generic_make_request(bio); + + return DM_MAPIO_SUBMITTED; +} + +/* + * Status: V (valid) or C (corruption found) + */ +static int verity_status(struct dm_target *ti, status_type_t type, + char *result, unsigned maxlen) +{ + struct dm_verity *v = ti->private; + unsigned sz = 0; + unsigned x; + + switch (type) { + case STATUSTYPE_INFO: + DMEMIT("%c", v->hash_failed ? 'C' : 'V'); + break; + case STATUSTYPE_TABLE: + DMEMIT("%u %s %s %u %u %llu %llu %s ", + v->version, + v->data_dev->name, + v->hash_dev->name, + 1 << v->data_dev_block_bits, + 1 << v->hash_dev_block_bits, + (unsigned long long)v->data_blocks, + (unsigned long long)v->hash_start, + v->alg_name + ); + for (x = 0; x < v->digest_size; x++) + DMEMIT("%02x", v->root_digest[x]); + DMEMIT(" "); + if (!v->salt_size) + DMEMIT("-"); + else + for (x = 0; x < v->salt_size; x++) + DMEMIT("%02x", v->salt[x]); + break; + } + + return 0; +} + +static int verity_ioctl(struct dm_target *ti, unsigned cmd, + unsigned long arg) +{ + struct dm_verity *v = ti->private; + int r = 0; + + if (v->data_start || + ti->len != i_size_read(v->data_dev->bdev->bd_inode) >> SECTOR_SHIFT) + r = scsi_verify_blk_ioctl(NULL, cmd); + + return r ? : __blkdev_driver_ioctl(v->data_dev->bdev, v->data_dev->mode, + cmd, arg); +} + +static int verity_merge(struct dm_target *ti, struct bvec_merge_data *bvm, + struct bio_vec *biovec, int max_size) +{ + struct dm_verity *v = ti->private; + struct request_queue *q = bdev_get_queue(v->data_dev->bdev); + + if (!q->merge_bvec_fn) + return max_size; + + bvm->bi_bdev = v->data_dev->bdev; + bvm->bi_sector = verity_map_sector(v, bvm->bi_sector); + + return min(max_size, q->merge_bvec_fn(q, bvm, biovec)); +} + +static int verity_iterate_devices(struct dm_target *ti, + iterate_devices_callout_fn fn, void *data) +{ + struct dm_verity *v = ti->private; + + return fn(ti, v->data_dev, v->data_start, ti->len, data); +} + +static void verity_io_hints(struct dm_target *ti, struct queue_limits *limits) +{ + struct dm_verity *v = ti->private; + + if (limits->logical_block_size < 1 << v->data_dev_block_bits) + limits->logical_block_size = 1 << v->data_dev_block_bits; + + if (limits->physical_block_size < 1 << v->data_dev_block_bits) + limits->physical_block_size = 1 << v->data_dev_block_bits; + + blk_limits_io_min(limits, limits->logical_block_size); +} + +static void verity_dtr(struct dm_target *ti) +{ + struct dm_verity *v = ti->private; + + if (v->verify_wq) + destroy_workqueue(v->verify_wq); + + if (v->vec_mempool) + mempool_destroy(v->vec_mempool); + + if (v->io_mempool) + mempool_destroy(v->io_mempool); + + if (v->bufio) + dm_bufio_client_destroy(v->bufio); + + kfree(v->salt); + kfree(v->root_digest); + + if (v->tfm) + crypto_free_shash(v->tfm); + + kfree(v->alg_name); + + if (v->hash_dev) + dm_put_device(ti, v->hash_dev); + + if (v->data_dev) + dm_put_device(ti, v->data_dev); + + kfree(v); +} + +/* + * Target parameters: + * The current format is version 1. + * Vsn 0 is compatible with original Chromium OS releases. + * + * + * + * + * + * + * + * + * Hex string or "-" if no salt. + */ +static int verity_ctr(struct dm_target *ti, unsigned argc, char **argv) +{ + struct dm_verity *v; + unsigned num; + unsigned long long num_ll; + int r; + int i; + sector_t hash_position; + char dummy; + + v = kzalloc(sizeof(struct dm_verity), GFP_KERNEL); + if (!v) { + ti->error = "Cannot allocate verity structure"; + return -ENOMEM; + } + ti->private = v; + v->ti = ti; + + if ((dm_table_get_mode(ti->table) & ~FMODE_READ)) { + ti->error = "Device must be readonly"; + r = -EINVAL; + goto bad; + } + + if (argc != 10) { + ti->error = "Invalid argument count: exactly 10 arguments required"; + r = -EINVAL; + goto bad; + } + + if (sscanf(argv[0], "%d%c", &num, &dummy) != 1 || + num < 0 || num > 1) { + ti->error = "Invalid version"; + r = -EINVAL; + goto bad; + } + v->version = num; + + r = dm_get_device(ti, argv[1], FMODE_READ, &v->data_dev); + if (r) { + ti->error = "Data device lookup failed"; + goto bad; + } + + r = dm_get_device(ti, argv[2], FMODE_READ, &v->hash_dev); + if (r) { + ti->error = "Data device lookup failed"; + goto bad; + } + + if (sscanf(argv[3], "%u%c", &num, &dummy) != 1 || + !num || (num & (num - 1)) || + num < bdev_logical_block_size(v->data_dev->bdev) || + num > PAGE_SIZE) { + ti->error = "Invalid data device block size"; + r = -EINVAL; + goto bad; + } + v->data_dev_block_bits = ffs(num) - 1; + + if (sscanf(argv[4], "%u%c", &num, &dummy) != 1 || + !num || (num & (num - 1)) || + num < bdev_logical_block_size(v->hash_dev->bdev) || + num > INT_MAX) { + ti->error = "Invalid hash device block size"; + r = -EINVAL; + goto bad; + } + v->hash_dev_block_bits = ffs(num) - 1; + + if (sscanf(argv[5], "%llu%c", &num_ll, &dummy) != 1 || + num_ll << (v->data_dev_block_bits - SECTOR_SHIFT) != + (sector_t)num_ll << (v->data_dev_block_bits - SECTOR_SHIFT)) { + ti->error = "Invalid data blocks"; + r = -EINVAL; + goto bad; + } + v->data_blocks = num_ll; + + if (ti->len > (v->data_blocks << (v->data_dev_block_bits - SECTOR_SHIFT))) { + ti->error = "Data device is too small"; + r = -EINVAL; + goto bad; + } + + if (sscanf(argv[6], "%llu%c", &num_ll, &dummy) != 1 || + num_ll << (v->hash_dev_block_bits - SECTOR_SHIFT) != + (sector_t)num_ll << (v->hash_dev_block_bits - SECTOR_SHIFT)) { + ti->error = "Invalid hash start"; + r = -EINVAL; + goto bad; + } + v->hash_start = num_ll; + + v->alg_name = kstrdup(argv[7], GFP_KERNEL); + if (!v->alg_name) { + ti->error = "Cannot allocate algorithm name"; + r = -ENOMEM; + goto bad; + } + + v->tfm = crypto_alloc_shash(v->alg_name, 0, 0); + if (IS_ERR(v->tfm)) { + ti->error = "Cannot initialize hash function"; + r = PTR_ERR(v->tfm); + v->tfm = NULL; + goto bad; + } + v->digest_size = crypto_shash_digestsize(v->tfm); + if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) { + ti->error = "Digest size too big"; + r = -EINVAL; + goto bad; + } + v->shash_descsize = + sizeof(struct shash_desc) + crypto_shash_descsize(v->tfm); + + v->root_digest = kmalloc(v->digest_size, GFP_KERNEL); + if (!v->root_digest) { + ti->error = "Cannot allocate root digest"; + r = -ENOMEM; + goto bad; + } + if (strlen(argv[8]) != v->digest_size * 2 || + hex2bin(v->root_digest, argv[8], v->digest_size)) { + ti->error = "Invalid root digest"; + r = -EINVAL; + goto bad; + } + + if (strcmp(argv[9], "-")) { + v->salt_size = strlen(argv[9]) / 2; + v->salt = kmalloc(v->salt_size, GFP_KERNEL); + if (!v->salt) { + ti->error = "Cannot allocate salt"; + r = -ENOMEM; + goto bad; + } + if (strlen(argv[9]) != v->salt_size * 2 || + hex2bin(v->salt, argv[9], v->salt_size)) { + ti->error = "Invalid salt"; + r = -EINVAL; + goto bad; + } + } + + v->hash_per_block_bits = + fls((1 << v->hash_dev_block_bits) / v->digest_size) - 1; + + v->levels = 0; + if (v->data_blocks) + while (v->hash_per_block_bits * v->levels < 64 && + (unsigned long long)(v->data_blocks - 1) >> + (v->hash_per_block_bits * v->levels)) + v->levels++; + + if (v->levels > DM_VERITY_MAX_LEVELS) { + ti->error = "Too many tree levels"; + r = -E2BIG; + goto bad; + } + + hash_position = v->hash_start; + for (i = v->levels - 1; i >= 0; i--) { + sector_t s; + v->hash_level_block[i] = hash_position; + s = verity_position_at_level(v, v->data_blocks, i); + s = (s >> v->hash_per_block_bits) + + !!(s & ((1 << v->hash_per_block_bits) - 1)); + if (hash_position + s < hash_position) { + ti->error = "Hash device offset overflow"; + r = -E2BIG; + goto bad; + } + hash_position += s; + } + v->hash_blocks = hash_position; + + v->bufio = dm_bufio_client_create(v->hash_dev->bdev, + 1 << v->hash_dev_block_bits, 1, sizeof(struct buffer_aux), + dm_bufio_alloc_callback, NULL); + if (IS_ERR(v->bufio)) { + ti->error = "Cannot initialize dm-bufio"; + r = PTR_ERR(v->bufio); + v->bufio = NULL; + goto bad; + } + + if (dm_bufio_get_device_size(v->bufio) < v->hash_blocks) { + ti->error = "Hash device is too small"; + r = -E2BIG; + goto bad; + } + + v->io_mempool = mempool_create_kmalloc_pool(DM_VERITY_MEMPOOL_SIZE, + sizeof(struct dm_verity_io) + v->shash_descsize + v->digest_size * 2); + if (!v->io_mempool) { + ti->error = "Cannot allocate io mempool"; + r = -ENOMEM; + goto bad; + } + + v->vec_mempool = mempool_create_kmalloc_pool(DM_VERITY_MEMPOOL_SIZE, + BIO_MAX_PAGES * sizeof(struct bio_vec)); + if (!v->vec_mempool) { + ti->error = "Cannot allocate vector mempool"; + r = -ENOMEM; + goto bad; + } + + /* WQ_UNBOUND greatly improves performance when running on ramdisk */ + v->verify_wq = alloc_workqueue("kverityd", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND, num_online_cpus()); + if (!v->verify_wq) { + ti->error = "Cannot allocate workqueue"; + r = -ENOMEM; + goto bad; + } + + return 0; + +bad: + verity_dtr(ti); + + return r; +} + +static struct target_type verity_target = { + .name = "verity", + .version = {1, 0, 0}, + .module = THIS_MODULE, + .ctr = verity_ctr, + .dtr = verity_dtr, + .map = verity_map, + .status = verity_status, + .ioctl = verity_ioctl, + .merge = verity_merge, + .iterate_devices = verity_iterate_devices, + .io_hints = verity_io_hints, +}; + +static int __init dm_verity_init(void) +{ + int r; + + r = dm_register_target(&verity_target); + if (r < 0) + DMERR("register failed %d", r); + + return r; +} + +static void __exit dm_verity_exit(void) +{ + dm_unregister_target(&verity_target); +} + +module_init(dm_verity_init); +module_exit(dm_verity_exit); + +MODULE_AUTHOR("Mikulas Patocka "); +MODULE_AUTHOR("Mandeep Baines "); +MODULE_AUTHOR("Will Drewry "); +MODULE_DESCRIPTION(DM_NAME " target for transparent disk integrity checking"); +MODULE_LICENSE("GPL");