From: Kiyoshi Ueda Multiple instances of dec_pending() can run concurrently so a lock is needed when it saves the first error code. I have never experienced actual problem without locking and just found this during code inspection while implementing the barrier support patch for request-based dm. This patch adds the locking. I've done compile, boot and basic I/O testings. Signed-off-by: Kiyoshi Ueda Signed-off-by: Jun'ichi Nomura Signed-off-by: Alasdair G Kergon --- drivers/md/dm.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) Index: linux-2.6.32-rc3/drivers/md/dm.c =================================================================== --- linux-2.6.32-rc3.orig/drivers/md/dm.c +++ linux-2.6.32-rc3/drivers/md/dm.c @@ -47,6 +47,7 @@ struct dm_io { atomic_t io_count; struct bio *bio; unsigned long start_time; + spinlock_t endio_lock; }; /* @@ -576,8 +577,12 @@ static void dec_pending(struct dm_io *io struct mapped_device *md = io->md; /* Push-back supersedes any I/O errors */ - if (error && !(io->error > 0 && __noflush_suspending(md))) - io->error = error; + if (unlikely(error)) { + spin_lock_irqsave(&io->endio_lock, flags); + if (!(io->error > 0 && __noflush_suspending(md))) + io->error = error; + spin_unlock_irqrestore(&io->endio_lock, flags); + } if (atomic_dec_and_test(&io->io_count)) { if (io->error == DM_ENDIO_REQUEUE) { @@ -1224,6 +1229,7 @@ static void __split_and_process_bio(stru atomic_set(&ci.io->io_count, 1); ci.io->bio = bio; ci.io->md = md; + spin_lock_init(&ci.io->endio_lock); ci.sector = bio->bi_sector; ci.sector_count = bio_sectors(bio); if (unlikely(bio_empty_barrier(bio)))