From: Jonthan Brassow After a leg fails and a write returns, '__bio_mark_nosync' is used to mark the region out-of-sync. This state is stored in a region structure that remains in the region hash. It is not removed from the region hash until the mirror is destroyed because it never goes on the clean_regions list. Right now, this is not a problem because when a device fails, the mirror is destroyed and a new mirror is created w/o the failed device. In the future, when we wish to handle transient failures, we would simply suspend and resume to restart recovery. In that case, some machines in the cluster would only write to the primary for regions that are cached as not-in-sync - due to the '__bio_mark_nosync'. The fix is to simply clear out the region hash when a mirror is suspended. Signed-off-by: Jonathan Brassow [AGK How does this fit with other recent precautionary changes?] [AGK Looks more like a potential enhancement than a fix, and I don't really understand if it's an enhancement we want to implement this way.] --- drivers/md/dm-raid1.c | 6 ++++++ drivers/md/dm-region-hash.c | 8 +++++++- include/linux/dm-region-hash.h | 3 ++- 3 files changed, 15 insertions(+), 2 deletions(-) Index: linux-2.6.33-rc6/drivers/md/dm-raid1.c =================================================================== --- linux-2.6.33-rc6.orig/drivers/md/dm-raid1.c +++ linux-2.6.33-rc6/drivers/md/dm-raid1.c @@ -1301,6 +1301,12 @@ static void mirror_postsuspend(struct dm struct mirror_set *ms = ti->private; struct dm_dirty_log *log = dm_rh_dirty_log(ms->rh); + /* + * Clear the region cache to prevent stale information + * on the next resume. + */ + dm_region_hash_clear(ms->rh); + if (log->type->postsuspend && log->type->postsuspend(log)) /* FIXME: need better error handling */ DMWARN("log postsuspend failed"); Index: linux-2.6.33-rc6/drivers/md/dm-region-hash.c =================================================================== --- linux-2.6.33-rc6.orig/drivers/md/dm-region-hash.c +++ linux-2.6.33-rc6/drivers/md/dm-region-hash.c @@ -230,7 +230,7 @@ struct dm_region_hash *dm_region_hash_cr } EXPORT_SYMBOL_GPL(dm_region_hash_create); -void dm_region_hash_destroy(struct dm_region_hash *rh) +void dm_region_hash_clear(struct dm_region_hash *rh) { unsigned h; struct dm_region *reg, *nreg; @@ -243,6 +243,12 @@ void dm_region_hash_destroy(struct dm_re mempool_free(reg, rh->region_pool); } } +} +EXPORT_SYMBOL_GPL(dm_region_hash_clear); + +void dm_region_hash_destroy(struct dm_region_hash *rh) +{ + dm_region_hash_clear(rh); if (rh->log) dm_dirty_log_destroy(rh->log); Index: linux-2.6.33-rc6/include/linux/dm-region-hash.h =================================================================== --- linux-2.6.33-rc6.orig/include/linux/dm-region-hash.h +++ linux-2.6.33-rc6/include/linux/dm-region-hash.h @@ -29,7 +29,7 @@ enum dm_rh_region_states { }; /* - * Region hash create/destroy. + * Region hash create/clear/destroy. */ struct bio_list; struct dm_region_hash *dm_region_hash_create( @@ -40,6 +40,7 @@ struct dm_region_hash *dm_region_hash_cr sector_t target_begin, unsigned max_recovery, struct dm_dirty_log *log, uint32_t region_size, region_t nr_regions); +void dm_region_hash_clear(struct dm_region_hash *rh); void dm_region_hash_destroy(struct dm_region_hash *rh); struct dm_dirty_log *dm_rh_dirty_log(struct dm_region_hash *rh);