Hi Mikulas, Thanks for taking the time to look through this. I think you're right on all counts. - Joe On Mon, Mar 10, 2014 at 08:08:14PM -0400, Mikulas Patocka wrote: > Hi > > I reviewed dm-era, here is a possible list of bugs: > > There is no tracking of in-progress writes. Suppose that this happens: > 1) a write is received, era_map sees that metadata_current_marked(era->md, > block) is true, so it lets the write through > 2) the user sends the checkpoint message > 3) the write received in step 1) hits the disk and modifies it > 4) the user sends metadata_take_snap message, the write is not in the > snapshot, although it modified the disk between steps 2) and 4) > You can use the functions dm_internal_suspend and dm_internal_resume to > clear in-progress bios - this is lightweight suspend/resume callable only > from the kernel. > > era_destroy calls metadata_close(era->md); without checking that era->md > is not NULL. This can cause crash if era_destroy is called early in the > constructor. > > era->sectors_per_block should be validated that it is not zero. > > There is no write memory barrier between the setting of rpc->result and > rpc->complete: > if (r) > list_for_each_entry_safe(rpc, tmp, &calls, list) > rpc->result = r; > } > > list_for_each_entry_safe(rpc, tmp, &calls, list) { > atomic_set(&rpc->complete, 1); > And there is no read memory barrier between the reading of rpc->complete > and rpc->result: > wait_event(rpc->wait, atomic_read(&rpc->complete)); > > return rpc->result; > You should use completion (include/linux/completion.h) instead of > rpc->complete and rpc->wait, it simplifies the code and takes care of > memory barriers. > > metadata_space_map_root is not aligned on 64-bit boundary, but it is > accessed as 64-bit numbers in sm_ll_open_metadata. > > Mixing 32-bit and 64-bit current_era (I'm not sure it this is a bug or > not). > > Missing read and write memory barriers around accesses to > archived_writesets (I'm not sure if this is a bug, but it looks strange). > > create_fresh_metadata doesn't free md->tm and md->sm on errors. > > format_metadata doesn't undo the effect of create_fresh_metadata if > write_superblock fails. > > superblock_all_zeroes checks the full block for zeroes, although lvm > zeroes only zeroes 4k at the beginning of a new logical volume. > > test_bit and test_and_set_bit take signed integer as a parameter (on some > architectures the argument is long, on some int), so you must restrict the > maximum number of blocks to 2^31-1. Or, alternatively, do not use test_bit > and test_and_set_bit and implement these functions on your own - this will > allow you to have 64-bit block numbers. > > The argument to bitset_size may overflow, it is not checked. If you allow > 64-bit argument to bitset_size, you must also check that the conversion to > size_t does not overflow. > > era_status: /* FIXME: do we need to protect this? */ dm_sm_get_nr_free - > on 32-bit architectures you need to lock it because 64-bit accesses are > nonatomic. You can see the implementation of i_size_read and i_size_write > and do something similar. > > era_map doesn't distinguish REQ_FLUSH bios, it should just pass them > through without accessing bio->bi_iter.bi_sector. > > > Possible improvements: > > atomic64_inc(&md->current_era); - there is no concurrent access to > current_era, so you can use this to avoid the interlocked instruction: > atomic64_set(&md->current_era, atomic64_read(&md->current_era) + 1) > > process_deferred_bios: deferred_lock is never taken from interrupt, so you > can replace spin_lock_irqsave/spin_unlock_irqrestore with > spin_lock/spin_unlock. > > writeset_test_and_set: test_and_set_bit is interlocked instruction - here, > no concurrent access happens, so use __test_and_set_bit, which is not > interlocked. Alternatively - reimplement it if you want 2^31 or more > blocks - see above. > > > Mikulas -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel