On Fri, Mar 16, 2012 at 8:16 PM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > Hi Will > > > On Wed, 14 Mar 2012, Will Drewry wrote: > >> Hi Mikulas, >> >> This is a nice rewrite and takes advantage of your dm-bufio layer. I >> wish it'd existed (and or we wrote it :) in 2009 when we started this >> work! Some comments below: >> >> > --- >> > +static void verity_prefetch_io(struct dm_verity *v, struct dm_verity_io *io) >> > +{ >> > + int i; >> > + for (i = v->levels - 2; i >= 0; i--) { >> > + sector_t hash_block_start; >> > + sector_t hash_block_end; >> > + verity_hash_at_level(v, io->block, i, &hash_block_start, NULL); >> > + verity_hash_at_level(v, io->block + io->n_blocks - 1, i, &hash_block_end, NULL); >> > + if (!i) { >> > + unsigned cluster = prefetch_cluster; >> > + /* barrier to stop GCC from re-reading prefetch_cluster again */ >> > + barrier(); >> > + cluster >>= v->data_dev_block_bits; >> >> Would: >> unsigned cluster = prefetch_cluster >> v->data_dev_block_bits; >> not have similar behavior without a barrier? (Yeah yeah I could >> compile and see, but I was curious if you already had.) >> >> Since the max iterations here is 61 in a worst-case, I don't think >> it's a big deal to barrier() each time, just thought I'd ask. >> >> > + if (unlikely(!cluster)) >> > + goto no_prefetch_cluster; >> > + if (unlikely(cluster & (cluster - 1))) >> > + cluster = 1 << (fls(cluster) - 1); >> > + >> > + hash_block_start &= ~(sector_t)(cluster - 1); >> > + hash_block_end |= cluster - 1; >> > + if (unlikely(hash_block_end >= v->hash_blocks)) >> > + hash_block_end = v->hash_blocks - 1; >> > + } >> > +no_prefetch_cluster: >> > + dm_bufio_prefetch(v->bufio, hash_block_start, >> > + hash_block_end - hash_block_start + 1); > > The problem here is this. If you look at the code, you think that after > the clause "if (unlikely(!cluster)) goto no_prefetch_cluster;", cluster > can't be zero. But this assumption is wrong. The C compiler is allowed to > transform the above code into: > > unsigned cluster; > if (!(prefetch_cluster >> v->data_dev_block_bits)) > goto no_prefetch_cluster; > cluster = prefetch_cluster >> v->data_dev_block_bits; > if (unlikely(cluster & (cluster - 1))) > cluster = 1 << (fls(cluster) - 1); > > I know it's suboptimal, but the C compiler is just allowed to perform this > transformation. Now, if you know that "prefetch_cluster" can change > asynchronously by another thread running simultaneously, the condition "if > (!(prefetch_cluster >> v->data_dev_block_bits))" is useless --- > prefetch_cluster may change just after this condition and we won't catch > the zero value. (if the cluster value is zero in the above code, it ends > up in hash_block_end being ORed with -1 and the prefetch goes wild over > the whole hash device). > > That's why I put that "barrier()" there. It would be better to declare > "prefetch_cluster" as volatile, but the module param macros issue warnings > if the variable is volatile. > > Or maybe I can change it this way: > "unsigned cluster = *(volatile unsigned *)&prefetch_cluster", it could be > better than the "barrier()". I think that'd read a little bit more clearly, and I think the C standard supports that approach. If it doesn't work in practice, the barrier isn't the worst. >> > + case STATUSTYPE_TABLE: >> > + DMEMIT("%u %s %s %llu %u %s ", >> > + 0, >> > + v->data_dev->name, >> > + v->hash_dev->name, >> >> I understand the new approach is to use major:minor instead of the >> device name. I don't care which, but I believe agk@ requested that. > > All the device mappers report dm_dev->name in their status routine, so I > do it this way too. > >> > +static int verity_ioctl(struct dm_target *ti, unsigned cmd, >> > + unsigned long arg) >> > +{ >> > + struct dm_verity *v = ti->private; >> > + int r = 0; >> > + >> > + if (v->data_start || >> > + ti->len != i_size_read(v->data_dev->bdev->bd_inode) >> SECTOR_SHIFT) >> > + r = scsi_verify_blk_ioctl(NULL, cmd); >> > + >> >> Is it worth supporting ioctl at all given these hoops? Nothing stops >> a privileged user from directly running the ioctl on the underlying >> device/devices, it's just very inconvenient :) > > I don't know. The other dm targets attempt to pass-thru ioctls too. > > You need ioctl pass-thru if you want to run it over a cd-rom because > the iso9660 filesystem needs to send an ioctl to find its superblock. > Other than that I don't know if other filesystems need ioctls. Makes sense. I just think the passthrough condition is ugly, but at least it provides some support. >> > + if (ti->len > i_size_read(v->data_dev->bdev->bd_inode) >> SECTOR_SHIFT) { >> > + ti->error = "Data device si too small"; >> >> s/si/is >> >> Should this also check ti->start + ti->len to ensure it isn't reading >> off the end or do you just rely on the requests failing? > > ti->start is the offset in the target table --- so it shouldn't be checked > here (for example, you can map a verity device having 1024 blocks to a > sector offset 1000000 in the table --- so ti->start == 1000000 and ti->len > == 1024 --- in this case, you have test that the underlying device has at > least 1024 blocks, but you shouldn't test it for 1000000 sectors --- > 1000000 is offset in the table, not required device size. > > But this reminds me that I had the size test wrong in verity_map ... > fixed. Well at least some good came of it! I typed ti->start, but I had mean v->data_start. However, I was misinterpreting the i_size_read as giving the last sector not the actual size, so my comment was still pointless. >> > +MODULE_AUTHOR("Mikulas Patocka <mpatocka@xxxxxxxxxx>"); >> >> As per linux/module.h, I'd welcome additional authors as per the >> lkml/patch lineage: >> MODULE_AUTHOR("Mandeep Baines <msb@xxxxxxxxxxxx>"); >> MODULE_AUTHOR("Will Drewry <wad@xxxxxxxxxxxx>"); > > OK, I added you there. Very much appreciated. >> Regardless, I'll just be happy to see this functionality merge. >> >> > +MODULE_DESCRIPTION(DM_NAME " target for transparent disk integrity checking"); >> > +MODULE_LICENSE("GPL"); >> > + >> > Index: linux-3.3-rc6-fast/drivers/md/dm-bufio.c >> >> This should be in a separate patch I think. > > Yes, it is a separate patch. > >> > b->hold_count++; >> >> Are these hold_counts safe on architectures with weak memory models? >> Should they be atomic_ts? I haven't looked at them in context, but >> based on what I see here they make me a bit nervous. >> >> Thanks for jumping in to the fray! None of my comments are blocking, >> so I believe the following is appropriate (if not >> s/Signed-off/Reviewed-by/). >> >> Signed-off-by: Will Drewry <wad@xxxxxxxxxxxx> >> >> cheers! >> will > > hold_count is read or changed only when we hold dm_bufio_client->lock, so > it doesn't have to be atomic. Ah nice! The little snippet had me worried, but I should've looked at the full context first. Thanks! will -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel