On 2018/10/29 下午5:43, fdmanana@xxxxxxxxxx wrote: > From: Filipe Manana <fdmanana@xxxxxxxx> > > Unless the '-s' option is passed to fssum, it should not detect file holes > and have their existence influence the computed checksum for a file. This > tool was added to test btrfs' send/receive feature, so that it checks for > any metadata and data differences between the original filesystem and the > filesystem that receives send streams. > > For a long time the test btrfs/007, which tests btrfs' send/receive with > fsstress, fails sporadically reporting data differences between files. > However the md5sum/sha1sum from the reported files in the original and > new filesystems are the same. The reason why fssum fails is because even > in normal mode it still accounts for number of holes that exist in the > file and their respective lengths. This is done using the SEEK_DATA mode > of lseek. The btrfs send feature does not preserve holes nor prealloc > extents (not supported by the current protocol), so whenever a hole or > prealloc (unwritten) extent is detected in the source filesystem, it > issues a write command full of zeroes, which will translate to a regular > (written) extent in the destination filesystem. This is why fssum reports > a different checksum. A prealloc extent also counts as hole when using > lseek. > > For example when passing a seed of 1540592967 to fsstress in btrfs/007, > the test fails, as file p0/d0/f7 has a prealloc extent in the original > filesystem (in the incr snapshot). > > Fix this by making fssum just read the hole file and feed its data to the > digest calculation function when option '-s' is not given. If we ever get > btrfs' send/receive to support holes and fallocate, we can just change > the test and pass the '-s' option to all fssum calls. However this is causing more problem here. Now since fssum doesn't skip holes, for a super large sparse file, fssum will try to fill all the holes with zero and spend tons of CPU time calculating the result. E.g. when seed = 1550703281. We will have a PiB level file: $ ls /mnt/scratch/base/p0/f3 -alh -rw-rw-rw- 1 1441 2774 666P Feb 13 13:39 /mnt/scratch/base/p0/f3 (well, 666, no wonder this will fail) But it only takes aroud 1M: $ du /mnt/scratch/base/p0/f3 1044 /mnt/scratch/base/p0/f3 This is even more annoying than test failure. Can't we just filter out the fallocate/hole punching part in the test itself? Thanks, Qu > > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx> > --- > src/fssum.c | 65 +++++-------------------------------------------------------- > 1 file changed, 5 insertions(+), 60 deletions(-) > > diff --git a/src/fssum.c b/src/fssum.c > index 5da39abf..f1da72fb 100644 > --- a/src/fssum.c > +++ b/src/fssum.c > @@ -224,71 +224,16 @@ int > sum_file_data_permissive(int fd, sum_t *dst) > { > int ret; > - off_t pos; > - off_t old; > - int i; > - uint64_t zeros = 0; > - > - pos = lseek(fd, 0, SEEK_CUR); > - if (pos == (off_t)-1) > - return errno == ENXIO ? 0 : -2; > > while (1) { > - old = pos; > - pos = lseek(fd, pos, SEEK_DATA); > - if (pos == (off_t)-1) { > - if (errno == ENXIO) { > - ret = 0; > - pos = lseek(fd, 0, SEEK_END); > - if (pos != (off_t)-1) > - zeros += pos - old; > - } else { > - ret = -2; > - } > - break; > - } > ret = read(fd, buf, sizeof(buf)); > - assert(ret); /* eof found by lseek */ > - if (ret <= 0) > + if (ret < 0) > + return -errno; > + sum_add(dst, buf, ret); > + if (ret < sizeof(buf)) > break; > - if (old < pos) /* hole */ > - zeros += pos - old; > - for (i = 0; i < ret; ++i) { > - for (old = i; buf[i] == 0 && i < ret; ++i) > - ; > - if (old < i) /* code like a hole */ > - zeros += i - old; > - if (i == ret) > - break; > - if (zeros) { > - if (verbose >= 2) > - fprintf(stderr, > - "adding %llu zeros to sum\n", > - (unsigned long long)zeros); > - sum_add_u64(dst, 0); > - sum_add_u64(dst, zeros); > - zeros = 0; > - } > - for (old = i; buf[i] != 0 && i < ret; ++i) > - ; > - if (verbose >= 2) > - fprintf(stderr, "adding %u non-zeros to sum\n", > - i - (int)old); > - sum_add(dst, buf + old, i - old); > - } > - pos += ret; > } > - > - if (zeros) { > - if (verbose >= 2) > - fprintf(stderr, > - "adding %llu zeros to sum (finishing)\n", > - (unsigned long long)zeros); > - sum_add_u64(dst, 0); > - sum_add_u64(dst, zeros); > - } > - > - return ret; > + return 0; > } > > int >
Attachment:
signature.asc
Description: OpenPGP digital signature