Adding stale-data detection to verify logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As written the verify function doesn't appear to have logic for detecting dropped-writes (stale data) for data at rest. There are only two temporally-variant fields presently utilized in the verify pattern:

verify_header.rand_seed
verify_header.numberio

These fields are verified during read+write invocations but not for read-only invocations. This means any dropped data for the most recent write to a given block won't be detected because all the non-temporally variant fields will pass verification. This is particularly problematic when reusing a device for separate fio invocations during a series of tests, as there will be valid but stale data at rest from previous invocations.

For example, if a user does the following after previous fio invocations:

1) Performs a write workload, without verify. When complete, runs a subsequent invocation with a read/verify-only workload against the same dataset.

2) Performs a write workload and use a trigger to perform a power-interruption test. Run a subsequent invocation with a read/verify-only workload, using verify_state_load=1.

It could be argued the onus is on the user to wipe data before every invocation but I'm not sure that's reasonable.

I'd like to implement an invocation-variant check that will catch the case of any data at rest stale relative to previous invocations. There would be an invocation-unique identifier, either passed via a command-line option or generated randomly. It would be added to verify_header and checked during all verify-reads. To support its use for subsequent read-only invocations it would be added to the verify_state file and used whenever verify_state_load=1. It would also be utilized when the identifier is specified on the command line.

An alternative would be to use the existing verify_header.time_sec field and check for any blocks older than the start time of the most recent invocation time that we'd encode in the state file. This would make a command-line option for specifying the time a little more cumbersome than an opaque identifier.

Note this wont catch missed multiple writes within a given invocation as that would require a block-specific sidecar map that tracks write counts per block (or stores a subset of the hash for the most recent write for each block). I've implemented such a feature in a proprietary tool and would consider it for fio if there's interest. The downside is the creation and dependency of a large side-car file. The upside is it would add verification support for sparsely-random workloads.

Code references for the temporal-variant field not being used for read-only workloads:

verify_io_u() forces the seeds to match the header's seed when !td_rw():

/*
 * Make rand_seed check pass when have verify_backlog or
 * zone reset frequency for zonemode=zbd.
 */
if (!td_rw(td) || (td->flags & TD_F_VER_BACKLOG) ||
    td->o.zrf.u.f)
    io_u->rand_seed = hdr->rand_seed;

verify_header() bypasses numberio check for read-only invocations:

/*
 * For read-only workloads, the program cannot be certain of the
 * last numberio written to a block. Checking of numberio will be
 * done only for workloads that write data.  For verify_only,
 * numberio check is skipped.
 */
if (td_write(td) && (td_min_bs(td) == td_max_bs(td)) &&
    !td->o.time_based)
    if (!td->o.verify_only)
        if (hdr->numberio != io_u->numberio) {
            log_err("verify: bad header numberio %"PRIu16
                ", wanted %"PRIu16,
                hdr->numberio, io_u->numberio);
            goto err;
        }

Adam (horshack@xxxxxxxx)



[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux