Adding stale-data detection to verify logic

Adam Horshack <horshack@xxxxxxxx> · Fri, 3 Feb 2023 14:08:36 -0500

As written the verify function doesn't appear to have logic for 
detecting dropped-writes (stale data) for data at rest. There are only 
two temporally-variant fields presently utilized in the verify pattern:

verify_header.rand_seed
verify_header.numberio

These fields are verified during read+write invocations but not for 
read-only invocations. This means any dropped data for the most recent 
write to a given block won't be detected because all the non-temporally 
variant fields will pass verification. This is particularly problematic 
when reusing a device for separate fio invocations during a series of 
tests, as there will be valid but stale data at rest from previous 
invocations.

For example, if a user does the following after previous fio invocations:

1) Performs a write workload, without verify. When complete, runs a 
subsequent invocation with a read/verify-only workload against the same 
dataset.

2) Performs a write workload and use a trigger to perform a 
power-interruption test. Run a subsequent invocation with a 
read/verify-only workload, using verify_state_load=1.

It could be argued the onus is on the user to wipe data before every 
invocation but I'm not sure that's reasonable.

I'd like to implement an invocation-variant check that will catch the 
case of any data at rest stale relative to previous invocations. There 
would be an invocation-unique identifier, either passed via a 
command-line option or generated randomly. It would be added to 
verify_header and checked during all verify-reads. To support its use 
for subsequent read-only invocations it would be added to the 
verify_state file and used whenever verify_state_load=1. It would also 
be utilized when the identifier is specified on the command line.

An alternative would be to use the existing verify_header.time_sec field 
and check for any blocks older than the start time of the most recent 
invocation time that we'd encode in the state file. This would make a 
command-line option for specifying the time a little more cumbersome 
than an opaque identifier.

Note this wont catch missed multiple writes within a given invocation as 
that would require a block-specific sidecar map that tracks write counts 
per block (or stores a subset of the hash for the most recent write for 
each block). I've implemented such a feature in a proprietary tool and 
would consider it for fio if there's interest. The downside is the 
creation and dependency of a large side-car file. The upside is it would 
add verification support for sparsely-random workloads.

Code references for the temporal-variant field not being used for 
read-only workloads:

verify_io_u() forces the seeds to match the header's seed when !td_rw():

/*
 * Make rand_seed check pass when have verify_backlog or
 * zone reset frequency for zonemode=zbd.
 */
if (!td_rw(td) || (td->flags & TD_F_VER_BACKLOG) ||
    td->o.zrf.u.f)
    io_u->rand_seed = hdr->rand_seed;

verify_header() bypasses numberio check for read-only invocations:

/*
 * For read-only workloads, the program cannot be certain of the
 * last numberio written to a block. Checking of numberio will be
 * done only for workloads that write data.  For verify_only,
 * numberio check is skipped.
 */
if (td_write(td) && (td_min_bs(td) == td_max_bs(td)) &&
    !td->o.time_based)
    if (!td->o.verify_only)
        if (hdr->numberio != io_u->numberio) {
            log_err("verify: bad header numberio %"PRIu16
                ", wanted %"PRIu16,
                hdr->numberio, io_u->numberio);
            goto err;
        }

Adam (horshack@xxxxxxxx)