Hi Chris, We are using software we developed called CrashMonkey [1]. It simulates the state on storage after a crash (taking into accounts FLUSH and FUA flags). Talk slides on how it works can be found here [2]. It is similar to dm-log-writes if you have used that in the past. [1] https://github.com/utsaslab/crashmonkey [2] http://www.cs.utexas.edu/~vijay/papers/hotstorage17-crashmonkey-slides.pdf Thanks, Vijay Chidambaram On Tue, Apr 24, 2018 at 10:07 PM, Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > > > On Tue, Apr 24, 2018 at 8:35 PM, Jayashree Mohan <jayashree2912@xxxxxxxxx> wrote: >> >> Hi, >> >> While investigating crash consistency bugs on btrfs, we came across >> workloads that demonstrate inconsistent behavior of fsync. >> >> Consider the following workload where fsync on the directory did not persist it. >> >> Workload 1: >> >> mkdir A >> Sync >> rename (A, B) >> creat B/foo >> fsync B/foo >> fsync B >> ---crash--- >> >> In this case, the directory B as well as file B/foo are missing. >> What's more worrying is that, on recovery from crash, we expect the >> contents of directory to be >> >> Dir A : should not exist >> Dir B : >> foo >> >> But instead, what we see is that: >> Dir A : >> foo >> Dir B : doesn't exist >> >> >> This state is acceptable if we had created the file foo in dir A and >> then renamed the directory - in that case it would mean the rename did >> not persist. However what we see here is that, a file created in >> directory B falsely appears in A, which is incorrect. >> >> However, if we did not persist the initial create of directory A, i.e >> >> Workload 2: >> >> mkdir A >> rename (A, B) >> creat B/foo >> fsync B/foo >> fsync B >> ---crash--- >> >> the directory B and its entry both get persisted in this case. >> >> Is this something to do with the directory entry A being already >> present in the FS/subvolume tree and then the changes to the directory >> inode going into the fsync log? >> >> We do not clearly understand the reason for such inconsistent >> behavior, but it does seem incorrect. >> >> Consider another case where we found inconsistent behavior in the way >> fsync is handled. >> >> Workload 3: >> >> mkdir A >> mkdir B >> creat A/foo >> link (A/foo, B/foo) >> fsync A/foo >> fsync B/foo >> ---crash--- >> >> In this case, file A/foo is persisted, but inspite of an explicit >> fsync on B/foo, the file goes missing. >> >> Workload 4: >> >> mkdir A >> mkdir B >> creat A/foo >> link (A/foo, B/foo) >> fsync B/foo >> fsync A/foo >> ---crash--- >> >> Note that, the only difference between workload 3 and 4 is the order >> of fsync on files A/foo and B/foo. In this case, the file B/foo is >> persisted, but A/foo is missing. >> >> What we interpret from the above workloads is that, the second fsync >> is behaving like a no-op, and in either cases, only the file that is >> fsynced first gets persisted. If we insert a sleep(45) between the two >> fsyncs in the workloads above, we see both the files A/foo and B/foo >> being persisted. >> >> No matter how many more links we create and fsync, only the first >> fsync persists the file, i.e for example, >> >> Workload 5: >> >> mkdir A >> mkdir B >> mkdir C >> creat A/foo >> link (A/foo, B/foo) >> link (A/foo, C/foo) >> fsync B/foo >> fsync A/foo >> fsync C/foo >> ---crash--- >> >> Only file B/foo gets persisted, and both A/foo and C/foo are missing. >> >> This seems like inconsistent behavior as only the first fsync persists >> the file, while all others don't seem to. Do you agree if this is >> indeed incorrect and needs fixing? >> >> All the above tests pass on ext4 and xfs. >> >> Please let us know what you feel about such inconsistency. >> > > I don't have answer to your question, but I'm curious exactly how you simulate a crash? For my own really rudimentary testing I've been doing crazy things like: > > # grub-mkconfig -o /boot/efi && echo b > /proc/sysrq-trigger > > And seeing what makes it to disk - or not. And I'm finding a some non-determinstic results are possible even in a VM which is a bit confusing. I'm sure with real hardware I'd find even more inconsistency. > > > -- > Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html