[LSF/MM/BPF TOPIC] scaling error injection for block / fs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We *used* to not any error handling as part of the block layer's
*add_disk() paths since the code's inception. Fortunately that's now
history, but the only piece of code changes that were dropped from
that effort was error injection [0], this was since Hannes noted that
we may want to discuss if the approach is the best we can do.

I looked into this and indeed the BFP method to do error injection
is a viable alternative [1]. However this was even further generalized
from kprobes and all one now needs is to sprinkle ALLOW_ERROR_INJECTION()
on calls we went to enable error injection for. This makes things much
easier, instead of having to have a kernel bpf program to load
load_bpf_file() and then to run that and specify the comands you want
on the shell you can now just use something as simple as just shell:

-------------------------------------------------------------------
#!/bin/bash

rm -f testfile.img
dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
DEVICE=$(losetup --show -f testfile.img)
mkfs.btrfs -f $DEVICE
mkdir -p tmpmnt

FAILTYPE=fail_function
FAILFUNC=open_ctree
echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
echo 100 > /sys/kernel/debug/$FAILTYPE/probability
echo 0 > /sys/kernel/debug/$FAILTYPE/interval
echo -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 1 > /sys/kernel/debug/$FAILTYPE/verbose

mount -t btrfs $DEVICE tmpmnt
if [ $? -ne 0 ]
then
       echo "SUCCESS!"
else
       echo "FAILED!"
       umount tmpmnt
fi

echo > /sys/kernel/debug/$FAILTYPE/inject

rmdir tmpmnt
losetup -d $DEVICE
rm testfile.img
-------------------------------------------------------------------

This seems to be much more adaptable to what we do in blktests and
fstests. So before I go forward with adding error injection for the
block layer (only one user) or fs (only btrfs uses this so far), I think
it would be prudent for us to socialize if this *is* the scalable
strategy we'd like to see moving forward. If not then LSFMM seems like a
good place to iron out any possible kinks or concerns folks might have.

If we already have consensus and this is *the* way to go then I'll just
go forward and start adding some knobs / tests for this.

Perhaps the only issue I can think of is that you need a kernel
which enables error injection, and so production kernels would not
cut it.

Thoughts?

[0] https://lkml.kernel.org/r/20210512064629.13899-9-mcgrof@xxxxxxxxxx
[1] https://lwn.net/Articles/740146/
[2] https://patchwork.ozlabs.org/project/netdev/patch/151563182380.628.2420967932180154822.stgit@devbox/

  Luis



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux