Re: [Lsf-pc] [LSF/MM/BPF TOPIC] tracing the source of errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/7/24 5:23 AM, Miklos Szeredi wrote:
> On Wed, 7 Feb 2024 at 12:00, Jan Kara <jack@xxxxxxx> wrote:
> 
>> The problem always has been how to implement this functionality in a
>> transparent way so the code does not become a mess. So if you have some
>> idea, I'd say go for it :)
> 
> My first idea would be to wrap all instances of E* (e.g. ERR(E*)).
> But this could be made completely transparent by renaming current
> definition of E* to _E* and defining E* to be the wrapped ones.
> There's probably a catch (or several catches) somewhere, though.
> 
> Thanks,
> Miklos
> 

Just FWIW, XFS has kind of been there and back again on wrapping error returns
with macros.

Long ago, we had an XFS_ERROR() macro, i.e.

 	if (error)
		return -XFS_ERROR(error);

sprinkled (randomly) throughout the code.

(it didn't make it out through strace, and was pretty clunky but could printk or
BUG based on which error you were looking for, IIRC.)

In 2014(!) I removed it, pointing out that systemtap could essentially do the
same thing, and do it more flexibly (see: [PATCH 2/2] xfs: Nuke XFS_ERROR macro):

# probe module("xfs").function("xfs_*").return { if (@defined($return) &&
$return == VALUE) { ... } }

hch pointed out that systemtap was not a viable option for many, and further
discussion turned up a slightly kludgey way to use kprobes:

-- from dchinner --
#!/bin/bash

TRACEDIR=/sys/kernel/debug/tracing

grep -i 't xfs_' /proc/kallsyms | awk '{print $3}' ; while read F; do
	echo "r:ret_$F $F \$retval" >> $TRACEDIR/kprobe_events
done

for E in $TRACEDIR/events/kprobes/ret_xfs_*/enable; do
	echo 1 > $E
done;

echo 'arg1 > 0xffffffffffffff00' > $TRACEDIR/events/kprobes/filter

for T in $TRACEDIR/events/kprobes/ret_xfs_*/trigger; do
	echo 'traceoff if arg1 > 0xffffffffffffff00' > $T
done
--------

which yields i.e.:

# dd if=/dev/zero of=/mnt/scratch/newfile bs=513 oflag=direct
dd: error writing ¿/mnt/scratch/newfile¿: Invalid argument
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000259882 s, 0.0 kB/s
root@test4:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1/1   #P:16
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
           <...>-8073  [006] d... 145740.460546: ret_xfs_file_dio_aio_write:
(xfs_file_aio_write+0x170/0x180 <- xfs_file_dio_aio_write) arg1=0xffffffffffffffea

where that last negative number is the errno.

Not the prettiest thing but something that works today and could maybe be improved?

-Eric




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux