On Mon, May 25, 2020 at 9:23 AM Michal Soltys <msoltyspl@xxxxxxxxx> wrote: > > On 5/19/20 1:55 AM, Song Liu wrote: > > > > 2. try use bcc/bpftrace to trace r5l_recovery_read_page(), > > specifically, the 4th argument. > > With bcc, it is something like: > > > > trace.py -M 100 'r5l_recovery_read_page() "%llx", arg4' > > > > -M above limits the number of outputs to 100 lines. We may need to > > increase the limit or > > remove the constraint. If the system doesn't have bcc/bpftrace. You > > can also try with > > kprobe. > > > > > Trace keeps outputting the following data (with steadily growing 4th > argument): > > PID TID COMM FUNC - > 3456 3456 mdadm r5l_recovery_read_page 98f65b8 > 3456 3456 mdadm r5l_recovery_read_page 98f65c0 > 3456 3456 mdadm r5l_recovery_read_page 98f65c8 > 3456 3456 mdadm r5l_recovery_read_page 98f65d0 > 3456 3456 mdadm r5l_recovery_read_page 98f65d8 > 3456 3456 mdadm r5l_recovery_read_page 98f65e0 > 3456 3456 mdadm r5l_recovery_read_page 98f65e8 > 3456 3456 mdadm r5l_recovery_read_page 98f65f0 > 3456 3456 mdadm r5l_recovery_read_page 98f65f8 > 3456 3456 mdadm r5l_recovery_read_page 98f6600 > 3456 3456 mdadm r5l_recovery_read_page 98f65c0 > 3456 3456 mdadm r5l_recovery_read_page 98f65c8 > 3456 3456 mdadm r5l_recovery_read_page 98f65d0 > 3456 3456 mdadm r5l_recovery_read_page 98f65d8 > 3456 3456 mdadm r5l_recovery_read_page 98f65e0 > 3456 3456 mdadm r5l_recovery_read_page 98f65e8 > 3456 3456 mdadm r5l_recovery_read_page 98f65f0 > 3456 3456 mdadm r5l_recovery_read_page 98f65f8 > 3456 3456 mdadm r5l_recovery_read_page 98f6600 > 3456 3456 mdadm r5l_recovery_read_page 98f6608 > 3456 3456 mdadm r5l_recovery_read_page 98f6610 > > ... a few minutes later ... > > PID TID COMM FUNC - > 3456 3456 mdadm r5l_recovery_read_page 9b69b60 > 3456 3456 mdadm r5l_recovery_read_page 9b69b68 > > 3456 3456 mdadm r5l_recovery_read_page 9b69b70 > 3456 3456 mdadm r5l_recovery_read_page 9b69b78 > > 3456 3456 mdadm r5l_recovery_read_page 9b69b80 > 3456 3456 mdadm r5l_recovery_read_page 9b69b88 > 3456 3456 mdadm r5l_recovery_read_page 9b69b90 > > > 3456 3456 mdadm r5l_recovery_read_page 9b69b98 > 3456 3456 mdadm r5l_recovery_read_page 9b69ba0 > 3456 3456 mdadm r5l_recovery_read_page 9b69ba8 > > > 3456 3456 mdadm r5l_recovery_read_page 9b69bb0 > > > 3456 3456 mdadm r5l_recovery_read_page 9b69bb8 > 3456 3456 mdadm r5l_recovery_read_page 9b69bc0 > > 3456 3456 mdadm r5l_recovery_read_page 9b69bc8 > 3456 3456 mdadm r5l_recovery_read_page 9b69b90 > > > 3456 3456 mdadm r5l_recovery_read_page 9b69b98 > 3456 3456 mdadm r5l_recovery_read_page 9b69ba0 > 3456 3456 mdadm r5l_recovery_read_page 9b69ba8 > 3456 3456 mdadm r5l_recovery_read_page 9b69bb0 > 3456 3456 mdadm r5l_recovery_read_page 9b69bb8 > > ... and so on Looks like the kernel has processed about 1.2GB of journal (9b69bb8 - 98f65b8 sectors). And the limit is min(1/4 disk size, 10GB). I just checked the code, it should stop once it hits checksum mismatch. Does it keep going after half hour or so? Thanks, Song