Re: 5.5.0-0.rc1 hang, could be zstd compression related

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Applied that single line on top of 5.5.0-rc3

fs/btrfs/compression.c:449:17: error: implicit declaration of function
‘bio_set_bev’; did you mean ‘bio_set_dev’?
[-Werror=implicit-function-declaration]

If I use bio_set_dev

...
  CC [M]  fs/btrfs/compression.o
fs/btrfs/compression.o: warning: objtool:
end_compressed_bio_read.cold()+0x11: unreachable instruction
  LD [M]  fs/btrfs/btrfs.o
  GEN     .version
...

Despite that, it seems to work, and no crash with the reproducer.

On Wed, Dec 11, 2019 at 8:59 AM David Sterba <dsterba@xxxxxxx> wrote:
>
> On Wed, Dec 11, 2019 at 04:55:53PM +0100, David Sterba wrote:
> > On Wed, Dec 11, 2019 at 09:58:45AM -0500, Josef Bacik wrote:
> > > On 12/10/19 11:00 PM, Chris Murphy wrote:
> > > > Could continue to chat in one application, the desktop environment was
> > > > responsive, but no shells worked and I couldn't get to a tty and I
> > > > couldn't ssh into remotely. Looks like the journal has everything up
> > > > until I pressed and held down the power button.
> > > >
> > > >
> > > > /dev/nvme0n1p7 on / type btrfs
> > > > (rw,noatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=274,subvol=/root)
> > > >
> > > > dmesg pretty
> > > > https://pastebin.com/pvG3ERnd
> > > >
> > > > dmesg (likely MUA stomped)
> > > > [10224.184137] flap.local kernel: perf: interrupt took too long (2522
> > > >> 2500), lowering kernel.perf_event_max_sample_rate to 79000
> > > > [14712.698184] flap.local kernel: perf: interrupt took too long (3153
> > > >> 3152), lowering kernel.perf_event_max_sample_rate to 63000
> > > > [17903.211976] flap.local kernel: Lockdown: systemd-logind:
> > > > hibernation is restricted; see man kernel_lockdown.7
> > > > [22877.667177] flap.local kernel: BUG: kernel NULL pointer
> > > > dereference, address: 00000000000006c8
> > > > [22877.667182] flap.local kernel: #PF: supervisor read access in kernel mode
> > > > [22877.667184] flap.local kernel: #PF: error_code(0x0000) - not-present page
> > > > [22877.667187] flap.local kernel: PGD 0 P4D 0
> > > > [22877.667191] flap.local kernel: Oops: 0000 [#1] SMP PTI
> > > > [22877.667194] flap.local kernel: CPU: 2 PID: 14747 Comm: kworker/u8:7
> > > > Not tainted 5.5.0-0.rc1.git0.1.fc32.x86_64+debug #1
> > > > [22877.667196] flap.local kernel: Hardware name: HP HP Spectre
> > > > Notebook/81A0, BIOS F.43 04/16/2019
> > > > [22877.667226] flap.local kernel: Workqueue: btrfs-delalloc
> > > > btrfs_work_helper [btrfs]
> > > > [22877.667233] flap.local kernel: RIP:
> > > > 0010:bio_associate_blkg_from_css+0x1c/0x3b0
> > >
> > > This looks like the extent_map bdev cleanup thing that was supposed to be fixed,
> > > did you send the patch without the fix for it Dave?  Thanks,
> >
> > The fix for NULL bdev was added in 429aebc0a9a063667dba21 (and tested
> > with cgroups v2) and it's in a different function than the one that
> > appears on the stacktrace.
> >
> > This seems to be another instance where the bdev is needed right after
> > the bio is created but way earlier than it's actually known for real,
> > yet still needed for the blkcg thing.
> >
> >  443         bio = btrfs_bio_alloc(first_byte);
> >  444         bio->bi_opf = REQ_OP_WRITE | write_flags;
> >  445         bio->bi_private = cb;
> >  446         bio->bi_end_io = end_compressed_bio_write;
> >  447
> >  448         if (blkcg_css) {
> >  449                 bio->bi_opf |= REQ_CGROUP_PUNT;
> >  450                 bio_associate_blkg_from_css(bio, blkcg_css);
> >  451         }
> >
> > Strange that it takes so long to reproduce, meaning the 'if' branch is
> > not taken often.
>
> Compile tested only:
>
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -446,6 +446,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
>         bio->bi_end_io = end_compressed_bio_write;
>
>         if (blkcg_css) {
> +               bio_set_bev(bio, fs_info->fs_devices->latest_bdev);
>                 bio->bi_opf |= REQ_CGROUP_PUNT;
>                 bio_associate_blkg_from_css(bio, blkcg_css);
>         }
>


-- 
Chris Murphy




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux