Re: 5.5.0-0.rc1 hang, could be zstd compression related

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 11, 2019 at 09:58:45AM -0500, Josef Bacik wrote:
> On 12/10/19 11:00 PM, Chris Murphy wrote:
> > Could continue to chat in one application, the desktop environment was
> > responsive, but no shells worked and I couldn't get to a tty and I
> > couldn't ssh into remotely. Looks like the journal has everything up
> > until I pressed and held down the power button.
> > 
> > 
> > /dev/nvme0n1p7 on / type btrfs
> > (rw,noatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=274,subvol=/root)
> > 
> > dmesg pretty
> > https://pastebin.com/pvG3ERnd
> > 
> > dmesg (likely MUA stomped)
> > [10224.184137] flap.local kernel: perf: interrupt took too long (2522
> >> 2500), lowering kernel.perf_event_max_sample_rate to 79000
> > [14712.698184] flap.local kernel: perf: interrupt took too long (3153
> >> 3152), lowering kernel.perf_event_max_sample_rate to 63000
> > [17903.211976] flap.local kernel: Lockdown: systemd-logind:
> > hibernation is restricted; see man kernel_lockdown.7
> > [22877.667177] flap.local kernel: BUG: kernel NULL pointer
> > dereference, address: 00000000000006c8
> > [22877.667182] flap.local kernel: #PF: supervisor read access in kernel mode
> > [22877.667184] flap.local kernel: #PF: error_code(0x0000) - not-present page
> > [22877.667187] flap.local kernel: PGD 0 P4D 0
> > [22877.667191] flap.local kernel: Oops: 0000 [#1] SMP PTI
> > [22877.667194] flap.local kernel: CPU: 2 PID: 14747 Comm: kworker/u8:7
> > Not tainted 5.5.0-0.rc1.git0.1.fc32.x86_64+debug #1
> > [22877.667196] flap.local kernel: Hardware name: HP HP Spectre
> > Notebook/81A0, BIOS F.43 04/16/2019
> > [22877.667226] flap.local kernel: Workqueue: btrfs-delalloc
> > btrfs_work_helper [btrfs]
> > [22877.667233] flap.local kernel: RIP:
> > 0010:bio_associate_blkg_from_css+0x1c/0x3b0
> 
> This looks like the extent_map bdev cleanup thing that was supposed to be fixed, 
> did you send the patch without the fix for it Dave?  Thanks,

The fix for NULL bdev was added in 429aebc0a9a063667dba21 (and tested
with cgroups v2) and it's in a different function than the one that
appears on the stacktrace.

This seems to be another instance where the bdev is needed right after
the bio is created but way earlier than it's actually known for real,
yet still needed for the blkcg thing.

 443         bio = btrfs_bio_alloc(first_byte);
 444         bio->bi_opf = REQ_OP_WRITE | write_flags;
 445         bio->bi_private = cb;
 446         bio->bi_end_io = end_compressed_bio_write;
 447
 448         if (blkcg_css) {
 449                 bio->bi_opf |= REQ_CGROUP_PUNT;
 450                 bio_associate_blkg_from_css(bio, blkcg_css);
 451         }

Strange that it takes so long to reproduce, meaning the 'if' branch is
not taken often.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux