RE: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 12.04.21 12:53, Muchun Song wrote:
On Mon, Apr 12, 2021 at 6:42 PM Christian Borntraeger
<borntraeger@xxxxxxxxxx> wrote:

FWIW, I was away the last week, and I checked yesterdays next (e99d8a849517) regression runs.
I still do see errors in our CI system:

[ 2263.021681] ------------[ cut here ]------------
[ 2263.021697] percpu ref (obj_cgroup_release) <= 0 (0) after switching to atomic
[ 2263.021748] WARNING: CPU: 4 PID: 0 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8
[ 2263.021756] Modules linked in: scsi_debug vfio_pci irqbypass vfio_virqfd kvm vhost_vsock vmw_vsock_virtio_transport_common vsock vhost vhost_iotlb xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT xt_tcpudp nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp nft_counter bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink dm_service_time zfcp scsi_transport_fc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua rpcrdma sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib dm_mod ib_uverbs ib_core s390_trng vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio eadm_sch zcrypt_cex4 sch_fq_codel configfs ip_tables x_tables ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 mlx5_core sha512_s390 sha256_s390 sha1_s390 sha_common nvme nvme_core pkey zcrypt rng_core autofs4 [last unloaded: vfio_ap]
[ 2263.021820] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
[ 2263.021823] Hardware name: IBM 8561 T01 703 (LPAR)
[ 2263.021825] Krnl PSW : 0704c00180000000 000000025b234c1e (percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8)
[ 2263.021829]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 2263.021832] Krnl GPRS: c0000000fffeffff 00000002f7212818 0000000000000042 00000000fffeffff
[ 2263.021834]            00000000ffffffea 0000038000000001 0000000000000000 000003800000017c
[ 2263.021836]            000000025b980988 00000000b774d0e0 000003fee191d5d8 8000000000000000
[ 2263.021838]            000000008034c000 00000002f7227570 000000025b234c1a 00000380000aba28
[ 2263.021849] Krnl Code: 000000025b234c0e: e3309fe8ff04        lg      %r3,-24(%r9)
                            000000025b234c14: c0e5001ebe92        brasl   %r14,000000025b60c938
                           #000000025b234c1a: af000000            mc      0,0
                           >000000025b234c1e: a7f4ffcc            brc     15,000000025b234bb6
                            000000025b234c22: 0707                bcr     0,%r7
                            000000025b234c24: 0707                bcr     0,%r7
                            000000025b234c26: 0707                bcr     0,%r7
                            000000025b234c28: eb6ff0480024        stmg    %r6,%r15,72(%r15)
[ 2263.021912] Call Trace:
[ 2263.021914]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
[ 2263.021917] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
[ 2263.021919]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
[ 2263.021924]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
[ 2263.021926]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
[ 2263.021930]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
[ 2263.021934]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
[ 2263.021937]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
[ 2263.021939]  [<0000000000000000>] 0x0
[ 2263.021943]  [<000000025b62775a>] default_idle_call+0x42/0x110
[ 2263.021945]  [<000000025ab99328>] do_idle+0xd8/0x168
[ 2263.021949]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40
[ 2263.021952]  [<000000025ab1f33a>] smp_start_secondary+0x82/0x88
[ 2263.021955] Last Breaking-Event-Address:
[ 2263.021955]  [<000000025abc8828>] vprintk_emit+0xa8/0x110
[ 2263.021961] Kernel panic - not syncing: panic_on_warn set ...
[ 2263.021962] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.12.0-20210412.rc6.git0.e99d8a849517.300.fc33.s390x+next #1
[ 2263.021964] Hardware name: IBM 8561 T01 703 (LPAR)
[ 2263.021965] Call Trace:
[ 2263.021966]  [<000000025b60bc9a>] show_stack+0x92/0xd8
[ 2263.021972]  [<000000025b6161c0>] dump_stack+0x90/0xc0
[ 2263.021975]  [<000000025b60cab2>] panic+0x112/0x308
[ 2263.021977]  [<000000025ab5571a>] __warn+0xc2/0x158
[ 2263.021981]  [<000000025b2a5e4a>] report_bug+0xb2/0x130
[ 2263.021984]  [<000000025ab09ef4>] monitor_event_exception+0x44/0xc0
[ 2263.021986]  [<000000025b61a1e8>] __do_pgm_check+0xe0/0x1f0
[ 2263.021988]  [<000000025b627b30>] pgm_check_handler+0x118/0x160
[ 2263.021990]  [<000000025b234c1e>] percpu_ref_switch_to_atomic_rcu+0x1ee/0x1f8
[ 2263.021992] ([<000000025b234c1a>] percpu_ref_switch_to_atomic_rcu+0x1ea/0x1f8)
[ 2263.021993]  [<000000025abe16fe>] rcu_do_batch+0x146/0x608
[ 2263.021995]  [<000000025abe5ff4>] rcu_core+0x124/0x1d0
[ 2263.021997]  [<000000025b62a222>] __do_softirq+0x13a/0x3c8
[ 2263.021998]  [<000000025ab5d3f6>] irq_exit+0xce/0xf8
[ 2263.022000]  [<000000025b61a5f6>] do_ext_irq+0xd6/0x160
[ 2263.022001]  [<000000025b627c3c>] ext_int_handler+0xc4/0xf4
[ 2263.022003]  [<0000000000000000>] 0x0
[ 2263.022004]  [<000000025b62775a>] default_idle_call+0x42/0x110
[ 2263.022006]  [<000000025ab99328>] do_idle+0xd8/0x168
[ 2263.022008]  [<000000025ab99576>] cpu_startup_entry+0x36/0x40

So either the fix was not complete or it is still missing in next.

The fix now is on the mm-tree. I guess the branch you
tested does not contain this fix patch. You can check if
the function of obj_cgroup_get_many() exists. If it
doesn't exist, this means my guess is correct.

Right, the next tree from april 9th does not yet contain obj_cgroup_get_many.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux