I am sorry so later for replying this message due to something.
This is the backtrace edited by me we met.
[pid:3471,cpu4,thread-3]------------[ cut here ]------------
[pid:3471,cpu4,thread-3]kernel bug at ../../../../../../mm/zsmalloc.c:1455!
[pid:3471,cpu4,thread-3]internal error: oops - bug: 0 [#1] preempt smp
[pid:3471,cpu4,thread-3]modules linked in:
[pid:3471,cpu4,thread-3]cpu: 4 pid: 3471 comm: thread-3 tainted: g w 4.9.84 #1
[pid:3471,cpu4,thread-3]tgid: 715 comm: proc-a
[pid:3471,cpu4,thread-3]task: ffffffcc83ba1d00 task.stack: ffffffcad99b0000
[pid:3471,cpu4,thread-3]pc is at zs_map_object+0x1e0/0x1f0
[pid:3471,cpu4,thread-3]lr is at zs_map_object+0x9c/0x1f0
[pid:3471,cpu4,thread-3]pc : [] lr : [] pstate: 20000145
[pid:3471,cpu4,thread-3]sp : ffffffcad99b3530
[pid:3471,cpu4,thread-3]x29: ffffffcad99b3530 x28: ffffffcc97533c40
[pid:3471,cpu4,thread-3]x27: ffffffcc974dd720 x26: ffffffcad99b0000
[pid:3471,cpu4,thread-3]x25: 0000000001fa9f80 x24: 0000000000000002
[pid:3471,cpu4,thread-3]x23: ffffff89c3a27000 x22: ffffff89c30e6000
[pid:3471,cpu4,thread-3]x21: ffffff89c354f000 x20: ffffff89c3234720
[pid:3471,cpu4,thread-3]x19: 0000000000000f90 x18: 0000000000000008
[pid:3471,cpu4,thread-3]x17: 00000000bbb877ff x16: 00000000ffdba560
[pid:3471,cpu4,thread-3]x15: ffffffcaeab13ff5 x14: 000000009e3779b1
[pid:3471,cpu4,thread-3]x13: 0000000000000ff4 x12: ffffffcaeab13fd9
[pid:3471,cpu4,thread-3]x11: ffffffcaeab13ffa x10: ffffffcaeab13ff8
[pid:3471,cpu4,thread-3]x9 : ffffffca8cc201b8 x8 : ffffffca8cc20190
[pid:3471,cpu4,thread-3]x7 : 000000000000008e x6 : 000000000000009b
[pid:3471,cpu4,thread-3]x5 : 0000000000000000 x4 : 0000000000000001
[pid:3471,cpu4,thread-3]x3 : 00000042d42a9000 x2 : 00000000000009d0
[pid:3471,cpu4,thread-3]x1 : ffffffcc994ddbc0 x0 : 0000000000000000
[pid:3471,cpu4,thread-3] zs_map_object+0x1e0/0x1f0
[pid:3471,cpu4,thread-3] zs_zpool_map+0x44/0x54
[pid:3471,cpu4,thread-3] zpool_map_handle+0x44/0x58
[pid:3471,cpu4,thread-3] zram_bvec_write+0x22c/0x76c
[pid:3471,cpu4,thread-3] zram_bvec_rw+0x288/0x488
[pid:3471,cpu4,thread-3] zram_rw_page+0x124/0x1a4
[pid:3471,cpu4,thread-3] bdev_write_page+0x8c/0xd8
[pid:3471,cpu4,thread-3] __swap_writepage+0x1c0/0x3a8
[pid:3471,cpu4,thread-3] swap_writepage+0x3c/0x64
[pid:3471,cpu4,thread-3] shrink_page_list+0x844/0xd84
[pid:3471,cpu4,thread-3] reclaim_pages_from_list+0xf4/0x1bc
[pid:3471,cpu4,thread-3] reclaim_pte_range+0x208/0x2a0
[pid:3471,cpu4,thread-3] walk_pgd_range+0xe8/0x238
[pid:3471,cpu4,thread-3] walk_page_range+0x7c/0x164
[pid:3471,cpu4,thread-3] reclaim_write+0x208/0x608
[pid:3471,cpu4,thread-3] __vfs_write+0x50/0x88
[pid:3471,cpu4,thread-3] vfs_write+0xbc/0x2b0
[pid:3471,cpu4,thread-3] sys_write+0x60/0xc4
[pid:3471,cpu4,thread-3] el0_svc_naked+0x34/0x38
[pid:3471,cpu4,thread-3]code: 17ffffdd d4210000 97ffff1f 97ffff83 (d4210000)
[pid:3471,cpu4,thread-3]---[ end trace 652caafc4c4b6d06 ]---
随心邮-在微信里收发邮件,及时省电又安心Hi Sergey, On Mon, Aug 13, 2018 at 07:55:36PM +0900, Sergey Senozhatsky wrote: > On (08/13/18 15:05), Minchan Kim wrote: > > > From: zhouxianrong> > > > > > The last partial object in last subpage of zspage should not be linked > > > in allocation list. Otherwise it could trigger BUG_ON explicitly at > > > function zs_map_object. But it happened rarely. > > > > Could you be more specific? What case did you see the problem? > > Is it a real problem or one founded by review? > [..] > > > Signed-off-by: zhouxianrong > > > --- > > > mm/zsmalloc.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > > index 8d87e973a4f5..24dd8da0aa59 100644 > > > --- a/mm/zsmalloc.c > > > +++ b/mm/zsmalloc.c > > > @@ -1040,6 +1040,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage) > > > * Reset OBJ_TAG_BITS bit to last link to tell > > > * whether it's allocated object or not. > > > */ > > > + if (off > PAGE_SIZE) > > > + link -= class->size / sizeof(*link); > > > link->next = -1UL << OBJ_TAG_BITS; > > > } > > > kunmap_atomic(vaddr); > > Hmm. This can be a real issue. Unless I'm missing something. > > So... I might be wrong, but the way I see the bug report is: > > When we link objects during zspage init, we do the following: > > while ((off += class->size) < PAGE_SIZE) { > link->next = freeobj++ << OBJ_TAG_BITS; > link += class->size / sizeof(*link); > } > > Note that we increment the link first, link += class->size / sizeof(*link), > and check for the offset only afterwards. So by the time we break out of > the while-loop the link *might* point to the partial object which starts at > the last page of zspage, but *never* ends, because we don't have next_page > in current zspage. So that's why that object should not be linked in, > because it's not a valid allocates object - we simply don't have space > for it anymore. > > zspage [ page 1 ][ page 2 ] > ...............................link > [..###] > > therefore the last object must be "link - 1" for such cases. > > I think, the following change can also do the trick: > > while ((off + class->size) < PAGE_SIZE) { > link->next = freeobj++ << OBJ_TAG_BITS; > link += class->size / sizeof(*link); > off += class->size; > } > > Once again, I might be wrong on this. > Any thoughts? If we want a refactoring, I'm not against but description said it tiggered BUG_ON on zs_map_object rarely. That means it should be stable material and need more description to understand. Please be more specific with some example. The reason I'm hesitating is zsmalloc moves ZS_FULL group when the zspage->inuse is equal to class->objs_per_zspage so I thought it shouldn't allocate last partial object. Thanks.