reply:Re: [PATCH] zsmalloc: fix linking bug in init_zspage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Minchan:
We do not know how to repetition of this issue. Just met this problem two times. But yudongbin find that if disabled the function zs_page_isolate this issue never happen. It seems like mapping's .migratepage affects this issue .For now we just know these. As you said the last partial object is never allocated because its zspage already has been moved to ZS_FULL.

THANKS

Hi zhouxianrong,

Please could you be more sepcific what case can we encounter below BUG?
(Please use plain text)
What zs_class size did you this this problem?
Could you say how that can happen?

As I wrote in other reply, zsmalloc should never allocate last parital
object when I look at source code so we need to understand what specific
case we are missing if it's a real zsmalloc bug.

Please explain how that can be happen with a real example.

Thanks.

On Thu, Aug 16, 2018 at 08:10:42AM +0800, zhouxianrong wrote:
> Hi:

  I am sorry so later for replying this message due to something.

This is the backtrace edited by me we met.

[pid:3471,cpu4,thread-3]------------[ cut here ]------------
[pid:3471,cpu4,thread-3]kernel bug at ../../../../../../mm/zsmalloc.c:1455!
[pid:3471,cpu4,thread-3]internal error: oops - bug: 0 [#1] preempt smp
[pid:3471,cpu4,thread-3]modules linked in:
[pid:3471,cpu4,thread-3]cpu: 4 pid: 3471 comm: thread-3 tainted: g        w       4.9.84 #1
[pid:3471,cpu4,thread-3]tgid: 715 comm: proc-a
[pid:3471,cpu4,thread-3]task: ffffffcc83ba1d00 task.stack: ffffffcad99b0000
[pid:3471,cpu4,thread-3]pc is at zs_map_object+0x1e0/0x1f0
[pid:3471,cpu4,thread-3]lr is at zs_map_object+0x9c/0x1f0
[pid:3471,cpu4,thread-3]pc : [] lr : [] pstate: 20000145
[pid:3471,cpu4,thread-3]sp : ffffffcad99b3530
[pid:3471,cpu4,thread-3]x29: ffffffcad99b3530 x28: ffffffcc97533c40
[pid:3471,cpu4,thread-3]x27: ffffffcc974dd720 x26: ffffffcad99b0000
[pid:3471,cpu4,thread-3]x25: 0000000001fa9f80 x24: 0000000000000002
[pid:3471,cpu4,thread-3]x23: ffffff89c3a27000 x22: ffffff89c30e6000
[pid:3471,cpu4,thread-3]x21: ffffff89c354f000 x20: ffffff89c3234720
[pid:3471,cpu4,thread-3]x19: 0000000000000f90 x18: 0000000000000008
[pid:3471,cpu4,thread-3]x17: 00000000bbb877ff x16: 00000000ffdba560
[pid:3471,cpu4,thread-3]x15: ffffffcaeab13ff5 x14: 000000009e3779b1
[pid:3471,cpu4,thread-3]x13: 0000000000000ff4 x12: ffffffcaeab13fd9
[pid:3471,cpu4,thread-3]x11: ffffffcaeab13ffa x10: ffffffcaeab13ff8
[pid:3471,cpu4,thread-3]x9 : ffffffca8cc201b8 x8 : ffffffca8cc20190
[pid:3471,cpu4,thread-3]x7 : 000000000000008e x6 : 000000000000009b
[pid:3471,cpu4,thread-3]x5 : 0000000000000000 x4 : 0000000000000001
[pid:3471,cpu4,thread-3]x3 : 00000042d42a9000 x2 : 00000000000009d0
[pid:3471,cpu4,thread-3]x1 : ffffffcc994ddbc0 x0 : 0000000000000000

[pid:3471,cpu4,thread-3] zs_map_object+0x1e0/0x1f0
[pid:3471,cpu4,thread-3] zs_zpool_map+0x44/0x54
[pid:3471,cpu4,thread-3] zpool_map_handle+0x44/0x58
[pid:3471,cpu4,thread-3] zram_bvec_write+0x22c/0x76c
[pid:3471,cpu4,thread-3] zram_bvec_rw+0x288/0x488
[pid:3471,cpu4,thread-3] zram_rw_page+0x124/0x1a4
[pid:3471,cpu4,thread-3] bdev_write_page+0x8c/0xd8
[pid:3471,cpu4,thread-3] __swap_writepage+0x1c0/0x3a8
[pid:3471,cpu4,thread-3] swap_writepage+0x3c/0x64
[pid:3471,cpu4,thread-3] shrink_page_list+0x844/0xd84
[pid:3471,cpu4,thread-3] reclaim_pages_from_list+0xf4/0x1bc
[pid:3471,cpu4,thread-3] reclaim_pte_range+0x208/0x2a0
[pid:3471,cpu4,thread-3] walk_pgd_range+0xe8/0x238
[pid:3471,cpu4,thread-3] walk_page_range+0x7c/0x164
[pid:3471,cpu4,thread-3] reclaim_write+0x208/0x608
[pid:3471,cpu4,thread-3] __vfs_write+0x50/0x88
[pid:3471,cpu4,thread-3] vfs_write+0xbc/0x2b0
[pid:3471,cpu4,thread-3] sys_write+0x60/0xc4
[pid:3471,cpu4,thread-3] el0_svc_naked+0x34/0x38
[pid:3471,cpu4,thread-3]code: 17ffffdd d4210000 97ffff1f 97ffff83 (d4210000)
[pid:3471,cpu4,thread-3]---[ end trace 652caafc4c4b6d06 ]---
Hi Sergey,
>
> On Mon, Aug 13, 2018 at 07:55:36PM +0900, Sergey Senozhatsky wrote:
> > On (08/13/18 15:05), Minchan Kim wrote:
> > > > From: zhouxianrong
> > > >
> > > > The last partial object in last subpage of zspage should not be linked
> > > > in allocation list. Otherwise it could trigger BUG_ON explicitly at
> > > > function zs_map_object. But it happened rarely.
> > >
> > > Could you be more specific? What case did you see the problem?
> > > Is it a real problem or one founded by review?
> > [..]
> > > > Signed-off-by: zhouxianrong
> > > > ---
> > > > mm/zsmalloc.c | 2 ++
> > > > 1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > > index 8d87e973a4f5..24dd8da0aa59 100644
> > > > --- a/mm/zsmalloc.c
> > > > +++ b/mm/zsmalloc.c
> > > > @@ -1040,6 +1040,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage)
> > > > * Reset OBJ_TAG_BITS bit to last link to tell
> > > > * whether it's allocated object or not.
> > > > */
> > > > + if (off > PAGE_SIZE)
> > > > + link -= class->size / sizeof(*link);
> > > > link->next = -1UL << OBJ_TAG_BITS;
> > > > }
> > > > kunmap_atomic(vaddr);
> >
> > Hmm. This can be a real issue. Unless I'm missing something.
> >
> > So... I might be wrong, but the way I see the bug report is:
> >
> > When we link objects during zspage init, we do the following:
> >
> > while ((off += class->size) < PAGE_SIZE) {
> > link->next = freeobj++ << OBJ_TAG_BITS;
> > link += class->size / sizeof(*link);
> > }
> >
> > Note that we increment the link first, link += class->size / sizeof(*link),
> > and check for the offset only afterwards. So by the time we break out of
> > the while-loop the link *might* point to the partial object which starts at
> > the last page of zspage, but *never* ends, because we don't have next_page
> > in current zspage. So that's why that object should not be linked in,
> > because it's not a valid allocates object - we simply don't have space
> > for it anymore.
> >
> > zspage [ page 1 ][ page 2 ]
> > ...............................link
> > [..###]
> >
> > therefore the last object must be "link - 1" for such cases.
> >
> > I think, the following change can also do the trick:
> >
> > while ((off + class->size) < PAGE_SIZE) {
> > link->next = freeobj++ << OBJ_TAG_BITS;
> > link += class->size / sizeof(*link);
> > off += class->size;
> > }
> >
> > Once again, I might be wrong on this.
> > Any thoughts?
>
> If we want a refactoring, I'm not against but description said it tiggered
> BUG_ON on zs_map_object rarely. That means it should be stable material
> and need more description to understand. Please be more specific with
> some example. The reason I'm hesitating is zsmalloc moves ZS_FULL group
> when the zspage->inuse is equal to class->objs_per_zspage so I thought
> it shouldn't allocate last partial object.
>
> Thanks.
>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux