Re: [PATCH] cifs: Fix double add page to memcg when cifs_readpages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2020/6/23 10:54, ronnie sahlberg 写道:
Acked-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>

Nice find!
So the small change in semantics in 3fea5a499d57 was that intended or
could this also affect other filesystems that would need a similar fix
as cifs ?
The afs_readpages also has the same problem.
http://lists.infradead.org/pipermail/linux-afs/2020-June/002359.html

On Mon, Jun 22, 2020 at 7:29 PM Zhang Xiaoxu <zhangxiaoxu5@xxxxxxxxxx> wrote:

When xfstests generic/451, there is an BUG at mm/memcontrol.c:
   page:ffffea000560f2c0 refcount:2 mapcount:0 mapping:000000008544e0ea
        index:0xf
   mapping->aops:cifs_addr_ops dentry name:"tst-aio-dio-cycle-write.451"
   flags: 0x2fffff80000001(locked)
   raw: 002fffff80000001 ffffc90002023c50 ffffea0005280088 ffff88815cda0210
   raw: 000000000000000f 0000000000000000 00000002ffffffff ffff88817287d000
   page dumped because: VM_BUG_ON_PAGE(page->mem_cgroup)
   page->mem_cgroup:ffff88817287d000
   ------------[ cut here ]------------
   kernel BUG at mm/memcontrol.c:2659!
   invalid opcode: 0000 [#1] SMP
   CPU: 2 PID: 2038 Comm: xfs_io Not tainted 5.8.0-rc1 #44
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_
     073836-buildvm-ppc64le-16.ppc.4
   RIP: 0010:commit_charge+0x35/0x50
   Code: 0d 48 83 05 54 b2 02 05 01 48 89 77 38 c3 48 c7
         c6 78 4a ea ba 48 83 05 38 b2 02 05 01 e8 63 0d9
   RSP: 0018:ffffc90002023a50 EFLAGS: 00010202
   RAX: 0000000000000000 RBX: ffff88817287d000 RCX: 0000000000000000
   RDX: 0000000000000000 RSI: ffff88817ac97ea0 RDI: ffff88817ac97ea0
   RBP: ffffea000560f2c0 R08: 0000000000000203 R09: 0000000000000005
   R10: 0000000000000030 R11: ffffc900020237a8 R12: 0000000000000000
   R13: 0000000000000001 R14: 0000000000000001 R15: ffff88815a1272c0
   FS:  00007f5071ab0800(0000) GS:ffff88817ac80000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 000055efcd5ca000 CR3: 000000015d312000 CR4: 00000000000006e0
   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   Call Trace:
    mem_cgroup_charge+0x166/0x4f0
    __add_to_page_cache_locked+0x4a9/0x710
    add_to_page_cache_locked+0x15/0x20
    cifs_readpages+0x217/0x1270
    read_pages+0x29a/0x670
    page_cache_readahead_unbounded+0x24f/0x390
    __do_page_cache_readahead+0x3f/0x60
    ondemand_readahead+0x1f1/0x470
    page_cache_async_readahead+0x14c/0x170
    generic_file_buffered_read+0x5df/0x1100
    generic_file_read_iter+0x10c/0x1d0
    cifs_strict_readv+0x139/0x170
    new_sync_read+0x164/0x250
    __vfs_read+0x39/0x60
    vfs_read+0xb5/0x1e0
    ksys_pread64+0x85/0xf0
    __x64_sys_pread64+0x22/0x30
    do_syscall_64+0x69/0x150
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
   RIP: 0033:0x7f5071fcb1af
   Code: Bad RIP value.
   RSP: 002b:00007ffde2cdb8e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
   RAX: ffffffffffffffda RBX: 00007ffde2cdb990 RCX: 00007f5071fcb1af
   RDX: 0000000000001000 RSI: 000055efcd5ca000 RDI: 0000000000000003
   RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
   R10: 0000000000001000 R11: 0000000000000293 R12: 0000000000000001
   R13: 000000000009f000 R14: 0000000000000000 R15: 0000000000001000
   Modules linked in:
   ---[ end trace 725fa14a3e1af65c ]---

Since commit 3fea5a499d57 ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") not cancel the page charge, the pages maybe
double add to pagecache:
thread1                       | thread2
cifs_readpages
readpages_get_pages
  add_to_page_cache_locked(head,index=n)=0
                               | readpages_get_pages
                               | add_to_page_cache_locked(head,index=n+1)=0
  add_to_page_cache_locked(head, index=n+1)=-EEXIST
  then, will next loop with list head page's
  index=n+1 and the page->mapping not NULL
readpages_get_pages
add_to_page_cache_locked(head, index=n+1)
  commit_charge
   VM_BUG_ON_PAGE

So, we should not do the next loop when any page add to page cache
failed.

Reported-by: Hulk Robot <hulkci@xxxxxxxxxx>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@xxxxxxxxxx>
---
  fs/cifs/file.c | 11 +++++++----
  1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4fe757cfc360..9b0f8f33f832 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -4336,7 +4336,8 @@ readpages_get_pages(struct address_space *mapping, struct list_head *page_list,
                         break;

                 __SetPageLocked(page);
-               if (add_to_page_cache_locked(page, mapping, page->index, gfp)) {
+               rc = add_to_page_cache_locked(page, mapping, page->index, gfp);
+               if (rc) {
                         __ClearPageLocked(page);
                         break;
                 }
@@ -4352,6 +4353,7 @@ static int cifs_readpages(struct file *file, struct address_space *mapping,
         struct list_head *page_list, unsigned num_pages)
  {
         int rc;
+       int err = 0;
         struct list_head tmplist;
         struct cifsFileInfo *open_file = file->private_data;
         struct cifs_sb_info *cifs_sb = CIFS_FILE_SB(file);
@@ -4396,7 +4398,7 @@ static int cifs_readpages(struct file *file, struct address_space *mapping,
          * the order of declining indexes. When we put the pages in
          * the rdata->pages, then we want them in increasing order.
          */
-       while (!list_empty(page_list)) {
+       while (!list_empty(page_list) && !err) {
                 unsigned int i, nr_pages, bytes, rsize;
                 loff_t offset;
                 struct page *page, *tpage;
@@ -4429,9 +4431,10 @@ static int cifs_readpages(struct file *file, struct address_space *mapping,
                         return 0;
                 }

-               rc = readpages_get_pages(mapping, page_list, rsize, &tmplist,
+               nr_pages = 0;
+               err = readpages_get_pages(mapping, page_list, rsize, &tmplist,
                                          &nr_pages, &offset, &bytes);
-               if (rc) {
+               if (!nr_pages) {
                         add_credits_and_wake_if(server, credits, 0);
                         break;
                 }
--
2.25.4


.





[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux