Re: [PATCH] hfsplus: release bnode pages after use, not before

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18 June 2015 at 13:58, Sergei Antonov <saproj@xxxxxxxxx> wrote:
> On 18 June 2015 at 04:51, Hin-Tak Leung <hintak.leung@xxxxxxxxx> wrote:
>> Hi Andrew and everybody else,
>>
>> I looked through the pre-git history and seem to have found the reason of how
>> the current code had come to be, and why Sergei's fix seems to involve
>> re-enable "#if 0"'ed out code. It is this - I think the code was wrong on entry,
>> in that the first !REF_PAGES should be REF_PAGES. Then the "remove old
>> debug code"
>> commit below did not remove the right stuff.
>>
>> Looking at the old code, I think REF_PAGES may have started out being
>> 1 (i.e. pages are released right after create, then get/put when needed, no
>> need to release on bnode_free),
>> then the code was modified for efficiency so that pages are not released
>> on bnode_create and not put/get at bnode_put/get but release at bnode_free.
>> But it was still largely working in the REF_PAGE=1 mode (and with the commented
>> out release at bnode_free, which is supposed to be spanned by a !REF_PAGE).
>> Then it was flipped over, and seems to work, and things were forgotten.
>>
>> I think the release at bnode_free was commented out because the person who
>> wrote it wasn't sure - I am not sure either, since it seems like there might be
>> other/more places that one might need to release the pages than just
>> at bnode_free().
>>
>> Does this train of thought make sense?
>
> No. The occurrence of !REF_PAGES was not a typo, it seems to me. Well,
> at first it was my thought too. But later I considered what the logic
> of the author could have been with REF_PAGES=0 and REF_PAGES=1 and
> came to a conclusion that REF_PAGES=1 was a safe approach and
> REF_PAGES=0 was an experimental unsafe release-early approach, so
> !REF_PAGES was what he really meant.
>

You are wrong, it seems. I found even earlier versions of the code,
with history, at
http://sourceforge.net/projects/linux-hfsplus/ .
The earlier version of the code
(http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/)
does get_page and put_page at bnode_get and bnode_put, while the later version
(http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/
)
have those removed, dated 2001 december and 2002 june.

The change log says so:

Date:   Thu Jun 6 09:45:14 2002 +0000

    Use buffer cache instead of page cache in bnode.c. Cache inode extents.

The change also removed any page_cache_release() . It would seem that
the additional REF_PAGES
conditional was a fork/rewrite/clean-up which re-uses the earlier
development logic, and eventually took over.

The release-early code came first, not after. That makes sense, as
releasing early
and read_cache_page on each bnode_get is the slower, safer but
inefficient approach.
It would be absurd to go from caching pages, then to 'experiment' to
see if release-early + later read_cache_page
on each bnode_get 'improves'.

>>
>> Hin-Tak
>>
>> <commit>
>> commit a5e3985fa014029eb6795664c704953720cc7f7d
>> Author: Roman Zippel <zippel@xxxxxxxxxxxxxx>
>> Date:   Tue Sep 6 15:18:47 2005 -0700
>>
>>     [PATCH] hfs: remove debug code
>>
>>     This removes some old debug code, which is no longer needed.
>>
>>     Signed-off-by: Roman Zippel <zippel@xxxxxxxxxxxxxx>
>>     Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
>>     Signed-off-by: Linus Torvalds <torvalds@xxxxxxxx>
>>
>> diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c
>> index a096c5a..3d5cdc6 100644
>> --- a/fs/hfs/bnode.c
>> +++ b/fs/hfs/bnode.c
>> @@ -13,8 +13,6 @@
>>
>>  #include "btree.h"
>>
>> -#define REF_PAGES    0
>> -
>>  void hfs_bnode_read(struct hfs_bnode *node, void *buf,
>>          int off, int len)
>>  {
>> @@ -289,9 +287,7 @@ static struct hfs_bnode *__hfs_bnode_create(struct
>> hfs_btree *tree, u32 cnid)
>>              page_cache_release(page);
>>              goto fail;
>>          }
>> -#if !REF_PAGES
>>          page_cache_release(page);
>> -#endif
>>          node->page[i] = page;
>>      }
>>
>> @@ -449,13 +445,6 @@ void hfs_bnode_get(struct hfs_bnode *node)
>>  {
>>      if (node) {
>>          atomic_inc(&node->refcnt);
>> -#if REF_PAGES
>> -        {
>> -        int i;
>> -        for (i = 0; i < node->tree->pages_per_bnode; i++)
>> -            get_page(node->page[i]);
>> -        }
>> -#endif
>>          dprint(DBG_BNODE_REFS, "get_node(%d:%d): %d\n",
>>                 node->tree->cnid, node->this, atomic_read(&node->refcnt));
>>      }
>> @@ -472,20 +461,12 @@ void hfs_bnode_put(struct hfs_bnode *node)
>>                 node->tree->cnid, node->this, atomic_read(&node->refcnt));
>>          if (!atomic_read(&node->refcnt))
>>              BUG();
>> -        if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock)) {
>> -#if REF_PAGES
>> -            for (i = 0; i < tree->pages_per_bnode; i++)
>> -                put_page(node->page[i]);
>> -#endif
>> +        if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock))
>>              return;
>> -        }
>>          for (i = 0; i < tree->pages_per_bnode; i++) {
>>              if (!node->page[i])
>>                  continue;
>>              mark_page_accessed(node->page[i]);
>> -#if REF_PAGES
>> -            put_page(node->page[i]);
>> -#endif
>>          }
>>
>>          if (test_bit(HFS_BNODE_DELETED, &node->flags)) {
>> diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c
>> index 8868d3b..b85abc6 100644
>> --- a/fs/hfsplus/bnode.c
>> +++ b/fs/hfsplus/bnode.c
>> @@ -18,8 +18,6 @@
>>  #include "hfsplus_fs.h"
>>  #include "hfsplus_raw.h"
>>
>> -#define REF_PAGES    0
>> -
>>  /* Copy a specified range of bytes from the raw data of a node */
>>  void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len)
>>  {
>> @@ -450,9 +448,7 @@ static struct hfs_bnode *__hfs_bnode_create(struct
>> hfs_btree *tree, u32 cnid)
>>              page_cache_release(page);
>>              goto fail;
>>          }
>> -#if !REF_PAGES
>>          page_cache_release(page);
>> -#endif
>>          node->page[i] = page;
>>      }
>>
>> @@ -612,13 +608,6 @@ void hfs_bnode_get(struct hfs_bnode *node)
>>  {
>>      if (node) {
>>          atomic_inc(&node->refcnt);
>> -#if REF_PAGES
>> -        {
>> -        int i;
>> -        for (i = 0; i < node->tree->pages_per_bnode; i++)
>> -            get_page(node->page[i]);
>> -        }
>> -#endif
>>          dprint(DBG_BNODE_REFS, "get_node(%d:%d): %d\n",
>>                 node->tree->cnid, node->this, atomic_read(&node->refcnt));
>>      }
>> @@ -635,20 +624,12 @@ void hfs_bnode_put(struct hfs_bnode *node)
>>                 node->tree->cnid, node->this, atomic_read(&node->refcnt));
>>          if (!atomic_read(&node->refcnt))
>>              BUG();
>> -        if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock)) {
>> -#if REF_PAGES
>> -            for (i = 0; i < tree->pages_per_bnode; i++)
>> -                put_page(node->page[i]);
>> -#endif
>> +        if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock))
>>              return;
>> -        }
>>          for (i = 0; i < tree->pages_per_bnode; i++) {
>>              if (!node->page[i])
>>                  continue;
>>              mark_page_accessed(node->page[i]);
>> -#if REF_PAGES
>> -            put_page(node->page[i]);
>> -#endif
>>          }
>>
>>          if (test_bit(HFS_BNODE_DELETED, &node->flags)) {
>> </commit>
>>
>> On 18 June 2015 at 00:41, Hin-Tak Leung <htl10@xxxxxxxxxxxxxxxxxxxxx> wrote:
>>> ------------------------------
>>> On Tue, Jun 9, 2015 7:06 PM BST Anton Altaparmakov wrote:
>>>
>>>>Hi Andrew,
>>>>
>>>>Can you please take this patch up and get it merged into mainline?  Despite Vyacheslav's lamentations this patch is obviously correct.  __hfs_bnode_create() is called in two places from the driver and the pages it returned are kmap()-ed and used (both read and write) so it is quite obvious they cannot be page_cache_release()-d until after they are kunmap()-ed instead of before they are even kmap()-ed as happens without this patch and this patch fixes exactly this by moving the page_cache_release() calls to after the pages are kunmap()-ed.
>>>>
>>>>Feel free to add:
>>>>
>>>>Reviewed-by: Anton Altaparmakov <anton@xxxxxxxxxx>
>>>>
>>>>Thanks a lot in advance!
>>>>
>>>>Best regards,
>>>>
>>>>    Anton
>>>
>>> I went around and looked at the code and Anton is right -  __hfs_bnode_create() is called from hfs_bnode_create() and bfs_bnode_find(), and the pages are played with afterwards a bit further, so page_cache_release() at where the current location is, is wrong.
>>>
>>> Since I have a way of seeing a likely-related problem, I'll try testing this change in the next few days and possibly adding a Signed-off at the point.
>>>
>>> BTW, there is a similiar problem with the hfs (no plus) driver also; the "if 0" part is commented out with // c++ style comments instead.
>>> I'm looking over the pre-git history to see how that comes about.
>>>
>>> Hin-Tak
>>>
>>>>> On 7 Jun 2015, at 03:42, Sergei Antonov <saproj@xxxxxxxxx> wrote:
>>>>>
>>>>> Fix this bugreport by Sasha Levin:
>>>>> http://lkml.org/lkml/2015/2/20/85 ("use after free")
>>>>> Make sure mapped pages are available for the entire lifetime of hfs_bnode.
>>>>>
>>>>> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
>>>>> Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>
>>>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>>>>> Cc: Vyacheslav Dubeyko <slava@xxxxxxxxxxx>
>>>>> Cc: Hin-Tak Leung <htl10@xxxxxxxxxxxxxxxxxxxxx>
>>>>> Cc: Sougata Santra <sougata@xxxxxxxxxx>
>>>>> Reported-by: Sasha Levin <sasha.levin@xxxxxxxxxx>
>>>>> Signed-off-by: Sergei Antonov <saproj@xxxxxxxxx>
>>>>> ---
>>>>> fs/hfsplus/bnode.c | 6 ++----
>>>>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c
>>>>> index 759708f..5af50fb 100644
>>>>> --- a/fs/hfsplus/bnode.c
>>>>> +++ b/fs/hfsplus/bnode.c
>>>>> @@ -454,7 +454,6 @@ static struct hfs_bnode *__hfs_bnode_create(struct hfs_btree *tree, u32 cnid)
>>>>>             page_cache_release(page);
>>>>>             goto fail;
>>>>>         }
>>>>> -        page_cache_release(page);
>>>>>         node->page[i] = page;
>>>>>     }
>>>>>
>>>>> @@ -566,13 +565,12 @@ node_error:
>>>>>
>>>>> void hfs_bnode_free(struct hfs_bnode *node)
>>>>> {
>>>>> -#if 0
>>>>>     int i;
>>>>>
>>>>> -    for (i = 0; i < node->tree->pages_per_bnode; i++)
>>>>> +    for (i = 0; i < node->tree->pages_per_bnode; i++) {
>>>>>         if (node->page[i])
>>>>>             page_cache_release(node->page[i]);
>>>>> -#endif
>>>>> +    }
>>>>>     kfree(node);
>>>>> }
>>>>>
>>>>> --
>>>>> 2.3.0
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>--
>>>>Anton Altaparmakov <anton at tuxera.com> (replace at with @)
>>>>Lead in File System Development, Tuxera Inc., http://www.tuxera.com/
>>>>Linux NTFS maintainer
>>>>
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux