On 18 June 2015 at 13:58, Sergei Antonov <saproj@xxxxxxxxx> wrote: > On 18 June 2015 at 04:51, Hin-Tak Leung <hintak.leung@xxxxxxxxx> wrote: >> Hi Andrew and everybody else, >> >> I looked through the pre-git history and seem to have found the reason of how >> the current code had come to be, and why Sergei's fix seems to involve >> re-enable "#if 0"'ed out code. It is this - I think the code was wrong on entry, >> in that the first !REF_PAGES should be REF_PAGES. Then the "remove old >> debug code" >> commit below did not remove the right stuff. >> >> Looking at the old code, I think REF_PAGES may have started out being >> 1 (i.e. pages are released right after create, then get/put when needed, no >> need to release on bnode_free), >> then the code was modified for efficiency so that pages are not released >> on bnode_create and not put/get at bnode_put/get but release at bnode_free. >> But it was still largely working in the REF_PAGE=1 mode (and with the commented >> out release at bnode_free, which is supposed to be spanned by a !REF_PAGE). >> Then it was flipped over, and seems to work, and things were forgotten. >> >> I think the release at bnode_free was commented out because the person who >> wrote it wasn't sure - I am not sure either, since it seems like there might be >> other/more places that one might need to release the pages than just >> at bnode_free(). >> >> Does this train of thought make sense? > > No. The occurrence of !REF_PAGES was not a typo, it seems to me. Well, > at first it was my thought too. But later I considered what the logic > of the author could have been with REF_PAGES=0 and REF_PAGES=1 and > came to a conclusion that REF_PAGES=1 was a safe approach and > REF_PAGES=0 was an experimental unsafe release-early approach, so > !REF_PAGES was what he really meant. > You are wrong, it seems. I found even earlier versions of the code, with history, at http://sourceforge.net/projects/linux-hfsplus/ . The earlier version of the code (http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/) does get_page and put_page at bnode_get and bnode_put, while the later version (http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/ ) have those removed, dated 2001 december and 2002 june. The change log says so: Date: Thu Jun 6 09:45:14 2002 +0000 Use buffer cache instead of page cache in bnode.c. Cache inode extents. The change also removed any page_cache_release() . It would seem that the additional REF_PAGES conditional was a fork/rewrite/clean-up which re-uses the earlier development logic, and eventually took over. The release-early code came first, not after. That makes sense, as releasing early and read_cache_page on each bnode_get is the slower, safer but inefficient approach. It would be absurd to go from caching pages, then to 'experiment' to see if release-early + later read_cache_page on each bnode_get 'improves'. >> >> Hin-Tak >> >> <commit> >> commit a5e3985fa014029eb6795664c704953720cc7f7d >> Author: Roman Zippel <zippel@xxxxxxxxxxxxxx> >> Date: Tue Sep 6 15:18:47 2005 -0700 >> >> [PATCH] hfs: remove debug code >> >> This removes some old debug code, which is no longer needed. >> >> Signed-off-by: Roman Zippel <zippel@xxxxxxxxxxxxxx> >> Signed-off-by: Andrew Morton <akpm@xxxxxxxx> >> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxx> >> >> diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c >> index a096c5a..3d5cdc6 100644 >> --- a/fs/hfs/bnode.c >> +++ b/fs/hfs/bnode.c >> @@ -13,8 +13,6 @@ >> >> #include "btree.h" >> >> -#define REF_PAGES 0 >> - >> void hfs_bnode_read(struct hfs_bnode *node, void *buf, >> int off, int len) >> { >> @@ -289,9 +287,7 @@ static struct hfs_bnode *__hfs_bnode_create(struct >> hfs_btree *tree, u32 cnid) >> page_cache_release(page); >> goto fail; >> } >> -#if !REF_PAGES >> page_cache_release(page); >> -#endif >> node->page[i] = page; >> } >> >> @@ -449,13 +445,6 @@ void hfs_bnode_get(struct hfs_bnode *node) >> { >> if (node) { >> atomic_inc(&node->refcnt); >> -#if REF_PAGES >> - { >> - int i; >> - for (i = 0; i < node->tree->pages_per_bnode; i++) >> - get_page(node->page[i]); >> - } >> -#endif >> dprint(DBG_BNODE_REFS, "get_node(%d:%d): %d\n", >> node->tree->cnid, node->this, atomic_read(&node->refcnt)); >> } >> @@ -472,20 +461,12 @@ void hfs_bnode_put(struct hfs_bnode *node) >> node->tree->cnid, node->this, atomic_read(&node->refcnt)); >> if (!atomic_read(&node->refcnt)) >> BUG(); >> - if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock)) { >> -#if REF_PAGES >> - for (i = 0; i < tree->pages_per_bnode; i++) >> - put_page(node->page[i]); >> -#endif >> + if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock)) >> return; >> - } >> for (i = 0; i < tree->pages_per_bnode; i++) { >> if (!node->page[i]) >> continue; >> mark_page_accessed(node->page[i]); >> -#if REF_PAGES >> - put_page(node->page[i]); >> -#endif >> } >> >> if (test_bit(HFS_BNODE_DELETED, &node->flags)) { >> diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c >> index 8868d3b..b85abc6 100644 >> --- a/fs/hfsplus/bnode.c >> +++ b/fs/hfsplus/bnode.c >> @@ -18,8 +18,6 @@ >> #include "hfsplus_fs.h" >> #include "hfsplus_raw.h" >> >> -#define REF_PAGES 0 >> - >> /* Copy a specified range of bytes from the raw data of a node */ >> void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len) >> { >> @@ -450,9 +448,7 @@ static struct hfs_bnode *__hfs_bnode_create(struct >> hfs_btree *tree, u32 cnid) >> page_cache_release(page); >> goto fail; >> } >> -#if !REF_PAGES >> page_cache_release(page); >> -#endif >> node->page[i] = page; >> } >> >> @@ -612,13 +608,6 @@ void hfs_bnode_get(struct hfs_bnode *node) >> { >> if (node) { >> atomic_inc(&node->refcnt); >> -#if REF_PAGES >> - { >> - int i; >> - for (i = 0; i < node->tree->pages_per_bnode; i++) >> - get_page(node->page[i]); >> - } >> -#endif >> dprint(DBG_BNODE_REFS, "get_node(%d:%d): %d\n", >> node->tree->cnid, node->this, atomic_read(&node->refcnt)); >> } >> @@ -635,20 +624,12 @@ void hfs_bnode_put(struct hfs_bnode *node) >> node->tree->cnid, node->this, atomic_read(&node->refcnt)); >> if (!atomic_read(&node->refcnt)) >> BUG(); >> - if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock)) { >> -#if REF_PAGES >> - for (i = 0; i < tree->pages_per_bnode; i++) >> - put_page(node->page[i]); >> -#endif >> + if (!atomic_dec_and_lock(&node->refcnt, &tree->hash_lock)) >> return; >> - } >> for (i = 0; i < tree->pages_per_bnode; i++) { >> if (!node->page[i]) >> continue; >> mark_page_accessed(node->page[i]); >> -#if REF_PAGES >> - put_page(node->page[i]); >> -#endif >> } >> >> if (test_bit(HFS_BNODE_DELETED, &node->flags)) { >> </commit> >> >> On 18 June 2015 at 00:41, Hin-Tak Leung <htl10@xxxxxxxxxxxxxxxxxxxxx> wrote: >>> ------------------------------ >>> On Tue, Jun 9, 2015 7:06 PM BST Anton Altaparmakov wrote: >>> >>>>Hi Andrew, >>>> >>>>Can you please take this patch up and get it merged into mainline? Despite Vyacheslav's lamentations this patch is obviously correct. __hfs_bnode_create() is called in two places from the driver and the pages it returned are kmap()-ed and used (both read and write) so it is quite obvious they cannot be page_cache_release()-d until after they are kunmap()-ed instead of before they are even kmap()-ed as happens without this patch and this patch fixes exactly this by moving the page_cache_release() calls to after the pages are kunmap()-ed. >>>> >>>>Feel free to add: >>>> >>>>Reviewed-by: Anton Altaparmakov <anton@xxxxxxxxxx> >>>> >>>>Thanks a lot in advance! >>>> >>>>Best regards, >>>> >>>> Anton >>> >>> I went around and looked at the code and Anton is right - __hfs_bnode_create() is called from hfs_bnode_create() and bfs_bnode_find(), and the pages are played with afterwards a bit further, so page_cache_release() at where the current location is, is wrong. >>> >>> Since I have a way of seeing a likely-related problem, I'll try testing this change in the next few days and possibly adding a Signed-off at the point. >>> >>> BTW, there is a similiar problem with the hfs (no plus) driver also; the "if 0" part is commented out with // c++ style comments instead. >>> I'm looking over the pre-git history to see how that comes about. >>> >>> Hin-Tak >>> >>>>> On 7 Jun 2015, at 03:42, Sergei Antonov <saproj@xxxxxxxxx> wrote: >>>>> >>>>> Fix this bugreport by Sasha Levin: >>>>> http://lkml.org/lkml/2015/2/20/85 ("use after free") >>>>> Make sure mapped pages are available for the entire lifetime of hfs_bnode. >>>>> >>>>> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> >>>>> Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx> >>>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>>>> Cc: Vyacheslav Dubeyko <slava@xxxxxxxxxxx> >>>>> Cc: Hin-Tak Leung <htl10@xxxxxxxxxxxxxxxxxxxxx> >>>>> Cc: Sougata Santra <sougata@xxxxxxxxxx> >>>>> Reported-by: Sasha Levin <sasha.levin@xxxxxxxxxx> >>>>> Signed-off-by: Sergei Antonov <saproj@xxxxxxxxx> >>>>> --- >>>>> fs/hfsplus/bnode.c | 6 ++---- >>>>> 1 file changed, 2 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c >>>>> index 759708f..5af50fb 100644 >>>>> --- a/fs/hfsplus/bnode.c >>>>> +++ b/fs/hfsplus/bnode.c >>>>> @@ -454,7 +454,6 @@ static struct hfs_bnode *__hfs_bnode_create(struct hfs_btree *tree, u32 cnid) >>>>> page_cache_release(page); >>>>> goto fail; >>>>> } >>>>> - page_cache_release(page); >>>>> node->page[i] = page; >>>>> } >>>>> >>>>> @@ -566,13 +565,12 @@ node_error: >>>>> >>>>> void hfs_bnode_free(struct hfs_bnode *node) >>>>> { >>>>> -#if 0 >>>>> int i; >>>>> >>>>> - for (i = 0; i < node->tree->pages_per_bnode; i++) >>>>> + for (i = 0; i < node->tree->pages_per_bnode; i++) { >>>>> if (node->page[i]) >>>>> page_cache_release(node->page[i]); >>>>> -#endif >>>>> + } >>>>> kfree(node); >>>>> } >>>>> >>>>> -- >>>>> 2.3.0 >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>>-- >>>>Anton Altaparmakov <anton at tuxera.com> (replace at with @) >>>>Lead in File System Development, Tuxera Inc., http://www.tuxera.com/ >>>>Linux NTFS maintainer >>>> >>> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html