On 10 June 2015 at 00:15, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > On Mon, 8 Jun 2015 18:50:00 +0200 Sergei Antonov <saproj@xxxxxxxxx> wrote: > >> >> You are basically saying you don___t understand it. Too bad, because the >> >> bug is very simple. It is the ___use after free___ type of bug, and it can >> >> be illustrated by this: >> >> (1) void *ptr = malloc(___); >> >> (2) free(ptr); >> >> (3) memcpy(___, ptr, 1); >> >> Guess which two of these three lines are executed in wrong order. >> >> >> >> My patch is about the same type of bug, but with memory pages mapping. >> >> The driver currently accesses pages that may be unavailable, or >> >> contain different data. The problem is more likely to occur when >> >> memory is a limited resource. I reproduced it while running a >> >> memory-hungry program. >> > >> > I worried not about myself but about potential readers of description of >> > the fix. The description is completely obscure. And it needs to describe >> > the fix in clear and descriptive manner. This is my request. Please, >> > describe the fix in a clear way. >> >> The description is just right. > > Yes, I too would like to hear much more about your thinking on this, > and a detailed description of the bug and how the patch fixes it. By calling page_cache_release() when it is OK to. > The code is distressingly undocumented and has been that way since > Roman Zippel's original commit in 2004. I looked into it before submitting the patch. The code submitted in 2004 was already broken. > From the looks of it, that loop in __hfs_bnode_create() is simply doing > readahead and is designed as a performance optimisation. The pages are > pulled into pagecache in the expectation that they will soon be > accessed. What your patch does is to instead pin the pages in > pagecache until the bnode is freed. If we're going to do this then we > need to be very careful about worst-case scenarios: we could even run > the machine out of memory. I did not try to change the logic of the driver. Just fixed one glaring defect. Which, by the way, in addition to the aforementioned bug by Sasha Levin caused: 1. A "du"-related bug reported by Hin-Tak Leung earlier in the list. 2. https://bugzilla.kernel.org/show_bug.cgi?id=63841 3. https://bugzilla.kernel.org/show_bug.cgi?id=78761 4. https://bugzilla.kernel.org/show_bug.cgi?id=42342 > If I'm correct, and this is just readahead then the bug lies elsewhere: I pinpointed the bug so well. My test code dumped the content of the page at the moment corrupted data was detected. Then I looked at the dumped data, and - guess what - data from the memory-hungry program I was using sneaked in there! So I'm certain the cause of the bug is indeed the wrong sequence of read_mapping_page/kmap/kunmap/page_cache_release. > if other parts of hfsplus are assuming that this memory is in pagecache > then that's an error - that code (where is it?) should instead be performing > a pagecache lookup and if the page isn't present, read it from disk > again. > > But for others to be able to review and understand this change and > suggest alternatives, we'll need a much much better changelog! -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html