Jan Kara <jack@xxxxxxx> writes: >> I'm not sure I'm understanding your pseudocode logic correctly though. >> This logic doesn't seems to be a page forking specific issue. And >> this pseudocode logic seems to be missing the locking and revalidate of >> page. >> >> If you can show more details, it would be helpful to see more, and >> discuss the issue of page forking, or we can think about how to handle >> the corner cases. >> >> Well, before that, why need more details? >> >> For example, replace the page fork at (4) with "truncate", "punch >> hole", or "invalidate page". >> >> Those operations remove the old page from radix tree, so the >> userspace's write creates the new page, and HW still refererences the >> old page. (I.e. situation should be same with page forking, in my >> understand of this pseudocode logic.) > > Yes, if userspace truncates the file, the situation we end up with is > basically the same. However for truncate to happen some malicious process > has to come and truncate the file - a failure scenario that is acceptable > for most use cases since it doesn't happen unless someone is actively > trying to screw you. With page forking it is enough for flusher thread > to start writeback for that page to trigger the problem - event that is > basically bound to happen without any other userspace application > interfering. Acceptable conclusion is where came from? That pseudocode logic doesn't say about usage at all. And even if assume it is acceptable, as far as I can see, for example /proc/sys/vm/drop_caches is enough to trigger, or a page on non-exists block (sparse file. i.e. missing disk space check in your logic). And if really no any lock/check, there would be another races. >> IOW, this pseudocode logic seems to be broken without page forking if >> no lock and revalidate. Usually, we prevent unpleasant I/O by >> lock_page or PG_writeback, and an obsolated page is revalidated under >> lock_page. > > Well, good luck with converting all the get_user_pages() users in kernel to > use lock_page() or PG_writeback checks to avoid issues with page forking. I > don't think that's really feasible. What does all get_user_pages() conversion mean? Well, maybe right more or less, I also think there is the issue in/around get_user_pages() that we have to tackle. IMO, if there is a code that pseudocode logic actually, it is the breakage. And "it is acceptable and limitation, and give up to fix", I don't think it is the right way to go. If there is really code broken like your logic, I think we should fix. Could you point which code is using your logic? Since that seems to be so racy, I can't believe yet there are that racy codes actually. >> For page forking, we may also be able to prevent similar situation by >> locking, flags, and revalidate. But those details might be different >> with current code, because page states are different. > > Sorry, I don't understand what do you mean in this paragraph. Can you > explain it a bit more? This just means a forked page (old page) and a truncated page have different set of flags and state, so we may have to adjust revalidation. Thanks. -- OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html