On Wed, Jan 4, 2017 at 10:37 PM, George Spelvin <linux@xxxxxxxxxxxxxxxxxxx> wrote: > Back in 2001, Linus had some very negative things to say about MAP_COPY. > I'm going to try to change that opinion. Not going to happen. Basically, the way you can change that opinion is if you can show some clever zero-cost versioning model that "just work". With an actual patch. Because I'm not seeing it. And without it being zero cost to all the _real_ users, I'm not adding a MAP_COPY that absolutely nobody will ever use because it's not standard, and it's not useful enough to them. We've had a history of failed clever interfaces that end up being very painful to maintain (splice() being the most obvious one, but we've had a numebr of filesystem innovations that just didn't work either, devfs being the most spectacularly bad one). > I think I have a semantic for MAP_COPY that is both efficiently > implementable and useful. The semantic meaning is not my worry. The implementation is. > The meaning is "For each page in the mapping, a snapshot of the backing > file is taken at some undefined time between the mmap() call and the > first access to the mapped memory. The time of the snapshot may (will!) > be different for each page. Once taken, the snapshot will not be affected > by later writes to the file. Show me the efficient implementation. I see the trivial part: at page fault time, just do a COW if the page has any other users. But to know if it has "users", you now need another count that distinguishes between plain other mappings or *writable* mappings (so "mapcount" needs to be split up). That part is fairly simple, because the "new writable mappings" is hopefully just in a few places. But the hard part is for all *other* users that might write to the page now need to do the cow for somebody else. So it basically requires a per-page count (possibly just flag) of "this has a copy mapping", along with everybody who might write to it that currently just get a ref to the page to check it, and do the rmap thing etc. And just creating those two new fields is a big problem. We literally had a long discussion just about getting a single new _bit_ free'd up in the page flags, because things are so tight. You need two new fields entirely. I'm not saying it's impossible. But it's a lot of details (and that extra field to a very core data structure really is surprisingly painful) for some very dubious gains. People simply won't be using it. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>