On Wed, Apr 2, 2014 at 9:36 AM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > On Tue, Apr 01, 2014 at 09:12:44PM -0700, John Stultz wrote: >> On 04/01/2014 04:01 PM, Dave Hansen wrote: >> > On 04/01/2014 02:35 PM, H. Peter Anvin wrote: >> >> On 04/01/2014 02:21 PM, Johannes Weiner wrote: >> > John, this was something that the Mozilla guys asked for, right? Any >> > idea why this isn't ever a problem for them? >> So one of their use cases for it is for library text. Basically they >> want to decompress a compressed library file into memory. Then they plan >> to mark the uncompressed pages volatile, and then be able to call into >> it. Ideally for them, the kernel would only purge cold pages, leaving >> the hot pages in memory. When they traverse a purged page, they handle >> the SIGBUS and patch the page up. > > How big are these libraries compared to overall system size? Mike or Taras would have to refresh my memory on this detail. My recollection is it mostly has to do with keeping the on-disk size of the library small, so it can load off of slow media very quickly. >> Now.. this is not what I'd consider a normal use case, but was hoping to >> illustrate some of the more interesting uses and demonstrate the >> interfaces flexibility. > > I'm just dying to hear a "normal" use case then. :) So the more "normal" use cause would be marking objects volatile and then non-volatile w/o accessing them in-between. In this case the zero-fill vs SIGBUS semantics don't really matter, its really just a trade off in how we handle applications deviating (intentionally or not) from this use case. So to maybe flesh out the context here for folks who are following along (but weren't in the hallway at LSF :), Johannes made a fairly interesting proposal (Johannes: Please correct me here where I'm maybe slightly off here) to use only the dirty bits of the ptes to mark a page as volatile. Then the kernel could reclaim these clean pages as it needed, and when we marked the range as non-volatile, the pages would be re-dirtied and if any of the pages were missing, we could return a flag with the purged state. This had some different semantics then what I've been working with for awhile (for example, any writes to pages would implicitly clear volatility), so I wasn't completely comfortable with it, but figured I'd think about it to see if it could be done. Particularly since it would in some ways simplify tmpfs/shm shared volatility that I'd eventually like to do. After thinking it over in the hallway, I talked some of the details w/ Johnnes and there was one issue that while w/ anonymous memory, we can still add a VM_VOLATILE flag on the vma, so we can get SIGBUS semantics, but since on shared volatile ranges, we don't have anything to hang a volatile flag on w/o adding some new vma like structure to the address_space structure (much as we did in the past w/ earlier volatile range implementations). This would negate much of the point of using the dirty bits to simplify the shared volatility implementation. Thus Johannes is reasonably questioning the need for SIGBUS semantics, since if it wasn't needed, the simpler page-cleaning based volatility could potentially be used. Now, while for the case I'm personally most interested in (ashmem), zero-fill would technically be ok, since that's what Android does. Even so, I don't think its the best approach for the interface, since applications may end up quite surprised by the results when they accidentally don't follow the "don't touch volatile pages" rule. That point beside, I think the other problem with the page-cleaning volatility approach is that there are other awkward side effects. For example: Say an application marks a range as volatile. One page in the range is then purged. The application, due to a bug or otherwise, reads the volatile range. This causes the page to be zero-filled in, and the application silently uses the corrupted data (which isn't great). More problematic though, is that by faulting the page in, they've in effect lost the purge state for that page. When the application then goes to mark the range as non-volatile, all pages are present, so we'd return that no pages were purged. From an application perspective this is pretty ugly. Johannes: Any thoughts on this potential issue with your proposal? Am I missing something else? thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>