Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, Michael

On Fri, Feb 06, 2015 at 04:41:12PM +0100, Michael Kerrisk (man-pages) wrote:
> On 02/05/2015 02:07 AM, Minchan Kim wrote:
> > Hello,
> > 
> > On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote:
> >> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@xxxxxxx> wrote:
> >>> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote:
> >>>>
> >>>> Hello Vlastimil,
> >>>>
> >>>> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@xxxxxxx> wrote:
> >>>>>>>
> >>>>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages"
> >>>>>>> case
> >>>>>>> though. I dont see any check for other kinds of shared pages in the
> >>>>>>> code.
> >>>>>>
> >>>>>>
> >>>>>> Agreed. "shared" here seems confused. I've removed it. And I've
> >>>>>> added mention of "Huge TLB pages" for this error.
> >>>>>
> >>>>>
> >>>>> Thanks.
> >>>>
> >>>>
> >>>> I also added those cases for MADV_REMOVE, BTW.
> >>>
> >>>
> >>> Right. There's also the following for MADV_REMOVE that needs updating:
> >>>
> >>> "Currently, only shmfs/tmpfs supports this; other filesystems return with
> >>> the error ENOSYS."
> >>>
> >>> - it's not just shmem/tmpfs anymore. It should be best to refer to
> >>> fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to
> >>> date.
> >>>
> >>> - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is
> >>> listed in the ERRORS section.
> >>
> >> Yup, I recently added that as well, based on a patch from Jan Chaloupka.
> >>
> >>>>>>>>> - The word "will result" did sound as a guarantee at least to me. So
> >>>>>>>>> here it
> >>>>>>>>> could be changed to "may result (unless the advice is ignored)"?
> >>>>>>>>
> >>>>>>>> It's too late to fix documentation. Applications already depends on
> >>>>>>>> the
> >>>>>>>> beheviour.
> >>>>>>>
> >>>>>>> Right, so as long as they check for EINVAL, it should be safe. It
> >>>>>>> appears
> >>>>>>> that
> >>>>>>> jemalloc does.
> >>>>>>
> >>>>>> So, first a brief question: in the cases where the call does not error
> >>>>>> out,
> >>>>>> are we agreed that in the current implementation, MADV_DONTNEED will
> >>>>>> always result in zero-filled pages when the region is faulted back in
> >>>>>> (when we consider pages that are not backed by a file)?
> >>>>>
> >>>>> I'd agree at this point.
> >>>>
> >>>> Thanks for the confirmation.
> >>>>
> >>>>> Also we should probably mention anonymously shared pages (shmem). I think
> >>>>> they behave the same as file here.
> >>>>
> >>>> You mean tmpfs here, right? (I don't keep all of the synonyms straight.)
> >>>
> >>> shmem is tmpfs (that by itself would fit under "files" just fine), but also
> >>> sys V segments created by shmget(2) and also mappings created by mmap with
> >>> MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to
> >>> refer to the full list.
> >>
> >> So, how about this text:
> >>
> >>               After a successful MADV_DONTNEED operation, the seman‐
> >>               tics  of  memory  access  in  the specified region are
> >>               changed: subsequent accesses of  pages  in  the  range
> >>               will  succeed,  but will result in either reloading of
> >>               the memory contents from the  underlying  mapped  file
> >>               (for  shared file mappings, shared anonymous mappings,
> >>               and shmem-based techniques such  as  System  V  shared
> >>               memory  segments)  or  zero-fill-on-demand  pages  for
> >>               anonymous private mappings.
> > 
> > Hmm, I'd like to clarify.
> > 
> > Whether it was intention or not, some of userspace developers thought
> > about that syscall drop pages instantly if was no-error return so that
> > they will see more free pages(ie, rss for the process will be decreased)
> > with keeping the VMA. Can we rely on it?
> 
> I do not know. Michael?

It's important to identify difference between MADV_DONTNEED and MADV_FREE
so it would be better to clear out in this chance.

> 
> > And we should make error section, too.
> > "locked" covers mlock(2) and you said you will add hugetlb. Then,
> > VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP?
> > special mapping for some drivers?
> 
> I'm open for offers on what to add.

I suggests from quote "LWN" http://lwn.net/Articles/162860/
"*special mapping* which is not made up of "normal" pages.
It is usually created by device drivers which map special memory areas
into user space"

>  
> > One more thing, "The kernel is free to ignore the advice".
> > It conflicts "This call does not influence the semantics of the
> > application (except in the case of MADV_DONTNEED)" so
> > is it okay we can believe "The kernel is free to ingmore the advise
> > except MADV_DONTNEED"?
> 
> I decided to just drop the sentence
> 
>      The kernel is free to ignore the advice.
> 
> It creates misunderstandings, and does not really add information.

Sounds good.

> 
> Cheers,
> 
> Michael
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]