Re: [Invitation] Linux MM Alignment Session on Memory Error Detector on Wednesday

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks everyone for the discussion Wednesday! To follow up with the
questions I wasn't able to catch during the meeting, and my next
steps:

Dave Hansen: If one motivation for this is "guest performance", then
it would be great to have some data to back that up, even if it is
worst-case data.
Jiaqi: Definitely, I will set up some memory cycler workload and try
measure the delta between with and without S2 shattering due to memory
errors.

Peter Xu: we have folio_set_has_hwpoisoned() for folios, right?
(besdies PG_hwpoison)
David Hildenbrand: Yes, for !hugetlb IIRC. It prevents us from having
tos can all page flags.
Jane Chu: right, to mark poisoned non-head subpages
Peter Xu: i thought that's ok, so when free the folio it knows
something within is poisoned, so it needs a split / degrade / etc. as
long as spread to hugetlb pages
Jiaqi: Ack, HasHWPoison is a good tool for kernel to project
subsequent memory poison consumption (i.e. MCE again and again).

I think it would be better for me to separate the MFR policy from
subsequence memory poison protection: no matter what policy is set by
userspace and no matter how the policy is implemented, kernel will
eventually take actions to split, dissolve, and isolate the raw
poisoned page (let's stick with page size for now, instead of
cacheline size). With this requirement, I think the global policy
design is out.

I do want to spend some time to think more about attaching the MFR
policy with memfd, the proposal from Jason. Memfd already works with
HugeTLBFS and THP, and IIRC will work with guestfs. Will sent out a
RFC if I can put out a design (and code).

Thanks,

Jiaqi


On Mon, Oct 28, 2024 at 4:50 PM Jiaqi Yan <jiaqiyan@xxxxxxxxxx> wrote:
>
> Thanks David for sending out the invite!
>
> On Mon, Oct 28, 2024 at 2:25 PM David Rientjes <rientjes@xxxxxxxxxx> wrote:
> >
> > Hi everybody,
> >
> > We host a biweekly series, the Linux MM Alignment Session, on Wednesdays.
> > We'd like to invite MM developers to attend and will announce the topic
> > for the next instance on the Monday prior to the next meeting.
> >
> > Our next Linux MM Alignment Session is scheduled for Wednesday.  The
> > details:
> >
> > Wednesday, October 30 * 9:00 - 10:00am PDT (GMT-7)
> > https://meet.google.com/csb-wcds-xya
> > backup: (US) +1 347-682-5874 PIN: 356 962 072#
> > international: https://tel.meet/csb-wcds-xya?pin=1301132214803
> >
> > This week's topic will be supporting Memory Error Detector APIs led by
>
> Just a minor correction: the topic is how to make userspace control
> memory failure recovery.
>
> > Jiaqi Yan.  See
> > https://lore.kernel.org/linux-mm/20240924043924.3562257-2-jiaqiyan@xxxxxxxxxx/T/
>
> I have prepared some draft slides here:
> https://docs.google.com/presentation/d/1tWqcuAqeCLhfd47uXXLdu2SzolKu7WYvM03vEkbhobc
>
> >
> > There is lots of interest in these topics, so feel free to forward this
> > along to anybody else who may be interested!
> >
> > Also: if anybody has ideas for future topics, please let me know and I'll
> > try to organize them.  We'd love to have volunteers to lead future topics
> > as well as requests for MM topics to be presented.
> >
> > Looking forward to seeing all of you on Wednesday!
> >





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux