On Tue, Jul 2, 2019 at 5:48 PM Oscar Salvador <osalvador@xxxxxxx> wrote:
On Tue, Jul 02, 2019 at 04:42:34PM +1000, Rashmica Gupta wrote:
> Hi David,
>
> Sorry for the late reply.
>
> On Wed, 2019-06-26 at 10:28 +0200, David Hildenbrand wrote:
> > On 26.06.19 10:15, Oscar Salvador wrote:
> > > On Wed, Jun 26, 2019 at 10:11:06AM +0200, David Hildenbrand wrote:
> > > > Back then, I already mentioned that we might have some users that
> > > > remove_memory() they never added in a granularity it wasn't
> > > > added. My
> > > > concerns back then were never fully sorted out.
> > > >
> > > > arch/powerpc/platforms/powernv/memtrace.c
> > > >
> > > > - Will remove memory in memory block size chunks it never added
> > > > - What if that memory resides on a DIMM added via
> > > > MHP_MEMMAP_DEVICE?
> > > >
> > > > Will it at least bail out? Or simply break?
> > > >
> > > > IOW: I am not yet 100% convinced that MHP_MEMMAP_DEVICE is save
> > > > to be
> > > > introduced.
> > >
> > > Uhm, I will take a closer look and see if I can clear your
> > > concerns.
> > > TBH, I did not try to use arch/powerpc/platforms/powernv/memtrace.c
> > > yet.
> > >
> > > I will get back to you once I tried it out.
> > >
> >
> > BTW, I consider the code in arch/powerpc/platforms/powernv/memtrace.c
> > very ugly and dangerous.
>
> Yes it would be nice to clean this up.
>
> > We should never allow to manually
> > offline/online pages / hack into memory block states.
> >
> > What I would want to see here is rather:
> >
> > 1. User space offlines the blocks to be used
> > 2. memtrace installs a hotplug notifier and hinders the blocks it
> > wants
> > to use from getting onlined.
> > 3. memory is not added/removed/onlined/offlined in memtrace code.
> >
>
> I remember looking into doing it a similar way. I can't recall the
> details but my issue was probably 'how does userspace indicate to
> the kernel that this memory being offlined should be removed'?
>
> I don't know the mm code nor how the notifiers work very well so I
> can't quite see how the above would work. I'm assuming memtrace would
> register a hotplug notifier and when memory is offlined from userspace,
> the callback func in memtrace would be called if the priority was high
> enough? But how do we know that the memory being offlined is intended
> for usto touch? Is there a way to offline memory from userspace not
> using sysfs or have I missed something in the sysfs interface?
>
> On a second read, perhaps you are assuming that memtrace is used after
> adding new memory at runtime? If so, that is not the case. If not, then
> would you be able to clarify what I'm not seeing?
Hi Rashmica,
let us go the easy way here.
Could you please explain:
Sure!
1) How memtrace works
You write the size of the chunk of memory you want into the debugfs file
and memtrace will attempt to find a contiguous section of memory of that size
that can be offlined. If it finds that, then the memory is removed from the
kernel's mappings. If you want a different size, then you write that to the
debugsfs file and memtrace will re-add the memory it first removed and then
try to offline and remove the a chunk of the new size.
2) Why it was designed, what is the goal of the interface?
3) When it is supposed to be used?
There is a hardware debugging facility (htm) on some power chips. To use
this you need a contiguous portion of memory for the output to be dumped
to - and we obviously don't want this memory to be simultaneously used by
the kernel.
At boot time we can portion off a section of memory for this (and not tell the
kernel about it), but sometimes you want to be able to use the hardware
debugging facilities and you haven't done this and you don't want to reboot
your machine - and memtrace is the solution for this.
If you're curious one tool that uses this debugging facility is here:
https://github.com/open-power/pdbg. Relevant files are libpdbg/htm.c and src/htm.c.
I have seen a couple of reports in the past from people running memtrace
and failing to do so sometimes, and back then I could not grasp why people
was using it, or under which circumstances was nice to have.
So it would be nice to have a detailed explanation from the person who wrote
it.
Is that enough detail?
Thanks
--
Oscar Salvador
SUSE L3