On Thu, May 05, Andrew Morton wrote: > On Tue, 3 May 2011 21:08:06 +0200 > Olaf Hering <olaf at aepfle.de> wrote: > > This will reduce the time to read /proc/vmcore. Without this change a > > 512M guest with 128M crashkernel region needs 200 seconds to read it, > > with this change it takes just 2 seconds. > > Seems reasonable, I suppose. Andrew, Thanks for your feedback. > Is there some suitable ifdef we can put around this stuff to avoid > adding it to kernel builds which will never use it? The change is for pv-on-hvm guests. In this setup the (out-of-tree) paravirtualized drivers shutdown the emulated hardware, then they communicate directly with the backend. There is no ifdef right now. I guess at some point, when xen is fully merged, this hook can be put into some CONFIG_XEN_PV_ON_HVM thing. > > +void register_oldmem_pfn_is_ram(int (*fn)(unsigned long)) > > +{ > > + if (oldmem_pfn_is_ram == NULL) > > + oldmem_pfn_is_ram = fn; > > +} > > This is racy, and it should return a success code. And we may as well > mark it __must_check to prevent people from cheating. I will update that part. > > +void unregister_oldmem_pfn_is_ram(void) > > +{ > > + wait_event(oldmem_fn_waitq, atomic_read(&oldmem_fn_refcount) == 0); > > + oldmem_pfn_is_ram = NULL; > > + wmb(); > > +} > > I'd say we should do away with the (also racy) refcount thing. > Instead, require that callers not be using the thing when they > unregister. ie: that callers not be buggy. I think oldmem_pfn_is_ram can be cleared unconditionally, the NULL check in pfn_is_ram() below will prevent a crash. This whole refcount thing is to prevent a module unload while pfn_is_ram() is calling the hook, in other words the called code should not go away until the hook returns to pfn_is_ram(). Should the called code increase/decrease the modules refcount instead? I remember there was some MODULE_INC/MODULE_DEC macro (cant remember the exact name) at some point. What needs to be done inside the module to prevent an unload while its still in use? Is it __module_get/module_put for each call of fn()? The called function which will go into the xen source at some point looks like shown below. HVMOP_get_mem_type was just merged into xen-unstable. xen-unstable.hg/unmodified_drivers/linux-2.6/platform-pci/platform-pci.c #ifdef HAVE_OLDMEM_PFN_IS_RAM static int xen_oldmem_pfn_is_ram(unsigned long pfn) { struct xen_hvm_get_mem_type a; int ret; a.domid = DOMID_SELF; a.pfn = pfn; if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a)) return -ENXIO; switch (a.mem_type) { case HVMMEM_mmio_dm: ret = 0; break; case HVMMEM_ram_rw: case HVMMEM_ram_ro: default: ret = 1; break; } return ret; } #endif static int __devinit platform_pci_init(...) { /* other init stuff */ #ifdef HAVE_OLDMEM_PFN_IS_RAM register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram); #endif /* other init stuff */ } Also, this xen driver has no module_exit. So for xen the unregister is not an issue. I havent looked at the to-be-merged pv-on-hvm drivers, maybe they can be properly unloaded. > > +static int pfn_is_ram(unsigned long pfn) > > +{ > > + int (*fn)(unsigned long); > > + /* pfn is ram unless fn() checks pagetype */ > > + int ret = 1; > > + > > + atomic_inc(&oldmem_fn_refcount); > > + smp_mb__after_atomic_inc(); > > + fn = oldmem_pfn_is_ram; > > + if (fn) > > + ret = fn(pfn); > > + if (atomic_dec_and_test(&oldmem_fn_refcount)) > > + wake_up(&oldmem_fn_waitq); > > + > > + return ret; > > +} > > This function would have been a suitable place at which to document the > entire feature. As it stands, anyone who is reading this code won't > have any clue why it exists. I will add a comment. > > +EXPORT_SYMBOL_GPL(register_oldmem_pfn_is_ram); > > +EXPORT_SYMBOL_GPL(unregister_oldmem_pfn_is_ram); > > Each export should be placed immediately after the function which is > being exported, please. Checkpatch reports this. Please use checkpatch. Will do. > > +++ linux-2.6.39-rc5/include/linux/crash_dump.h > > @@ -66,6 +66,11 @@ static inline void vmcore_unusable(void) > > if (is_kdump_kernel()) > > elfcorehdr_addr = ELFCORE_ADDR_ERR; > > } > > + > > +#define HAVE_OLDMEM_PFN_IS_RAM 1 > > What's this for? So that out-of-tree drivers dont fail to compile when they call that hook unconditionally. Perhaps they could use the kernel version. Olaf