On Tue, Jul 09, 2013 at 03:44:38PM +0100, David Vrabel wrote: > On 09/07/13 15:13, Konrad Rzeszutek Wilk wrote: > > On Tue, Jul 09, 2013 at 02:38:53PM +0100, David Vrabel wrote: > >> From: David Vrabel <david.vrabel@xxxxxxxxxx> > >> > >> If there are UNUSABLE regions in the machine memory map, dom0 will > >> attempt to map them 1:1 which is not permitted by Xen and the kernel > >> will crash. > >> > >> There isn't anything interesting in the UNUSABLE region that the dom0 > >> kernel needs access to so we can avoid making the 1:1 mapping and > >> leave the region as RAM. > >> > >> Since the obtaining the memory map for dom0 and domU are now more > >> different, refactor each into separate functions. > >> > >> This fixes a dom0 boot failure if tboot is used (because tboot was > >> marking a memory region as UNUSABLE). > > > > Please also include the serial log that shows the crash. > > It's a domain crash by Xen and it isn't useful as none of the stack is > decoded. Could you include the E820 at least to get a sense of where and how this looks? As in - without tboot and then with tboot? > > >> +static int __init xen_get_memory_map_dom0(struct e820entry *map, > >> + unsigned *nr_entries) > >> +{ > >> + struct xen_memory_map memmap; > >> + unsigned i; > >> + int ret; > >> + > >> + /* > >> + * Dom0 requires access to machine addresses for BIOS data and > >> + * MMIO (e.g. PCI) devices. The reset of the kernel expects > >> + * to be able to access these through a 1:1 p2m mapping. > >> + * > >> + * We need to take the pseudo physical memory map and set up > >> + * 1:1 mappings corresponding to the RESERVED regions and > >> + * holes in the /machine/ memory map, adding/expanding the RAM > >> + * region at the end of the map for the relocated RAM. > > This is the key paragraph. The apparent use of the machine memory map > for dom0 is a confusing fiction. OK, but I don't follow when dom0 would be using the E820_UNUSED regions. Is it the xen_do_chunk that is failing on said PFNs? Or is it in this code xen_set_identity_and_release_chunk: "217 /* 218 * If the PFNs are currently mapped, the VA mapping also needs 219 * to be updated to be 1:1. 220 */ 221 for (pfn = start_pfn; pfn <= max_pfn_mapped && pfn < end_pfn; pfn++) 222 (void)HYPERVISOR_update_va_mapping( 223 (unsigned long)__va(pfn << PAGE_SHIFT), 224 mfn_pte(pfn, PAGE_KERNEL_IO), 0); 225 " which updates the initial PTE's with the 1-1 PFN and the E820_UNUSABLE is somehow in between two E820_RAM regions? > > >> + * > >> + * This is more easily done if we just start with the machine > >> + * memory map. > >> + * > >> + * UNUSABLE regions are awkward, they are not interesting to > >> + * dom0 and Xen won't allow them to be mapped so we want to > >> + * leave these as RAM in the pseudo physical map. > > > > We just want them as such in the P2M but not do any PTE creation for it? > > Why do we care about it? We are not creating any page tables for > > E820_UNUSABLE regions. > > I don't follow what you're asking here. What code maps said PFNs. > > In dom0, UNUSABLE regions in the machine memory map are RAM regions on > the pseudo-physical memory map. So instead of playing games and making > these regions special in the pseudo-physical map we just leave them as RAM. .. And then exposing them to the kernel to be used as normal RAM? > > >> + * > >> + * Again, this is easiest if we take the machine memory map > >> + * and change the UNUSABLE regions to RAM. > > > > Won't then Linux try to map them then? In 3.9 (or was it 3.8?) and further > > the page table creation starts ignoring any E820 entries that are not RAM. > > See init_range_memory_mapping and its comment: > > Yes. They are just regular RAM in the pseudo-physical map. With your change it is. But without your change it would not map it. > > > /* > > * We need to iterate through the E820 memory map and create direct mappings > > * for only E820_RAM and E820_KERN_RESERVED regions. We cannot simply > > * create direct mappings for all pfns from [0 to max_low_pfn) and > > * [4GB to max_pfn) because of possible memory holes in high addresses > > * that cannot be marked as UC by fixed/variable range MTRRs. > > * Depending on the alignment of E820 ranges, this may possibly result > > * in using smaller size (i.e. 4K instead of 2M or 1G) page tables. > > * > > > > > > So in effect you are now altering them. > > No. > > >> + */ > >> + > >> + memmap.nr_entries = *nr_entries; > >> + set_xen_guest_handle(memmap.buffer, map); > >> + > >> + ret = HYPERVISOR_memory_op(XENMEM_machine_memory_map, &memmap); > >> + if (ret < 0) > >> + return ret; > >> + > >> + for (i = 0; i < memmap.nr_entries; i++) { > >> + if (map[i].type == E820_UNUSABLE) > > > > What if the E820_UNUSABLE regions were manufactured by the BIOS? Or > > somebody booted Xen with mem=3G (in which we clip the memory) on a 16GB > > box. > > The resulting memory map should be clipped by the result of the call to > xen_get_max_pages(). OK. What about the BIOS manufacturing it? -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html