On 03/22/2010 03:53 PM, Thomas Gleixner wrote: > On Mon, 22 Mar 2010, Yinghai Lu wrote: >> On 03/22/2010 03:09 PM, Thomas Gleixner wrote: >>> On Mon, 22 Mar 2010, Yinghai Lu wrote: >>>> On 03/22/2010 12:37 PM, Ingo Molnar wrote: >> >>>> 1. need to keep e820 >>> >>> That's neither an argument for using lmb nor an argument not to use >>> lmb. e820 is x86 specific BIOS wreckage and it's whole purpose is >>> just to feed information into a (hopefully) generic early resource >>> management facility. >>> >>> e820 _CANNOT_ be generalized. Period. > > I still want to know, what "need to keep e820" means for you. keep the most arch/x86/kernel/e820.c, and later when finish_e820_parsing() is called, fill lmb.memory according to e820 entries with E820_RAM type. > >>>> 2. use e820 range with RAM to fill lmb.memory when finizing_e820 >>> >>> What's finizing_e820 ??? >> finish_e820_parsing(); > > Yinghai, come on. Are you really expecting that everyone involved in > this discussion goes to look up what the heck finish_e820_parsing() > is doing ? > > You want to explain why your solution is better or why lmb is not > sufficient, so you better go and explain what finish_e820_parsing() > is, why finish_e820_parsing() is important and why lmb cannot cope > with it. current x86: a. setup e820 array. b. early_parm mem= and memmap= related code will adjust the e820. we don't need to call lmb_enforce_memory_limit(). > >>>> 3. use lmb.reserved to replace early_res. >>> >>> What's the implication of doing that ? >> >> early_res array is only corresponding to lmb.reserved, aka reserved >> region from kernel. > > Is it only corresponding (somehow) or is it a full equivivalent ? early_res is not sorted and merged. > >>>> current lmb is merging the region, we can not use name tag any more. >>> >>> What's wrong with merging of regions ? Are you arguing about a >>> specific region ("the region") ? > > Care to answer my question ? if range get merged, you can not use name with them. > >>> >>> Which name tag ? And why is that name tag important ? >> >> struct early_res { >> u64 start, end; >> char name[15]; >> char overlap_ok; >> }; > > I'm starting to get annoyed, really. What is that name field for and > why is that "name" field important ? at least later when some code free a wrong range, we can figure who cause the problem. > >>> >>>> may need to dump early_memtest, and use early_res for bootmem at >>>> first. >>> >>> Why exactly might early_memtest not longer be possible ? >> >> early_memtest need to call find_e820_area_size >> current lmb doesn't have that kind of find util. >> the one memory subtract reserved memory by kernel. > > What subtracts what ? And why is it that hard to fix that ? lmb.memory - lmb.reserved or e820 E820_RAM entries - early_res move some code from early_res to lmb.c? > >>> >>> What means "early_res for bootmem" ? >> >> use early_res to replace bootmem, the CONFIG_NO_BOOTMEM. >> that need early_res can be double or increase the slots automatically. > > -ENOPARSE > > Yinghai, I asked you to take your time and explain things in detail > instead of shooting unparseable answers within a minute. > > Everyone else in this discussion tries to be as explanatory as > possible, just you expect that everyone else is going to dig out the > crystal ball to understand the deeper meanings of your patches. > > Again, please take your time to explain what needs to be done or what > is impossible to solve in your opinion, so we can get that resolved in > a way which is satisfactory and useful for all parties involved. to make x86 to use lmb, we need to extend lmb to have find_early_area. static int __init find_overlapped_early(u64 start, u64 end) { int i; struct lmb_properties *r; for (i = 0; i < lmb.reserved_cnt && lmb.reserved.region[i].size; i++) { r = &lmb.reserved.region[i]; if (end > r->base && start < (r->base + r->size)) break; } return i; } /* Check for already reserved areas */ static inline int __init bad_addr(u64 *addrp, u64 size, u64 align) { int i; u64 addr = *addrp; int changed = 0; struct lmb_properties *r; again: i = find_overlapped_early(addr, addr + size); r = &lmb.reserved.region[i]; if (i < lmb.reserved.cnt && r->size) { *addrp = addr = round_up(r->base + r->size, align); changed = 1; goto again; } return changed; } u64 __init find_early_area(u64 ei_start, u64 ei_last, u64 start, u64 end, u64 size, u64 align) { u64 addr, last; addr = round_up(ei_start, align); if (addr < start) addr = round_up(start, align); if (addr >= ei_last) goto out; while (bad_addr(&addr, size, align) && addr+size <= ei_last) ; last = addr + size; if (last > ei_last) goto out; if (last > end) goto out; return addr; out: return -1ULL; } find_early_area_size()... and use them we can have find_lmb_free_area /* * Find a free area with specified alignment in a specific range. */ u64 __init find_lmb_area(u64 start, u64 end, u64 size, u64 align) { int i; for (i = 0; i < lmb.memory.cnt; i++) { u64 ei_start = lmb.memory.region[i].base; u64 ei_end = ei_start + lmb.memory.region[i].size; addr = find_early_area(ei_start, ei_last, start, end, size, align); if (addr != -1ULL) return addr; } return -1ULL; } also later we can use with active_range for bootmem replacement. u64 __init find_memory_core_early(int nid, u64 size, u64 align, u64 goal, u64 limit) { int i; /* need to go over early_node_map to find out good range for node */ for_each_active_range_index_in_nid(i, nid) { u64 addr; u64 ei_start, ei_last; ei_last = early_node_map[i].end_pfn; ei_last <<= PAGE_SHIFT; ei_start = early_node_map[i].start_pfn; ei_start <<= PAGE_SHIFT; addr = find_early_area(ei_start, ei_last, goal, limit, size, align); if (addr == -1ULL) continue; return addr; } return -1ULL; } Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html