On Tue, 16 Jun 2009, Ingo Molnar wrote: > * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Mon, 2009-06-15 at 20:52 +0200, Ingo Molnar wrote: > > > * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > On Mon, 2009-06-15 at 20:42 +0200, Ingo Molnar wrote: > > > > > * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > > > On Mon, 2009-06-15 at 20:25 +0200, Ingo Molnar wrote: > > > > > > > * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > but ... look at the APIs i propose above. We dont need _any_ > > > > > > > 'types'. > > > > > > > > > > > > > > That type enumeration is basically an open-coded allocator. If we do > > > > > > > a _real_ allocator (a balanced stack of atomic kmaps) we dont need > > > > > > > any of those indices, and all the potential for mismatch goes away > > > > > > > as well - a stack nests trivially with IRQ and NMI and arbitrary > > > > > > > other contexts. > > > > > > > > > > > > You want types because: > > > > > > - they encode the intent, and can be verified > > > > > > - they help keep track of the max nesting depth > > > > > > > > > > > > In the proposed implementation all type code basically falls away > > > > > > no ! CONFIG_DEBUG_VM, but is kept around for robustness. > > > > > > > > > > But much of the fragility of the types (and their clumsiness - for > > > > > example in highpte ops we have to know at which level of the > > > > > pagetables we are, and use the right kind of index) is _precisely_ > > > > > because we have the types ... > > > > > > > > How will you manage the max depth? > > > > > > if (++depth == MAX_DEPTH) { > > > print_all_entries_and_nasty_warning(); > > > /* hope we'll live long enough for the syslog to touch disk */ > > > depth = 0; > > > } > > > > That will only trigger if we hit it, which will be _very_ rare. > > > > > unbalanced kmap is a bad bug - the easier we make it to catch, > > > the better. The system wouldnt survive anyway. > > > > My proposed patch validates strict balance of types. But I can > > easily add the above as well. > > > > By removing the types it becomes very difficult to verify the max > > depth. I really don't like removing them. > > The fact that it implies an atomic section pretty much limits its > depth in practice, doesnt it? > > All we need to track in the debug code is > max-{syscall,softirq,hardirq,nmi}. The sum of these 4 counts must be > smaller than the max - even if (as you are right to point out) we > dont hit that magic combo that truly maximizes the depth. > > And note that in practice many of the current types are exclusive to > each other - so using the stack would _reduce_ the amount of > kmap-atomic space we need. I'll briefly resurface into the discussion before submerging again ;) I like very much the direction you're taking this, Ingo. Yes, that is how I've sometimes thought we should go - though when making the kmap_push/kmap_pop suggestion to Peter yesterday, I wasn't expecting him to make that revolution, just provide a way to save a current KM_type mapping and restore it later, so he can safely use the standard primitives like pte_offset_map() within. I wasn't expecting in_nmi() and in_irq() tests still to be there, even if only when debug. I can understand Peter's lockdep background wanting to retain the checking and KM_types, but if we're actually going to overhaul this area, I'd love just to get rid of them. Yes, that should reduce the amount of kmap_atomic space needed; though I've not thought how we keep track of the maximum needed as the kernel goes on developing. There might be a very few places where we expect to kmap_atomic A, kmap_atomic B, kunmap_atomic A, kunmap_atomic B? Something else to throw in: what if they were not just atomic, but also replaced the current sleeping kmaps? i.e. a task context carries around its own stack of these. I've always rejected that as introducing a pretty terrible overhead just where we don't want it; but maybe you're ingenious enough to devise ways of amortizing that cost. It would be nice to delete mm/highmem.c is we could. Ah, but there are probably places where one task passes a kmap address to another? Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html