On Wed, Sep 12, 2012 at 10:45:22AM +0300, Avi Kivity wrote: > On 09/12/2012 04:03 AM, Paul E. McKenney wrote: > >> > > Paul, I'd like to check something with you here: > >> > > this function can be triggered by userspace, > >> > > any number of times; we allocate > >> > > a 2K chunk of memory that is later freed by > >> > > kfree_rcu. > >> > > > >> > > Is there a risk of DOS if RCU is delayed while > >> > > lots of memory is queued up in this way? > >> > > If yes is this a generic problem with kfree_rcu > >> > > that should be addressed in core kernel? > >> > > >> > There is indeed a risk. > >> > >> In our case it's a 2K object. Is it a practical risk? > > > > How many kfree_rcu()s per second can a given user cause to happen? > > Not much more than a few hundred thousand per second per process (normal > operation is zero). > I managed to do 21466 per second. > > > >> > The kfree_rcu() implementation cannot really > >> > decide what to do here, especially given that it is callable with irqs > >> > disabled. > >> > > >> > The usual approach is to keep a per-CPU counter and count it down from > >> > some number for each kfree_rcu(). When it reaches zero, invoke > >> > synchronize_rcu() as well as kfree_rcu(), and then reset it to the > >> > "some number" mentioned above. > >> > >> It is a bit of a concern for me that this will hurt worst-case latency > >> for realtime guests. In our case, we return error and this will > >> fall back on not allocating memory and using slow all-CPU scan. > >> One possible scheme that relies on this is: > >> - increment an atomic counter, per vcpu. If above threshold -> > >> return with error > >> - call_rcu (+ barrier vcpu destruct) > >> - within callback decrement an atomic counter > > > > That certainly is a possibility, but... > > > >> > In theory, I could create an API that did this. In practice, I have no > >> > idea how to choose the number -- much depends on the size of the object > >> > being freed, for example. > >> > >> We could pass an object size, no problem :) > > > > ... before putting too much additional effort into possible solutions, > > why not force the problem to occur and see what actually happens? We > > would then be in a much better position to work out what should be done. > > Good idea. Michael, is should be easy to modify kvm-unit-tests to write > to the APIC ID register in a loop. > I did. Memory consumption does not grow on otherwise idle host. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html