On Tue, 26 Sep 2017 14:10:52 +0000 "Marciniszyn, Mike" <mike.marciniszyn@xxxxxxxxx> wrote: > > My first question would be: is disabling preemption here really > > necessary? > > > > The motivation for the preempt manipulation has nothing to do with > the counter. > > The buffer allocation returns a set of ringed write-combining 64B > MMIO buffers. The allocate lock is then dropped right after > allocation to allow parallel allocates. > > The hardware keeps track of the buffer fill across multiple CPUs, so > that after the oldest buffer is copied the ring is advanced. > > The idea was to minimize the time from the drop of the allocate lock > to the end of the copy to insure the highest rate of copy to the ring > from multiple cores. > > > AFAICT, preemption is only disabled to protect the access of the > > 'buffers_allocated' percpu counters, nothing else. > > > > However, because these counters are only observed in aggregate, > > across CPUs (see get_buffers_allocated()), the sum will be the > > same, regardless of a thread is migrated between a > > this_cpu_inc(buffers_allocated) and a > > this_cpu_dec(buffers_allocated). > > The counter exists purely as a hot path optimization to avoid an > atomic. > > The only time an accurate counter is required is during reset > sequences to wait for allocate/copy pairs to finish and we don't care > if the closing decrement is on the same core. > > The percpu refcount might be a better choice, but I don't think it > existed at the time the code was written. > > > If that's the case, perhaps the fix is as easy as removing the > > preempt_disable() and preempt_enable()? > > > > That would certainly be the easiest solution, but would compromise > performance if a migration occurs after the allocate lock is dropped. local_lock() disables migration. It's currently only in the -rt patch. It's made to allow preemption to occur as much as possible, as -rt cares about reaction time more than performance (although, we would love to keep performance as well). > > We have been looking at the patch that was submitted for using the > raw spin lock and we share Arnaldo's concern, particularly mixing > lock types. > > Other locks can be procured in the timeout case in > sc_wait_for_packet_egress() inside queue_work() and > dd_dev_err()/printk(). > A raw_spin_lock sacrifices reaction time, which goes against the goal of -rt. When PREEMPT_RT is disabled, the local_lock becomes preempt_disable, so it has no affect on the vanilla kernel. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html