On Fri, 20 Jul 2018 16:33:36 +0100 Dmitry Safonov <dima@xxxxxxxxxx> wrote: > On Fri, 2018-07-20 at 17:09 +0200, Petr Mladek wrote: > > > > On Tue, 2018-07-03 at 23:56 +0100, Dmitry Safonov wrote: > > > > > Currently ratelimit_state is protected with spin_lock. If the > > > > > .lock > > > > > is > > > > > taken at the moment of ___ratelimit() call, the message is > > > > > suppressed > > > > > to > > > > > make ratelimiting robust. > > > > > > > > > > That results in the following issue issue: > > > > > CPU0 CPU1 > > > > > ------------------ ------------------ > > > > > printk_ratelimit() printk_ratelimit() > > > > > | | > > > > > try_spin_lock() try_spin_lock() > > > > > | | > > > > > time_is_before_jiffies() return 0; // suppress > > > > > > > > > > So, concurrent call of ___ratelimit() results in silently > > > > > suppressing > > > > > one of the messages, regardless if the limit is reached or not. > > > > > And rc->missed is not increased in such case so the issue is > > > > > covered > > > > > from user. > > > > > > > > > > Convert ratelimiting to use atomic counters and drop spin_lock. > > > > > > > > > > Note: That might be unexpected, but with the first interval of > > > > > messages > > > > > storm one can print up to burst*2 messages. So, it doesn't > > > > > guarantee > > > > > that in *any* interval amount of messages is lesser than burst. > > > > > But, that differs lightly from previous behavior where one can > > > > > start > > > > > burst=5 interval and print 4 messages on the last milliseconds > > > > > of > > > > > interval + new 5 messages from new interval (totally 9 messages > > > > > in > > > > > lesser than interval value): > > > > > > > > > > msg0 msg1-msg4 msg0-msg4 > > > > > | | | > > > > > | | | > > > > > |--o---------------------o-|-----o--------------------|--> (t) > > > > > <-------> > > > > > Lesser than burst > > > > > > > > > I am a bit confused. Does this patch fix two problems? > > > > + Lost rc->missed update when try_spin_lock() fails > > + printing up to burst*2 messages in a give interval > > What I tried to solve here (maybe I could better point it in the commit > message): is the situation where ratelimit::burst haven't been hit yet > and there are calls for __ratelimit() from different CPUs. > So, neither of the messages should be suppressed, but as the spinlock > is taken already - one of them is dropped and even without updating > missed counter. > > > It would make the review much easier if you split it into more > > patches and fix the problems separately. > > Hmm, not really sure: the patch changes spinlock to atomics and I'm not > sure it make much sense to add atomics before removing the spinlock.. > > > Otherwise, it looks promissing... > > > > Best Regards, > > Petr > > > > PS: I have vacation the following two weeks. Anyway, please, CC me > > if you send any new version. > > Surely, thanks for your attention to the patch (and time). > I'm just catching up from my vacation. What about making rs->missed into an atomic, and have: if (!raw_spin_trylock_irqsave(&rs->lock, flags)) { atomic_inc(&rs->missed); return 0; } ? You would also need to do: if (time_is_before_jiffies(rs->begin + rs->interval)) { int missed = atomic_xchg(&rs->missed, 0); if (missed) { So that you don't have a race between checking rs->missed and setting it to zero. -- Steve _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx