Re: INFO: rcu detected stall in ext4_write_checks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 15, 2019 at 06:01:01AM -0700, Paul E. McKenney wrote:
> Title: Making SCHED_DEADLINE safe for kernel kthreads
> 
> Abstract:
> 
> Dmitry Vyukov's testing work identified some (ab)uses of sched_setattr()
> that can result in SCHED_DEADLINE tasks starving RCU's kthreads for
> extended time periods, not millisecond, not seconds, not minutes, not even
> hours, but days. Given that RCU CPU stall warnings are issued whenever
> an RCU grace period fails to complete within a few tens of seconds,
> the system did not suffer silently. Although one could argue that people
> should avoid abusing sched_setattr(), people are human and humans make
> mistakes. Responding to simple mistakes with RCU CPU stall warnings is
> all well and good, but a more severe case could OOM the system, which
> is a particularly unhelpful error message.
> 
> It would be better if the system were capable of operating reasonably
> despite such abuse. Several approaches have been suggested.
> 
> First, sched_setattr() could recognize parameter settings that put
> kthreads at risk and refuse to honor those settings. This approach
> of course requires that we identify precisely what combinations of
> sched_setattr() parameters settings are risky, especially given that there
> are likely to be parameter settings that are both risky and highly useful.

So we (the people poking at the DEADLINE code) are all aware of this,
and on the TODO list for making DEADLINE available for !priv users is
the item:

  - put limits on deadline/period

And note that that is both an upper and lower limit. The upper limit
you've just found why we need it, the lower limit is required because
you can DoS the hardware by causing deadlines/periods that are equal (or
shorter) than the time it takes to program the hardware.

There might have even been some patches that do some of this, but I've
held off because we have bigger problems and they would've established
an ABI while it wasn't clear it was sufficient or the right form.





[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux