在 2019/6/24 0:38, Thomas Gleixner 写道: > Zhiqiang, >> controlled by sysadmins to copy with hardware changes over time. > > So much for the theory. See below. Thanks for your reply. > >> Correspondingly, the MAX_SOFTIRQ_TIME should be able to be tunned by sysadmins, >> who knows best about hardware performance, for excepted tradeoff between latence >> and fairness. >> >> Here, we add sysctl variable max_softirq_time_usecs to replace MAX_SOFTIRQ_TIME >> with 2ms default value. > > ... > >> */ >> -#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2) >> +unsigned int __read_mostly max_softirq_time_usecs = 2000; >> #define MAX_SOFTIRQ_RESTART 10 >> >> #ifdef CONFIG_TRACE_IRQFLAGS >> @@ -248,7 +249,8 @@ static inline void lockdep_softirq_end(bool in_hardirq) { } >> >> asmlinkage __visible void __softirq_entry __do_softirq(void) >> { >> - unsigned long end = jiffies + MAX_SOFTIRQ_TIME; >> + unsigned long end = jiffies + >> + usecs_to_jiffies(max_softirq_time_usecs); > > That's still jiffies based and therefore depends on CONFIG_HZ. Any budget > value will be rounded up to the next jiffie. So in case of HZ=100 and > time=1000us this will still result in 10ms of allowed loop time. > > I'm not saying that we must use a more fine grained time source, but both > the changelog and the sysctl documentation are misleading. > > If we keep it jiffies based, then microseconds do not make any sense. They > just give a false sense of controlability. > > Keep also in mind that with jiffies the accuracy depends also on the > distance to the next tick when 'end' is evaluated. The next tick might be > imminent. > > That's all information which needs to be in the documentation. > Thanks again for your detailed advice. As your said, the max_softirq_time_usecs setting without explaining the relationship with CONFIG_HZ will give a false sense of controlability. And the time accuracy of jiffies will result in a certain difference between the max_softirq_time_usecs set value and the actual value, which is in one jiffies range. I will add these infomation in the sysctl documentation and changelog in v2 patch. >> + { >> + .procname = "max_softirq_time_usecs", >> + .data = &max_softirq_time_usecs, >> + .maxlen = sizeof(unsigned int), >> + .mode = 0644, >> + .proc_handler = proc_dointvec_minmax, >> + .extra1 = &zero, >> + }, > > Zero as the lower limit? That means it allows a single loop. Fine, but > needs to be documented as well. > > Thanks, > > tglx > > . >