On Mon, Sep 8, 2014 at 7:57 PM, Luis R. Rodriguez <mcgrof@xxxxxxxxxxxxxxxx> wrote: >> Why do we care about the priority of probing tasks? Does that >> actually make any meaningful difference? If so, how? > > As I noted before -- I have yet to provide clear metrics but at least > changing both init paths + probe from finit_module() to kthread > certainly had a measurable time increase, I suspect using > queue_work(system_unbound_wq, async_probe_work) will make probe > slower. I'll get to these metrics this week. The results are in and I'm glad to report my suspicions were incorrect about kthread() being slower than queue_work(system_unbound_wq), it actually works faster. Results will likely vary depending on subsystems but in this particular case the cxgb4 driver was tested requiring firmware loading and then without requiring firmware loading and for these two types of driver loading all mechanisms make probe take just about the same out of time. What was surprising was that when firmware loading is required the amount of time it takes to run probe does vary and quite considerably in terms of microseconds. The discrepancies are by no means terrible... but should be considered if one is thinking of large systems and if we do wish to optimize things further and offer equivalent behavior, specially when probing multiple devices with the same driver. The method used to collect the amount of time for probe was to use: ktime_t calltime, delta, rettime; calltime = ktime_get(); driver_attach(); rettime = ktime_get(); delta = ktime_sub(rettime, calltime); duration = (unsigned long long) ktime_to_ns(delta) >> 10; And then print that time of microsecond out right after it finishes, whether that be through the default kernel synchronous run or the async runs. The collection and testing was then done by Santosh. Details of the collections are at: https://bugzilla.novell.com/show_bug.cgi?id=877622 The summary: The driver actually probed 2 cards in the tests so we don't have results for 1 card, the kernel serially calls probe for each device so to get the amount of time for one run lets just divide the results by 2. For each strategy there is the requirement of using firmware and a run where no firmware loading is required. The results for both cards are: =====================================================================| strategy fw (usec) no-fw (usec) | ---------------------------------------------------------------------| synchronous 48945138 2615126 | kthread 50132831 2619737 | queue_work(system_unbound_wq) 49827323 2615262 | ---------------------------------------------------------------------| For one device then that comes out to: =====================================================================| strategy fw (usec) no-fw (usec) | ---------------------------------------------------------------------| synchronous 24472569 1307563 | kthread 25066415.5 1309868.5 | queue_work(system_unbound_wq) 24913661.5 1307631 | ---------------------------------------------------------------------| Converting that to seconds: =====================================================================| strategy fw (s) no-fw (s) | ---------------------------------------------------------------------| synchronous 24.47 1.31 | kthread 25.07 1.31 | queue_work(system_unbound_wq) 24.91 1.31 | ---------------------------------------------------------------------| Graph friendly versions of the results for probe of 1 device: Probe with firmware: http://drvbp1.linux-foundation.org/~mcgrof/images/probe-measurements/probe-cgxb4-firmware.png Probe without firmware: http://drvbp1.linux-foundation.org/~mcgrof/images/probe-measurements/probe-cgxb4-no-firmware.png Luis -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html