Hi Ingo, On 01/24/2017 04:20 PM, Ingo Molnar wrote: > * Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote: > >> Hi Ingo, >> >> On 01/22/2017 05:04 PM, Ingo Molnar wrote: >>> * Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote: >>> >>>>>> +static void xdbc_runtime_delay(unsigned long count) >>>>>> +{ >>>>>> + udelay(count); >>>>>> +} >>>>>> +static void (*xdbc_delay)(unsigned long) = xdbc_early_delay; >>>>> Is this udelay() complication really necessary? udelay() should work fine even in >>>>> early code. It might not be precisely calibrated, but should be good enough. >>>> I tried udelay() in the early code. It's not precise enough for the >>>> hardware handshaking. >>> Possibly because on x86 early udelay() did not work at all - i.e. there's no delay >>> whatsoever. >> Yes. >> >>> Could you try it on top of this commit in tip:timers/core: >>> >>> 4c45c5167c95 x86/timer: Make delay() work during early bootup >>> >>> ? >> I tried tip:timers/core. It's not precise enough for my context either. >> >> __const_udelay(). >> >> 157 inline void __const_udelay(unsigned long xloops) >> 158 { >> 159 unsigned long lpj = this_cpu_read(cpu_info.loops_per_jiffy) ? : loops_per_jiffy; >> 160 int d0; >> 161 >> 162 xloops *= 4; >> 163 asm("mull %%edx" >> 164 :"=d" (xloops), "=&a" (d0) >> 165 :"1" (xloops), "0" (lpj * (HZ / 4))); >> 166 >> 167 __delay(++xloops); >> 168 } >> >> >> In my early code, loops_per_jiffy is not initialized yet. Hence "lpj" for the asm line >> is 4096 (default value). >> >> The cpu_info.loops_per_jiffy actually reads 8832000 after initialization. They are >> about 2000 times different. >> >> I did a hacky test in kernel to check the difference between these two different >> "lpj" values. (The hacky patch is attached.) Below is the output for 100ms delay. >> >> [ 2.494751] udelay_test uninitialized ---->start >> [ 2.494820] udelay_test uninitialized ---->end >> [ 2.494828] udelay_test initialized ---->start >> [ 2.595234] udelay_test initialized ---->end >> >> For 100ms delay, udelay() with uninitialized loops_per_jiffy only gives a delay of >> only 69us. > Ok, then could we add some simple calibration to make udelay work much better - or > perhaps move the udelay calibration up earlier? > > Hiding essentially an early udelay() implementation in an early-printk driver is > ugly and counterproductive. Sure. How about below change? diff --git a/drivers/usb/early/xhci-dbc.c b/drivers/usb/early/xhci-dbc.c index d3f0c84..940989e 100644 --- a/drivers/usb/early/xhci-dbc.c +++ b/drivers/usb/early/xhci-dbc.c @@ -587,6 +587,35 @@ static int xdbc_bulk_transfer(void *data, int size, bool read) return size; } +static void __init xdbc_udelay_calibration(void) +{ + unsigned long lpj = 0; + unsigned int tsc_khz, cpu_khz; + + if (!boot_cpu_has(X86_FEATURE_TSC)) + goto calibration_out; + + cpu_khz = x86_platform.calibrate_cpu(); + tsc_khz = x86_platform.calibrate_tsc(); + + if (tsc_khz == 0) + tsc_khz = cpu_khz; + else if (abs(cpu_khz - tsc_khz) * 10 > tsc_khz) + cpu_khz = tsc_khz; + + if (!tsc_khz) + goto calibration_out; + + lpj = tsc_khz * 1000; + do_div(lpj, HZ); + +calibration_out: + if (!lpj) + lpj = 1 << 22; + + loops_per_jiffy = lpj; +} + static int __init xdbc_early_setup(void) { int ret; @@ -686,6 +715,8 @@ int __init early_xdbc_parse_parameter(char *s) } xdbc.xdbc_reg = (struct xdbc_regs __iomem *)(xdbc.xhci_base + offset); + xdbc_udelay_calibration(); + return 0; } Best regards, Lu Baolu -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html