On 10/12/21 01:38, David Mosberger-Tang wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > I took linux-wireless off the CC list... > > On Thu, 2021-12-09 at 12:28 -0700, David Mosberger-Tang wrote: >> On Thu, 2021-12-09 at 12:12 -0700, David Mosberger-Tang wrote: >>> On Thu, 2021-12-09 at 11:15 -0700, David Mosberger-Tang wrote: >>> >>> As far as I can see, chip_wakeup() can trigger this legitimately: >>> >>> do { >>> h->hif_read_reg(wilc, WILC_SPI_WAKEUP_REG, ®); >>> h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, >>> reg | WILC_SPI_WAKEUP_BIT); >>> h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, >>> reg & ~WILC_SPI_WAKEUP_BIT); >>> >>> do { >>> usleep_range(2000, 2500); >>> wilc_get_chipid(wilc, true); >>> } while (wilc_get_chipid(wilc, true) == 0); >>> } while (wilc_get_chipid(wilc, true) == 0); >>> >>> Actually, the above looks rather gross. wilc_get_chip() reads both WILC_CHIPID >>> and WILC_RF_REVISION_ID and we do this at least three times in a row on each >>> chip_wakeup(). Wow. Is this really necessary? >> Looks like I'm on to something. If I replace the gross code with: >> >> h->hif_read_reg(wilc, WILC_SPI_WAKEUP_REG, ®); >> h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, reg | WILC_SPI_WAKEUP_BIT); >> h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, reg & ~WILC_SPI_WAKEUP_BIT); >> >> while (1) { >> if (wilc->hif_func->hif_read_reg(wilc, WILC_CHIPID, &chipid) == 0) >> break; >> usleep_range(2000, 2500); >> } >> >> Look what happens to TX throughput: >> >> [ 4] 0.00-120.00 sec 142 MBytes 9.92 Mbits/sec 6 sender >> >> This is with power_save mode turned on. Almost exactly 3 times faster than the >> 3.33 Mbits/sec I was seeing before. > Over 120 seconds of iperf3, a histogram of the number of WILC_CHIPID reads > required to get a successful read looked like this: > > [ 493.918785] hist[ 0]=70417 > [ 493.921609] hist[ 1]=0 > [ 493.923997] hist[ 2]=0 > [ 493.926424] hist[ 3]=0 > [ 493.928957] hist[ 4]=0 > [ 493.931359] hist[ 5]=0 > [ 493.933781] hist[ 6]=0 > [ 493.936210] hist[ 7]=0 > [ 493.938702] hist[ 8]=0 > [ 493.941097] hist[ 9]=0 > [ 493.943523] hist[ 10]=0 > [ 493.945951] hist[ 11]=0 > [ 493.948418] hist[ 12]=0 > [ 493.950812] hist[ 13]=0 > [ 493.953238] hist[ 14]=0 > [ 493.955667] hist[ 15]=0 > [ 493.958126] hist[ 16]=0 > [ 493.960527] hist[ 17]=0 > [ 493.962954] hist[ 18]=1 > [ 493.965382] hist[ 19]=7 > [ 493.967840] hist[ 20]=1 > [ 493.970242] hist[ 21]=13 > > Based on this, I changed the above code to: > > while (1) { > h->hif_read_reg(wilc, WILC_SPI_WAKEUP_REG, ®); > h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, > reg | WILC_SPI_WAKEUP_BIT); > h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, > reg & ~WILC_SPI_WAKEUP_BIT); > > if (wilc->hif_func->hif_read_reg(wilc, WILC_CHIPID, &chipid) == 0) > break; > usleep_range(2000, 2500); > } > > This gets me: > > TX/PSM off: [ 4] 0.00-120.00 sec 198 MBytes 13.8 Mbits/sec 9 sender > TX/PSM on : [ 4] 0.00-120.00 sec 140 MBytes 9.78 Mbits/sec 9 sender > RX/PSM off: [ 5] 0.00-120.10 sec 304 MBytes 21.2 Mbits/sec receiver > RX/PSM on : [ 5] 0.00-120.11 sec 288 MBytes 20.1 Mbits/sec receive > > Ajay, I'm guessing your kernel is either configured for tickless timer or > HZ=1000? That would explain why you'd see higher throughput with the original > code. My system is configured for HZ=100, so the usleep_range() would typically > take 10ms. I think the issue may not be related to the HZ configuration because the "usleep_range" API may have a delay of the same magnitude.Btw I too have CONFIG_HZ=100 configured for my setup. I overlooked that the throughput number for ttcp was in Bytes/sec whereas iPerf specified in (bits/sec) so it looks like the numbers are in same range. > > Any objections to the latest code? The above modification are not required because the API is already changed but it would be good to have the other patch which uses "power_save_mode" flag to avoid calling of chip-sleep code. Regards, Ajay