I took linux-wireless off the CC list... On Thu, 2021-12-09 at 12:28 -0700, David Mosberger-Tang wrote: > On Thu, 2021-12-09 at 12:12 -0700, David Mosberger-Tang wrote: > > On Thu, 2021-12-09 at 11:15 -0700, David Mosberger-Tang wrote: > > > > As far as I can see, chip_wakeup() can trigger this legitimately: > > > > do { > > h->hif_read_reg(wilc, WILC_SPI_WAKEUP_REG, ®); > > h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, > > reg | WILC_SPI_WAKEUP_BIT); > > h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, > > reg & ~WILC_SPI_WAKEUP_BIT); > > > > do { > > usleep_range(2000, 2500); > > wilc_get_chipid(wilc, true); > > } while (wilc_get_chipid(wilc, true) == 0); > > } while (wilc_get_chipid(wilc, true) == 0); > > > > Actually, the above looks rather gross. wilc_get_chip() reads both WILC_CHIPID > > and WILC_RF_REVISION_ID and we do this at least three times in a row on each > > chip_wakeup(). Wow. Is this really necessary? > > Looks like I'm on to something. If I replace the gross code with: > > h->hif_read_reg(wilc, WILC_SPI_WAKEUP_REG, ®); > h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, reg | WILC_SPI_WAKEUP_BIT); > h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, reg & ~WILC_SPI_WAKEUP_BIT); > > while (1) { > if (wilc->hif_func->hif_read_reg(wilc, WILC_CHIPID, &chipid) == 0) > break; > usleep_range(2000, 2500); > } > > Look what happens to TX throughput: > > [ 4] 0.00-120.00 sec 142 MBytes 9.92 Mbits/sec 6 sender > > This is with power_save mode turned on. Almost exactly 3 times faster than the > 3.33 Mbits/sec I was seeing before. Over 120 seconds of iperf3, a histogram of the number of WILC_CHIPID reads required to get a successful read looked like this: [ 493.918785] hist[ 0]=70417 [ 493.921609] hist[ 1]=0 [ 493.923997] hist[ 2]=0 [ 493.926424] hist[ 3]=0 [ 493.928957] hist[ 4]=0 [ 493.931359] hist[ 5]=0 [ 493.933781] hist[ 6]=0 [ 493.936210] hist[ 7]=0 [ 493.938702] hist[ 8]=0 [ 493.941097] hist[ 9]=0 [ 493.943523] hist[ 10]=0 [ 493.945951] hist[ 11]=0 [ 493.948418] hist[ 12]=0 [ 493.950812] hist[ 13]=0 [ 493.953238] hist[ 14]=0 [ 493.955667] hist[ 15]=0 [ 493.958126] hist[ 16]=0 [ 493.960527] hist[ 17]=0 [ 493.962954] hist[ 18]=1 [ 493.965382] hist[ 19]=7 [ 493.967840] hist[ 20]=1 [ 493.970242] hist[ 21]=13 Based on this, I changed the above code to: while (1) { h->hif_read_reg(wilc, WILC_SPI_WAKEUP_REG, ®); h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, reg | WILC_SPI_WAKEUP_BIT); h->hif_write_reg(wilc, WILC_SPI_WAKEUP_REG, reg & ~WILC_SPI_WAKEUP_BIT); if (wilc->hif_func->hif_read_reg(wilc, WILC_CHIPID, &chipid) == 0) break; usleep_range(2000, 2500); } This gets me: TX/PSM off: [ 4] 0.00-120.00 sec 198 MBytes 13.8 Mbits/sec 9 sender TX/PSM on : [ 4] 0.00-120.00 sec 140 MBytes 9.78 Mbits/sec 9 sender RX/PSM off: [ 5] 0.00-120.10 sec 304 MBytes 21.2 Mbits/sec receiver RX/PSM on : [ 5] 0.00-120.11 sec 288 MBytes 20.1 Mbits/sec receive Ajay, I'm guessing your kernel is either configured for tickless timer or HZ=1000? That would explain why you'd see higher throughput with the original code. My system is configured for HZ=100, so the usleep_range() would typically take 10ms. Any objections to the latest code? --david