Search Linux Wireless

Re: debugging TCP stalls on high-speed wifi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/12/19 1:46 PM, Johannes Berg wrote:
On Thu, 2019-12-12 at 13:29 -0800, Ben Greear wrote:

(*) Hmm. Now I have another idea. Maybe we have some kind of problem
with the medium access configuration, and we transmit all this data
without the AP having a chance to send back all the ACKs? Too bad I
can't put an air sniffer into the setup - it's a conductive setup.

splitter/combiner?

I guess. I haven't looked at it, it's halfway around the world or
something :)

If it is just delayed acks coming back, which would slow down a stream, then
multiple streams would tend to work around that problem?

Only a bit, because it allows somewhat more outstanding data. But each
stream estimates the throughput lower in its congestion control
algorithm, so it would have a smaller window size?

What I was thinking is that if we have some kind of skew in the system
and always/frequently/sometimes make our transmissions have priority
over the AP transmissions, then we'd not get ACKs back, and that might
cause what I see - the queue drains entirely and *then* we get an ACK
back...

That's not a _bad_ theory and I'll have to find a good way to test it,
but I'm not entirely convinced that's the problem.

Oh, actually, I guess I know it's *not* the problem because otherwise
the ss output would show we're blocked on congestion window far more
than it looks like now? I think?

If you get the rough packet counters or characteristics, you could set up UDP flows to mimic
download and upload packet behaviour and run them concurrently.  If you can still push a good bit more UDP up even
with small UDP packets emulating TCP acks coming down, then I think you can be
confident that it is not ACKs clogging up the RF or AP being starved for airtime.

Since the windows driver works better, then probably it is not much to do with ACKs or
downstream traffic anyway.

		TCP_TSQ=200

Setting it to 200 is way excessive. In particular since you already get
the *8 from the default mac80211 behaviour, so now you effectively have
*1600, which means instead of 1ms you can have 1.6s worth of TCP data on
the queues ... way too much :)

Yes, this was hacked in back when the pacing did not work well with ath10k.
I'll do some tests to see how much this matters on modern kernels when I get
a chance.

This will allow huge congestion control windows....

Thanks,
Ben


--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc  http://www.candelatech.com




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux