Re: [PATCH 3/4] Bluetooth: Limit depth of the HCI TX queue with ERTM mode

Mat Martineau <mathewm@xxxxxxxxxxxxxx> · Tue, 14 Jun 2011 16:31:10 -0700 (PDT)

Hi Gustavo -

On Thu, 9 Jun 2011, Mat Martineau wrote:

Gustavo,

On Wed, 8 Jun 2011, Gustavo F. Padovan wrote:

Hi Mat,

* Mat Martineau <mathewm@xxxxxxxxxxxxxx> [2011-06-03 16:21:09 -0700]:

In order to provide timely responses to REJ, SREJ, and poll input from
the remote device, it helps to reduce the number of ERTM data frames
in the HCI TX queue at one time. If a full TX window of data is in the
HCI TX queue, any responses to REJ, SREJ, or polls must wait in line
behind all previously queued data. This latency leads to disconnects,
and will be more severe with extended window sizes.

I prefer if we go with a hci_send_acl_prio() implementation. It 
will have much less overhead using a workqueue. As it will be 
filled only by S-frames with a few bytes each I don't think we will 
have problems. So lets go with this approach and see what we can 
get.

I considered that approach too, but it breaks some major assumptions and I 
don't think it complies with the ERTM spec.  I-frames contain reqseq fields 
and a final bit, so if S-frames and I-frames are delivered out-of-sequence, 
you can easily end up with a confusing series of reqseq values at the 
receiver.

Suppose the HCI tx queue is full of I-frames, and the oldest I-frame has 
reqseq set to 1.  Since that I-frame has been queued, other incoming I-frames 
have been processed, so the last recieved I-frame had txseq 20.  The remote 
device sends a poll, and we reply with an RR (reqseq 21) using the priority 
queue.  HCI sends that RR first, then an I-frame from the normal queue with 
reqseq 1.  Now the remote side thinks it missed all of the frames from 21 to 
1 (having wrapped around).  The remote side then has to send REJ or SREJ 
frames, even though nothing is actually missing.

So, I think we have two options:

* Use the skb_destructor mechanism to pull data for ERTM (which is what my 
patch does), and leave queuing for other modes alone
* Rearchitect HCI & L2CAP so that data is pulled from the L2CAP layer as 
num_comp_pkts events are received

I realize there is increased overhead to make the callbacks to get data out 
of the ERTM tx queue, but the skb destructor is very lightweight (since it 
uses an atomic_t counter).  The overhead is tunable using 
L2CAP_MAX_ERTM_QUEUED and L2CAP_MIN_ERTM_QUEUED to control how often the 
callback to l2cap_ertm_send() is actually made. With the current queuing 
behavior, things get unmanageable on AMP with extra latency from larger tx 
windows and much shorter timeouts.

Just pinging you regarding the ERTM tx queuing questions.  Please let 
me know what I can do!

--
Mat Martineau
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum
--
To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html