Hi Luiz, On Wed, 2011-08-03 at 13:49 -0400, Luiz Augusto von Dentz wrote: > Hi Peter, > > On Wed, Aug 3, 2011 at 7:25 PM, Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote: > > > > I believe that the existing tx scheduler is already inefficient and that > > this approach significantly compounds that inefficiency. > > > > Consider the following scenarios with 4 ACL connections (keyboard, > > mouse, phone, headset): > > > > For *one* scheduled ACL tx packet, the connection list will be searched > > 35~42 times -- that's 140~168 connection list node visits to find one > > packet! Here's the math: > > (7 - priority of tx packet) x 4 connections => found the packet > > 7 pri levels x 4 connections => no more ACL tx packets > > 7 pri levels x 4 connections => no SCO tx packets > > 7 pri levels x 4 connections => no ESCO tx packets > > 7 pri levels x 4 connections => no LE tx packets > > 7 pri levels x 4 connections => recalc priority > > > > If the connection type doesn't match, it's not going to match at any > > priority level. > > Obviously there are a few things that we might change to not traverse > the list of connection over and over, remember this is an RFC, besides > it 8 priorities (0-7), but we can skip much earlier if connection > doesn't match as you said. I'm sorry if my comments appeared harsh and I apologize if I offended you. I get it's an RFC :) Personally, I like the idea of prioritizing the tx scheduler, and I applaud the effort. My point here (besides pointing out the massive iteration) was to suggest that perhaps some scheduler state should be preserved such that already-visited priority levels are not revisited more than once per tx tasklet schedule. > Also I guess for SCO/ESCO/LE e doesn't make > much sense to have many queues/priorities, it is basically ACL only, > that simplify a lot already. LE should still have priorities. In fact, in a private test we've been running here, we've merged the scheduler so that LE conns that share ACL buffers are scheduled alongside other ACL conns (because they share the same resource -- namely, the acl_cnt. We also merged SCO/ESCO scheduling together as well). > > >> + > >> + if (c->state != BT_CONNECTED && c->state != BT_CONFIG) > >> + continue; > >> > >> - num++; > >> + num++; > >> > >> - if (c->sent < min) { > >> - min = c->sent; > >> - conn = c; > >> + if (c->sent < min) { > >> + min = c->sent; > >> + conn = c; > >> + *queue = &c->data_q[i]; > >> + } > > > > Why preserve the fairness logic? > > It does need to fair if there is 2 or more sockets with the same > priority, otherwise the first connection in the list might get all the > quote and even if we promote the starving queues it may still happen > to top most priority since there it cannot be promoted anymore. Right. What I meant here was this: if the tx scheduler is re-made as priority-based, then it seems logical to discard the fairness logic and design the priority scheme such that starvation is not possible. For example, the pri levels could be like this: 0 ~ 5 socket-programmable priorities 6 special CAP_NET_ADMIN priority 7 only possible via promotion (Not that I'm suggesting that's the only or even best way to make starvation not possible). > >> +static void hci_prio_recalculate(struct hci_dev *hdev) > >> +{ > >> + struct hci_conn_hash *h = &hdev->conn_hash; > >> + int i; > >> + > >> + BT_DBG("%s", hdev->name); > >> + > >> + for (i = HCI_PRIO_MAX - 1; i > 0; i--) { > >> + struct list_head *p; > >> + > >> + list_for_each(p, &h->list) { > >> + struct hci_conn *c; > >> + c = list_entry(p, struct hci_conn, list); > >> + > >> + if (c->sent || skb_queue_empty(&c->data_q[i - 1])) > >> + continue; > > > > The test for unacked packets does not allow the priority to advance if > > the unacked packets are from a previously executed hci_tx_task (ie., a > > connection transmitted packets on an earlier scheduled tx_task but those > > packets haven't been acked yet). > > IMO it does, why should it be promoted to a higher priority if it > still is pending from the last time, this is specifically to throttle > lower priority in favor of higher, if throughput is important the > application should request be able to sets its priority according, and > it can do it at any time. > > > This would allow a higher priority connection to almost monopolize > > transmission when the controller is near max load. A higher priority > > link will receive a tx quota of all available tx buffers, whereas a > > lower priority link will not advance until it's previous tx packets are > > acked. Worst-case scheduling could take a long time to schedule a lower > > priority packet. > > You probably have never run this code did you? Only priority 7 can > really monopolize the connection, and that is on purpose, and yes > lower priority are throttled so higher priority can get lower latency, > what is wrong with that? With no offense meant, simply running this code would be insufficient to qualify it's appropriateness as a replacement scheduler. A bare-minimum test matrix would ascertain comparative values for minimum/mean/maximum latency and throughput for best-case/nominal/worst-case tx loads. My main concern here is that app-level socket settings can have dramatic effects on tx scheduling behavior, especially at near max tx loads. What I meant about near monopolization is this: a priority 6 link that retires 2 packets every 55 ms would mean that priority 0 links could only xmit every 330 ms (1/3 sec.), assuming the controller was already saturated by the priority 6 link. Even priority 3 links are only going to be able to xmit every 165ms under these conditions. What if, instead, links that are being promoted because the controller is saturated, are promoted to the same (or maybe, higher) priority level than was last successful? > >> static void hci_tx_task(unsigned long arg) > >> { > >> struct hci_dev *hdev = (struct hci_dev *) arg; > >> @@ -2253,6 +2316,8 @@ static void hci_tx_task(unsigned long arg) > >> while ((skb = skb_dequeue(&hdev->raw_q))) > >> hci_send_frame(skb); > >> > >> + hci_prio_recalculate(hdev); > > > > The priority recalculation should only be performed if one of the > > hci_sched_* functions was unable to schedule all outstanding packets > > (ie., a tx buffer count dropped to zero). > > Yep, gonna fix that too. > > > How does this scheduler + priority recalc behave if *no* packets could > > be scheduled at all? For example, if acl_cnt == 0 when the tx_task is > > scheduled? > > Right now it promotes starving queues, all of them, but maybe it > shouldn't so that we simplify the complexity a little bit. My thinking here is that the tx tasklet shouldn't be scheduled at all if no work can be performed. Of course, the existing code suffers from the same defect. Really, the only reason the tx tasklet *needs* to be scheduled when the relevant buffer counter is 0 is to catch tx timeouts. Respectfully, Peter Hurley ��.n��������+%������w��{.n�����{����^n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�