Pauli Virtanen wrote: > Hi, > > ke, 2025-03-19 kello 16:00 -0400, Willem de Bruijn kirjoitti: > > Pauli Virtanen wrote: > > > ke, 2025-03-19 kello 10:44 -0400, Willem de Bruijn kirjoitti: > > > > Jason Xing wrote: > > > > > On Wed, Mar 19, 2025 at 3:10 AM Pauli Virtanen <pav@xxxxxx> wrote: > > > > > > > > > > > > Support enabling TX timestamping for some skbs, and track them until > > > > > > packet completion. Generate software SCM_TSTAMP_COMPLETION when getting > > > > > > completion report from hardware. > > > > > > > > > > > > Generate software SCM_TSTAMP_SND before sending to driver. Sending from > > > > > > driver requires changes in the driver API, and drivers mostly are going > > > > > > to send the skb immediately. > > > > > > > > > > > > Make the default situation with no COMPLETION TX timestamping more > > > > > > efficient by only counting packets in the queue when there is nothing to > > > > > > track. When there is something to track, we need to make clones, since > > > > > > the driver may modify sent skbs. > > > > > > > > Why count packets at all? And if useful separate from completions, > > > > should that be a separate patch? > > > > > > This paragraph was commenting on the implementation of struct tx_queue, > > > and maybe how it works should be explicitly explained somewhere (code > > > comment?). Here's some explanation of it: > > > > > > 1) We have to hang on (clones of) skbs until completion reports for > > > them arrive, in order to emit COMPLETION timestamps. There's no > > > existing queue that does this in net/bluetooth (drivers may just copy > > > data & discard skbs, and they don't know about completion reports), so > > > something new needs to be added. > > > > > > 2) It is only needed for emitting COMPLETION timestamps. So it's better > > > to not do any extra work (clones etc.) when there are no such > > > timestamps to be emitted. > > > > > > 3) The new queue should work correctly when timestamping is turned on > > > or off, or only some packets are timestamped. It should also eventually > > > return to a state where no extra work is done, when new skbs don't > > > request COMPLETION timestamps. > > > > So far, fully understood. > > > > > struct tx_queue implements such queue that only "tracks" some skbs. > > > Logical structure: > > > > > > HEAD > > > <no stored skb> } > > > <no stored skb> } tx_queue::extra is the number of non-tracked > > > ... } logical items at queue head > > > <no stored skb> } > > > <tracked skb> } tx_queue::queue contains mixture of > > > <non-tracked skb> } tracked items (skb->sk != NULL) and > > > <non-tracked skb> } non-tracked items (skb->sk == NULL). > > > <tracked skb> } These are ordered after the "extra" items. > > > TAIL > > > > > > tx_queue::tracked is the number of tracked skbs in tx_queue::queue. > > > > > > hci_conn_tx_queue() determines whether skb is tracked (= COMPLETION > > > timestamp shall be emitted for it) and pushes a logical item to TAIL. > > > > > > hci_conn_tx_dequeue() pops a logical item from HEAD, and emits > > > timestamp if it corresponds to a tracked skb. > > > > > > When tracked == 0, queue() can just increment tx_queue::extra, and > > > dequeue() can remove any skb from tx_queue::queue, or if empty then > > > decrement tx_queue::extra. This allows it to return to a state with > > > empty tx_queue::queue when new skbs no longer request timestamps. > > > > > > When tracked != 0, the ordering of items in the queue needs to be > > > respected strictly, so queue() always pushes real skb (tracked or not) > > > to TAIL, and dequeue() has to decrement extra to zero, before it can > > > pop skb from queue head. > > > > Thanks. I did not understand why you need to queue or track any > > sbs aside from those that have SKBTX_COMPLETION_TSTAMP. > > > > If I follow correctly this is to be able to associate the tx > > completion with the right skb on the queue. > > Yes, it was done to maintain the queue/dequeue ordering. > > > The usual model in Ethernet drivers is that every tx descriptor (and > > completion descriptor) in the ring is associated with a pure software > > ring of metadata structures, which can point to an skb (or NULL). > > > > In a pinch, instead the skb on the queue itself could record the > > descriptor id that it is associated with. But hci_conn_tx_queue is > > too far removed from the HW, so has no direct access to that. And > > similarly hci_conn_tx_dequeue has no such low level details. > > > > So long story short you indeed have to track this out of band with > > a separate counter. I also don't immediately see a simpler way. > > > > Though you can perhaps replace the skb_clone (not the skb_clone_sk!) > > with some sentinel value that just helps count? > > It probably could be done a bit smarter, it could eg. use something > else than skb_queue. Or, I think we can clobber cb here as the clones > are only used for timestamping, so: > > > struct tx_queue { > unsigned int pre_items; > struct sk_buff_head queue; > }; > > struct tx_queue_cb { > unsigned int post_items; > }; > > static void hci_tx_queue_push(struct tx_queue *q, struct sk_buff *skb) > { > struct tx_queue_cb *cb; > > /* HEAD > * <non-tracked item> } > * ... } tx_queue::pre_items of these > * <non-tracked item> } > * <tracked skb1> <- tx_queue::queue first item > * <non-tracked item> } > * ... } ((struct tx_queue_cb *)skb1->cb)->post_items > * <non-tracked item> } > * ... > * <tracked skbn> <- tx_queue::queue n-th item > * <non-tracked item> } > * ... } ((struct tx_queue_cb *)skbn->cb)->post_items > * <non-tracked item> } > * TAIL > */ > if (skb) { > cb = (struct tx_queue_cb *)skb->cb; > cb->post_items = 0; > skb_queue_tail(&q->queue, skb); > } else { > skb = skb_peek_tail(&q->queue); > if (skb) { > cb = (struct tx_queue_cb *)skb->cb; > cb->post_items++; > } else { > q->pre_items++; > } > } > } > > static struct sk_buff *hci_tx_queue_pop(struct tx_queue *q) > { > struct sk_buff *skb; > struct tx_queue_cb *cb; > > if (q->pre_items) { > q->pre_items--; > return NULL; > } > > skb = skb_dequeue(&q->queue); > if (skb) { > cb = (struct tx_queue_cb *)skb->cb; > q->pre_items += cb->post_items; > } > > return skb; > } > > void hci_conn_tx_queue(struct hci_conn *conn, struct sk_buff *skb) > { > /* Emit SND now, ie. just before sending to driver */ > if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP) > __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SND); > > /* COMPLETION tstamp is emitted for tracked skb later in Number of > * Completed Packets event. Available only for flow controlled cases. > * > * TODO: SCO support without flowctl (needs to be done in drivers) > */ > switch (conn->type) { > case ISO_LINK: > case ACL_LINK: > case LE_LINK: > break; > case SCO_LINK: > case ESCO_LINK: > if (!hci_dev_test_flag(conn->hdev, HCI_SCO_FLOWCTL)) > return; > break; > default: > return; > } > > if (skb->sk && (skb_shinfo(skb)->tx_flags & SKBTX_COMPLETION_TSTAMP)) > skb = skb_clone_sk(skb); > else > skb = NULL; > > hci_tx_queue_push(&conn->tx_q, skb); > return; > } > > void hci_conn_tx_dequeue(struct hci_conn *conn) > { > struct sk_buff *skb = hci_tx_queue_pop(&conn->tx_q); > > if (skb) { > __skb_tstamp_tx(skb, NULL, NULL, skb->sk, > SCM_TSTAMP_COMPLETION); > kfree_skb(skb); > } > } Neat. To be clear, your call. Just if the expectation is that timestamped packets are rare even when enabled (e.g., due to sampling, or enabling only for one of many sockets), then avoiding the skb_clone in the common case may be worthwhile.