Re: [RFC PATCH 00/17] xdp: Add packet queueing and scheduling capabilities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 14, 2022 at 12:46:54PM +0200, Toke Høiland-Jørgensen wrote:
> Stanislav Fomichev <sdf@xxxxxxxxxx> writes:
> 
> > On Wed, Jul 13, 2022 at 2:52 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> >>
> >> Stanislav Fomichev <sdf@xxxxxxxxxx> writes:
> >>
> >> > On Wed, Jul 13, 2022 at 4:14 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> >> >>
> >> >> Packet forwarding is an important use case for XDP, which offers
> >> >> significant performance improvements compared to forwarding using the
> >> >> regular networking stack. However, XDP currently offers no mechanism to
> >> >> delay, queue or schedule packets, which limits the practical uses for
> >> >> XDP-based forwarding to those where the capacity of input and output links
> >> >> always match each other (i.e., no rate transitions or many-to-one
> >> >> forwarding). It also prevents an XDP-based router from doing any kind of
> >> >> traffic shaping or reordering to enforce policy.
> >> >>
> >> >> This series represents a first RFC of our attempt to remedy this lack. The
> >> >> code in these patches is functional, but needs additional testing and
> >> >> polishing before being considered for merging. I'm posting it here as an
> >> >> RFC to get some early feedback on the API and overall design of the
> >> >> feature.
> >> >>
> >> >> DESIGN
> >> >>
> >> >> The design consists of three components: A new map type for storing XDP
> >> >> frames, a new 'dequeue' program type that will run in the TX softirq to
> >> >> provide the stack with packets to transmit, and a set of helpers to dequeue
> >> >> packets from the map, optionally drop them, and to schedule an interface
> >> >> for transmission.
> >> >>
> >> >> The new map type is modelled on the PIFO data structure proposed in the
> >> >> literature[0][1]. It represents a priority queue where packets can be
> >> >> enqueued in any priority, but is always dequeued from the head. From the
> >> >> XDP side, the map is simply used as a target for the bpf_redirect_map()
> >> >> helper, where the target index is the desired priority.
> >> >
> >> > I have the same question I asked on the series from Cong:
> >> > Any considerations for existing carousel/edt-like models?
> >>
> >> Well, the reason for the addition in patch 5 (continuously increasing
> >> priorities) is exactly to be able to implement EDT-like behaviour, where
> >> the priority is used as time units to clock out packets.
> >
> > Ah, ok, I didn't read the patches closely enough. I saw some limits
> > for the ranges and assumed that it wasn't capable of efficiently
> > storing 64-bit timestamps..
> 
> The goal is definitely to support full 64-bit priorities. Right now you
> have to start out at 0 but can go on for a full 64 bits, but that's a
> bit of an API wart that I'd like to get rid of eventually...
> 
> >> > Can we make the map flexible enough to implement different qdisc
> >> > policies?
> >>
> >> That's one of the things we want to be absolutely sure about. We are
> >> starting out with the PIFO map type because the literature makes a good
> >> case that it is flexible enough to implement all conceivable policies.
> >> The goal of the test harness linked as note [4] is to actually examine
> >> this; Frey is our PhD student working on this bit.
> >>
> >> Thus far we haven't hit any limitations on this, but we'll need to add
> >> more policies before we are done with this. Another consideration is
> >> performance, of course, so we're also planning to do a comparison with a
> >> more traditional "bunch of FIFO queues" type data structure for at least
> >> a subset of the algorithms. Kartikeya also had an idea for an
> >> alternative way to implement a priority queue using (semi-)lockless
> >> skiplists, which may turn out to perform better.
> >>
> >> If there's any particular policy/algorithm you'd like to see included in
> >> this evaluation, please do let us know, BTW! :)
> >
> > I honestly am not sure what the bar for accepting this should be. But
> > on the Cong's series I mentioned Martin's CC bpf work as a great
> > example of what we should be trying to do for qdisc-like maps. Having
> > a bpf version of fq/fq_codel/whatever_other_complex_qdisc might be
> > very convincing :-)
> 
> Just doing flow queueing is quite straight forward with PIFOs. We're
> working on fq_codel. Personally I also want to implement something that
> has feature parity with sch_cake (which includes every feature and the
> kitchen sink already) :)

And how exactly would you plan to implement Least Slack Time First with
PIFOs?  See https://www.usenix.org/system/files/nsdi20-paper-sharma.pdf.
Can you be as specific as possible ideally with a pesudo code?

BTW, this is very easy to do with my approach as no FO limitations.

Thanks!



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux