Re: [PATCH net-next] net: introduce SO_INCOMING_CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 14, 2014 at 12:34 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Fri, Nov 14, 2014 at 12:25 PM, Tom Herbert <therbert@xxxxxxxxxx> wrote:
>> On Fri, Nov 14, 2014 at 12:16 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>> On Fri, Nov 14, 2014 at 11:52 AM, Tom Herbert <therbert@xxxxxxxxxx> wrote:
>>>> On Fri, Nov 14, 2014 at 11:33 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
>>>>> On Fri, 2014-11-14 at 09:17 -0800, Andy Lutomirski wrote:
>>>>>
>>>>>> As a heavy user of RFS (and finder of bugs in it, too), here's my
>>>>>> question about this API:
>>>>>>
>>>>>> How does an application tell whether the socket represents a
>>>>>> non-actively-steered flow?  If the flow is subject to RFS, then moving
>>>>>> the application handling to the socket's CPU seems problematic, as the
>>>>>> socket's CPU might move as well.  The current implementation in this
>>>>>> patch seems to tell me which CPU the most recent packet came in on,
>>>>>> which is not necessarily very useful.
>>>>>
>>>>> Its the cpu that hit the TCP stack, bringing dozens of cache lines in
>>>>> its cache. This is all that matters,
>>>>>
>>>>>>
>>>>>> Some possibilities:
>>>>>>
>>>>>> 1. Let SO_INCOMING_CPU fail if RFS or RPS are in play.
>>>>>
>>>>> Well, idea is to not use RFS at all. Otherwise, it is useless.
>>>
>>> Sure, but how do I know that it'll be the same CPU next time?
>>>
>>>>>
>>>> Bear in mind this is only an interface to report RX CPU and in itself
>>>> doesn't provide any functionality for changing scheduling, there is
>>>> obviously logic needed in user space that would need to do something.
>>>>
>>>> If we track the interrupting CPU in skb, the interface could be easily
>>>> extended to provide the interrupting CPU, the RPS CPU (calculated at
>>>> reported time), and the CPU processing transport (post steering which
>>>> is what is currently returned). That would provide the complete
>>>> picture to control scheduling a flow from userspace, and an interface
>>>> to selectively turn off RFS for a socket would make sense then.
>>>
>>> I think that a turn-off-RFS interface would also want a way to figure
>>> out where the flow would go without RFS.  Can the network stack do
>>> that (e.g. evaluate the rx indirection hash or whatever happens these
>>> days)?
>>>
>> Yes,. We need the rxhash and the CPU that packets are received on from
>> the device for the socket. The former we already have, the latter
>> might be done by adding a field to skbuff to set received CPU. Given
>> the L4 hash and interrupting CPU we can calculated the RPS CPU which
>> is where packet would have landed with RFS off.
>
> Hmm.  I think this would be useful for me.  It would *definitely* be
> useful for me if I could pin an RFS flow to a cpu of my choice.
>
Andy, can you elaborate a little more on your use case. I've thought
several times about an interface to program the flow table from
userspace, but never quite came up with a compelling use case and
there is the security concern that a user could "steal" cycles from
arbitrary CPUs.

> With SO_INCOMING_CPU as described, I'm worried that people will write
> programs that perform very well if RFS is off, but that once that code
> runs with RFS on, weird things could happen.
>
> (On a side note: the RFS flow hash stuff seems to be rather buggy.
> Some Solarflare engineers know about this, but a fix seems to be
> rather slow in the works.  I think that some of the bugs are in core
> code, though.)

This is problems with accelerated RFS or just getting the flow hash for packets?

Thanks,
Tom

>
> --Andy
>
>>
>>>>
>>>>> RFS is the other way around : You want the flow to follow your thread.
>>>>>
>>>>> RPS wont be a problem if you have sensible RPS settings.
>>>>>
>>>>>>
>>>>>> 2. Change the interface a bit to report the socket's preferred CPU
>>>>>> (where it would go without RFS, for example) and then let the
>>>>>> application use setsockopt to tell the socket to stay put (i.e. turn
>>>>>> off RFS and RPS for that flow).
>>>>>>
>>>>>> 3. Report the preferred CPU as in (2) but let the application ask for
>>>>>> something different.
>>>>>>
>>>>>> For example, I have flows for which I know which CPU I want.  A nice
>>>>>> API to put the flow there would be quite useful.
>>>>>>
>>>>>>
>>>>>> Also, it may be worth changing the naming to indicate that these are
>>>>>> about the rx cpu (they are, right?).  For some applications (sparse,
>>>>>> low-latency flows, for example), it can be useful to keep the tx
>>>>>> completion handling on a different CPU.
>>>>>
>>>>> SO_INCOMING_CPU is rx, like incoming ;)
>>>>>
>>>>>
>>>
>>> Duh :)
>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>> --
>>> Andy Lutomirski
>>> AMA Capital Management, LLC
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux