Re: [very-RFC 0/8] TSN driver for the kernel

Henrik Austad <henrik@xxxxxxxxx> · Tue, 14 Jun 2016 11:30:00 +0200

On Mon, Jun 13, 2016 at 09:32:10PM +0200, Richard Cochran wrote:
> On Mon, Jun 13, 2016 at 03:00:59PM +0200, Henrik Austad wrote:
> > On Mon, Jun 13, 2016 at 01:47:13PM +0200, Richard Cochran wrote:
> > > Which driver is that?
> > 
> > drivers/net/ethernet/renesas/
> 
> That driver is merely a PTP capable MAC driver, nothing more.
> Although AVB is in the device name, the driver doesn't implement
> anything beyond the PTP bits.

Yes, I think they do the rest from userspace, not sure though :)

> > What is the rationale for no new sockets? To avoid cluttering? or do 
> > sockets have a drawback I'm not aware of?
> 
> The current raw sockets will work just fine.  Again, there should be a
> application that sits in between with the network socket and the audio
> interface.

So loop data from kernel -> userspace -> kernelspace and finally back to 
userspace and the media application? I agree that you need a way to pipe 
the incoming data directly from the network to userspace for those TSN 
users that can handle it. But again, for media-applications that don't know 
(or care) about AVB, it should be fed to ALSA/v4l2 directly and not jump 
between kernel and userspace an extra round.

I get the point of not including every single audio/video encoder in the 
kernel, but raw audio should be piped directly to alsa. V4L2 has a way of 
piping encoded video through the system and to the media application (in 
order to support cameras that to encoding). The same approach should be 
doable for AVB, no? (someone from alsa/v4l2 should probably comment on 
this)

> > Why is configfs wrong?
> 
> Because the application will use the already existing network and
> audio interfaces to configure the system.

Configuring this via the audio-interface is going to be a challenge since 
you need to configure the stream through the network before you can create 
the audio interface. If not, you will have to either drop data or block the 
caller until the link has been fully configured.

This is actually the reason why configfs is used in the series now, as it 
allows userspace to figure out all the different attributes and configure 
the link before letting ALSA start pushing data.

> > > Lets take a look at the big picture.  One aspect of TSN is already
> > > fully supported, namely the gPTP.  Using the linuxptp user stack and a
> > > modern kernel, you have a complete 802.1AS-2011 solution.
> > 
> > Yes, I thought so, which is also why I have put that to the side and why 
> > I'm using ktime_get() for timestamps at the moment. There's also the issue 
> > of hooking the time into ALSA/V4L2
> 
> So lets get that issue solved before anything else.  It is absolutely
> essential for TSN.  Without the synchronization, you are only playing
> audio over the network.  We already have software for that.

Yes, I agree, presentation-time and local time needs to be handled 
properly. The same for adjusting sample-rate etc. This is a lot of work, so 
I hope you can understand why I started out with a simple approach to spark 
a discussion before moving on to the larger bits.

> > > 2. A user space audio application that puts it all together, making
> > >    use of the services in #1, the linuxptp gPTP service, the ALSA
> > >    services, and the network connections.  This program will have all
> > >    the knowledge about packet formats, AV encodings, and the local HW
> > >    capabilities.  This program cannot yet be written, as we still need
> > >    some kernel work in the audio and networking subsystems.
> > 
> > Why?
> 
> Because user space is right place to place the knowledge of the myriad
> formats and options.

Se response above, better to let anything but uncompressed raw data trickle 
through.

> > the whole point should be to make it as easy for userspace as 
> > possible. If you need to tailor each individual media-appliation to use 
> > AVB, it is not going to be very useful outside pro-Audio. Sure, there will 
> > be challenges, but one key element here should be to *not* require 
> > upgrading every single media application.
> > 
> > Then, back to the suggestion of adding a TSN_SOCKET (which you didn't like, 
> > but can we agree on a term "raw interface to TSN", and mode of transport 
> > can be defined later? ), was to let those applications that are TSN-aware 
> > to do what they need to do, whether it is controlling robots or media 
> > streams.
> 
> First you say you don't want ot upgrade media applications, but then
> you invent a new socket type.  That is a contradiction in terms.

Hehe, no, bad phrasing on my part. I want *both* (hence the shim-interface) 
:)

> Audio apps already use networking, and they already use the audio
> subsystem.  We need to help them get their job done by providing the
> missing kernel interfaces.  They don't need extra magic buffering the
> kernel.  They already can buffer audio data by themselves.

Yes, I know some audio apps "use networking", I can stream netradio, I can 
use jack to connect devices using RTP and probably a whole lot of other 
applications do similar things. However, AVB is more about using the 
network as a virtual sound-card. For the media application, it should not 
have to care if the device it is using is a soudncard inside the box or a 
set of AVB-capable speakers somewhere on the network.

> > > * Kernel Space
> > > 
> > > 1. Providing frames with a future transmit time.  For normal sockets,
> > >    this can be in the CMESG data.  For mmap'ed buffers, we will need a
> > >    new format.  (I think Arnd is working on a new layout.)
> > 
> > Ah, I was unaware of this, both CMESG and mmap buffers.
> > 
> > What is the accuracy of deferred transmit? If you have a class A stream, 
> > you push out a new frame every 125 us, you may end up with 
> > accuracy-constraints lower than that if you want to be able to state "send 
> > frame X at time Y".
> 
> I have no idea what you are asking here.

I assumed that when you had a mmap'd buffer you'd have to specify a point 
in time at which the frame should be sent. And since a class A has a 125us 
interval of sending frames, you have to be able to send frames with enough 
accuray to that. That's a pretty strict deadline coming from userspace. But 
as they say, never assume.

I have a lot to dig into, and I've gotten a lot of very useful pointers. I 
should be busy for a while

-- 
Henrik Austad
Attachment:
signature.asc

Description: Digital signature