RE: [EXT] Re: [PATCH 1/3] dma-buf: heaps: add Linaro secure dmabuf heap support

Cyrille Fleury <cyrille.fleury@xxxxxxx> · Tue, 23 Aug 2022 15:58:16 +0000

Hi Nicolas, all

Nicolas,  please see below in the thread for your concerns regarding video decoding and hdmi receivers with secure memory.

But i can also propose to setup calls to discuss about that. We are almost done with a proposal to add support of Linaro secure dmabuff heaps in V4L2. Today It is functional with Linux 5.15 + i.MX8MQ evk board,  I mean we can playback video using a secure dmabuf heap memory Linux can't read/write, but we still need some time to clean the code and have something ready for a code review.

So maybe 2 different calls could happen:
    1) A first one to define a generic mechanism for Linux to manage (allocate/free) secure memory. By secure memory i mean a memory Linux can't read and write with the cpu running in non-secure mode.
          - Linaro secure dmabuf heaps seems to be a reasonable approach and is available
          - Secure OS in charge of the hardware management to protect the secure memory, without any action to be done from Linux side, I mean when Linux kernel starts, a secure dmabuf heap is already protected ,seems a reasonable approach
          - then Linux Kernel needs to read device tree to know such secure heaps exist and will expose them to user space world.
          - For memory isolation/sandboxing use cases, we may need different secure heaps.  For example, one secure heap, the  video decoder is allowed to access (Secure Video Path like applications),and second secure heap Video heap decoder is not allowed to access (secure payment like applications), but this is under responsibility of secure OS to configure such memory security rules before Linux kernel starts. 

          - Using current Linux CMA, and ask dynamically the secure OS to allocate and secure memory from existing CMA heap doesn't seem to be a right approach to me. I think we should be more or less all aligned with that.
              Why is it not the right approach:
                 - when you release secure memory, you need to "memset" it to 0, because this memory can potentially  be reuse by non-secure world. It can take a long time with 4K video buffers and during that time, the Linux process calling the secure OS is stuck. Not good for real time or smooth video playback.
                 - we need direct interaction between Linux and Secure OS for each allocate/free of secure memory:
		   - different secure OS means different API to be called from Linux.
                                - it takes time for the CPU to switch from non-secure  Linux-> secure OS -> non-secure Linux. From what I have in mind, something like 1ms with a arm a53 cpu running at 1.5 Ghz, so max 1000 alloc/free per seconds, and again during that time, the Linux process calling the secure OS is stuck, waiting secure OS.
	                  - Linux and secure OS shall be in sync regarding memory allocation in CMA. Seems a very complex mechanism to maintain.

              ->  It is why we need dedicated dma buf heaps for secure memory, and why  Linaro secure dmabuf heap support is needed in Linux Kernel.

    2) A second call to discuss V4L2 using Linaro secure dmabuff heap, when we will be ready for the code review.

Please let me know if you agree with this proposal to setup 2 different calls, as they are 2 different topics to be addressed.

Regards.

-----Original Message-----
From: Nicolas Dufresne <nicolas@xxxxxxxxxxxx> 
Sent: Friday, August 19, 2022 5:14 PM
To: Cyrille Fleury <cyrille.fleury@xxxxxxx>; Olivier Masse <olivier.masse@xxxxxxx>; brian.starkey@xxxxxxx
Cc: sumit.semwal@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linaro-mm-sig@xxxxxxxxxxxxxxxx; christian.koenig@xxxxxxx; linux-media@xxxxxxxxxxxxxxx; nd@xxxxxxx; Clément Faure <clement.faure@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx; benjamin.gaignard@xxxxxxxxxxxxx
Subject: Re: [EXT] Re: [PATCH 1/3] dma-buf: heaps: add Linaro secure dmabuf heap support

Caution: EXT Email

Hi,

thanks for the additional information, we are starting to have a (still partial) overview of your team goals.

Le jeudi 18 août 2022 à 05:25 +0000, Cyrille Fleury a écrit :
> Hi Nicolas, all,
>
>  The short reply:
>       - For DRM, gstreamer, ffmeg, ... we don't use anymore NXP VPU 
> proprietary API
>       - We need secure dma-buf heaps to replace secure ion heaps
>
>   The more detailed reply to address concerns below in the thread:
>       - NXP doesn't design VPU, but rely on third party VPU hardware 
> IP we integrate in our soc.  NXP proprietary API are for legacy 
> applications our customers did without using gstreamer or ffmpeg, but 
> we are now relying on
> V4L2 API for WPE/gstreamer, chromium/ffmpeg ...
>       - Even with NXP legacy BSP, there was no API impact for WPE (or
> chromium) due to NXP VPU API. We use WPE/gstreamer, then a gstreamer 
> pluging relying on NXP VPU proprietary API. But now we use V4L2. So we 
> can forget NXP VPU proprietary API, and I'm very happy with that.
>       - We have moved from ion buffer to dma buff to manage secure 
> memory management. This is why we need secure dma-buf heaps, we 
> protect with NXP hardware as we did with ion heaps in the presentation Olivier shared.
>       - For secure video playback, the changes we need to do are in 
> user space world (gstreamer, WPE, ...), to update our patches managing 
> secure ion heaps by secure dma-buf heaps. But dma-buf is file descriptor based as ion heap are.

Do you have some links to these changes to user-space code that demonstrate the usage of this new heap in its real context ?

[Cyrille] We have a proposal for V4L2 + secure dma-buff heaps. Pull request should be available soon. We added an allocator in drivers/media/common/videobuf2/videobuf2-dma-heap.c, following what has been done in videobuf2-dma-contig.c. With this new allocator, we can playback H264 and HEVC streams with gstreamer and secure dma heaps (memory Linux can't read/write, and protected by TZASC + NXP equivalent of Arm TZMP technology (RDC/TRDC for i.MX8M family)). OPTEE is in charge of protecting memory, through a device tree and dedicated drivers in OPTEE, but OPTEE could be replaced by any other secure OS, as we don't rely on OPTEE to allocate memory.

>       - What will change between platforms, is how memory is 
> protected. This is why we requested to have dtb in OPTEE for secure 
> memory, to be able to provide a common API to secure DDR memory, and  
> then to rely on proprietary code in OPTEE to secure it.
>       - We don't have a display controller or VPU decoder running in 
> TEE. They remain under the full control of Linux/REE Word. If 
> Linux/REE ask something breaking Widevine/PlayReady security rules, 
> for example decode secure memory to non-secure memory, read from 
> secure memory will return 0, write to secure memory will be ignored. Same with keys, certificates ...

Can you explain how you would manage to make VP9 stateless decoding work ? On IMX8MQ you have a chip that will produce a feedback binary, which contains the probability data. The mainline driver will merge the forward probability to prepare the probability for the next decode.

This basically means at least 1 output of the decoder needs to be non-secure (for CPU read-back). That breaks the notion of secure memory domain, which is global to the HW. One could think you could just ask the TEE to copy it back for you, but to do that safely, the TEE would need to control the CODEC programming, hence have a CODEC driver in the secure OS.

I'm not familiar with it, but may that have impact on HDMI receivers, which may need some buffers for CPU usage (perhaps HDR metadata, EDID, etc.).

 [Cyrille] We indeed got issues with VP9 codec with i.MX 8M stateless VPU, but not with vp9 continuity counters/feedback binary. There is no really secret in those feedback information ( I mean you cannot build an image from them), so they can be expose through non-secure memory to Linux. Issue we got is related to amount of video meta data Widevine encrypt, because some meta data not supported by i.MX8 stateless VPU (Hantro G2 decoder) shall be parsed by CPU, but they are encrypted by Widevine. Widevine is not following CENC specification regarding those meta data. So we informed Widevine but they are afraid to change the encryption model they use for VP9. So to support VP9, we need to parse those VP9 meta data in OPTEE, in a dedicated Trusted Application, to detect and expose them to Linux. There is no secret in those meta data, the only risk is a bug in our parsing algorithm, and I agree this is not very good at secure video path level, but we have no other solution for VP9 codec with 8M family having Hantro stateless VPU. H264 and H265 streams don't have such issues.

I don't think HDMI receivers are an issue: data bitrate is just too big for a cpu, and so for the kind of information you mentioned (meta data, audio codec, audio clock, number of audio channels....) we rely on interruption and registers exposed by the HDMI controller receivers to notify changes in the hdmi flow, in order to reconfigure the hardware/software accordingly. EDID use a slow I2C bus like, but there is no sensible data there at DRM point of view. So cpu can parse it if not already managed by the HDMI receiver hardware and shared through registers.

>       - i.MX8 socs have a stateless VPU and there is no VPU firmware. 
> i.MX9 socs have a stateful VPU with firmware. In secure memory 
> context, with secure memory, at software level, stateful VPU are even 
> more simple to manage -> less read/write operations performed by Linux 
> world to parse the stream, so less patch to be done in the video 
> framework. But for memory management, stateful/stateless, same 
> concern: we  need  to provide support of secure dma heaps to Linux, to 
> allocate secure memory for the VPU and the display controller, so it 
> is just a different dma-buf heaps, so a different file descriptor.

i.MX8 boards may have stateless or stateful CODEC (Hantro chips are used in stateless fashion, while Amphion chips are driven by a stateful firmware). I would have hoped NXP folks would know that, as this is what their users have to deal with on day-to-day.

[Cyrille] Correct, I should have mentioned i.MX 8M  family (Hantro VPU only), and not i.MX 8 to avoid confusion.

May I interpret this as NXP is giving up on i.MX8 memory protection (or perhaps your team is only caring about i.MX9 ?), and this solution is on usable for stateful (less flexible) CODECs ?

[Cyrille] We target both stateless and stateful VPU. For 8, it is i.MX 8MPlus and 8MQ, for 9 it will depends what customer request for DRM. It doesn't make sense to support Secure Video Path for all socs. 

>       - i.MX9 VPU will support "Virtual Machine VPU". Till now I don't 
> see why it will not work. I'm not an expert in VM, but from what I 
> understood from my discussions with NXP VPU team integrating the new 
> VPU hardware IP, from outside world, VPU is seen as multiple VPUs, 
> with multiple register banks. So virtualized OS will continue to 
> read/write registers as today, and at software level, secure memory is 
> as non-secure memory, I mean at this end, it is physical DDR memory. 
> Of course hardware shall be able to read/write it, but this is not 
> software related, this is hardware concern. And even without VM, we 
> target to dedicate one virtual VPU to DRM,  so one register bank, to setup dedicated security rules for DRM.

What you wrote here is about as much as I heard about the new security model coming in newer chips (this is not NXP specific). I think in order to push forward designs and APIs, it would be logical to first present about these mechanism, now they work and how they affect drivers and user space. Its not clear how this mechanism inforces usage of non-mappable to kernel mmu memory.
Providing Open Source kernel and userland to demonstrate and use this feature is also very helpful for reviewers and adopters, but also a requirement in the drm tree.

regards,
Nicolas

>
>   I'm on vacation until end of this week. I can setup a call next week to discuss this topic if more clarifications are needed.
>
> Regards.
>
> -----Original Message-----
> From: Olivier Masse <olivier.masse@xxxxxxx>
> Sent: Wednesday, August 17, 2022 4:52 PM
> To: nicolas@xxxxxxxxxxxx; Cyrille Fleury <cyrille.fleury@xxxxxxx>; 
> brian.starkey@xxxxxxx
> Cc: sumit.semwal@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; 
> linaro-mm-sig@xxxxxxxxxxxxxxxx; christian.koenig@xxxxxxx; 
> linux-media@xxxxxxxxxxxxxxx; nd@xxxxxxx; Clément Faure 
> <clement.faure@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx; 
> benjamin.gaignard@xxxxxxxxxxxxx
> Subject: Re: [EXT] Re: [PATCH 1/3] dma-buf: heaps: add Linaro secure 
> dmabuf heap support
>
> +Cyrille
>
> Hi Nicolas,
>
> On mer., 2022-08-17 at 10:29 -0400, Nicolas Dufresne wrote:
> > Caution: EXT Email
> >
> > Hi Folks,
> >
> > Le mardi 16 août 2022 à 11:20 +0000, Olivier Masse a écrit :
> > > Hi Brian,
> > >
> > >
> > > On ven., 2022-08-12 at 17:39 +0100, Brian Starkey wrote:
> > > > Caution: EXT Ema
> > > >
> >
> > [...]
> >
> > > >
> > > > Interesting, that's not how the devices I've worked on operated.
> > > >
> > > > Are you saying that you have to have a display controller driver 
> > > > running in the TEE to display one of these buffers?
> > >
> > > In fact the display controller is managing 3 plans : UI, PiP and 
> > > video. The video plan is protected in secure as you can see on 
> > > slide
> > > 11:
> > >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> ic.linaro.org%2Fconnect%2Fsan19%2Fpresentations%2Fsan19-107.pdf&amp;da
> ta=05%7C01%7Ccyrille.fleury%40nxp.com%7C13a4dd35018b43f9f63908da81f570
> 30%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637965188416145231%7CU
> nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=cZ7BP3BJXVIBX8kCGcj%2FCNbe
> P4cB%2BaSjgGOfPMh6k4E%3D&amp;reserved=0
> >
> >
> >
> > just wanted to highlight that all the WPE/GStreamer bit in this 
> > presentation is based on NXP Vendor Media CODEC design, which rely 
> > on their own i.MX VPU API. I don't see any effort to extend this to 
> > a wider audience. It is not explaining how this can work with a 
> > mainline kernel with v4l2 stateful or stateless drivers and generic 
> > GStreamer/FFMPEG/Chromium support.
>
> Maybe Cyrille can explain what it is currently done at NXP level regarding the integration of v4l2 with NXP VPU.
>
> >
> > I'm raising this, since I'm worried that no one cares of solving 
> > that high level problem from a generic point of view. In that 
> > context, any additions to the mainline Linux kernel can only be 
> > flawed and will only serves specific vendors and not the larger audience.
> >
> > Another aspect, is that this design might be bound to a specific 
> > (NXP
> > ?)
> > security design. I've learn recently that newer HW is going to use 
> > multiple level of MMU (like virtual machines do) to protect the 
> > memory rather then marking pages. Will all this work for that too ?
>
> our fire-walling hardware is protecting memory behind the MMU and so rely on physical memory layout.
> this work is only relying on a reserved physical memory.
>
> Regards,
> Olivier
>
> >
> > regards,
> > Nicolas