On Mon, Aug 29, 2022 at 11:54 PM Kevin Hilman <khilman@xxxxxxxxxxxx> wrote: > > Hi Oded (and sorry I misspelled your name last time), > > Oded Gabbay <oded.gabbay@xxxxxxxxx> writes: > > > On Tue, Aug 23, 2022 at 9:24 PM Kevin Hilman <khilman@xxxxxxxxxxxx> wrote: > >> > >> Hi Obed, > >> > >> Oded Gabbay <oded.gabbay@xxxxxxxxx> writes: > >> > >> [...] > >> > >> > I want to update that I'm currently in discussions with Dave to figure > >> > out what's the best way to move forward. We are writing it down to do > >> > a proper comparison between the two paths (new accel subsystem or > >> > using drm). I guess it will take a week or so. > >> > >> Any update on the discussions with Dave? and/or are there any plans to > >> discuss this further at LPC/ksummit yet? > > Hi Kevin. > > > > We are still discussing the details, as at least the habanalabs driver > > is very complex and there are multiple parts that I need to see if and > > how they can be mapped to drm. > > Some of us will attend LPC so we will probably take advantage of that > > to talk more about this. > > OK, looking forward to some more conversations at LPC. > > >> > >> We (BayLibre) are upstreaming support for APUs on Mediatek SoCs, and are > >> using the DRM-based approach. I'll also be at LPC and happy to discuss > >> in person. > >> > >> For some context on my/our interest: back in Sept 2020 we initially > >> submitted an rpmesg based driver for kernel communication[1]. After > >> review comments, we rewrote that based on DRM[2] and are now using it > >> for some MTK SoCs[3] and supporting our MTK customers with it. > >> > >> Hopefully we will get the kernel interfaces sorted out soon, but next, > >> there's the userspace side of things. To that end, we're also working > >> on libAPU, a common, open userspace stack. Alex Bailon recently > >> presented a proposal earlier this year at Embedded Recipes in Paris > >> (video[4], slides[5].) > >> > >> libAPU would include abstractions of the kernel interfaces for DRM > >> (using libdrm), remoteproc/rpmsg, virtio etc. but also goes farther and > >> proposes an open firmware for the accelerator side using > >> libMetal/OpenAMP + rpmsg for communication with (most likely closed > >> source) vendor firmware. Think of this like sound open firmware (SOF[6]), > >> but for accelerators. > > > > I think your device and the habana device are very different in > > nature, and it is part of what Dave and I discussed, whether these two > > classes of devices can live together. I guess they can live together > > in the kernel, but in the userspace, not so much imo. > > Yeah, for now I think focusing on how to handle both classes of devices > in the kernel is the most important. > > > The first class is the edge inference devices (usually as part of some > > SoC). I think your description of the APU on MTK SoC is a classic > > example of such a device. > > Correct. > > > You usually have some firmware you load, you give it a graph and > > pointers for input and output and then you just execute the graph > > again and again to perform inference and just replace the inputs. > > > > The second class is the data-center, training accelerators, which > > habana's gaudi device is classified as such. These devices usually > > have a number of different compute engines, a fabric for scaling out, > > on-device memory, internal MMUs and RAS monitoring requirements. Those > > devices are usually operated via command queues, either through their > > kernel driver or directly from user-space. They have multiple APIs for > > memory management, RAS, scaling-out and command-submissions. > > OK, I see. > > >> > >> We've been using this succesfully for Mediatek SoCs (which have a > >> Cadence VP6 APU) and have submitted/published the code, including the > >> OpenAMP[7] and libmetal[8] parts in addition to the kernel parts already > >> mentioned. > > What's the difference between libmetal and other open-source low-level > > runtime drivers, such as oneAPI level-zero ? > > TBH, I'd never heard of oneAPI before, so I'm assuming it's mainly > focused in the data center. libmetal/openAMP are widely used > in the consumer, industrial embedded space, and heavily used by FPGAs in > many market segments. > > > Currently we have our own runtime driver which is tightly coupled with > > our h/w. For example, the method the userspace "talks" to the > > data-plane firmware is very proprietary as it is hard-wired into the > > architecture of the entire ASIC and how it performs deep-learning > > training. Therefore, I don't see how this can be shared with other > > vendors. Not because of secrecy but because it is simply not relevant > > to any other ASIC. > > OK, makes sense. > > Thanks for clarifying your use case in more detail. > > Kevin Hi all, I wanted to update on this issue for those of you who weren't in LPC. We had a BOF session about this topic with most, if not all, of the relevant people - DRM maintainers, Greg and other subsystem and device drivers maintainers. Dave Airlie summarized the session in his blog - https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html TL;DR Accelerators drivers will use the DRM subsystem code, but they will be located in a different directory (drivers/accel) and will be exposed to the userspace using a new major and a new device char name (/dev/accelX). I'm currently working on preparing some prerequisite patches for the DRM subsystem to support the new subsystem (e.g. new major number). After that will be merged, I plan to move the habanalabs driver to the new location and convert it to use the modified DRM framework code. Thanks, Oded