On Thu, Sep 28, 2023 at 11:16 AM Cancan Chang <Cancan.Chang@xxxxxxxxxxx> wrote: > > “What happens if you call this again without waiting for the previous > inference to complete ?” > --- There is a work-queue in the driver to manage inference tasks. > When two consecutive inference tasks occur, the second inference task will be add to > the "pending list". While the previous inference task ends, the second inference task will > switch to the "scheduled list", and be executed. > Each inference task has an id, "inferece" and "wait until finish" are paired. > > thanks Thanks for the clarification. I'll wait for your driver's code link. It doesn't have to be a patch series at this point. A link to a git repo is enough. I just want to do a quick pass. Thanks, Oded > > ________________________________________ > 发件人: Oded Gabbay <ogabbay@xxxxxxxxxx> > 发送时间: 2023年9月28日 15:40 > 收件人: Cancan Chang > 抄送: Jagan Teki; linux-media; linux-kernel; Dave Airlie; Daniel Vetter > 主题: Re: kernel.org 6.5.4 , NPU driver, --not support (RFC) > > [ EXTERNAL EMAIL ] > > On Thu, Sep 28, 2023 at 10:25 AM Cancan Chang <Cancan.Chang@xxxxxxxxxxx> wrote: > > > > “Could you please post a link to the driver's source code ? > > In addition, could you please elaborate which userspace libraries > > exists that work with your driver ? Are any of them open-source ?” > > --- We will prepare the adla driver link after the holiday on October 6th. > > It's a pity that there is no open-source userspace library. > > But you can probably understand it through a workflow, which can be simplified as: > > 1. create model context > > ret = ioctl(context->fd, ADLAK_IOCTL_REGISTER_NETWORK, &desc); > > 2. set inputs > > 3. inference > > ret = ioctl(context->fd, ADLAK_IOCTL_INVOKE, &invoke_dec); > What happens if you call this again without waiting for the previous > inference to complete ? > Oded > > 4. wait for the inference to complete > > ret = ioctl(context->fd, ADLAK_IOCTL_WAIT_UNTIL_FINISH, &stat_req_desc); > > 5. destroy model context > > ret = ioctl(context->fd, ADLAK_IOCTL_DESTROY_NETWORK, &submit_del); > > > > > > thanks > > > > > > ________________________________________ > > 发件人: Oded Gabbay <ogabbay@xxxxxxxxxx> > > 发送时间: 2023年9月28日 13:28 > > 收件人: Cancan Chang > > 抄送: Jagan Teki; linux-media; linux-kernel; Dave Airlie; Daniel Vetter > > 主题: Re: kernel.org 6.5.4 , NPU driver, --not support (RFC) > > > > [ EXTERNAL EMAIL ] > > > > On Wed, Sep 27, 2023 at 10:01 AM Cancan Chang <Cancan.Chang@xxxxxxxxxxx> wrote: > > > > > > “Or do you handle one cmd at a time, where the user sends a cmd buffer > > > to the driver and the driver then submit it by writing to a couple of > > > registers and polls on some status register until its done, or waits > > > for an interrupt to mark it as done ?” > > > --- yes, user sends a cmd buffer to driver, and driver triggers hardware by writing to register, > > > and then, waits for an interrupt to mark it as done. > > > > > > My current driver is very different from drm, so I want to know if I have to switch to drm? > > Could you please post a link to the driver's source code ? > > In addition, could you please elaborate which userspace libraries > > exists that work with your driver ? Are any of them open-source ? > > > > > Maybe I can refer to /driver/accel/habanalabs. > > That's definitely a possibility. > > > > Oded > > > > > > thanks > > > > > > ________________________________________ > > > 发件人: Oded Gabbay <ogabbay@xxxxxxxxxx> > > > 发送时间: 2023年9月26日 20:54 > > > 收件人: Cancan Chang > > > 抄送: Jagan Teki; linux-media; linux-kernel; Dave Airlie; Daniel Vetter > > > 主题: Re: kernel.org 6.5.4 , NPU driver, --not support (RFC) > > > > > > [ EXTERNAL EMAIL ] > > > > > > On Mon, Sep 25, 2023 at 12:29 PM Cancan Chang <Cancan.Chang@xxxxxxxxxxx> wrote: > > > > > > > > Thank you for your reply from Jagan & Oded. > > > > > > > > It is very appropritate for my driver to be placed in driver/accel. > > > > > > > > My accelerator is named ADLA(Amlogic Deep Learning Accelerator). > > > > It is an IP in SOC,mainly used for neural network models acceleration. > > > > It will split and compile the neural network model into a private format cmd buffer, > > > > and submit this cmd buffer to ADLA hardware. It is not programmable device. > > > What exactly does it mean to "submit this cmd buffer to ADLA hardware" ? > > > > > > Does your h/w provides queues for the user/driver to put their > > > workloads/cmd-bufs on them ? And does it provide some completion queue > > > to notify when the work is completed? > > > > > > Or do you handle one cmd at a time, where the user sends a cmd buffer > > > to the driver and the driver then submit it by writing to a couple of > > > registers and polls on some status register until its done, or waits > > > for an interrupt to mark it as done ? > > > > > > > > > > > ADLA includes four hardware engines: > > > > RS engines : working for the reshape operators > > > > MAC engines : working for the convolution operators > > > > DW engines : working for the planer & Elementwise operators > > > > Activation engines : working for activation operators(ReLu,tanh..) > > > > > > > > By the way, my IP is mainly used for SOC, and the current driver registration is through the platform_driver, > > > > is it necessary to switch to drm? > > > This probably depends on the answer to my question above. btw, there > > > are drivers in drm that handle IPs that are part of an SOC, so > > > platform_driver is supported. > > > > > > Oded > > > > > > > > > > > thanks. > > > > > > > > ________________________________________ > > > > 发件人: Oded Gabbay <ogabbay@xxxxxxxxxx> > > > > 发送时间: 2023年9月22日 23:08 > > > > 收件人: Jagan Teki > > > > 抄送: Cancan Chang; linux-media; linux-kernel; Dave Airlie; Daniel Vetter > > > > 主题: Re: kernel.org 6.5.4 , NPU driver, --not support (RFC) > > > > > > > > [你通常不会收到来自 ogabbay@xxxxxxxxxx 的电子邮件。请访问 https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要;] > > > > > > > > [ EXTERNAL EMAIL ] > > > > > > > > On Fri, Sep 22, 2023 at 12:38 PM Jagan Teki <jagan@xxxxxxxxxx> wrote: > > > > > > > > > > On Fri, 22 Sept 2023 at 15:04, Cancan Chang <Cancan.Chang@xxxxxxxxxxx> wrote: > > > > > > > > > > > > Dear Media Maintainers: > > > > > > Thanks for your attention. Before describing my problem,let me introduce to you what I mean by NPU. > > > > > > NPU is Neural Processing Unit, It is designed for deep learning acceleration, It is also called TPU, APU .. > > > > > > > > > > > > The real problems: > > > > > > When I was about to upstream my NPU driver codes to linux mainline, i meet two problems: > > > > > > 1. According to my research, There is no NPU module path in the linux (base on linux 6.5.4) , I have searched all linux projects and found no organization or comany that has submitted NPU code. Is there a path prepared for NPU driver currently? > > > > > > 2. If there is no NPU driver path currently, I am going to put my NPU driver code in the drivers/media/platform/amlogic/ , because my NPU driver belongs to amlogic. and amlogic NPU is mainly used for AI vision applications. Is this plan suitabe for you? > > > > > > > > > > If I'm correct about the discussion with Oded Gabby before. I think > > > > > the drivers/accel/ is proper for AI Accelerators including NPU. > > > > > > > > > > + Oded in case he can comment. > > > > > > > > > > Thanks, > > > > > Jagan. > > > > Thanks Jagan for adding me to this thread. Adding Dave & Daniel as well. > > > > > > > > Indeed, the drivers/accel is the place for Accelerators, mainly for > > > > AI/Deep-Learning accelerators. > > > > We currently have 3 drivers there already. > > > > > > > > The accel subsystem is part of the larger drm subsystem. Basically, to > > > > get into accel, you need to integrate your driver with the drm at the > > > > basic level (registering a device, hooking up with the proper > > > > callbacks). ofc the more you use code from drm, the better. > > > > You can take a look at the drivers under accel for some examples on > > > > how to do that. > > > > > > > > Could you please describe in a couple of sentences what your > > > > accelerator does, which engines it contains, how you program it. i.e. > > > > Is it a fixed-function device where you write to a couple of registers > > > > to execute workloads, or is it a fully programmable device where you > > > > load compiled code into it (GPU style) ? > > > > > > > > For better background on the accel subsystem, please read the following: > > > > https://docs.kernel.org/accel/introduction.html > > > > This introduction also contains links to other important email threads > > > > and to Dave Airlie's BOF summary in LPC2022. > > > > > > > > Thanks, > > > > Oded