> -----Original Message----- > From: Oded Gabbay <oded.gabbay@xxxxxxxxx> > Sent: 2024年4月3日 14:26 > To: Dejia Shang <Dejia.Shang@xxxxxxxxxxxx> > Cc: ogabbay@xxxxxxxxxx; airlied@xxxxxxxxxx; daniel@xxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx; > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > Subject: Re: About upstreaming ArmChina NPU driver > > On Thu, Mar 28, 2024 at 10:01 AM Dejia Shang <Dejia.Shang@xxxxxxxxxxxx> > wrote: > > > > Dear Kernel Maintainers, > > > > I am a driver developer and would like to upstream the ArmChina Zhouyi > NPU driver ("Zhouyi" is the brand) to accel subsystem. > > > > The driver is already open sourced (both UMD and KMD) and anyone can > find the code from https://github.com/Arm-China/Compass_NPU_Driver.git. > > > > This driver is responsible for scheduling AI inference tasks to the NPU cores > (V1/V2/V3). Specifically, a simplified end-to-end flow is: > > > > 1. A TFLite/ONNX model is transformed to an executable binary > file in ELF format by the NN graph compiler (designed by ArmChina) > > 2. An application loads the executable binary file to UMD and > provides the input data. > > 3. UMD parses the binary and sends ioctls to KMD (open device, > do memory allocation/mmap/free, submit the job descriptor). > > 4. KMD dispatches the job to NPU h/w, handles interrupts and > updates the execution status. > > 5. UMD polls the status of the pre-scheduled job. > > 6. The application gets the output results. > > > > So...for the upstreaming, > > > > Q1: do you think our NPU driver is suitable for accel? If the answer is yes, > which tree & branch should the patches be based on? > Hi Dejia, > Yes, it definitely sounds as a good fit to the accel subsystem. > Please base your patches on "drm-misc-next" branch in drm-misc repo: > https://anongit.freedesktop.org/git/drm/drm-misc.git > Hi Oded, Got it. > > > > Q2: in thread > https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9edad63@kern > el.org/ showing a similar case, Oded mentioned that: > > > > "If we would have upstreamed a new driver, the expectation > would have been that we would use some drm mechanisms.", and > > "the minimal requirement is to use GEM/BOs for memory > management operations". > > > > I guess those requirements are also applicable for the Zhouyi NPU KMD? > Currently, the memory management (MM) in KMD is based on dma-mapping > APIs, which handles both reserved CMA region(s) and SMMU mapped buffers, > and supports the dma-buf framework. Maybe I should replace the > implementations with DRM APIs. > Yes, those requirements definitely apply here. > > > > Q3: if you have looked at the KMD code, do you think I should make any > other major change before submitting the first patch series? Thank you! > I took a quick glance. In general, it seems to be ok, but I noticed two things > related to the integration with drm/accel: > > 1. You us a scheduler for the job submission, which provides the ability to > defer jobs. In that case, I suggest to check if you can use drm_sched instead of > your own implementation. No point in re-inventing the wheel. > 2. You provide several memory zones for allocation of memory. I would > suggest here to look at using ttm as the memory manager instead of > re-implementing your own. Thanks for your time! I will try to refactor the code as suggested and then send the first patch series. > > And please remove the IMPORTANT NOTICE at the end of your emails. I > would have to refrain from answering to further emails if that notice remains. Now fixed. I did not realize that because the server auto appended the notice. Sorry for the inconvenience. Best Regards, Dejia > > Thanks, > Oded > > > > > Thanks for your time and look forward to your reply~ 😊 > > > > Best Regards, > > Dejia