26.01.2021 05:45, Mikko Perttunen пишет: >> 6. We will need to allocate a host1x BO for a job's cmdstream and add a >> restart command to the end of the job's stream. CDMA will jump into the >> job's stream from push buffer. >> >> We could add a flag for that to drm_tegra_submit_cmd_gather, saying that >> gather should be inlined into job's main cmdstream. >> >> This will remove a need to have a large push buffer that will easily >> overflow, it's a real problem and upstream driver even has a bug where >> it locks up on overflow. >> >> How it will look from CDMA perspective: >> >> PUSHBUF | >> --------- >> ... | | JOB | >> | --------- | JOB GATHER | >> RESTART ------> CMD | -------------- >> | |GATHER -------> DATA | >> ... <---------- RESTART| | | >> | | | >> > > Let me check if I understood you correctly: > - You would like to have the job's cmdbuf have further GATHER opcodes > that jump into smaller gathers? I want jobs to be a self-contained. Instead of pushing commands to the PB of a kernel driver, we'll push them to job's cmdstream. This means that for each new job we'll need to allocate a host1x buffer. > I assume this is needed because currently WAITs are placed into the > pushbuffer, so the job will take a lot of space in the pushbuffer if > there are a lot of waits (and GATHERs in between these waits)? Yes, and with drm-sched we will just need to limit the max number of jobs in the h/w queue (i.e. push buffer) and then push buffer won't ever overflow. Problem solved. > If so, perhaps as a simpler alternative we could change the firewall to > allow SETCLASS into HOST1X for waiting specifically, then userspace > could just submit one big cmdbuf taking only little space in the > pushbuffer? Although that would only allow direct ID/threshold waits. My solution doesn't require changes to firewall, not sure whether it's easier. > In any case, it seems that this can be added in a later patch, so we > should omit it from this series for simplicity. If it is impossible for > the userspace to deal with it, we could disable the firewall > temporarily, or implement the above change in the firewall. I won't be able to test UAPI fully until all features are at least on par with the experimental driver of grate kernel.