23.03.2021 21:21, Thierry Reding пишет: > On Sat, Feb 27, 2021 at 02:19:39PM +0300, Dmitry Osipenko wrote: >> 03.02.2021 14:18, Mikko Perttunen пишет: >> ... >>>> I'll need more time to think about it. >>>> >>> >>> How about something like this: >>> >>> Turn the syncpt_incr field back into an array of structs like >>> >>> #define DRM_TEGRA_SUBMIT_SYNCPT_INCR_REPLACE_SYNCOBJ (1<<0) >>> #define DRM_TEGRA_SUBMIT_SYNCPT_INCR_PATCH_DYNAMIC_SYNCPT (1<<1) >>> >>> struct drm_tegra_submit_syncpt_incr { >>> /* can be left as zero if using dynamic syncpt */ >>> __u32 syncpt_id; >>> __u32 flags; >>> >>> struct { >>> __u32 syncobj; >>> __u32 value; >>> } fence; >>> >>> /* patch word as such: >>> * *word = *word | (syncpt_id << shift) >>> */ >>> struct { >>> __u32 gather_offset_words; >>> __u32 shift; >>> } patch; >>> }; >>> >>> So this will work similarly to the buffer reloc system; the kernel >>> driver will allocate a job syncpoint and patch in the syncpoint ID if >>> requested, and allows outputting syncobjs for each increment. >> >> I haven't got any great ideas so far, but it feels that will be easier >> and cleaner if we could have separate job paths (and job IOCTLS) based >> on hardware generation since the workloads a too different. The needs of >> a newer h/w are too obscure for me and absence of userspace code, >> firmware sources and full h/w documentation do not help. >> >> There still should be quite a lot to share, but things like >> mapping-to-channel and VM sync points are too far away from older h/w, >> IMO. This means that code path before drm-sched and path for job-timeout >> handling should be separate. >> >> Maybe later on it will become cleaner that we actually could unify it >> all nicely, but for now it doesn't look like a good idea to me. > > Sorry for jumping in rather randomly here and elsewhere, but it's been a > long time since the discussion and I just want to share my thoughts on a > couple of topics in order to hopefully help move this forward somehow. > > For UAPI, "unifying it later" doesn't really work. Of course I meant a "later version of this series" :) Sorry for not making it clear. > So I think the only > realistic option is to make a best attempt at getting the UABI right so > that it works for all existing use-cases and ideally perhaps even as of > yet unknown use-cases in the future. As with all APIs this means that > there's going to be a need to abstract away some of the hardware details > so that we don't have to deal with too many low-level details in > userspace, because otherwise the UAPI is just going to be a useless > mess. > > I think a proposal such as the above to allow both implicit and explicit > syncpoints makes sense. For what it's worth, it's fairly similar to what > we had come up with last time we tried destaging the ABI, although back > at the time I'm not sure we had even considered explicit syncpoint usage > yet. I think where reasonably possible this kind of optional behaviour > is acceptable, but I don't think having two completely separate paths is > going to help in any way. If anything it's just going to make it more > difficult to maintain the code and get it to a usable state in the first > place. > > Like I said elsewhere, the programming model for host1x hasn't changed > since Tegra20. It's rather evolved and gained a couple more features, > but that doesn't change anything about how userspace uses it. Not having a complete control over sync points state is a radical change, IMO. I prefer not to use this legacy and error-prone way of sync points handling at least for older h/w where it's possible to do. This is what downstream driver did 10 years ago and what it still continues to do. I was very happy to move away from this unnecessary complication in the experimental grate-kernel driver and I think this will be great to do in the mainline as well. The need to map buffers explicitly is also a big difference. The need to map BO for each channel is a quite big over-complication as we already found out in the current version of UAPI. Alright, perhaps the mapping could improved for older userspace if we will move away from a per-channel contexts to a single DRM context like I already was suggesting to Mikko before. I.e. instead of mapping BO "to a channel", we will need to map BO "to h/w clients" within the DRM context. This should allow older userspace to create a single mapping for all channels/clients using a single IOCTL and then to have a single "mapping handle" to care about. Objections?