在 2019/8/8 22:34, Lionel Landwerlin 写道: > On 08/08/2019 17:16, Chunming Zhou wrote: >> 在 2019/8/8 22:01, Lionel Landwerlin 写道: >>> On 08/08/2019 16:42, Chunming Zhou wrote: >>>> No, I just see your comment "The next vkQueueSubmit() >>>> waiting on a the syncobj will read the sideband payload and wait for a >>>> fence chain element with a seqno superior or equal to the sideband >>>> payload value to be added into the fence chain and use that fence to >>>> trigger the submission on the kernel driver. ". >>> >>> That was meant to say wait on the chain fence to reach the sideband >>> payload value. >>> >>> It a bit confusing but I can't see any other way to word it. >>> >>> >>>> In that, you mentioned wait for sideband. >>>> So I want to know we how to use that, maybe I miss something in >>>> previous >>>> discussing thread. >>> >>> In QueueSubmit(), we start by reading the sideband payloads : >>> https://gitlab.freedesktop.org/llandwerlin/mesa/blob/review/anv-timeline_semaphore_prep/src/intel/vulkan/anv_queue.c#L655 >>> >>> >>> Then build everything for the submission and hand it over to the >>> submission thread. >>> >>> Instead of the just waiting on the timeline semaphore values, we also >>> wait on the binary semaphore sideband payload values. >> Waiting on timeline values is when finding fence in kernel. > > > Hmm aren't you waiting in a thread in userspace? Yes, For timeline case, we can use waitForSyncobj()..., At begin of QueueThread, I let it wait in cs_ioctl when drm_syncobj_find_fence. But I still didn't get how to wait on sideband for binary Syncobj. Ah, I see, you will compare it in your QueueThread, if sideband value is >= expected, then do submission, otherwise, loop QueueThread, right? That sounds the QueueThread will be always busy. -David > > >> >> But I don't see how to wait/block in kernel when finding fence for >> binary sideband payload values. > > > Essentially our driver now handles timelines & binary semaphore using > dma-fence-chain in both cases. > > Only with timelines we take the values submitted by the vulkan > application. > > The binary semaphore auto increment on vkQueueSubmit() and that is > managed by the userspace driver. > > > -Lionel > > >> >> >> -David >> >>> Finally before exiting the QueueSubmit() call, we bump the sideband >>> payloads of all the binary semaphores have have been signaled : >>> https://gitlab.freedesktop.org/llandwerlin/mesa/blob/review/anv-timeline_semaphore_prep/src/intel/vulkan/anv_queue.c#L806 >>> >>> >>> >>> Whoever calls QueueSubmit() after that will pickup the new sideband >>> payload values to wait on. >>> >>> >>> -Lionel >>> >>> >>> >>>> -DAvid >>>> >>>> >>>> 在 2019/8/8 21:38, Lionel Landwerlin 写道: >>>>> Interesting question :) >>>>> >>>>> I didn't see any usecase for that. >>>>> This sideband payload value is used for a wait so waiting on the wait >>>>> value for another wait is bit meta :) >>>>> >>>>> Do you have a use case for that? >>>>> >>>>> -Lionel >>>>> >>>>> On 08/08/2019 16:23, Chunming Zhou wrote: >>>>>> The propursal is fine with me. >>>>>> >>>>>> one question: >>>>>> >>>>>> how to wait sideband payload? Following patch will show that? >>>>>> >>>>>> -David >>>>>> >>>>>> 在 2019/8/8 21:14, Lionel Landwerlin 写道: >>>>>>> The Vulkan timeline semaphores allow signaling to happen on the >>>>>>> point >>>>>>> of the timeline without all of the its dependencies to be created. >>>>>>> >>>>>>> The current 2 implementations (AMD/Intel) of the Vulkan spec on >>>>>>> top of >>>>>>> the Linux kernel are using a thread to wait on the dependencies >>>>>>> of a >>>>>>> given point to materialize and delay actual submission to the >>>>>>> kernel >>>>>>> driver until the wait completes. >>>>>>> >>>>>>> If a binary semaphore is submitted for signaling along the side >>>>>>> of a >>>>>>> timeline semaphore waiting for completion that means that the drm >>>>>>> syncobj associated with that binary semaphore will not have a DMA >>>>>>> fence associated with it by the time vkQueueSubmit() returns. This >>>>>>> and >>>>>>> the fact that a binary semaphore can be signaled and unsignaled as >>>>>>> before its DMA fences materialize mean that we cannot just rely on >>>>>>> the >>>>>>> fence within the syncobj but we also need a sideband payload >>>>>>> verifying >>>>>>> that the fence in the syncobj matches the last submission from the >>>>>>> Vulkan API point of view. >>>>>>> >>>>>>> This change adds a sideband payload that is incremented with >>>>>>> signaled >>>>>>> syncobj when vkQueueSubmit() is called. The next vkQueueSubmit() >>>>>>> waiting on a the syncobj will read the sideband payload and wait >>>>>>> for a >>>>>>> fence chain element with a seqno superior or equal to the sideband >>>>>>> payload value to be added into the fence chain and use that >>>>>>> fence to >>>>>>> trigger the submission on the kernel driver. >>>>>>> >>>>>>> v2: Use a separate ioctl to get/set the sideband value (Christian) >>>>>>> >>>>>>> v3: Use 2 ioctls for get/set (Christian) >>>>>>> >>>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx> >>>>>>> Cc: Christian Koenig <Christian.Koenig@xxxxxxx> >>>>>>> Cc: Jason Ekstrand <jason@xxxxxxxxxxxxxx> >>>>>>> Cc: David(ChunMing) Zhou <David1.Zhou@xxxxxxx> >>>>>>> --- >>>>>>> drivers/gpu/drm/drm_internal.h | 4 ++ >>>>>>> drivers/gpu/drm/drm_ioctl.c | 5 +++ >>>>>>> drivers/gpu/drm/drm_syncobj.c | 79 >>>>>>> +++++++++++++++++++++++++++++++++- >>>>>>> include/drm/drm_syncobj.h | 9 ++++ >>>>>>> include/uapi/drm/drm.h | 2 + >>>>>>> 5 files changed, 98 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/drm_internal.h >>>>>>> b/drivers/gpu/drm/drm_internal.h >>>>>>> index 51a2055c8f18..0c9736199df0 100644 >>>>>>> --- a/drivers/gpu/drm/drm_internal.h >>>>>>> +++ b/drivers/gpu/drm/drm_internal.h >>>>>>> @@ -208,6 +208,10 @@ int drm_syncobj_timeline_signal_ioctl(struct >>>>>>> drm_device *dev, void *data, >>>>>>> struct drm_file *file_private); >>>>>>> int drm_syncobj_query_ioctl(struct drm_device *dev, void >>>>>>> *data, >>>>>>> struct drm_file *file_private); >>>>>>> +int drm_syncobj_get_sideband_ioctl(struct drm_device *dev, void >>>>>>> *data, >>>>>>> + struct drm_file *file_private); >>>>>>> +int drm_syncobj_set_sideband_ioctl(struct drm_device *dev, void >>>>>>> *data, >>>>>>> + struct drm_file *file_private); >>>>>>> /* drm_framebuffer.c */ >>>>>>> void drm_framebuffer_print_info(struct drm_printer *p, >>>>>>> unsigned >>>>>>> int indent, >>>>>>> diff --git a/drivers/gpu/drm/drm_ioctl.c >>>>>>> b/drivers/gpu/drm/drm_ioctl.c >>>>>>> index f675a3bb2c88..48500bf62801 100644 >>>>>>> --- a/drivers/gpu/drm/drm_ioctl.c >>>>>>> +++ b/drivers/gpu/drm/drm_ioctl.c >>>>>>> @@ -703,6 +703,11 @@ static const struct drm_ioctl_desc >>>>>>> drm_ioctls[] >>>>>>> = { >>>>>>> DRM_RENDER_ALLOW), >>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, >>>>>>> drm_syncobj_query_ioctl, >>>>>>> DRM_RENDER_ALLOW), >>>>>>> + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_GET_SIDEBAND, >>>>>>> drm_syncobj_get_sideband_ioctl, >>>>>>> + DRM_RENDER_ALLOW), >>>>>>> + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SET_SIDEBAND, >>>>>>> drm_syncobj_set_sideband_ioctl, >>>>>>> + DRM_RENDER_ALLOW), >>>>>>> + >>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, >>>>>>> drm_crtc_get_sequence_ioctl, 0), >>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, >>>>>>> drm_crtc_queue_sequence_ioctl, 0), >>>>>>> DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, >>>>>>> drm_mode_create_lease_ioctl, DRM_MASTER), >>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c >>>>>>> b/drivers/gpu/drm/drm_syncobj.c >>>>>>> index b927e482e554..c90ef20b9242 100644 >>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c >>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c >>>>>>> @@ -1150,8 +1150,10 @@ drm_syncobj_reset_ioctl(struct drm_device >>>>>>> *dev, void *data, >>>>>>> if (ret < 0) >>>>>>> return ret; >>>>>>> - for (i = 0; i < args->count_handles; i++) >>>>>>> + for (i = 0; i < args->count_handles; i++) { >>>>>>> drm_syncobj_replace_fence(syncobjs[i], NULL); >>>>>>> + syncobjs[i]->sideband_payload = 0; >>>>>>> + } >>>>>>> drm_syncobj_array_free(syncobjs, args->count_handles); >>>>>>> @@ -1321,6 +1323,81 @@ int drm_syncobj_query_ioctl(struct >>>>>>> drm_device *dev, void *data, >>>>>>> if (ret) >>>>>>> break; >>>>>>> } >>>>>>> + >>>>>>> + drm_syncobj_array_free(syncobjs, args->count_handles); >>>>>>> + >>>>>>> + return ret; >>>>>>> +} >>>>>>> + >>>>>>> +int drm_syncobj_get_sideband_ioctl(struct drm_device *dev, void >>>>>>> *data, >>>>>>> + struct drm_file *file_private) >>>>>>> +{ >>>>>>> + struct drm_syncobj_timeline_array *args = data; >>>>>>> + struct drm_syncobj **syncobjs; >>>>>>> + uint64_t __user *points = u64_to_user_ptr(args->points); >>>>>>> + uint32_t i; >>>>>>> + int ret; >>>>>>> + >>>>>>> + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >>>>>>> + return -EOPNOTSUPP; >>>>>>> + >>>>>>> + if (args->pad != 0) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> + if (args->count_handles == 0) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> + ret = drm_syncobj_array_find(file_private, >>>>>>> + u64_to_user_ptr(args->handles), >>>>>>> + args->count_handles, >>>>>>> + &syncobjs); >>>>>>> + if (ret < 0) >>>>>>> + return ret; >>>>>>> + >>>>>>> + for (i = 0; i < args->count_handles; i++) { >>>>>>> + copy_to_user(&points[i], &syncobjs[i]->sideband_payload, >>>>>>> sizeof(uint64_t)); >>>>>>> + ret = ret ? -EFAULT : 0; >>>>>>> + if (ret) >>>>>>> + break; >>>>>>> + } >>>>>>> + >>>>>>> + drm_syncobj_array_free(syncobjs, args->count_handles); >>>>>>> + >>>>>>> + return ret; >>>>>>> +} >>>>>>> + >>>>>>> +int drm_syncobj_set_sideband_ioctl(struct drm_device *dev, void >>>>>>> *data, >>>>>>> + struct drm_file *file_private) >>>>>>> +{ >>>>>>> + struct drm_syncobj_timeline_array *args = data; >>>>>>> + struct drm_syncobj **syncobjs; >>>>>>> + uint64_t __user *points = u64_to_user_ptr(args->points); >>>>>>> + uint32_t i; >>>>>>> + int ret; >>>>>>> + >>>>>>> + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ_TIMELINE)) >>>>>>> + return -EOPNOTSUPP; >>>>>>> + >>>>>>> + if (args->pad != 0) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> + if (args->count_handles == 0) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> + ret = drm_syncobj_array_find(file_private, >>>>>>> + u64_to_user_ptr(args->handles), >>>>>>> + args->count_handles, >>>>>>> + &syncobjs); >>>>>>> + if (ret < 0) >>>>>>> + return ret; >>>>>>> + >>>>>>> + for (i = 0; i < args->count_handles; i++) { >>>>>>> + copy_from_user(&syncobjs[i]->sideband_payload, &points[i], >>>>>>> sizeof(uint64_t)); >>>>>>> + ret = ret ? -EFAULT : 0; >>>>>>> + if (ret) >>>>>>> + break; >>>>>>> + } >>>>>>> + >>>>>>> drm_syncobj_array_free(syncobjs, args->count_handles); >>>>>>> return ret; >>>>>>> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h >>>>>>> index 6cf7243a1dc5..b587b8e07547 100644 >>>>>>> --- a/include/drm/drm_syncobj.h >>>>>>> +++ b/include/drm/drm_syncobj.h >>>>>>> @@ -61,6 +61,15 @@ struct drm_syncobj { >>>>>>> * @file: A file backing for this syncobj. >>>>>>> */ >>>>>>> struct file *file; >>>>>>> + /** >>>>>>> + * @sideband_payload: A 64bit side band payload. >>>>>>> + * >>>>>>> + * We use the sideband payload value to wait on binary syncobj >>>>>>> fences >>>>>>> + * to materialize. It is a reservation mechanism for the >>>>>>> signaler to >>>>>>> + * express that at some point in the future a dma fence with >>>>>>> the same >>>>>>> + * seqno will be put into the syncobj. >>>>>>> + */ >>>>>>> + u64 sideband_payload; >>>>>>> }; >>>>>>> void drm_syncobj_free(struct kref *kref); >>>>>>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h >>>>>>> index 8a5b2f8f8eb9..cffdc6c9772c 100644 >>>>>>> --- a/include/uapi/drm/drm.h >>>>>>> +++ b/include/uapi/drm/drm.h >>>>>>> @@ -946,6 +946,8 @@ extern "C" { >>>>>>> #define DRM_IOCTL_SYNCOBJ_QUERY DRM_IOWR(0xCB, struct >>>>>>> drm_syncobj_timeline_array) >>>>>>> #define DRM_IOCTL_SYNCOBJ_TRANSFER DRM_IOWR(0xCC, struct >>>>>>> drm_syncobj_transfer) >>>>>>> #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL DRM_IOWR(0xCD, >>>>>>> struct drm_syncobj_timeline_array) >>>>>>> +#define DRM_IOCTL_SYNCOBJ_GET_SIDEBAND DRM_IOR(0xCE, struct >>>>>>> drm_syncobj_timeline_array) >>>>>>> +#define DRM_IOCTL_SYNCOBJ_SET_SIDEBAND DRM_IOWR(0xCF, struct >>>>>>> drm_syncobj_timeline_array) >>>>>>> /** >>>>>>> * Device specific ioctls should only be in their respective >>>>>>> headers > > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel