On Tue, Sep 14, 2021 at 10:53 AM Chia-I Wu <olvaffe@xxxxxxxxx> wrote:
,On Mon, Sep 13, 2021 at 6:57 PM Gurchetan Singh
<gurchetansingh@xxxxxxxxxxxx> wrote:
>
>
>
>
> On Mon, Sep 13, 2021 at 11:52 AM Chia-I Wu <olvaffe@xxxxxxxxx> wrote:
>>
>> .
>>
>> On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh
>> <gurchetansingh@xxxxxxxxxxxx> wrote:
>> >
>> >
>> >
>> > On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu <olvaffe@xxxxxxxxx> wrote:
>> >>
>> >> On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh
>> >> <gurchetansingh@xxxxxxxxxxxx> wrote:
>> >> >
>> >> > We don't want fences from different 3D contexts (virgl, gfxstream,
>> >> > venus) to be on the same timeline. With explicit context creation,
>> >> > we can specify the number of ring each context wants.
>> >> >
>> >> > Execbuffer can specify which ring to use.
>> >> >
>> >> > Signed-off-by: Gurchetan Singh <gurchetansingh@xxxxxxxxxxxx>
>> >> > Acked-by: Lingfeng Yang <lfy@xxxxxxxxxx>
>> >> > ---
>> >> > drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++
>> >> > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 ++++++++++++++++++++++++--
>> >> > 2 files changed, 35 insertions(+), 2 deletions(-)
>> >> >
>> >> > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
>> >> > index a5142d60c2fa..cca9ab505deb 100644
>> >> > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
>> >> > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
>> >> > @@ -56,6 +56,7 @@
>> >> > #define STATE_ERR 2
>> >> >
>> >> > #define MAX_CAPSET_ID 63
>> >> > +#define MAX_RINGS 64
>> >> >
>> >> > struct virtio_gpu_object_params {
>> >> > unsigned long size;
>> >> > @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv {
>> >> > uint32_t ctx_id;
>> >> > uint32_t context_init;
>> >> > bool context_created;
>> >> > + uint32_t num_rings;
>> >> > + uint64_t base_fence_ctx;
>> >> > struct mutex context_lock;
>> >> > };
>> >> >
>> >> > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
>> >> > index f51f3393a194..262f79210283 100644
>> >> > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
>> >> > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
>> >> > @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
>> >> > int in_fence_fd = exbuf->fence_fd;
>> >> > int out_fence_fd = -1;
>> >> > void *buf;
>> >> > + uint64_t fence_ctx;
>> >> > + uint32_t ring_idx;
>> >> > +
>> >> > + fence_ctx = vgdev->fence_drv.context;
>> >> > + ring_idx = 0;
>> >> >
>> >> > if (vgdev->has_virgl_3d == false)
>> >> > return -ENOSYS;
>> >> > @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
>> >> > if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
>> >> > return -EINVAL;
>> >> >
>> >> > + if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
>> >> > + if (exbuf->ring_idx >= vfpriv->num_rings)
>> >> > + return -EINVAL;
>> >> > +
>> >> > + if (!vfpriv->base_fence_ctx)
>> >> > + return -EINVAL;
>> >> > +
>> >> > + fence_ctx = vfpriv->base_fence_ctx;
>> >> > + ring_idx = exbuf->ring_idx;
>> >> > + }
>> >> > +
>> >> > exbuf->fence_fd = -1;
>> >> >
>> >> > virtio_gpu_create_context(dev, file);
>> >> > @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
>> >> > goto out_memdup;
>> >> > }
>> >> >
>> >> > - out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
>> >> > + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
>> >> > if(!out_fence) {
>> >> > ret = -ENOMEM;
>> >> > goto out_unresv;
>> >> > @@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
>> >> > return -EINVAL;
>> >> >
>> >> > /* Number of unique parameters supported at this time. */
>> >> > - if (num_params > 1)
>> >> > + if (num_params > 2)
>> >> > return -EINVAL;
>> >> >
>> >> > ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
>> >> > @@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
>> >> >
>> >> > vfpriv->context_init |= value;
>> >> > break;
>> >> > + case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
>> >> > + if (vfpriv->base_fence_ctx) {
>> >> > + ret = -EINVAL;
>> >> > + goto out_unlock;
>> >> > + }
>> >> > +
>> >> > + if (value > MAX_RINGS) {
>> >> > + ret = -EINVAL;
>> >> > + goto out_unlock;
>> >> > + }
>> >> > +
>> >> > + vfpriv->base_fence_ctx = dma_fence_context_alloc(value);
>> >> With multiple fence contexts, we should do something about implicit fencing.
>> >>
>> >> The classic example is Mesa and X server. When both use virgl and the
>> >> global fence context, no dma_fence_wait is fine. But when Mesa uses
>> >> venus and the ring fence context, dma_fence_wait should be inserted.
>> >
>> >
>> > If I read your comment correctly, the use case is:
>> >
>> > context A (venus)
>> >
>> > sharing a render target with
>> >
>> > context B (Xserver backed virgl)
>> >
>> > ?
>> >
>> > Which function do you envisage dma_fence_wait(...) to be inserted? Doesn't implicit synchronization mean there's no fence to share between contexts (only buffer objects)?
>>
>> Fences can be implicitly shared via reservation objects associated
>> with buffer objects.
>>
>> > It may be possible to wait on the reservation object associated with a buffer object from a different context (userspace can also do DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for.
>>
>> Right, that's what I am looking for. Userspace expects implicit
>> fencing to work. While there are works to move the userspace to do
>> explicit fencing, it is not there yet in general and we can't require
>> the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT.
>
>
> Another option would be to use the upcoming DMA_BUF_IOCTL_EXPORT_SYNC_FILE + VIRTGPU_EXECBUF_FENCE_FD_IN (which checks the dma_fence context).
That requires the X server / compositors to be modified. For example,
venus works under Android (where there is explicit fencing) or under a
modified compositor (which does DMA_BUF_IOCTL_EXPORT_SYNC_FILE or
DRM_IOCTL_VIRTGPU_WAIT). But it does not work too well with an
unmodified X server.
Some semi-recent virgl modifications will be needed regardless for interop, such as VIRGL_CAP_V2_UNTYPED_RESOURCE (?).
Not sure aren't too many virgl users (most developers)
Does Xserver just pick up the latest Mesa release (including virgl/venus)? Suppose context types land in 5.16, the userspace changes land (both Venus/Virgl) in 21.2 stable releases.
>
> Generally, if it only requires virgl changes, userspace changes are fine since OpenGL drivers implement implicit sync in many ways. Waiting on the reservation object in the kernel is fine too though.
I don't think we want to assume virgl to be the only consumer of
dma-bufs, despite that it is the most common use case.
>
> Though venus doesn't use the NUM_RINGS param yet. Getting all permutations of context type + display integration working would take some time (patchset mostly tested with wayland + gfxstream/Android [no implicit sync]).
>
> WDYT of someone figuring out virgl/venus interop later, independently of this patchset?
I think we should understand the implications of multiple fence
contexts better, even if some changes are not included in this
patchset.
>From my view, we don't need implicit fencing in most cases and
implicit fencing should be considered a legacy path. But X server /
compositors today happen to require it. Other drivers seem to use a
flag to control whether implicit fences are set up or waited (e.g.,
AMDGPU_GEM_CREATE_EXPLICIT_SYNC, MSM_SUBMIT_NO_IMPLICIT, or
EXEC_OBJECT_WRITE). It seems to be the least surprising thing to do.
IMO, the easiest way is just to limit the change to userspace if possible since implicit sync is legacy/something we want to deprecate over time.
Another option is to add something like VIRTGPU_EXECBUF_EXPLICIT_SYNC (similar to MSM_SUBMIT_NO_IMPLICIT), where the reservation objects are waited on / added to without that flag. Since explicit sync will need new hypercalls/params and is a major, that feature is expected to be independent of context types.
With that option, waiting on the reservation object would just be another bug fix + addition to 5.16 (perhaps by you) so we can proceed in parallel faster. VIRTGPU_EXECBUF_EXPLICIT_SYNC (or an equivalent) would be added later.
>
>>
>>
>>
>>
>> >
>> >>
>> >>
>> >> > + vfpriv->num_rings = value;
>> >> > + break;
>> >> > default:
>> >> > ret = -EINVAL;
>> >> > goto out_unlock;
>> >> > --
>> >> > 2.33.0.153.gba50c8fa24-goog
>> >> >
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: virtio-dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>> >> > For additional commands, e-mail: virtio-dev-help@xxxxxxxxxxxxxxxxxxxx
>> >> >