Re: [RFC] drm/radeon: userfence IOCTL

Serguei Sagalovitch <serguei.sagalovitch@xxxxxxx> · Mon, 13 Apr 2015 11:37:42 -0400

    >  Another alternative would be to use the userspace mapping to
    check the BO value

    This is what I was thinking.  

    Sincerely yours,

    Serguei Sagalovitch

    On 15-04-13 11:35 AM, Christian König
      wrote:

      On 13.04.2015 17:25, Serguei
        Sagalovitch wrote:

       >
        the BO to be kept in the same place while it is mapped inside
        the kernel page table 

        ...

        > So this requires that we pin down the BO for the duration
        of the wait IOCTL. 

        But my understanding is that it should be not duration of "wait"
        IOCTL but "duration" of command buffer execution.

        BTW: I would assume that this is not the new scenario.

         This is scenario:

            - User allocate BO

            - User get CPU address for BO

            - User submit command buffer to write to BO

            - User could "poll" / "read" or "write" BO data by CPU

        So when  TTM needs  to move BO to another location it should
        also update CPU "mapping" correctly so user will always read /
        write the correct data.

        Did I miss anything?

      The problem is that kernel mappings are not updated when TTM moves
      the buffer around. In the case of a swapped out buffer that
      wouldn't even be possible cause kernel mappings aren't pageable.

      You just can't map the BO into kernel space unless you have it
      pinned down, so you can't check the current value written in the
      BO in your IOCTL.

      One alternative is to send all interrupts in question unfiltered
      to user space and let userspace do the check if the right value
      was written or not. But I assume that this would be rather bad for
      performance.

      Another alternative would be to use the userspace mapping to check
      the BO value, but this approach isn't compatible with a GPU
      scheduler. E.g. you can't really do cross process space memory
      access in device drivers.

      Regards,

      Christian.

        Sincerely yours,

        Serguei Sagalovitch

        On 15-04-13 10:52 AM, Christian
          König wrote:

          Hello everyone,

we have a requirement for a bit different kind of fence handling. Currently we handle fences completely inside the kernel, but in the future we would like to emit multiple fences inside the same IB as well.

This works by adding multiple fence commands into an IB which just write their value to a specific location inside a BO and trigger the appropriate hardware interrupt.

The user part of the driver stack should then be able to call an IOCTL to wait for the interrupt and block for the value (or something larger) to be written to the specific location.

This has the advantage that you can have multiple synchronization points in the same IB and don't need to split up your draw commands over several IBs so that the kernel can insert kernel fences in between.

The following set of patches tries to implement exactly this IOCTL. The big problem with that IOCTL is that TTM needs the BO to be kept in the same place while it is mapped inside the kernel page table. So this requires that we pin down the BO for the duration of the wait IOCTL.

This practically gives userspace a way of pinning down BOs for as long as it wants, without the ability for the kernel for intervention.

Any ideas how to avoid those problems? Or better ideas how to handle the new requirements?

Please note that the patches are only hacked together quick&dirty to demonstrate the problem and not very well tested.

Best regards,
Christian.

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel