[RFC] Mechanism for high priority scheduling in amdgpu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Do you encounter the priority issue for compute queue with current driver?

If compute queue is occupied only by you, the efficiency is equal with 
setting job queue to high priority I think.

Regards,
David Zhou

On 2016å¹´12æ??19æ?¥ 13:29, Andres Rodriguez wrote:
> Yes, vulkan is available on all-open through the mesa radv UMD.
>
> I'm not sure if I'm asking for too much, but if we can coordinate a 
> similar interface in radv and amdgpu-pro at the vulkan level that 
> would be great.
>
> I'm not sure what that's going to be yet.
>
> - Andres
>
> On 12/19/2016 12:11 AM, zhoucm1 wrote:
>>
>>
>> On 2016å¹´12æ??19æ?¥ 11:33, Pierre-Loup A. Griffais wrote:
>>> We're currently working with the open stack; I assume that a 
>>> mechanism could be exposed by both open and Pro Vulkan userspace 
>>> drivers and that the amdgpu kernel interface improvements we would 
>>> pursue following this discussion would let both drivers take 
>>> advantage of the feature, correct?
>> Of course.
>> Does open stack have Vulkan support?
>>
>> Regards,
>> David Zhou
>>>
>>> On 12/18/2016 07:26 PM, zhoucm1 wrote:
>>>> By the way, are you using all-open driver or amdgpu-pro driver?
>>>>
>>>> +David Mao, who is working on our Vulkan driver.
>>>>
>>>> Regards,
>>>> David Zhou
>>>>
>>>> On 2016å¹´12æ??18æ?¥ 06:05, Pierre-Loup A. Griffais wrote:
>>>>> Hi Serguei,
>>>>>
>>>>> I'm also working on the bringing up our VR runtime on top of amgpu;
>>>>> see replies inline.
>>>>>
>>>>> On 12/16/2016 09:05 PM, Sagalovitch, Serguei wrote:
>>>>>> Andres,
>>>>>>
>>>>>>>  For current VR workloads we have 3 separate processes running
>>>>>>> actually:
>>>>>> So we could have potential memory overcommit case or do you do
>>>>>> partitioning
>>>>>> on your own?  I would think that there is need to avoid overcomit in
>>>>>> VR case to
>>>>>> prevent any BO migration.
>>>>>
>>>>> You're entirely correct; currently the VR runtime is setting up
>>>>> prioritized CPU scheduling for its VR compositor, we're working on
>>>>> prioritized GPU scheduling and pre-emption (eg. this thread), and in
>>>>> the future it will make sense to do work in order to make sure that
>>>>> its memory allocations do not get evicted, to prevent any unwelcome
>>>>> additional latency in the event of needing to perform just-in-time
>>>>> reprojection.
>>>>>
>>>>>> BTW: Do you mean __real__ processes or threads?
>>>>>> Based on my understanding sharing BOs between different processes
>>>>>> could introduce additional synchronization constrains. btw: I am not
>>>>>> sure
>>>>>> if we are able to share Vulkan sync. object cross-process boundary.
>>>>>
>>>>> They are different processes; it is important for the compositor that
>>>>> is responsible for quality-of-service features such as consistently
>>>>> presenting distorted frames with the right latency, reprojection, 
>>>>> etc,
>>>>> to be separate from the main application.
>>>>>
>>>>> Currently we are using unreleased cross-process memory and semaphore
>>>>> extensions to fetch updated eye images from the client application,
>>>>> but the just-in-time reprojection discussed here does not actually
>>>>> have any direct interactions with cross-process resource sharing,
>>>>> since it's achieved by using whatever is the latest, most up-to-date
>>>>> eye images that have already been sent by the client application,
>>>>> which are already available to use without additional 
>>>>> synchronization.
>>>>>
>>>>>>
>>>>>>>    3) System compositor (we are looking at approaches to remove 
>>>>>>> this
>>>>>>> overhead)
>>>>>> Yes,  IMHO the best is to run in  "full screen mode".
>>>>>
>>>>> Yes, we are working on mechanisms to present directly to the headset
>>>>> display without any intermediaries as a separate effort.
>>>>>
>>>>>>
>>>>>>>  The latency is our main concern,
>>>>>> I would assume that this is the known problem (at least for compute
>>>>>> usage).
>>>>>> It looks like that amdgpu / kernel submission is rather CPU 
>>>>>> intensive
>>>>>> (at least
>>>>>> in the default configuration).
>>>>>
>>>>> As long as it's a consistent cost, it shouldn't an issue. However, if
>>>>> there's high degrees of variance then that would be troublesome 
>>>>> and we
>>>>> would need to account for the worst case.
>>>>>
>>>>> Hopefully the requirements and approach we described make sense, 
>>>>> we're
>>>>> looking forward to your feedback and suggestions.
>>>>>
>>>>> Thanks!
>>>>>  - Pierre-Loup
>>>>>
>>>>>>
>>>>>> Sincerely yours,
>>>>>> Serguei Sagalovitch
>>>>>>
>>>>>>
>>>>>> From: Andres Rodriguez <andresr at valvesoftware.com>
>>>>>> Sent: December 16, 2016 10:00 PM
>>>>>> To: Sagalovitch, Serguei; amd-gfx at lists.freedesktop.org
>>>>>> Subject: RE: [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>> Hey Serguei,
>>>>>>
>>>>>>> [Serguei] No. I mean pipe :-) as MEC define it.  As far as I
>>>>>>> understand (by simplifying)
>>>>>>> some scheduling is per pipe.  I know about the current allocation
>>>>>>> scheme but I do not think
>>>>>>> that it is  ideal.  I would assume that we need  to switch to
>>>>>>> dynamical partition
>>>>>>> of resources  based on the workload otherwise we will have resource
>>>>>>> conflict
>>>>>>> between Vulkan compute and  OpenCL.
>>>>>>
>>>>>> I agree the partitioning isn't ideal. I'm hoping we can start with a
>>>>>> solution that assumes that
>>>>>> only pipe0 has any work and the other pipes are idle (no HSA/ROCm
>>>>>> running on the system).
>>>>>>
>>>>>> This should be more or less the use case we expect from VR users.
>>>>>>
>>>>>> I agree the split is currently not ideal, but I'd like to consider
>>>>>> that a separate task, because
>>>>>> making it dynamic is not straight forward :P
>>>>>>
>>>>>>> [Serguei] Vulkan works via amdgpu (kernel submissions) so amdkfd
>>>>>>> will be not
>>>>>>> involved.  I would assume that in the case of VR we will have 
>>>>>>> one main
>>>>>>> application ("console" mode(?)) so we could temporally "ignore"
>>>>>>> OpenCL/ROCm needs when VR is running.
>>>>>>
>>>>>> Correct, this is why we want to enable the high priority compute
>>>>>> queue through
>>>>>> libdrm-amdgpu, so that we can expose it through Vulkan later.
>>>>>>
>>>>>> For current VR workloads we have 3 separate processes running 
>>>>>> actually:
>>>>>>     1) Game process
>>>>>>     2) VR Compositor (this is the process that will require high
>>>>>> priority queue)
>>>>>>     3) System compositor (we are looking at approaches to remove 
>>>>>> this
>>>>>> overhead)
>>>>>>
>>>>>> For now I think it is okay to assume no OpenCL/ROCm running
>>>>>> simultaneously, but
>>>>>> I would also like to be able to address this case in the future
>>>>>> (cross-pipe priorities).
>>>>>>
>>>>>>> [Serguei]  The problem with pre-emption of graphics task:  (a) it
>>>>>>> may take time so
>>>>>>> latency may suffer
>>>>>>
>>>>>> The latency is our main concern, we want something that is
>>>>>> predictable. A good
>>>>>> illustration of what the reprojection scheduling looks like can be
>>>>>> found here:
>>>>>> https://community.amd.com/servlet/JiveServlet/showImage/38-1310-104754/pastedImage_3.png 
>>>>>>
>>>>>>
>>>>>>
>>>>>>> (b) to preempt we need to have different "context" - we want
>>>>>>> to guarantee that submissions from the same context will be 
>>>>>>> executed
>>>>>>> in order.
>>>>>>
>>>>>> This is okay, as the reprojection work doesn't have dependencies on
>>>>>> the game context, and it
>>>>>> even happens in a separate process.
>>>>>>
>>>>>>> BTW: (a) Do you want  "preempt" and later resume or do you want
>>>>>>> "preempt" and
>>>>>>> "cancel/abort"
>>>>>>
>>>>>> Preempt the game with the compositor task and then resume it.
>>>>>>
>>>>>>> (b) Vulkan is generic API and could be used for graphics as well as
>>>>>>> for plain compute tasks (VK_QUEUE_COMPUTE_BIT).
>>>>>>
>>>>>> Yeah, the plan is to use vulkan compute. But if you figure out a way
>>>>>> for us to get
>>>>>> a guaranteed execution time using vulkan graphics, then I'll take 
>>>>>> you
>>>>>> out for a beer :)
>>>>>>
>>>>>> Regards,
>>>>>> Andres
>>>>>> ________________________________________
>>>>>> From: Sagalovitch, Serguei [Serguei.Sagalovitch at amd.com]
>>>>>> Sent: Friday, December 16, 2016 9:13 PM
>>>>>> To: Andres Rodriguez; amd-gfx at lists.freedesktop.org
>>>>>> Subject: Re: [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>> Hi Andres,
>>>>>>
>>>>>> Please see inline (as [Serguei])
>>>>>>
>>>>>> Sincerely yours,
>>>>>> Serguei Sagalovitch
>>>>>>
>>>>>>
>>>>>> From: Andres Rodriguez <andresr at valvesoftware.com>
>>>>>> Sent: December 16, 2016 8:29 PM
>>>>>> To: Sagalovitch, Serguei; amd-gfx at lists.freedesktop.org
>>>>>> Subject: RE: [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>> Hi Serguei,
>>>>>>
>>>>>> Thanks for the feedback. Answers inline as [AR].
>>>>>>
>>>>>> Regards,
>>>>>> Andres
>>>>>>
>>>>>> ________________________________________
>>>>>> From: Sagalovitch, Serguei [Serguei.Sagalovitch at amd.com]
>>>>>> Sent: Friday, December 16, 2016 8:15 PM
>>>>>> To: Andres Rodriguez; amd-gfx at lists.freedesktop.org
>>>>>> Subject: Re: [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>> Andres,
>>>>>>
>>>>>>
>>>>>> Quick comments:
>>>>>>
>>>>>> 1) To minimize "bubbles", etc. we need to "force" CU 
>>>>>> assignments/binding
>>>>>> to high-priority queue  when it will be in use and "free" them later
>>>>>> (we  do not want forever take CUs from e.g. graphic task to degrade
>>>>>> graphics
>>>>>> performance).
>>>>>>
>>>>>> Otherwise we could have scenario when long graphics task (or
>>>>>> low-priority
>>>>>> compute) will took all (extra) CUs and high--priority will wait for
>>>>>> needed resources.
>>>>>> It will not be visible on "NOP " but only when you submit "real"
>>>>>> compute task
>>>>>> so I would recommend  not to use "NOP" packets at all for testing.
>>>>>>
>>>>>> It (CU assignment) could be relatively easy done when everything is
>>>>>> going via kernel
>>>>>> (e.g. as part of frame submission) but I must admit that I am not 
>>>>>> sure
>>>>>> about the best way for user level submissions (amdkfd).
>>>>>>
>>>>>> [AR] I wasn't aware of this part of the programming sequence. Thanks
>>>>>> for the heads up!
>>>>>> Is this similar to the CU masking programming?
>>>>>> [Serguei] Yes. To simplify: the problem is that "scheduler" when
>>>>>> deciding which
>>>>>> queue to  run will check if there is enough resources and if not 
>>>>>> then
>>>>>> it will begin
>>>>>> to check other queues with lower priority.
>>>>>>
>>>>>> 2) I would recommend to dedicate the whole pipe to high-priority
>>>>>> queue and have
>>>>>> nothing their except it.
>>>>>>
>>>>>> [AR] I'm guessing in this context you mean pipe = queue? (as opposed
>>>>>> to the MEC definition
>>>>>> of pipe, which is a grouping of queues). I say this because amdgpu
>>>>>> only has access to 1 pipe,
>>>>>> and the rest are statically partitioned for amdkfd usage.
>>>>>>
>>>>>> [Serguei] No. I mean pipe :-)  as MEC define it.  As far as I
>>>>>> understand (by simplifying)
>>>>>> some scheduling is per pipe.  I know about the current allocation
>>>>>> scheme but I do not think
>>>>>> that it is  ideal.  I would assume that we need  to switch to
>>>>>> dynamical partition
>>>>>> of resources  based on the workload otherwise we will have resource
>>>>>> conflict
>>>>>> between Vulkan compute and  OpenCL.
>>>>>>
>>>>>>
>>>>>> BTW: Which user level API do you want to use for compute: Vulkan or
>>>>>> OpenCL?
>>>>>>
>>>>>> [AR] Vulkan
>>>>>>
>>>>>> [Serguei] Vulkan works via amdgpu (kernel submissions) so amdkfd 
>>>>>> will
>>>>>> be not
>>>>>> involved.  I would assume that in the case of VR we will have one 
>>>>>> main
>>>>>> application ("console" mode(?)) so we could temporally "ignore"
>>>>>> OpenCL/ROCm needs when VR is running.
>>>>>>
>>>>>>>  we will not be able to provide a solution compatible with GFX
>>>>>>> worloads.
>>>>>> I assume that you are talking about graphics? Am I right?
>>>>>>
>>>>>> [AR] Yeah, my understanding is that pre-empting the currently 
>>>>>> running
>>>>>> graphics job and scheduling in
>>>>>> something else using mid-buffer pre-emption has some cases where it
>>>>>> doesn't work well. But if with
>>>>>> polaris10 it starts working well, it might be a better solution for
>>>>>> us (because the whole reprojection
>>>>>> work uses the vulkan graphics stack at the moment, and porting it to
>>>>>> compute is not trivial).
>>>>>>
>>>>>> [Serguei]  The problem with pre-emption of graphics task: (a) it may
>>>>>> take time so
>>>>>> latency may suffer (b) to preempt we need to have different 
>>>>>> "context"
>>>>>> - we want
>>>>>> to guarantee that submissions from the same context will be executed
>>>>>> in order.
>>>>>> BTW: (a) Do you want  "preempt" and later resume or do you want
>>>>>> "preempt" and
>>>>>> "cancel/abort"?  (b) Vulkan is generic API and could be used
>>>>>> for graphics as well as for plain compute tasks 
>>>>>> (VK_QUEUE_COMPUTE_BIT).
>>>>>>
>>>>>>
>>>>>> Sincerely yours,
>>>>>> Serguei Sagalovitch
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of
>>>>>> Andres Rodriguez <andresr at valvesoftware.com>
>>>>>> Sent: December 16, 2016 6:15 PM
>>>>>> To: amd-gfx at lists.freedesktop.org
>>>>>> Subject: [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>> Hi Everyone,
>>>>>>
>>>>>> This RFC is also available as a gist here:
>>>>>> https://gist.github.com/lostgoat/7000432cd6864265dbc2c3ab93204249
>>>>>>
>>>>>>
>>>>>>
>>>>>> [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>> gist.github.com
>>>>>> [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>>
>>>>>>
>>>>>> [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>> gist.github.com
>>>>>> [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>> gist.github.com
>>>>>> [RFC] Mechanism for high priority scheduling in amdgpu
>>>>>>
>>>>>>
>>>>>> We are interested in feedback for a mechanism to effectively 
>>>>>> schedule
>>>>>> high
>>>>>> priority VR reprojection tasks (also referred to as time-warping) 
>>>>>> for
>>>>>> Polaris10
>>>>>> running on the amdgpu kernel driver.
>>>>>>
>>>>>> Brief context:
>>>>>> --------------
>>>>>>
>>>>>> The main objective of reprojection is to avoid motion sickness 
>>>>>> for VR
>>>>>> users in
>>>>>> scenarios where the game or application would fail to finish
>>>>>> rendering a new
>>>>>> frame in time for the next VBLANK. When this happens, the user's 
>>>>>> head
>>>>>> movements
>>>>>> are not reflected on the Head Mounted Display (HMD) for the duration
>>>>>> of an
>>>>>> extra frame. This extended mismatch between the inner ear and the
>>>>>> eyes may
>>>>>> cause the user to experience motion sickness.
>>>>>>
>>>>>> The VR compositor deals with this problem by fabricating a new frame
>>>>>> using the
>>>>>> user's updated head position in combination with the previous 
>>>>>> frames.
>>>>>> This
>>>>>> avoids a prolonged mismatch between the HMD output and the inner 
>>>>>> ear.
>>>>>>
>>>>>> Because of the adverse effects on the user, we require high
>>>>>> confidence that the
>>>>>> reprojection task will complete before the VBLANK interval. Even if
>>>>>> the GFX pipe
>>>>>> is currently full of work from the game/application (which is most
>>>>>> likely the case).
>>>>>>
>>>>>> For more details and illustrations, please refer to the following
>>>>>> document:
>>>>>> https://community.amd.com/community/gaming/blog/2016/03/28/asynchronous-shaders-evolved 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Gaming: Asynchronous Shaders Evolved | Community
>>>>>> community.amd.com
>>>>>> One of the most exciting new developments in GPU technology over the
>>>>>> past year has been the adoption of asynchronous shaders, which can
>>>>>> make more efficient use of ...
>>>>>>
>>>>>>
>>>>>>
>>>>>> Gaming: Asynchronous Shaders Evolved | Community
>>>>>> community.amd.com
>>>>>> One of the most exciting new developments in GPU technology over the
>>>>>> past year has been the adoption of asynchronous shaders, which can
>>>>>> make more efficient use of ...
>>>>>>
>>>>>>
>>>>>>
>>>>>> Gaming: Asynchronous Shaders Evolved | Community
>>>>>> community.amd.com
>>>>>> One of the most exciting new developments in GPU technology over the
>>>>>> past year has been the adoption of asynchronous shaders, which can
>>>>>> make more efficient use of ...
>>>>>>
>>>>>>
>>>>>> Requirements:
>>>>>> -------------
>>>>>>
>>>>>> The mechanism must expose the following functionaility:
>>>>>>
>>>>>>     * Job round trip time must be predictable, from submission to
>>>>>> fence signal
>>>>>>
>>>>>>     * The mechanism must support compute workloads.
>>>>>>
>>>>>> Goals:
>>>>>> ------
>>>>>>
>>>>>>     * The mechanism should provide low submission latencies
>>>>>>
>>>>>> Test: submitting a NOP packet through the mechanism on busy hardware
>>>>>> should
>>>>>> be equivalent to submitting a NOP on idle hardware.
>>>>>>
>>>>>> Nice to have:
>>>>>> -------------
>>>>>>
>>>>>>     * The mechanism should also support GFX workloads.
>>>>>>
>>>>>> My understanding is that with the current hardware capabilities in
>>>>>> Polaris10 we
>>>>>> will not be able to provide a solution compatible with GFX worloads.
>>>>>>
>>>>>> But I would love to hear otherwise. So if anyone has an idea,
>>>>>> approach or
>>>>>> suggestion that will also be compatible with the GFX ring, please 
>>>>>> let
>>>>>> us know
>>>>>> about it.
>>>>>>
>>>>>>     * The above guarantees should also be respected by amdkfd 
>>>>>> workloads
>>>>>>
>>>>>> Would be good to have for consistency, but not strictly necessary as
>>>>>> users running
>>>>>> games are not traditionally running HPC workloads in the background.
>>>>>>
>>>>>> Proposed approach:
>>>>>> ------------------
>>>>>>
>>>>>> Similar to the windows driver, we could expose a high priority
>>>>>> compute queue to
>>>>>> userspace.
>>>>>>
>>>>>> Submissions to this compute queue will be scheduled with high
>>>>>> priority, and may
>>>>>> acquire hardware resources previously in use by other queues.
>>>>>>
>>>>>> This can be achieved by taking advantage of the 'priority' field in
>>>>>> the HQDs
>>>>>> and could be programmed by amdgpu or the amdgpu scheduler. The 
>>>>>> relevant
>>>>>> register fields are:
>>>>>>         * mmCP_HQD_PIPE_PRIORITY
>>>>>>         * mmCP_HQD_QUEUE_PRIORITY
>>>>>>
>>>>>> Implementation approach 1 - static partitioning:
>>>>>> ------------------------------------------------
>>>>>>
>>>>>> The amdgpu driver currently controls 8 compute queues from pipe0. 
>>>>>> We can
>>>>>> statically partition these as follows:
>>>>>>         * 7x regular
>>>>>>         * 1x high priority
>>>>>>
>>>>>> The relevant priorities can be set so that submissions to the high
>>>>>> priority
>>>>>> ring will starve the other compute rings and the GFX ring.
>>>>>>
>>>>>> The amdgpu scheduler will only place jobs into the high priority
>>>>>> rings if the
>>>>>> context is marked as high priority. And a corresponding priority
>>>>>> should be
>>>>>> added to keep track of this information:
>>>>>>      * AMD_SCHED_PRIORITY_KERNEL
>>>>>>      * -> AMD_SCHED_PRIORITY_HIGH
>>>>>>      * AMD_SCHED_PRIORITY_NORMAL
>>>>>>
>>>>>> The user will request a high priority context by setting an
>>>>>> appropriate flag
>>>>>> in drm_amdgpu_ctx_in (AMDGPU_CTX_HIGH_PRIORITY or similar):
>>>>>> https://github.com/torvalds/linux/blob/master/include/uapi/drm/amdgpu_drm.h#L163 
>>>>>>
>>>>>>
>>>>>>
>>>>>> The setting is in a per context level so that we can:
>>>>>>     * Maintain a consistent FIFO ordering of all submissions to a
>>>>>> context
>>>>>>     * Create high priority and non-high priority contexts in the 
>>>>>> same
>>>>>> process
>>>>>>
>>>>>> Implementation approach 2 - dynamic priority programming:
>>>>>> ---------------------------------------------------------
>>>>>>
>>>>>> Similar to the above, but instead of programming the priorities and
>>>>>> amdgpu_init() time, the SW scheduler will reprogram the queue 
>>>>>> priorities
>>>>>> dynamically when scheduling a task.
>>>>>>
>>>>>> This would involve having a hardware specific callback from the
>>>>>> scheduler to
>>>>>> set the appropriate queue priority: set_priority(int ring, int 
>>>>>> index,
>>>>>> int priority)
>>>>>>
>>>>>> During this callback we would have to grab the SRBM mutex to perform
>>>>>> the appropriate
>>>>>> HW programming, and I'm not really sure if that is something we
>>>>>> should be doing from
>>>>>> the scheduler.
>>>>>>
>>>>>> On the positive side, this approach would allow us to program a 
>>>>>> range of
>>>>>> priorities for jobs instead of a single "high priority" value",
>>>>>> achieving
>>>>>> something similar to the niceness API available for CPU scheduling.
>>>>>>
>>>>>> I'm not sure if this flexibility is something that we would need for
>>>>>> our use
>>>>>> case, but it might be useful in other scenarios (multiple users
>>>>>> sharing compute
>>>>>> time on a server).
>>>>>>
>>>>>> This approach would require a new int field in drm_amdgpu_ctx_in, or
>>>>>> repurposing
>>>>>> of the flags field.
>>>>>>
>>>>>> Known current obstacles:
>>>>>> ------------------------
>>>>>>
>>>>>> The SQ is currently programmed to disregard the HQD priorities, and
>>>>>> instead it picks
>>>>>> jobs at random. Settings from the shader itself are also disregarded
>>>>>> as this is
>>>>>> considered a privileged field.
>>>>>>
>>>>>> Effectively we can get our compute wavefront launched ASAP, but we
>>>>>> might not get the
>>>>>> time we need on the SQ.
>>>>>>
>>>>>> The current programming would have to be changed to allow priority
>>>>>> propagation
>>>>>> from the HQD into the SQ.
>>>>>>
>>>>>> Generic approach for all HW IPs:
>>>>>> --------------------------------
>>>>>>
>>>>>> For consistency purposes, the high priority context can be enabled
>>>>>> for all HW IPs
>>>>>> with support of the SW scheduler. This will function similarly to 
>>>>>> the
>>>>>> current
>>>>>> AMD_SCHED_PRIORITY_KERNEL priority, where the job can jump ahead of
>>>>>> anything not
>>>>>> commited to the HW queue.
>>>>>>
>>>>>> The benefits of requesting a high priority context for a non-compute
>>>>>> queue will
>>>>>> be lesser (e.g. up to 10s of wait time if a GFX command is stuck in
>>>>>> front of
>>>>>> you), but having the API in place will allow us to easily improve 
>>>>>> the
>>>>>> implementation
>>>>>> in the future as new features become available in new hardware.
>>>>>>
>>>>>> Future steps:
>>>>>> -------------
>>>>>>
>>>>>> Once we have an approach settled, I can take care of the 
>>>>>> implementation.
>>>>>>
>>>>>> Also, once the interface is mostly decided, we can start thinking 
>>>>>> about
>>>>>> exposing the high priority queue through radv.
>>>>>>
>>>>>> Request for feedback:
>>>>>> ---------------------
>>>>>>
>>>>>> We aren't married to any of the approaches outlined above. Our goal
>>>>>> is to
>>>>>> obtain a mechanism that will allow us to complete the reprojection
>>>>>> job within a
>>>>>> predictable amount of time. So if anyone anyone has any 
>>>>>> suggestions for
>>>>>> improvements or alternative strategies we are more than happy to 
>>>>>> hear
>>>>>> them.
>>>>>>
>>>>>> If any of the technical information above is also incorrect, feel
>>>>>> free to point
>>>>>> out my misunderstandings.
>>>>>>
>>>>>> Looking forward to hearing from you.
>>>>>>
>>>>>> Regards,
>>>>>> Andres
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>
>>>>>>
>>>>>> amd-gfx Info Page - lists.freedesktop.org
>>>>>> lists.freedesktop.org
>>>>>> To see the collection of prior postings to the list, visit the
>>>>>> amd-gfx Archives. Using amd-gfx: To post a message to all the list
>>>>>> members, send email ...
>>>>>>
>>>>>>
>>>>>>
>>>>>> amd-gfx Info Page - lists.freedesktop.org
>>>>>> lists.freedesktop.org
>>>>>> To see the collection of prior postings to the list, visit the
>>>>>> amd-gfx Archives. Using amd-gfx: To post a message to all the list
>>>>>> members, send email ...
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux