Re: Questions on Maxwell 2nd Gen Compute Kernels/Shaders

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]<

 



On Mon, Jul 15, 2019 at 2:34 PM Fernando Sahmkow <fsahmkow27@xxxxxxxxx> wrote:
>
> So we have been busy implementing the compute engine lately but we have discovered a few issues with Compute Shaders. I hope you guys can answer some questions.
>
> 1st How do I determine the size of Compute Shaders/Kernel Local Memory ? In Pipeline shaders the size is included in the header but Compute Kernels don't have a header, so how do I determine how much local memory it uses? In case I can't is there a limit?

>From the header :)

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nve4_compute.h
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nve4_compute.c#n775

You may also find this interesting:

https://nvidia.github.io/open-gpu-doc/classes/compute/

These docs appeared well after we had already RE'd, I don't think we
ever went back to check if we'd missed anything substantial.

>
> 2nd I backtrack directions for LDG from the constbuffer that stores them. I then use this directions then to compute the adress in my emulated SSBO. For fragment, geometry and vertex shaders I got no problems with this directions. For compute shaders the directions seem to be invalid, I imagine there's a base adress that's added to this directions. Where can I obtain that base adress?

I don't think so. Can you show me an instruction stream that suggests
this? I suspect you're misreading the code. Should work the same way
as everywhere, except there are only 8 constbufs total, and so
sometimes the actual constbuf data is also retrieved with LDG.

>
> 3rd SUATOM instraction CAS is similar to CompareAndSwap except it may add 1 or 2 to the data register on store. How do I know when it adds 1 or 2?

Uhm... huh? CAS = compare and swap. The argument order is different
than the one in the API, as I recall, but there's no funny addition
that I'm aware of.

Now, there is a IADD.PO mode (PO = plus one), which corresponds to
both arguments' neg bits being set, but that's the only such weirdness
I'm aware of.

Cheers,

  -ilia
_______________________________________________
Nouveau mailing list
Nouveau@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/nouveau




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux