Hi, Steven! Thanks for such a detailed explanation of T628 peculiarities. On Wed, Jan 12, 2022 at 05:03:15PM +0000, Steven Price wrote: > On 10/01/2022 17:42, Alyssa Rosenzweig wrote: > >> Whether it's worth the effort depends on whether anyone really cares > >> about getting the full performance out of this particular GPU. > >> > >> At this stage I think the main UABI change would be to add the opposite > >> flag to kbase, (e.g. "PANFROST_JD_DOESNT_NEED_COHERENCY_ON_GPU"[1]) to > >> opt-in to allowing the job to run across all cores. > >> > >> The second change would be to allow compute jobs to be run on the second > >> core group, so another flag: PANFROST_RUN_ON_SECOND_CORE_GROUP. > >> > >> But clearly there's little point adding such flags until someone steps > >> up to do the Mesa work. > > > > I worry about the maintainence burden (both Mesa and kernel) of adding > > UABI only used by a piece of hardware none of us own, and only useful > > "sometimes" for that hardware. Doubly so for the second core group > > support; currently Mesa doesn't advertise any compute support on > > anything older than Mali T760 ... to the best of my knowledge, nobody > > has missed that support either... > > I agree there's no point adding the UABI support unless someone is > willing to step and be a maintainer for that hardware. And I suspect no > one cares enough about that hardware to do that. > > > To be clear I am in favour of merging the patches needed for GLES2 to > > work on all Malis, possibly at a performance cost on these dual-core > > systems. That's a far cry from the level of support the DDK gave these > > chips back in the day ... of course, the DDK doesn't support them at all > > anymore, so Panfrost wins there by default! ;) > > > > Agreed - I'm happy to merge a kernel series similar to this. I think the > remaining problems are: > > 1. Addressing Robin's concerns about the first patch. That looks like > it's probably just wrong. The first patch is wrong and I'll drop it. > 2. I think this patch is too complex for the basic support. There's some > parts like checking GROUPS_L2_COHERENT which also don't feature in kbase That part has been adapted from kbase_gpuprops_construct_coherent_groups, see https://github.com/hardkernel/linux/blob/2f0f4268209ddacc2cdea158104b87cedacbd0e3/drivers/gpu/arm/midgard/mali_kbase_gpuprops.c#L94 > so I don't believe are correct. I'm not sure if it's correct or not, however - it does not change anything for GPUs with coherent L2 caches - it appears to correctly figure out core groups for several SoCs with T628 GPU (BE-M1000, Exynos 5422). > 3. I don't think this blocks the change. But if we're not using the > second core group we could actually power it down. Indeed simply not > turning on the L2/shader cores should in theory work (jobs are not > scheduled to cores which are turned off even if they are included in the > affinity). Powering off unused GPU is would be nice, however 1) the userspace might power on those cores again (via sysfs or something), so I prefer to explicitly schedule jobs to the core group 0. 2) on BE-M1000 GPU seems to lock up in a few seconds after powering off some (GPU) cores. In fact I had to disable GPU devfreq to prevent GPU lockups. Therefore I consider powering off unused cores as a later optimization. (frankly speaking I'd better put the effort to *making use* of those cores instead of figuring out why they fail to power down properly). Best regards, Alexey