On Thursday, May 4, 2017 7:47:21 AM PDT David Weinehall wrote: > On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote: > > Thanks for rephrasing - that's exactly what I am concerned with. > > > > Did you just use the MediaSDK as it is - meaning that MOCS entries > > beyond the set of the 3 we have defined had been naively utilized? > > > > If that's the case it is probably the cause of the performance > > difference - everything beyond "the 3" means UNCACHED. > > > > Can you try changing MediaSDK to only use entries that are already in? > > How the performance differs in that case? > > We're benchmarking using upstream MediaSDK without changes, since that's > the only thing that's relevant. Customising benchmarks to get better > results isn't really an acceptable solution :) > > Obviously fixing MediaSDK upstream is a different story, in case one of > the three pre-defined entries we have turns out to be the best possible > MOCS-settings for that workload. You're right about customizing benchmarks, but... MediaSDK is not a benchmark. If I'm not mistaken, it's a userspace driver produced by Intel engineers, one which Intel has the full capability to change. What you're saying is that Intel's MediaSDK engineers are unwilling to change their software to provide better performance for their Linux users. That's pretty mental. We don't warp the core operating system to work around userspace software simply because they don't want to change it. This isn't about open vs. closed or internal vs. public projects, either. I work on a public userspace driver for Intel graphics. If I sent a kernel patch, the kernel developers would ask me the exact same questions, to justify my new additions: 1. Is your userspace actually using all these new additions? If not, which ones are you using? They would ask me to drop anything I wasn't actually using yet, because speculatively adding things to the kernel that we have to maintain backwards compatibility for has caused both kernel and userspace developers a lot of trouble. 2. Are you sure that you need them all? Is there a simpler solution - are some existing things good enough? What's the additional benefit of each new addition? I would have to answer these questions to the satisfaction of the kernel developers before they would even consider taking my patch. You keep pointing to your large performance improvement, but all it's shown is that actually using the GPU cache is faster than having a broken userspace driver explicitly set everything to uncached. Many people have pointed this out. Arek and Tvrtko have good suggestions. I don't think you're going to get anywhere with this until you demonstrate that the new MOCS entries provide some non-zero value over using the existing WB entry. Here are a couple more data points: 1. We likely can't implement the documented "MOCS Version 1" table as is. The kernel exposes existing entries with specific semantics. Changing their meaning would introduce a backwards-incompatible change that would likely regress the performance of existing userspace. This is almost certainly unacceptable - our customers, distro partners, users, and even people like Linus Torvalds will suffer and complain loudly. We could add the new entries at an offset - i.e. leave the existing 3 entries, and append the rest after that. But that would require changing userspace that assumes the Windows tables, such as MediaSDK (they would have to add 3 to their MOCS indexes). At which point, we're changing them, so...the "runs unaltered" argument falls over. 2. The docs finally contain "recommended MOCS settings" - i.e. where to cache various types of objects, and at what age. However, I believe those recommendations can be implemented with 1-2 new table entries and a PTE change to be eLLC-only by default. Most of the table is completely unnecessary to implement the recommendations. I personally would like to try implementing their recommended settings in my driver. I have not had time yet, but plan to try. I'm very glad to see the Windows MOCS recommendations documented. I'd been asking for that information for literally years. If we'd gotten it earlier, a lot of mess could have been avoided. For future platforms, we may want to coordinate and use the same table. But Gen9 has been shipping for ages, and we don't have that luxury. --Ken
Attachment:
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx