On Fri, Aug 7, 2015 at 2:56 PM, Toshi Kani <toshi.kani@xxxxxx> wrote: > On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: >> On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@xxxxxx> wrote: >> > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: >> > > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@xxxxxx> wrote: >> > > > On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: >> > > > > > > > On 12.06.15 at 01:23, <toshi.kani@xxxxxx> wrote: >> > > > > > There are two usages on MTRRs: >> > > > > > 1) MTRR entries set by firmware >> > > > > > 2) MTRR entries set by OS drivers >> > > > > > >> > > > > > We can obsolete 2), but we have no control over 1). As UEFI >> > > > > > firmwares also set this up, this usage will continue to stay. >> > > > > > So, we should not get rid of the MTRR code that looks up the >> > > > > > MTRR entries, while we have no need to modify them. >> > > > > > >> > > > > > Such MTRR entries provide safe guard to /dev/mem, which allows >> > > > > > privileged user to access a range that may require UC mapping >> > > > > > while the /dev/mem driver blindly maps it with WB. MTRRs >> > > > > > converts WB to UC in such a case. >> > > > > >> > > > > But it wouldn't be impossible to simply read the MTRRs upon boot, >> > > > > store the information, disable MTRRs, and correctly use PAT to >> > > > > achieve the same effect (i.e. the "blindly maps" part of course >> > > > > would need fixing). >> > > > >> > > > It could be done, but I do not see much benefit of doing it. One of >> > > > the reasons platform vendors set MTRRs is so that a system won't hit >> > > > a machine check when an OS bug leads an access with a wrong cache >> > > > type. >> > > > >> > > > A machine check is hard to analyze and can be seen as a hardware >> > > > issue by customers. Emulating MTRRs with PAT won't protect from >> > > > such a bug. >> > > >> > > That's seems like a fair and valid concern. This could only happen if >> > > the OS would have code that would use MTRR, in the case of Linux we'll >> > > soon be able to vet that this cannot happen. >> > >> > No, there is no OS support necessary to use MTRR. After firmware sets >> > it up, CPUs continue to use it without any OS support. I think the >> > Linux change you are referring is to obsolete legacy interfaces that >> > modify the MTRR setup. I agree that Linux should not modify MTRR. >> >> Its a bit more than that though. Since you agree that the OS can live >> without MTRR code I was hoping to then see if we can fold out PAT >> Linux code from under the MTRR dependency on Linux and make PAT a >> first class citizen, maybe at least for x86-64. Right now you can only >> get PAT support on Linux if you have MTRR code, but I'd like to see if >> instead we can rip MTRR code out completely under its own Kconfig and >> let it start rotting away. >> >> Code-wise the only issue I saw was that PAT code also relies on >> mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found >> no other obvious issues. > > We can rip of the MTTR code that modifies the MTRR setup, but not > mtrr_type_lookup(). This function provides necessary checks per documented > in commit 7f0431e3dc89 as follows. > > 1) reserve_memtype() tracks an effective memory type in case > a request type is WB (ex. /dev/mem blindly uses WB). Missing > to track with its effective type causes a subsequent request > to map the same range with the effective type to fail. > > 2) pud_set_huge() and pmd_set_huge() check if a requested range > has any overlap with MTRRs. Missing to detect an overlap may > cause a performance penalty or undefined behavior. > > mtrr_type_lookup() is still admittedly awkward, but I do not think we have > an immediate issue in PAT code calling it. I do not think it makes PAT code > a second class citizen. OK since we know that if MTRR set up code ends up disabled and would return MTRR_TYPE_INVALID what if we just static inline this for the no-MTRR Kconfig build option immediately, and only then have the full blown implementation for the case where MTRR Kconfig option is enabled? >> Platform firmware and SMIs seems to be the only other possible issue. >> More on this below. >> >> > > For those type of OSes... >> > > could it be possible to negotiate or hint to the platform through an >> > > attribute somehow that the OS has such capability to not use MTRR? >> > >> > The OS can disable MTRR. However, this can also cause a problem in >> > firmware, which may rely on MTRR. >> >> Can you describe what type of issues we could expect ? I tend to care >> more about this for 64-bit systems so if 32-bit platforms would be >> more of the ones which could cause an issue would restricting >> disabling MTRR only for 64-bit help? > > The SMI handler runs in real-mode and relies on MTRR being effective to > provide right cache types. It does not matter if it is 64-bit or not. I see... since I have no visibility to what goes under the hood, can you provide one example use case where an SMI handler would require getting a cache type through MTRR ? I realize this can vary, vendor by vendor, but any example would do just to satisfy my curiosity. >> > > Then, only if this bit is set, the platform could then avoid such MTRR >> > > settings, and if we have issues you can throw rocks at us. >> > >> > > And if that's not possible how about a new platform setting that would >> > > need to be set at the platform level to enable disabling this junk? >> > > Then only folks who know what they are doing would enable it, and if >> > > the customer set it, the issue would not be on the platform. >> > >> > > Could this also be used to prevent SMIs with MTRRs? >> > >> > ACPI _OSI could be used for firmware to implement some OS-specific >> > features, but it may be too late for firmware to make major changes and >> > is generally useless unless OS requirements are described in a spec >> > backed by logo certification. >> >> I see.. So there are no guarantees that platform firmware would not >> expect OS MTRR support. >> >> > SMIs are also used for platform management, such as fan >> > speed control. >> >> And its conceivable that some devices, or the platform itself, may >> trigger SMIs to have the platform firmware poke with MTRRs? > > SMIs are outside of OS control. SMI handler relies on MTRR being set. SMI > must be quick, so the handler should not be required to initialize MTRR or > page tables. Right makes sense. >> > Is there any issue for Linux to use MTRR set by firmware? >> >> Even though we don't have the Kconfig option right now to disable MTRR >> cod explicitly I'll note that there are a few other cases that could >> flip Linux to note use MTRR: >> >> a) Some BIOSes could let MTRR get disabled >> b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which >> disables MTRR on Linux >> >> If these environments can exist it'd be good to understand possible >> issues that could creep up as a result of the OS not having MTRR >> enabled. If this is a reasonable thing for x86-64 I was hoping we >> could just let users opt-in to a similar build configuration through >> the OS by letting PAT not depend on MTRR. > > Case a) and b) do not cause any issue. They simply lead mtrr_type_lookup() > to return immediately with MTRR_TYPE_INVALID (i.e. MTRR disable), and the > callers handle this value properly. These cases are only problematic when > the OS tries to modify MTRR. OK if the OS returns MTRR_TYPE_INVALID, for folks who do not want MTRR code on their kernel, we should be OK? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html