Hi James, Thanks a lot for your detailed comments. CC Peter. Peter is Qemu expert. Let us see his suggestion. > > Hi gengdongjiu, > > On 05/09/17 08:18, gengdongjiu wrote: > > On 2017/9/1 2:04, James Morse wrote: > >> On 28/08/17 11:38, Dongjiu Geng wrote: > >>> Userspace will want to check if the CPU has the RAS extension. > >> > >> ... but user-space wants to know if it can inject SErrors with a specified ESR. > >> > >> What if we gain another way of doing this that isn't via the > >> RAS-extensions, now user-space has to check for two capabilities. > >> > >> > >>> If it has, it wil specify the virtual SError syndrome value, > >>> otherwise it will not be set. This patch adds support for querying > >>> the availability of this extension. > >> > >> I'm against telling user-space what features the CPU has unless it > >> can use them directly. In this case we are talking about a KVM API, > >> so we should describe the API not the CPU. > > > > shenglong (zhaoshenglong@xxxxxxxxxx) who is Qemu maintainer suggested > > checking the CPU RAS-extensions to decide whether generate the APEI table and record CPER for the guest OS in the user space. > > he means if the host does not support RAS, user space may also not support RAS. > > The code to signal memory-failure to user-space doesn't depend on the CPU's RAS-extensions. > > If Qemu supports notifying the guest about RAS errors using CPER records, it should generate a HEST describing firmware first. It can then > choose the notification methods, some of which may require optional KVM APIs to support. > > Seattle has a HEST, it doesn't support the CPU RAS-extensions. The kernel can notify user-space about memory_failure() on this machine. I > would expect Qemu to be able to receive signals and describe memory errors to a guest (1). > > The question should be: 'How can Qemu know it can use SEI as a firmware-first notification?' It needs a KVM API to trigger an SError in the > guest with a specified ESR. The name of the KVM CAP needs to reflect the API (2). > > Just because this is the first KVM API that needs the CPU to have the RAS extensions doesn't mean we should call it 'has RAS' and be done > with it. > > We will eventually need another KVM API to configure trapping and emulating values in the RAS ERR registers so that Qemu can emulate a > machine without firmware-first. (This is likely to be a page of memory that backs the registers, there will need to be another KVM CAP to > describe this support (3)). > > > Exposing the CPUs support for RAS-extensions to support (2) means having per-platform support for (1). This is either creating extra work, > or not supporting as many platforms as we could. Both are bad. > Once we have (3) as well, any developer needs to know that 'has RAS' just meant the first API KVM implemented using RAS, and doesn't > mean later APIs also using RAS are supported by the kernel. Hi Peter/ shenglong, What is your idea about it? We may need to consult with you about it. > > > Thanks, > > James _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm