Adding Paolo. On 11/12/24 04:09, Maksim Davydov wrote: > > > On 11/8/24 23:56, Moger, Babu wrote: >> Hi Maxim, >> >> Thanks for looking into this. I will fix the bits I mentioned below in >> upcoming Genoa/Turin model update. >> >> I have few comments below. >> >> On 11/8/2024 12:15 PM, Maksim Davydov wrote: >>> Hi! >>> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa >>> host. I found some mismatches that confused me. Could you help me to >>> understand them? >>> >>> On 5/4/23 23:53, Babu Moger wrote: >>>> Adds the support for AMD EPYC Genoa generation processors. The model >>>> display for the new processor will be EPYC-Genoa. >>>> >>>> Adds the following new feature bits on top of the feature bits from >>>> the previous generation EPYC models. >>>> >>>> avx512f : AVX-512 Foundation instruction >>>> avx512dq : AVX-512 Doubleword & Quadword Instruction >>>> avx512ifma : AVX-512 Integer Fused Multiply Add instruction >>>> avx512cd : AVX-512 Conflict Detection instruction >>>> avx512bw : AVX-512 Byte and Word Instructions >>>> avx512vl : AVX-512 Vector Length Extension Instructions >>>> avx512vbmi : AVX-512 Vector Byte Manipulation Instruction >>>> avx512_vbmi2 : AVX-512 Additional Vector Byte Manipulation Instruction >>>> gfni : AVX-512 Galois Field New Instructions >>>> avx512_vnni : AVX-512 Vector Neural Network Instructions >>>> avx512_bitalg : AVX-512 Bit Algorithms, add bit algorithms Instructions >>>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and >>>> Quadword Instructions >>>> avx512_bf16 : AVX-512 BFLOAT16 instructions >>>> la57 : 57-bit virtual address support (5-level Page Tables) >>>> vnmi : Virtual NMI (VNMI) allows the hypervisor to inject >>>> the NMI >>>> into the guest without using Event Injection mechanism >>>> meaning not required to track the guest NMI and >>>> intercepting >>>> the IRET. >>>> auto-ibrs : The AMD Zen4 core supports a new feature called >>>> Automatic IBRS. >>>> It is a "set-and-forget" feature that means that, >>>> unlike e.g., >>>> s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS >>>> mitigation >>>> resources automatically across CPL transitions. >>>> >>>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx> >>>> --- >>>> target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 122 insertions(+) >>>> >>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >>>> index d50ace84bf..71fe1e02ee 100644 >>>> --- a/target/i386/cpu.c >>>> +++ b/target/i386/cpu.c >>>> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info >>>> = { >>>> }, >>>> }; >>>> +static const CPUCaches epyc_genoa_cache_info = { >>>> + .l1d_cache = &(CPUCacheInfo) { >>>> + .type = DATA_CACHE, >>>> + .level = 1, >>>> + .size = 32 * KiB, >>>> + .line_size = 64, >>>> + .associativity = 8, >>>> + .partitions = 1, >>>> + .sets = 64, >>>> + .lines_per_tag = 1, >>>> + .self_init = 1, >>>> + .no_invd_sharing = true, >>>> + }, >>>> + .l1i_cache = &(CPUCacheInfo) { >>>> + .type = INSTRUCTION_CACHE, >>>> + .level = 1, >>>> + .size = 32 * KiB, >>>> + .line_size = 64, >>>> + .associativity = 8, >>>> + .partitions = 1, >>>> + .sets = 64, >>>> + .lines_per_tag = 1, >>>> + .self_init = 1, >>>> + .no_invd_sharing = true, >>>> + }, >>>> + .l2_cache = &(CPUCacheInfo) { >>>> + .type = UNIFIED_CACHE, >>>> + .level = 2, >>>> + .size = 1 * MiB, >>>> + .line_size = 64, >>>> + .associativity = 8, >>>> + .partitions = 1, >>>> + .sets = 2048, >>>> + .lines_per_tag = 1, >>> >>> 1. Why L2 cache is not shown as inclusive and self-initializing? >>> >>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d): >>> * cache inclusive. Read-only. Reset: Fixed,1. >>> * cache is self-initializing. Read-only. Reset: Fixed,1. >> >> Yes. That is correct. This needs to be fixed. I Will fix it. >>> >>>> + }, >>>> + .l3_cache = &(CPUCacheInfo) { >>>> + .type = UNIFIED_CACHE, >>>> + .level = 3, >>>> + .size = 32 * MiB, >>>> + .line_size = 64, >>>> + .associativity = 16, >>>> + .partitions = 1, >>>> + .sets = 32768, >>>> + .lines_per_tag = 1, >>>> + .self_init = true, >>>> + .inclusive = true, >>>> + .complex_indexing = false, >>> >>> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that >>> the WBINVD/INVD instruction is not guaranteed to invalidate all lower >>> level caches (0 bit)? >>> >>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d): >>> * cache inclusive. Read-only. Reset: Fixed,0. >>> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1. >>> >> >> Yes. Both of this needs to be fixed. I Will fix it. >> >>> >>> >>> 3. Why the default stub is used for TLB, but not real values as for >>> other caches? >> >> Can you please eloberate on this? >> > > For L1i, L1d, L2 and L3 cache we provide the correct information about > characteristics. In contrast, for L1i TLB, L1d TLB, L2i TLB and L2d TLB > (0x80000005 and 0x80000006) we use the same value for all CPU models. > Sometimes it seems strange. For instance, the current default value in > QEMU for L2 TLB associativity for 4 KB pages is 4. But 4 is a reserved > value for Genoa (as PPR for Family 19h Model 11h says) > >>> >>>> + }, >>>> +}; >>>> + >>>> /* The following VMX features are not supported by KVM and are left >>>> out in the >>>> * CPU definitions: >>>> * >>>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition >>>> builtin_x86_defs[] = { >>>> { /* end of list */ } >>>> } >>>> }, >>>> + { >>>> + .name = "EPYC-Genoa", >>>> + .level = 0xd, >>>> + .vendor = CPUID_VENDOR_AMD, >>>> + .family = 25, >>>> + .model = 17, >>>> + .stepping = 0, >>>> + .features[FEAT_1_EDX] = >>>> + CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | >>>> CPUID_CLFLUSH | >>>> + CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | >>>> CPUID_PGE | >>>> + CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | >>>> CPUID_MCE | >>>> + CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE | >>>> + CPUID_VME | CPUID_FP87, >>>> + .features[FEAT_1_ECX] = >>>> + CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX | >>>> + CPUID_EXT_XSAVE | CPUID_EXT_AES | CPUID_EXT_POPCNT | >>>> + CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 | >>>> + CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA | >>>> + CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ | >>>> + CPUID_EXT_SSE3, >>>> + .features[FEAT_8000_0001_EDX] = >>>> + CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB | >>>> + CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX | >>>> + CPUID_EXT2_SYSCALL, >>>> + .features[FEAT_8000_0001_ECX] = >>>> + CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH | >>>> + CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM | >>>> + CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM | >>>> + CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE, >>>> + .features[FEAT_8000_0008_EBX] = >>>> + CPUID_8000_0008_EBX_CLZERO | >>>> CPUID_8000_0008_EBX_XSAVEERPTR | >>>> + CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB | >>>> + CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP | >>>> + CPUID_8000_0008_EBX_STIBP_ALWAYS_ON | >>>> + CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD, >>> >>> 4. Why 0x80000008_EBX features related to speculation vulnerabilities >>> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set? >> >> KVM does not expose these bits to the guests yet. >> >> I normally check using the ioctl KVM_GET_SUPPORTED_CPUID. >> > > I'm not sure, but at least the first two of these features seem to be > helpful to choose the appropriate mitigation. Do you think that we should > add them to KVM? > >> >>> >>>> + .features[FEAT_8000_0021_EAX] = >>>> + CPUID_8000_0021_EAX_No_NESTED_DATA_BP | >>>> + CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING | >>>> + CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE | >>>> + CPUID_8000_0021_EAX_AUTO_IBRS, >>> >>> 5. Why some 0x80000021_EAX features are not set? >>> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS) >> >> KVM does not expose FSRC and FSRS bits to the guests yet. > > But KVM exposes the same features (0x7 ecx=1, bits 10 and 11) for Intel > CPU models. Do we have to add these bits for AMD to KVM? > >> >> The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if >> we can add this bit to the Genoa and Turin. >> >>> >>>> + .features[FEAT_7_0_EBX] = >>>> + CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | >>>> CPUID_7_0_EBX_AVX2 | >>>> + CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | >>>> CPUID_7_0_EBX_ERMS | >>>> + CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F | >>>> + CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED | >>>> CPUID_7_0_EBX_ADX | >>>> + CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA | >>>> + CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB | >>>> + CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI | >>>> + CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL, >>>> + .features[FEAT_7_0_ECX] = >>>> + CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | >>>> CPUID_7_0_ECX_PKU | >>>> + CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI | >>>> + CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ | >>>> + CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG | >>>> + CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 | >>>> + CPUID_7_0_ECX_RDPID, >>>> + .features[FEAT_7_0_EDX] = >>>> + CPUID_7_0_EDX_FSRM, >>> >>> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data >>> processors have to use it, am I right? >> >> KVM does not expose L1D_FLUSH to the guests. Not sure why. Need to >> investigate. >> > > It seems that KVM has exposed L1D_FLUSH since da3db168fb67 Paolo, I see L1D_FLUSH feature bit is masked for SEV-SNP guest. https://lore.kernel.org/qemu-devel/20240704095806.1780273-17-pbonzini@xxxxxxxxxx/ Any reason for this? Genoa and Turin hardware supports L1D_FLUSH feature. I tested commenting the masking line and SEV-SNP guest boots fine. diff --git a/target/i386/sev.c b/target/i386/sev.c index a0d271f898..f5cc37bcc7 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -962,7 +962,7 @@ sev_snp_mask_cpuid_features(X86ConfidentialGuest *cg, uint32_t feature, uint32_t if (index == 0 && reg == R_EDX) { return value & ~(CPUID_7_0_EDX_SPEC_CTRL | CPUID_7_0_EDX_STIBP | - CPUID_7_0_EDX_FLUSH_L1D | + //CPUID_7_0_EDX_FLUSH_L1D | CPUID_7_0_EDX_ARCH_CAPABILITIES | CPUID_7_0_EDX_CORE_CAPABILITY | CPUID_7_0_EDX_SPEC_CTRL_SSBD); If there are no objections, I can add this patch in the Turin series. Thanks -- Thanks Babu Moger