On Tue, Dec 5, 2023 at 11:34 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > > > > On 2023/12/5 22:23, Juhyung Park wrote: > > Hi Gao, > > > > On Tue, Dec 5, 2023 at 4:32 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > >> > >> Hi Juhyung, > >> > >> On 2023/12/4 11:41, Juhyung Park wrote: > >> > >> ... > >>> > >>>> > >>>> - Could you share the full message about the output of `lscpu`? > >>> > >>> Sure: > >>> > >>> Architecture: x86_64 > >>> CPU op-mode(s): 32-bit, 64-bit > >>> Address sizes: 39 bits physical, 48 bits virtual > >>> Byte Order: Little Endian > >>> CPU(s): 8 > >>> On-line CPU(s) list: 0-7 > >>> Vendor ID: GenuineIntel > >>> BIOS Vendor ID: Intel(R) Corporation > >>> Model name: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz > >>> BIOS Model name: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz None CPU > >>> @ 3.0GHz > >>> BIOS CPU family: 198 > >>> CPU family: 6 > >>> Model: 140 > >>> Thread(s) per core: 2 > >>> Core(s) per socket: 4 > >>> Socket(s): 1 > >>> Stepping: 1 > >>> CPU(s) scaling MHz: 60% > >>> CPU max MHz: 4800.0000 > >>> CPU min MHz: 400.0000 > >>> BogoMIPS: 5990.40 > >>> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc > >>> a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss > >>> ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art > >>> arch_perfmon pebs bts rep_good nopl xtopology nonstop_ > >>> tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes6 > >>> 4 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xt > >>> pr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_dead > >>> line_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowp > >>> refetch cpuid_fault epb cat_l2 cdp_l2 ssbd ibrs ibpb st > >>> ibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ > >>> ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid > >>> rdt_a avx512f avx512dq rdseed adx smap avx512ifma clfl > >>> ushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl > >>> xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm > >>> ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp > >>> hwp_pkg_req vnmi avx512vbmi umip pku ospke avx512_vbmi > >>> 2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme av > >>> x512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2i > >> > >> Sigh, I've been thinking. Here FSRM is the most significant difference between > >> our environments, could you only try the following diff to see if there's any > >> difference anymore? (without the previous disable patch.) > >> > >> diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S > >> index 1b60ae81ecd8..1b52a913233c 100644 > >> --- a/arch/x86/lib/memmove_64.S > >> +++ b/arch/x86/lib/memmove_64.S > >> @@ -41,9 +41,7 @@ SYM_FUNC_START(__memmove) > >> #define CHECK_LEN cmp $0x20, %rdx; jb 1f > >> #define MEMMOVE_BYTES movq %rdx, %rcx; rep movsb; RET > >> .Lmemmove_begin_forward: > >> - ALTERNATIVE_2 __stringify(CHECK_LEN), \ > >> - __stringify(CHECK_LEN; MEMMOVE_BYTES), X86_FEATURE_ERMS, \ > >> - __stringify(MEMMOVE_BYTES), X86_FEATURE_FSRM > >> + CHECK_LEN > >> > >> /* > >> * movsq instruction have many startup latency > > > > Yup, that also seems to fix it. > > Are we looking at a potential memmove issue? > > I'm still analyzing this behavior as well as the root cause and > I will also try to get a recent cloud server with FSRM myself > to find more clues. Down the rabbit hole we go... Let me know if you have trouble getting an instance with FSRM. I'll see what I can do. > > Thanks, > Gao Xiang