On Wed, Oct 25, 2023 at 11:13:46PM +0100, Andrew Cooper wrote: > On 25/10/2023 11:07 pm, Pawan Gupta wrote: > > On Wed, Oct 25, 2023 at 10:10:41PM +0100, Andrew Cooper wrote: > >>> +.align L1_CACHE_BYTES, 0xcc > >>> +SYM_CODE_START_NOALIGN(mds_verw_sel) > >>> + UNWIND_HINT_UNDEFINED > >>> + ANNOTATE_NOENDBR > >>> + .word __KERNEL_DS > >> You need another .align here. Otherwise subsequent code will still > >> start in this cacheline and defeat the purpose of trying to keep it > >> separate. > > Right. > > > >>> +SYM_CODE_END(mds_verw_sel); > >> Thinking about it, should this really be CODE and not a data entry? > > Would that require adding a data equivalent of .entry.text and update > > KPTI to keep it mapped? Or is there an easier option? > > Leave it right here in .entry.text , but try using SYM_DATA() and > friends. See whether objtool vomits over the result or not. objtool still complaints when using SYM_DATA*() without the annotations: vmlinux.o: warning: objtool: mds_verw_sel+0x0: unreachable instruction vmlinux.o: warning: objtool: .altinstr_replacement+0x2c: relocation to !ENDBR: mds_verw_sel+0x0 > And if objtool does vomit over the result, then leaving it as it is in > this patch with SYM_CODE() is good enough. Settling with SYM_CODE(). On the bright-side, I am seeing even better perf with VERW operand out-of-line: Baseline: v6.6-rc5 | Test | Configuration | v1 | v3 | | ------------------ | ---------------------- | ---- | ---- | | build-linux-kernel | defconfig | 1.00 | 1.00 | | hackbench | 32 - Process | 1.02 | 1.06 | | nginx | Short Connection - 500 | 1.01 | 1.04 | Disclaimer: These are collected by a stupid dev who knows nothing about perf, please take this with a grain of salt. I will be sending v4 soon.