Re: [PATCH v4 02/27] hardirq/nmi: Allow nested nmi_enter()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 25, 2020 at 04:41:11PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 25, 2020 at 04:09:06AM +0100, Frederic Weisbecker wrote:
> > On Mon, Feb 24, 2020 at 05:13:18PM +0100, Peter Zijlstra wrote:
> 
> > > +#define arch_nmi_enter()						\
> > > +do {									\
> > > +	struct nmi_ctx *___ctx;						\
> > > +	unsigned int ___cnt;						\
> > > +									\
> > > +	if (!is_kernel_in_hyp_mode() || in_nmi())			\
> > > +		break;							\
> > > +									\
> > > +	___ctx = this_cpu_ptr(&nmi_contexts);				\
> > > +	___cnt = ___ctx->cnt;						\
> > > +	if (!(___cnt & 1) && __cnt) {					\
> > > +		___ctx->cnt += 2;					\
> > > +		break;							\
> > > +	}								\
> > > +									\
> > > +	___ctx->cnt |= 1;						\
> > > +	barrier();							\
> > > +	nmi_ctx->hcr = read_sysreg(hcr_el2);				\
> > > +	if (!(nmi_ctx->hcr & HCR_TGE)) {				\
> > > +		write_sysreg(nmi_ctx->hcr | HCR_TGE, hcr_el2);		\
> > > +		isb();							\
> > > +	}								\
> > > +	barrier();							\
> > 
> > Suppose the first NMI is interrupted here. nmi_ctx->hcr has HCR_TGE unset.
> > The new NMI is going to overwrite nmi_ctx->hcr with HCR_TGE set. Then the
> > first NMI will not restore the correct value upon arch_nmi_exit().
> > 
> > So perhaps the below, but I bet I overlooked something obvious.
> 
> Well, none of this is obvious :/
> 
> The basic idea was that the LSB signifies 'pending/in-progress' and when
> that is set, nobody else touches no nothing. Enter will unconditionally
> (re) write_sysreg(), exit will nothing.

So here is my previous proposal, based on a simple counter, this time
with comments and a few fixes:

#define arch_nmi_enter()						\
do {									\
	struct nmi_ctx *___ctx;						\
	u64 ___hcr;							\
									\
	if (!is_kernel_in_hyp_mode())					\
		break;							\
									\
	___ctx = this_cpu_ptr(&nmi_contexts);				\
	if (___ctx->cnt) {						\
		___ctx->cnt++;						\
		break;							\
	}								\
									\
	___hcr = read_sysreg(hcr_el2);					\
	if (!(___hcr & HCR_TGE)) {					\
		write_sysreg(___hcr | HCR_TGE, hcr_el2);		\
		isb();							\
	}								\
	/*								\
	 * Make sure the sysreg write is performed before ___ctx->cnt	\
	 * is set to 1. NMIs that see cnt == 1 will rely on us.		\
	 */								\
	barrier();							\
	___ctx->cnt = 1;                                                \
	/*								\
	 * Make sure ___ctx->cnt is set before we save ___hcr. We	\
	 * don't want ___ctx->hcr to be overwritten.			\
	 */								\
	barrier();							\
	___ctx->hcr = ___hcr;						\
} while (0)

#define arch_nmi_exit()							\
do {									\
	struct nmi_ctx *___ctx;						\
	u64 ___hcr;							\
									\
	if (!is_kernel_in_hyp_mode())					\
		break;							\
									\
	___ctx = this_cpu_ptr(&nmi_contexts);				\
	___hcr = ___ctx->hcr;						\
	/*								\
	 * Make sure we read ___ctx->hcr before we release		\
	 * ___ctx->cnt as it makes ___ctx->hcr updatable again.		\
	 */								\
	barrier();							\
	___ctx->cnt--;							\
	/*								\
	 * Make sure ___ctx->cnt release is visible before we		\
	 * restore the sysreg. Otherwise a new NMI occuring		\
	 * right after write_sysreg() can be fooled and think		\
	 * we secured things for it.					\
	 */								\
	barrier();							\
	if (!___ctx->cnt && !(___hcr & HCR_TGE))			\
            write_sysreg(___hcr, hcr_el2);				\
} while (0)



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux