[PATCH] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling

Punit Agrawal punit.agrawal at arm.com
Fri Feb 3 08:17:17 PST 2017


Tyler Baicar <tbaicar at codeaurora.org> writes:

> From: "Jonathan (Zhixiong) Zhang" <zjzhang at codeaurora.org>
>
> Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault
> handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar
> to VM_FAULT_OOM, the only difference is that a different si_code
> (BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is
> initialized.
>
> Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang at codeaurora.org>
> Signed-off-by: Tyler Baicar <tbaicar at codeaurora.org>
> ---
>  arch/arm64/mm/fault.c | 31 +++++++++++++++++++++++++++----
>  1 file changed, 27 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c

[...]

> @@ -426,7 +439,17 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
>  		 */
>  		sig = SIGBUS;
>  		code = BUS_ADRERR;
> -	} else {
> +	}
> +#ifdef CONFIG_MEMORY_FAILURE
> +	else if (fault & (VM_FAULT_HWPOISON|VM_FAULT_HWPOISON_LARGE)) {

Please add spaces around '|'.

> +		pr_err(
> +	"Killing %s:%d due to hardware memory corruption fault at %lx\n",
> +			tsk->comm, tsk->pid, addr);

The message is misleading as we're not really killing a task but
delivering a signal (SIGBUS) which might not always lead to the receiver
being killed.

But considering that we don't print any message for the other faults,
I'd prefer that we drop this pr_err.

> +		sig = SIGBUS;
> +		code = BUS_MCEERR_AR;
> +	}
> +#endif

Although to get a HWPOISON fault CONFIG_MEMORY_FAILURE is needed, the
handling seems safe even when it is not enabled. Can the ifdeffery be
dropped?

Also, I was wondering how this code was tested? Did you by any chance
try using hwpoison inject debugfs interface?

Thanks,
Punit

> +	else {
>  		/*
>  		 * Something tried to access memory that isn't in our memory
>  		 * map.
> @@ -436,7 +459,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
>  			SEGV_ACCERR : SEGV_MAPERR;
>  	}
>  
> -	__do_user_fault(tsk, addr, esr, sig, code, regs);
> +	__do_user_fault(tsk, addr, esr, sig, code, regs, fault);
>  	return 0;
>  
>  no_context:



More information about the linux-arm-kernel mailing list