答复: [PATCH,v2] arm64: fix the illegal address access in some cases

Guodeqing (A) geffrey.guo at huawei.com
Wed Jul 29 03:05:09 EDT 2020



> -----邮件原件-----
> 发件人: Will Deacon [mailto:will at kernel.org]
> 发送时间: Tuesday, July 28, 2020 23:35
> 收件人: Robin Murphy <robin.murphy at arm.com>
> 抄送: catalin.marinas at arm.com; Guodeqing (A) <geffrey.guo at huawei.com>;
> kernel-team at android.com; linux-arm-kernel at lists.infradead.org;
> luke.starrett at broadcom.com
> 主题: Re: [PATCH,v2] arm64: fix the illegal address access in some cases
> 
> Hi Robin,
> 
> On Tue, Jul 28, 2020 at 03:30:50PM +0100, Robin Murphy wrote:
> > On 2020-07-28 14:03, Will Deacon wrote:
> > > On Sat, 25 Jul 2020 10:08:06 +0800, guodeqing wrote:
> > > > The ihl value of ip header is smaller than 5 in some cases, if the
> > > > ihl value is smaller than 5, then the next code will access the
> > > > illegal address, and the system will panic. ip_fast_csum() must be
> > > > able to handle any value that could fit in the ihl field of the ip protocol
> header.
> > > >
> > > > Here I add the check of the ihl value to solve this problem.
> > >
> > > Applied to arm64 (for-next/fixes), thanks!
> > >
> > > [1/1] arm64: csum: Reject IP headers with 'ihl' field smaller than five
> > >        https://git.kernel.org/arm64/c/09aaef1c5f50
> >
> > I'm not sure your commit message is entirely right there. AFAICS it's
> > not "the same way as x86" at all - x86 dereferences the first word of
> > iph and returns that as the sum if ihl <= 4 (and thus is still capable
> > of crashing given sufficiently bogus data). I'm not sure where "return
> > 1" came from - if we're going to return nonsense then the mildly more
> > efficient choice of 0 seems just as good.
> 
> Argh, yes, that's %1 not $1, so I don't know where the 1 comes from either.
> Geffrey?
> 

The return 1 is just the report of ip checksum error, the return value 0 means the ip
checksum correct. x86 dereferences the first word of iph and returns that as the sum,
this may be just the report of ip checksum error too.

> > Otherwise it would seem reasonable to jump straight into the
> > word-at-a-time loop if ip_fast_csum() is really expected to cope with
> > more than just genuine IP headers (which should be backed by at least
> > 20 bytes of valid memory regardless of what ihl says).
> 
> Either copying the x86 behaviour or WARN_ON_ONCE() and assuming and ihl of
> 5 would be my preference, because I agree with you that this feels like it
> shouldn't be happening to start with.

How about modify the patch like this?

static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
{
	__uint128_t tmp;
	u64 sum;

    if (unlikely(ihl < 5))
        ihl = 5;

	tmp = *(const __uint128_t *)iph;
	iph += 16;
	ihl -= 4;           
	tmp += ((tmp >> 64) | (tmp << 64));
	sum = tmp >> 64;
	do {
		sum += *(const u32 *)iph;
		iph += 4;
	} while (--ihl);

	sum += ((sum >> 32) | (sum << 32));
	return csum_fold((__force u32)(sum >> 32));
}


> 
> > I still think this smells of papering over some other bug that led to
> > a bogus skb getting that far into the transmit stack in the first
> > place - presumably it's all wasted effort anyway since a "header" with
> > no space for a destination address and a deliberately wrong checksum
> > seems unlikely to go very far...
> 
> Looking at the ipvlan_start_xmit() path from the backtrace, it looks to me like
> ipvlan_get_L3_hdr() returns NULL if the header length is invalid, but then
> ipvlan_xmit_mode_l3() ends up calling ipvlan_process_outbound() anyway.
> Hmm. I really don't know enough about VLANs to know what the right
> behaviour is here and I guess just returning NET_XMIT_DROP will break
> something.

The network maintainer has replied to me,
" ip_fast_csum() must be able to handle any value that could fit in the ihl field of the ip protocol header. That's not only the most correct logic, but also the most robust."

> 
> > On a quick look there appear to be quite a few implementations
> > dereferencing up to 5 words unconditionally, so it's not like this is arm64's
> own bug.
> 
> I'll drop the patch, but we are apparently open to a crash here, so if you have
> time to figure out what's going on, that would be great. The reproducer didn't
> work for me (I guess I'm missing some utils) and sending bogus header lengths
> with a raw socket worked fine (i.e. didn't crash either). I guess the vlan is an
> important piece of the picture.

This test needs the netem module and " ip netns exec ns1 tc qdisc add dev ip1 root netem corrupt 100%"
must execute successfully. then the panic can be reproduced.

This is a fault injection test, the corrupt function of netem is the emulation of random noise introducing an error 
in a random position for a chosen percent of packets to test the network module.the netem will modify the packet 
randomly,so the ihl value of ip header may be modified to 1.

Thanks.

> 
> Will


More information about the linux-arm-kernel mailing list