答复: 答复: [PATCH,v2] arm64: fix the illegal address access in some cases

Guodeqing (A) geffrey.guo at huawei.com
Thu Jul 30 06:49:47 EDT 2020



> -----邮件原件-----
> 发件人: Will Deacon [mailto:will at kernel.org]
> 发送时间: Thursday, July 30, 2020 16:44
> 收件人: Guodeqing (A) <geffrey.guo at huawei.com>
> 抄送: Robin Murphy <robin.murphy at arm.com>; catalin.marinas at arm.com;
> kernel-team at android.com; linux-arm-kernel at lists.infradead.org;
> luke.starrett at broadcom.com
> 主题: Re: 答复: [PATCH,v2] arm64: fix the illegal address access in some cases
> 
> On Wed, Jul 29, 2020 at 07:05:09AM +0000, Guodeqing (A) wrote:
> > > On Tue, Jul 28, 2020 at 03:30:50PM +0100, Robin Murphy wrote:
> > > > On 2020-07-28 14:03, Will Deacon wrote:
> > > > > Applied to arm64 (for-next/fixes), thanks!
> > > > >
> > > > > [1/1] arm64: csum: Reject IP headers with 'ihl' field smaller than five
> > > > >        https://git.kernel.org/arm64/c/09aaef1c5f50
> > > >
> > > > I'm not sure your commit message is entirely right there. AFAICS
> > > > it's not "the same way as x86" at all - x86 dereferences the first
> > > > word of iph and returns that as the sum if ihl <= 4 (and thus is
> > > > still capable of crashing given sufficiently bogus data). I'm not
> > > > sure where "return 1" came from - if we're going to return
> > > > nonsense then the mildly more efficient choice of 0 seems just as good.
> > >
> > > Argh, yes, that's %1 not $1, so I don't know where the 1 comes from either.
> > > Geffrey?
> > >
> >
> > The return 1 is just the report of ip checksum error, the return value
> > 0 means the ip checksum correct. x86 dereferences the first word of
> > iph and returns that as the sum, this may be just the report of ip checksum
> error too.
> 
> On the receive path, sure, but the crash was on the transmit path where we're
> computing the checksum to insert into the header, no?

[  388.873996] Call trace:
[  388.874280]  ip_send_check+0x40/0x78
[  388.874657]  __ip_local_out+0x78/0x160
[  388.875060]  ip_local_out+0x34/0x68
[  388.875432]  ipvlan_queue_xmit+0x608/0x7f0
[  388.875870]  ipvlan_start_xmit+0x2c/0x90
[  388.876293]  dev_hard_start_xmit+0xac/0x258
[  388.876734]  sch_direct_xmit+0x1a8/0x4c0
[  388.877149]  __qdisc_run+0x88/0xe0
[  388.877508]  __dev_queue_xmit+0x630/0x950
[  388.877942]  dev_queue_xmit+0x24/0x30
[  388.878356]  ip_finish_output2+0x26c/0x420
[  388.878791]  ip_finish_output+0x1c8/0x2a0

The midify action is in the qdisc module, because netem is a kernel module,
the kernel config is CONFIG_NET_SCH_NETEM.

> 
> > > > Otherwise it would seem reasonable to jump straight into the
> > > > word-at-a-time loop if ip_fast_csum() is really expected to cope
> > > > with more than just genuine IP headers (which should be backed by
> > > > at least
> > > > 20 bytes of valid memory regardless of what ihl says).
> > >
> > > Either copying the x86 behaviour or WARN_ON_ONCE() and assuming and
> > > ihl of
> > > 5 would be my preference, because I agree with you that this feels
> > > like it shouldn't be happening to start with.
> >
> > How about modify the patch like this?
> >
> > static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
> > {
> > 	__uint128_t tmp;
> > 	u64 sum;
> >
> >     if (unlikely(ihl < 5))
> >         ihl = 5;
> 
> I'd probably do:
> 
> 	/* Callers should really be checking this first */
> 	if (WARN_ON_ONCE(ihl < 5))
> 		ihl = 5;
> 
> because I'd still like to understand what the vlan code is up to.
> 

Maybe ipvlan has to drop the corrupted packet, but ipvlan also has the choice to let the corrupted
packet to be dropped by the network stack, the network stack will call the ip_fast_csum. so ip_fast_csum
must be robust.

> > > > I still think this smells of papering over some other bug that led
> > > > to a bogus skb getting that far into the transmit stack in the
> > > > first place - presumably it's all wasted effort anyway since a
> > > > "header" with no space for a destination address and a
> > > > deliberately wrong checksum seems unlikely to go very far...
> > >
> > > Looking at the ipvlan_start_xmit() path from the backtrace, it looks
> > > to me like
> > > ipvlan_get_L3_hdr() returns NULL if the header length is invalid,
> > > but then
> > > ipvlan_xmit_mode_l3() ends up calling ipvlan_process_outbound() anyway.
> > > Hmm. I really don't know enough about VLANs to know what the right
> > > behaviour is here and I guess just returning NET_XMIT_DROP will
> > > break something.
> >
> > The network maintainer has replied to me, " ip_fast_csum() must be
> > able to handle any value that could fit in the ihl field of the ip
> > protocol header. That's not only the most correct logic, but also the
> > most robust."
> 
> Is that on a public list somewhere? Would be a good link for the commit
> message.

https://www.spinics.net/lists/netdev/msg671867.html

> 
> > This is a fault injection test, the corrupt function of netem is the
> > emulation of random noise introducing an error in a random position
> > for a chosen percent of packets to test the network module.the netem
> > will modify the packet randomly,so the ihl value of ip header may be modified
> to 1.
> 
> Ok, but netem is running in userspace (right?) and so I still think the network
> layer can reject the invalid ihl before calling into the checksum code.
> 

netem is a kernel module, the kernel config is CONFIG_NET_SCH_NETEM.

https://www.mail-archive.com/netdev@vger.kernel.org/msg342331.html


> Will


More information about the linux-arm-kernel mailing list