alignment faults in 3.6

Thu Oct 11 11:23:30 EDT 2012

On 10/11/2012 08:47 AM, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 08:20 -0500, Rob Herring wrote:
>> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
>>> On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
>>>
>>>>
>>>> Rob Herring as the original reporter has dropped off the Cc list, adding
>>>> him back.
>>>>
>>>> I assume that the calxeda xgmac driver is the culprit then. It uses
>>>> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
>>>> xgmac_rx_refill but it is not clear whether it does so intentionally
>>>> or by accident.
>>
>> This in fact does work and eliminates the unaligned traps. However, not
>> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
>> this is a questionable optimization by the compiler. We're saving 1 load
>> instruction here for data that is likely already in the cache. It may be
>> legal per the ABI, but the downside of this optimization is much greater
>> than the upside.
> 
> Compiler is asked to perform a 32bit load, it does it.

Not exactly. It is asked to to perform 2 32-bit loads which are combined
into a single ldm (load multiple) which cannot handle unaligned
accesses. Here's a simple example that does the same thing:

void test(char * buf)
{
	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
}

So I guess the only ABI legal unaligned access is in a packed struct.

> There is no questionable optimization here. Really.
> Please stop pretending this, this makes no sense.

I'm not the one calling the networking stack bad code.

I can fix my h/w, so I'll stop caring about this. Others can all get
bitten by this new behavior in gcc 4.7.

Rob

> As I said, if some h/w cannot do IP aligned DMA, driver can use a
> workaround, or a plain memmove() (some drivers seems to do this to work
> around this h/w limitation, just grep for memmove() in drivers/net)
> 
>>
>>>
>>> Thanks Arnd
>>>
>>> It seems an accident, since driver doesnt check skb->data alignment at
>>> all (this can change with SLAB debug on/off)
>>>
>>> It also incorrectly adds 64 bytes to bfsize, there is no need for this.
>>
>> I'm pretty sure this was needed as the h/w writes out full bursts of
>> data, but I'll go back and check.
> 
> Maybe the ALIGN() was needed then. But the 64 + NE_IP_ALIGN sounds like
> the head room that we allocate/reserve in netdev_alloc_skb_ip_align()
> 
> So you allocate this extra room twice.
> 
> Thanks
> 
>