[PATCH v10] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests

Christophe Leroy christophe.leroy at csgroup.eu
Tue Feb 27 10:35:04 PST 2024



Le 27/02/2024 à 19:21, Charlie Jenkins a écrit :
> On Tue, Feb 27, 2024 at 06:11:24PM +0000, Christophe Leroy wrote:
>>
>>
>> Le 27/02/2024 à 18:54, Charlie Jenkins a écrit :
>>> On Tue, Feb 27, 2024 at 11:32:19AM +0000, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 27/02/2024 à 11:28, Russell King (Oracle) a écrit :
>>>>> On Tue, Feb 27, 2024 at 06:47:38AM +0000, Christophe Leroy wrote:
>>>>>>
>>>>>>
>>>>>> Le 27/02/2024 à 00:48, Guenter Roeck a écrit :
>>>>>>> On 2/26/24 15:17, Charlie Jenkins wrote:
>>>>>>>> On Mon, Feb 26, 2024 at 10:33:56PM +0000, David Laight wrote:
>>>>>>>>> ...
>>>>>>>>>> I think you misunderstand. "NET_IP_ALIGN offset is what the kernel
>>>>>>>>>> defines to be supported" is a gross misinterpretation. It is not
>>>>>>>>>> "defined to be supported" at all. It is the _preferred_ alignment
>>>>>>>>>> nothing more, nothing less.
>>>>>>>>
>>>>>>>> This distinction is arbitrary in practice, but I am open to being proven
>>>>>>>> wrong if you have data to back up this statement. If the driver chooses
>>>>>>>> to not follow this, then the driver might not work. ARM defines the
>>>>>>>> NET_IP_ALIGN to be 2 to pad out the header to be on the supported
>>>>>>>> alignment. If the driver chooses to pad with one byte instead of 2
>>>>>>>> bytes, the driver may fail to work as the CPU may stall after the
>>>>>>>> misaligned access.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm sure I've seen code that would realign IP headers to a 4 byte
>>>>>>>>> boundary before processing them - but that might not have been in
>>>>>>>>> Linux.
>>>>>>>>>
>>>>>>>>> I'm also sure there are cpu which will fault double length misaligned
>>>>>>>>> memory transfers - which might be used to marginally speed up code.
>>>>>>>>> Assuming more than 4 byte alignment for the IP header is likely
>>>>>>>>> 'wishful thinking'.
>>>>>>>>>
>>>>>>>>> There is plenty of ethernet hardware that can only write frames
>>>>>>>>> to even boundaries and plenty of cpu that fault misaligned accesses.
>>>>>>>>> There are even cases of both on the same silicon die.
>>>>>>>>>
>>>>>>>>> You also pretty much never want a fault handler to fixup misaligned
>>>>>>>>> ethernet frames (or really anything else for that matter).
>>>>>>>>> It is always going to be better to check in the code itself.
>>>>>>>>>
>>>>>>>>> x86 has just made people 'sloppy' :-)
>>>>>>>>>
>>>>>>>>>        David
>>>>>>>>>
>>>>>>>>> -
>>>>>>>>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>>>>>>>>> MK1 1PT, UK
>>>>>>>>> Registration No: 1397386 (Wales)
>>>>>>>>>
>>>>>>>>
>>>>>>>> If somebody has a solution they deem to be better, I am happy to change
>>>>>>>> this test case. Otherwise, I would appreciate a maintainer resolving
>>>>>>>> this discussion and apply this fix.
>>>>>>>>
>>>>>>> Agreed.
>>>>>>>
>>>>>>> I do have a couple of patches which add explicit unaligned tests as well as
>>>>>>> corner case tests (which are intended to trigger as many carry overflows
>>>>>>> as possible). Once I get those working reliably, I'll be happy to submit
>>>>>>> them as additional tests.
>>>>>>>
>>>>>>
>>>>>> The functions definitely have to work at least with and without VLAN,
>>>>>> which means the alignment cannot be greater than 4 bytes. That's also
>>>>>> the outcome of the discussion.
>>>>>
>>>>> Thanks for completely ignoring what I've said. No. The alignment ends up
>>>>> being commonly 2 bytes.
>>>>>
>>>>> As I've said several times, network drivers do _not_ have to respect
>>>>> NET_IP_ALIGN. There are 32-bit ARM drivers which have a DMA engine in
>>>>> them which can only DMA to a 32-bit aligned address. This means that
>>>>> the start of the ethernet header is placed at a 32-bit aligned address
>>>>> making the IP header misaligned to 32-bit.
>>>>>
>>>>> I don't see what is so difficult to understand about this... but it
>>>>> seems that my comments on this are being ignored time and time again,
>>>>> and I can only think that those who are ignoring my comments have
>>>>> some alterior motive here.
>>>>>
>>>>
>>>> I'm sorry for this misunderstanding. I'm not ignoring what you said at
>>>> all. I understood that ARM is able to handle unaligned accesses with
>>>> some exception handlers at worst case and that DMA constraints may lead
>>>> to the IP header beeing on a 2 bytes alignment only.
>>>>
>>>> However I also understood from others that some architectures can't
>>>> handle such a 2 bytes only alignments.
>>>>
>>>> It's been suggested during the discussion that alignment tests should be
>>>> added later in a follow-up patch. So for the time being I'm trying to
>>>> find a compromise and get the existing tests working on all platforms
>>>> but with a smaller alignment than the 16-bytes alignment brought by
>>>> Charlie's v10 patch. And a 4 bytes alignment seemed to me to be a good
>>>> compromise for this fix. The idea is also to make the fix as minimal as
>>>> possible, unlike Charlie's patch that is churning up the tests quite
>>>> heavily.
>>>
>>> Do you have a list of platforms this is failing on? I haven't seen any
>>> reports that haven't been fixed.
>>
>> I don't have such a list, but I guess you do ? If all platforms have
>> already been fixed, why are you sending this patch at all ?
> 
> This patch is what is doing the "fixing". Over the course of 10 versions
> I have "fixed" the test cases to work on platforms that have various
> alignment and endianness constraints. The endianness changes were picked
> off of these patches and spun out into a different patch by you.
> 
> I originally introduced these two new test cases since I wrote the riscv
> checksum function implementations and these tests were helpful for me
> and I figured they may be helpful for somebody else too.

I see.

Then you mis-understood. I don't say your patch leave any platform 
unfixed. I say that your patch seems bigger than required, it is a 
churn. In addition your patch assumes an alignment of 16-bytes which, as 
explained by Russell, it just wrong. At least an alignment of 4 bytes 
must work on any platforms because of VLANs.

Christophe


More information about the linux-arm-kernel mailing list