[PATCH net V2 1/2] net: stmmac: Premature loop termination check was ignored on rx
Jochen Henneberg
jh at henneberg-systemdesign.com
Mon Mar 20 02:04:54 PDT 2023
Jakub Kicinski <kuba at kernel.org> writes:
> On Sat, 18 Mar 2023 09:38:12 +0100 Jochen Henneberg wrote:
>> > Are you sure? Can you provide more detailed analysis?
>> > Do you observe a problem / error in real life or is this theoretical?
>>
>> This is theoretical, I was hunting another bug and just stumbled over
>> the check which is, I think you agree, pointless right now. I did not
>> try to force execute that code path.
>
> If you have the HW it's definitely worth doing. There is a fault
> injection infra in Linus which allows to fail memory allocations.
> Or you can just make a little patch to the driver to fake failing
> every 1000th allocation.
>
I have the hardware available and will do the check.
>> > As far as I can tell only path which jumps to read_again after doing
>> > count++ is via the drain_data jump, but I can't tell how it's
>> > discarding subsequent segments in that case..
>> >
>> >> -read_again:
>> >> buf1_len = 0;
>> >> buf2_len = 0;
>> >> entry = next_entry;
>>
>> Correct. The read_again is triggered in case that the segment is not the
>> last segment of the frame:
>>
>> if (likely(status & rx_not_ls))
>> goto read_again;
>>
>> So in case there is no skb (queue error) it will keep increasing count
>> until the last segment has been found with released device DMA
>> ownership. So skb will not change while the goto loop is running, the
>> only thing that will change is that subsequent segments release device
>> DMA ownership. The dirty buffers are then cleaned up from
>> stmmac_rx_refill().
>
> To be clear - I'm only looking at stmmac_rx(), that ZC one is even more
> confusing.
>
> Your patch makes sense, but I think it's not enough to make this code
> work in case of memory allocation failure. AFAIU the device supports
> scatter - i.e. receiving a single frame in multiple chunks. Each time
> thru the loop we process one (or two?) chunks. But the code uses
> skb == NULL to decide whether it's the first chunk or not. So in case
> of memory allocation error it will treat the second chunk as the first
> (since skb will be NULL) and we'll get a malformed frame with missing
> chunks sent to the stack. The driver should discard the entire frame
> on failure..
>
Understood. However, this forces me to read the code and datahseet very
carefully to understand the details. I will do that, however, it will
take me some time.
For the ST and Synopsys people:
I could imagine that you would be able to fix this much faster than
I can, so if they want to work on this please let me know so I don't
waste my time on doing double work.
>> I think the driver code is really hard to read I have planned to cleanup
>> things later, however, this patch simply tries to prevent us from
>> returning a value greater than limit which could happen and would
>> definitely be wrong.
More information about the linux-arm-kernel
mailing list