[PATCH net V2 1/2] net: stmmac: Premature loop termination check was ignored on rx

Jochen Henneberg jh at henneberg-systemdesign.com
Mon Mar 20 02:04:54 PDT 2023


Jakub Kicinski <kuba at kernel.org> writes:

> On Sat, 18 Mar 2023 09:38:12 +0100 Jochen Henneberg wrote:
>> > Are you sure? Can you provide more detailed analysis?
>> > Do you observe a problem / error in real life or is this theoretical?  
>> 
>> This is theoretical, I was hunting another bug and just stumbled over
>> the check which is, I think you agree, pointless right now. I did not
>> try to force execute that code path.
>
> If you have the HW it's definitely worth doing. There is a fault
> injection infra in Linus which allows to fail memory allocations.
> Or you can just make a little patch to the driver to fake failing
> every 1000th allocation.
>

I have the hardware available and will do the check.

>> > As far as I can tell only path which jumps to read_again after doing
>> > count++ is via the drain_data jump, but I can't tell how it's
>> > discarding subsequent segments in that case..
>> >  
>> >> -read_again:
>> >>  		buf1_len = 0;
>> >>  		buf2_len = 0;
>> >>  		entry = next_entry;  
>> 
>> Correct. The read_again is triggered in case that the segment is not the
>> last segment of the frame:
>> 
>> 		if (likely(status & rx_not_ls))
>> 			goto read_again;
>> 
>> So in case there is no skb (queue error) it will keep increasing count
>> until the last segment has been found with released device DMA
>> ownership. So skb will not change while the goto loop is running, the
>> only thing that will change is that subsequent segments release device
>> DMA ownership. The dirty buffers are then cleaned up from
>> stmmac_rx_refill().
>
> To be clear - I'm only looking at stmmac_rx(), that ZC one is even more
> confusing.
>
> Your patch makes sense, but I think it's not enough to make this code
> work in case of memory allocation failure. AFAIU the device supports
> scatter - i.e. receiving a single frame in multiple chunks. Each time
> thru the loop we process one (or two?) chunks. But the code uses 
> skb == NULL to decide whether it's the first chunk or not. So in case
> of memory allocation error it will treat the second chunk as the first
> (since skb will be NULL) and we'll get a malformed frame with missing
> chunks sent to the stack. The driver should discard the entire frame
> on failure..
>

Understood. However, this forces me to read the code and datahseet very
carefully to understand the details. I will do that, however, it will
take me some time.

For the ST and Synopsys people:
I could imagine that you would be able to fix this much faster than
I can, so if they want to work on this please let me know so I don't
waste my time on doing double work.

>> I think the driver code is really hard to read I have planned to cleanup
>> things later, however, this patch simply tries to prevent us from
>> returning a value greater than limit which could happen and would
>> definitely be wrong.



More information about the linux-arm-kernel mailing list