[PATCH 2/3] tftp: implement UDP reorder cache using lists

Mon Sep 19 00:33:45 PDT 2022

Hi Enrico,

On Fri, Sep 16, 2022 at 04:12:07PM +0200, Enrico Scholz wrote:
> Sascha Hauer <s.hauer at pengutronix.de> writes:
> 
> > The UDP reorder cache can be much easier implemented using lists.
> > As a bonus the cache grows and shrinks on demand and no fixed size
> > has to be configured at compile time.
> 
> There are three variants of the cache
> 
> - small, fixed sized array with linear search (very first
>   implementation)  -->  O(n)
> - list --> O(n) on insert, O(1) on lookup
> - bitmap --> O(1)
> 
> 
> Performance wise, I think the list implementation is the slowest one
> (although the fixed sized array is O(n) on every operation, this should
> be still faster for small n than the list operations and the related
> memory management).
> 
> From code size, the list implementation is in the middle.
> 
> I am not sure whether dynamic shrinking/growing of the cache is so
> important or whether a small, fixed sized cache suffices.  In my
> experience, only the first couple of packets really matter regarding
> reordering.

I don't think that the performance of the cache limits the tftp
performance, given the low number of entries in the cache. You may prove
me wrong here, I don't have any setup here where I could test packet
reordering. What happens here instead is that packets get dropped
because of the RX queue of the network interface get overwhelmed by
packets, see my patch reducing the windowsize setting for i.MX. Packets
get cached then, but the missing packets come in later when retransmits
are triggered. Performance is very bad then anyway.

What counts for me here is the simplicity of the implementation and
the ability to check it for correctness. Here the list implementation
has clear advantages.

The dynamic shrinking/growing of the cache hasn't been a goal for me,
it's just easier to implement then having to maintain a fixed size
pool.

> 
> Of course, the 'kfifo' could be replaced completely by a custom buffer
> implementation where packets are inserted at the correct position.  O(1)
> for every operation + zero additional memory.

Yes, that could be a next step, although the kfifo makes it easy to
decouple the fixed sized blocks coming in from the read() call with
varying sizes.

> 
> 
> > -static inline bool is_block_before(uint16_t a, uint16_t b)
> > -{
> > -	return (int16_t)(b - a) > 0;
> > -}
> > -
> > ....
> >  static int tftp_window_cache_insert(struct tftp_cache *cache, uint16_t id,
> >  				    void const *data, size_t len)
> >  {
> > ...
> > +	list_for_each_entry(block, &cache->blocks, list) {
> 
> fwiw, iterating in the reverse direction will find the position probably
> faster
> 
> 
> > +		if (block->id == id)
> > +			return 0;
> > +		if (block->id < id)
> 
> This will break when wrapping at 65535; the removed 'is_block_before()'
> function was written for this case.

Oh, yes, I missed this. I'll fix it here and elsewhere. Thanks for
noting.

Sascha

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |