bug in nl_cache_refill?
Thomas Graf
tgraf at infradead.org
Thu Feb 16 04:49:19 EST 2012
On Wed, Feb 15, 2012 at 12:20:13PM -0500, Brett Ciphery wrote:
> > > err code refilling route cache: -25
> > > err code refilling route cache: 0
> > > err code refilling route cache: -25
> > > err code refilling route cache: 0
> > > err code refilling route cache: 0
> > > err code refilling route cache: -25
> > > err code refilling route cache: 0
> > > ...
> > >
> > > Which comes from lib/nl.c:719, indicating an NLE_BUSY, or, -EBUSY
> > > syserr. This leaves a previously populated cache empty for a
> > > (seemingly) non-critical error. What do you think about handling
> > > -NLE_BUSY inside nl_cache_refill, just as we do -NLE_DUMP_INTR?
> >
> > Good catch Brett. I agree, we need catch -NLE_BUSY and restart the
> > dump. Do you want to provide a patch or should I fix it?
>
> Sure.
>
> It was the frequency of these NLE_BUSY I found rather concerning,
> approaching roughly 50% for a system with essentially stable routing
> tables and link interfaces.
Do you have other applications that could be performing a dump in
parallel? -EBUSY is returned if the dumping lock is taken already
and there is only one lock for all rtnetlink components.
More information about the libnl
mailing list