bug in nl_cache_refill?

Thomas Graf tgraf at infradead.org
Tue Feb 14 06:04:40 EST 2012


On Wed, Feb 08, 2012 at 04:51:54PM -0500, Brett Ciphery wrote:
> I am periodically calling nl_cache_refill to clean and update a cache,
> but after some time it starts throwing errors.  As an (ugly) example:
> 
> nl_sock = nl_socket_alloc();
> nl_cache_mngr_alloc(nl_sock, NETLINK_ROUTE,
> NL_AUTO_PROVIDE, &mngr);
> nl_cache_mngr_add(mngr, "route/route", NULL,
> NULL, &route_cache);
> 
> while(true) {
> 	err = nl_cache_refill(nl_sock,
> route_cache);
>                 printf("err code refilling route cache: %d\n", err);
>                 sleep(1);
> }
> 
> 
> err code refilling route cache: 0
> err code refilling route cache: 0
> ...
> err code refilling route cache: -25
> err code refilling route cache: 0
> err code refilling route cache: -25
> err code refilling route cache: 0
> err code refilling route cache: -25
> err code refilling route cache: 0
> err code refilling route cache: 0
> err code refilling route cache: -25
> err code refilling route cache: 0
> ...
> 
> Which comes from lib/nl.c:719, indicating an NLE_BUSY, or, -EBUSY
> syserr.  This leaves a previously populated cache empty for a
> (seemingly) non-critical error.  What do you think about handling
> -NLE_BUSY inside nl_cache_refill, just as we do -NLE_DUMP_INTR?

Good catch Brett. I agree, we need catch -NLE_BUSY and restart the
dump. Do you want to provide a patch or should I fix it?



More information about the libnl mailing list