[PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
Martin Wilck
mwilck at suse.com
Wed Apr 28 08:06:31 BST 2021
On Wed, 2021-04-28 at 08:39 +0200, Maurizio Lombardi wrote:
> út 27. 4. 2021 v 22:02 odesílatel Martin Wilck <mwilck at suse.com>
> napsal:
> > The code doesn't use add_timer(), only mod_timer() and
> > del_timer_sync(). And we didn't observe a crash upon add_timer().
> > What
> > we observed was that a timer had been enqueued multiple times, and
> > the
> > kernel crashes in expire_timers()->detach_timer(), when it encounters
> > an already detached entry in the timer list.
>
> How can a timer end up enqueued multiple times?
I observed in a dump that this can happen. More precisely, I observed
the following:
[ 4389.978732] nvme nvme36: Successfully reconnected (1 attempt)
...
[ 4441.445655] nvme nvme36: Successfully reconnected (1 attempt)
...
[ 4510.891400] nvme nvme36: ANATT timeout, resetting controller.
...
[ 4517.035411] general protection fault: 0000 [#1] SMP PTI (kernel crash)
In the crash dump, I could see a anatt_timer belonging to nvme36 being
pending in the timer list, with a remaining expiry time of 44s,
suggesting it had been created around time 4441.4s. That means that at
the time the ANATT timeout was seen (which corresponds to a timer added
around 4389.9s), the timer must have been pending twice. (*)
> It's safe to call mod_timer() against both an active or an inactive
> timer
> and mod_timer() is thread-safe also.
>
> IMO the problem is due to the fact that timer_setup() could end up
> being called against
> an active, pending timer.
> timer_setup() doesn't take any lock and modifies the pprev pointer
> and
> the timer's flags
Yes, that's what I think has happened. timer_setup() doesn't clear any
pointers in the list of pending timers pointing to this entry. If the
newly-initialized timer is then added with mod_timer(), it becomes
linked in a second timer list. When the first one expires, the timer
will be detached, but only from one of the lists it's pending in. In a
scenario like the one we faced, this could actually happen multiple
times. If the detached timer remains linked into a timer list, once
that list is traversed, the kernel dereferences a pointer with value
LIST_POISON2, and crashes.
Anybody: correct me if this is nonsense.
Best,
Martin
(*) Note that the crash had been caused by another wrongly linked anatt
timer, not this one.
More information about the Linux-nvme
mailing list