arm64/v4.16-rc1: KASAN: use-after-free Read in finish_task_switch

Peter Zijlstra peterz at infradead.org
Thu Feb 15 08:47:54 PST 2018


On Thu, Feb 15, 2018 at 02:22:39PM +0000, Will Deacon wrote:
> Instead, we've come up with a more plausible sequence that can in theory
> happen on a single CPU:
> 
> <task foo calls exit()>
> 
> do_exit
> 	exit_mm

If this is the last task of the process, we would expect:

  mm_count == 1
  mm_users == 1

at this point.

> 		mmgrab(mm);			// foo's mm has count +1
> 		BUG_ON(mm != current->active_mm);
> 		task_lock(current);
> 		current->mm = NULL;
> 		task_unlock(current);

So the whole active_mm is basically the last 'real' mm, and its purpose
is to avoid switch_mm() between user tasks and kernel tasks.

A kernel task has !->mm. We do this by incrementing mm_count when
switching from user to kernel task and decrementing when switching from
kernel to user.

What exit_mm() does is change a user task into a 'kernel' task. So it
should increment mm_count to mirror the context switch. I suspect this
is what the mmgrab() in exit_mm() is for.

> <irq and ctxsw to kthread>
> 
> context_switch(prev=foo, next=kthread)
> 	mm = next->mm;
> 	oldmm = prev->active_mm;
> 
> 	if (!mm) {				// True for kthread
> 		next->active_mm = oldmm;
> 		mmgrab(oldmm);			// foo's mm has count +2
> 	}
> 
> 	if (!prev->mm) {			// True for foo
> 		rq->prev_mm = oldmm;
> 	}
> 
> 	finish_task_switch
> 		mm = rq->prev_mm;
> 		if (mm) {			// True (foo's mm)
> 			mmdrop(mm);		// foo's mm has count +1
> 		}
> 
> 	[...]
> 
> <ctxsw to task bar>
> 
> context_switch(prev=kthread, next=bar)
> 	mm = next->mm;
> 	oldmm = prev->active_mm;		// foo's mm!
> 
> 	if (!prev->mm) {			// True for kthread
> 		rq->prev_mm = oldmm;
> 	}
> 
> 	finish_task_switch
> 		mm = rq->prev_mm;
> 		if (mm) {			// True (foo's mm)
> 			mmdrop(mm);		// foo's mm has count +0

The context switch into the next user task will then decrement. At this
point foo no longer has a reference to its mm, except on the stack.

> 		}
> 
> 	[...]
> 
> <ctxsw back to task foo>
> 
> context_switch(prev=bar, next=foo)
> 	mm = next->mm;
> 	oldmm = prev->active_mm;
> 
> 	if (!mm) {				// True for foo
> 		next->active_mm = oldmm;	// This is bar's mm
> 		mmgrab(oldmm);			// bar's mm has count +1
> 	}
> 
> 
> 	[return back to exit_mm]

Enter mm_users, this counts the number of tasks associated with the mm.
We start with 1 in mm_init(), and when it drops to 0, we decrement
mm_count. Since we also start with mm_count == 1, this would appear
consistent.

  mmput() // --mm_users == 0, which then results in:

> mmdrop(mm);					// foo's mm has count -1

In the above case, that's the very last reference to the mm, and since
we started out with mm_count == 1, this -1 makes 0 and we do the actual
free.

> At this point, we've got an imbalanced count on the mm and could free it
> prematurely as seen in the KASAN log.

I'm not sure I see premature. At this point mm_users==0, mm_count==0 and
we freed mm and there is no further use of the on-stack mm pointer and
foo no longer has a pointer to it in either ->mm or ->active_mm. It's
well and proper dead.

> A subsequent context-switch away from foo would therefore result in a
> use-after-free.

At the above point, foo no longer has a reference to mm, we cleared ->mm
early, and the context switch to bar cleared ->active_mm. The switch
back into foo then results with foo->active_mm == bar->mm, which is
fine.




More information about the linux-arm-kernel mailing list