[PATCH 3/3] arm64/sve: KVM: Avoid dereference of dead task during guest entry
Dave Martin
Dave.Martin at arm.com
Tue Dec 5 04:40:28 PST 2017
On Tue, Dec 05, 2017 at 10:43:50AM +0100, Christoffer Dall wrote:
> Hi Dave,
>
> On Mon, Dec 04, 2017 at 03:36:50PM +0000, Dave Martin wrote:
> > On Mon, Dec 04, 2017 at 01:53:21PM +0000, Ard Biesheuvel wrote:
> > > On 1 December 2017 at 15:19, Dave Martin <Dave.Martin at arm.com> wrote:
> > > > When deciding whether to invalidate FPSIMD state cached in the cpu,
> > > > the backend function sve_flush_cpu_state() attempts to dereference
> > > > __this_cpu_read(fpsimd_last_state). However, this is not safe:
> > > > there is no guarantee that the pointer is still valid, because the
> > > > task could have exited in the meantime. For this reason, this
> > > > percpu pointer should only be assigned or compared, never
> > > > dereferenced.
> > > >
> > >
> > > Doesn't that mean the pointer could also be pointing to the
> > > fpsimd_state of a newly created task that is completely unrelated?
> > > IOW, are you sure comparison is safe?
> >
> > There are more conditions: the only place the determination is
> > made is for next, in fpsimd_thread_switch(next).
> >
> >
> > However, I can see your concern and I'm not sure how/if it is
> > resolved.
> >
> > For the worst case, let's assume that some child forks off but
> > doesn't enter userspace yet, while another task round-robins
> > across all CPUs, interspersed with tasks that don't enter userspace.
> >
> > So, we end up with
> >
> > All cpu < NR_CPUS . per_cpu(fpsimd_last_state, cpu) == T.
> >
> > Now, if T dies and a new task is allocated the same task_struct pointer,
> > then the _new_ T is guaranteed to get scheduled in on a CPU whose
> > per_cpu(fpsmid_last_state) == T.
> >
> > Thus, new T can pick up old T's regs _unless_ new T's fpsimd_state.cpu
> > is invalid (i.e., NR_CPUS).
> >
> > This is a separate bug from the one addressed by this patch though.
> > We can't go and harvest the bad pointers when old T exits, because
> > this might race new T being scheduled for real -- in any case it
> > would involve iterating over all CPUs which sounds racy and
> > inefficient.
> >
> >
> > So, I'd say we _must_ call fpsimd_flush_task_state() for every new
> > task. This may result in a redundant reload of the state, but this
> > is what would happen anyway if the pointers did not alias.
> >
> > Does this sound real to you? If so, I'll try to write something.
> >
> > And does this patch look reasonable to fix what it's trying to fix?
> >
> >
> > I wonder whether arch/arm has the same bug actually, since the kernel-
> > mode NEON logic was modelled from there IIUC (?)
> >
> Isn't this the common kernel problem of pid reuse?
It's a problem of <task identifier> reuse, so sort of.
> It seems holding a reference to a struct pid would solve your problems.
> See include/linux/pid.h.
>
> That might also make the code more intuitive and prevent future attempts
> of dereferencing potentially dead data structures.
If we want use the same mechanism to track fpsimd contexts that are not
user task contexts in the future, then that wouldn't work.
In particular, I'd like to track vcpu fpsimd contexts in the same way
as task fpsimd contexts in the future: having two different mechanisms
adds cruft and inefficiency and/or bugs.
I'll look at the struct pid thing and have a think.
Cheers
---Dave
More information about the linux-arm-kernel
mailing list