[syzbot] BUG: unable to handle kernel access to user memory in schedule_tail

Fri Mar 12 17:34:48 GMT 2021

On Fri, Mar 12, 2021 at 5:36 PM Ben Dooks <ben.dooks at codethink.co.uk> wrote:
>
> On 12/03/2021 16:34, Ben Dooks wrote:
> > On 12/03/2021 16:30, Ben Dooks wrote:
> >> On 12/03/2021 15:12, Dmitry Vyukov wrote:
> >>> On Fri, Mar 12, 2021 at 2:50 PM Ben Dooks <ben.dooks at codethink.co.uk>
> >>> wrote:
> >>>>
> >>>> On 10/03/2021 17:16, Dmitry Vyukov wrote:
> >>>>> On Wed, Mar 10, 2021 at 5:46 PM syzbot
> >>>>> <syzbot+e74b94fe601ab9552d69 at syzkaller.appspotmail.com> wrote:
> >>>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> syzbot found the following issue on:
> >>>>>>
> >>>>>> HEAD commit:    0d7588ab riscv: process: Fix no prototype for
> >>>>>> arch_dup_tas..
> >>>>>> git tree:
> >>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> >>>>>> console output:
> >>>>>> https://syzkaller.appspot.com/x/log.txt?x=1212c6e6d00000
> >>>>>> kernel config:
> >>>>>> https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136
> >>>>>> dashboard link:
> >>>>>> https://syzkaller.appspot.com/bug?extid=e74b94fe601ab9552d69
> >>>>>> userspace arch: riscv64
> >>>>>>
> >>>>>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>>>>
> >>>>>> IMPORTANT: if you fix the issue, please add the following tag to
> >>>>>> the commit:
> >>>>>> Reported-by: syzbot+e74b94fe601ab9552d69 at syzkaller.appspotmail.com
> >>>>>
> >>>>> +riscv maintainers
> >>>>>
> >>>>> This is riscv64-specific.
> >>>>> I've seen similar crashes in put_user in other places. It looks like
> >>>>> put_user crashes in the user address is not mapped/protected (?).
> >>>>
> >>>> I've been having a look, and this seems to be down to access of the
> >>>> tsk->set_child_tid variable. I assume the fuzzing here is to pass a
> >>>> bad address to clone?
> >>>>
> >>>>   From looking at the code, the put_user() code should have set the
> >>>> relevant SR_SUM bit (the value for this, which is 1<<18 is in the
> >>>> s2 register in the crash report) and from looking at the compiler
> >>>> output from my gcc-10, the code looks to be dong the relevant csrs
> >>>> and then csrc around the put_user
> >>>>
> >>>> So currently I do not understand how the above could have happened
> >>>> over than something re-tried the code seqeunce and ended up retrying
> >>>> the faulting instruction without the SR_SUM bit set.
> >>>
> >>> I would maybe blame qemu for randomly resetting SR_SUM, but it's
> >>> strange that 99% of these crashes are in schedule_tail. If it would be
> >>> qemu, then they would be more evenly distributed...
> >>>
> >>> Another observation: looking at a dozen of crash logs, in none of
> >>> these cases fuzzer was actually trying to fuzz clone with some insane
> >>> arguments. So it looks like completely normal clone's (e..g coming
> >>> from pthread_create) result in this crash.
> >>>
> >>> I also wonder why there is ret_from_exception, is it normal? I see
> >>> handle_exception disables SR_SUM:
> >>> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/riscv/kernel/entry.S#L73
> >>>
> >>
> >> So I think if SR_SUM is set, then it faults the access to user memory
> >> which the _user() routines clear to allow them access.
> >>
> >> I'm thinking there is at least one issue here:
> >>
> >> - the test in fault is the wrong way around for die kernel
> >> - the handler only catches this if the page has yet to be mapped.
> >>
> >> So I think the test should be:
> >>
> >>          if (!user_mode(regs) && addr < TASK_SIZE &&
> >>                          unlikely(regs->status & SR_SUM)
> >>
> >> This then should continue on and allow the rest of the handler to
> >> complete mapping the page if it is not there.
> >>
> >> I have been trying to create a very simple clone test, but so far it
> >> has yet to actually trigger anything.
> >
> > I should have added there doesn't seem to be a good way to use mmap()
> > to allocate memory but not insert a vm-mapping post the mmap().
> >
> How difficult is it to try building a branch with the above test
> modified?

I don't have access to hardware, I don't have other qemu versions ready to use.
But I can teach you how to run syzkaller locally :)
I am not sure anybody run it on real riscv hardware at all. When
Tobias ported syzkaller, Tobias also used qemu I think.

I am now building with an inverted check to test locally.

I don't fully understand but this code, but does handle_exception
reset SR_SUM around do_page_fault? If so, then looking at SR_SUM in
do_page_fault won't work with positive nor negative check.