[syzbot] BUG: unable to handle kernel access to user memory in schedule_tail

Sat Mar 13 07:20:57 GMT 2021

On Fri, Mar 12, 2021 at 9:12 PM Ben Dooks <ben.dooks at codethink.co.uk> wrote:
>
> On 12/03/2021 16:25, Alex Ghiti wrote:
> >
> >
> > Le 3/12/21 à 10:12 AM, Dmitry Vyukov a écrit :
> >> On Fri, Mar 12, 2021 at 2:50 PM Ben Dooks <ben.dooks at codethink.co.uk>
> >> wrote:
> >>>
> >>> On 10/03/2021 17:16, Dmitry Vyukov wrote:
> >>>> On Wed, Mar 10, 2021 at 5:46 PM syzbot
> >>>> <syzbot+e74b94fe601ab9552d69 at syzkaller.appspotmail.com> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> syzbot found the following issue on:
> >>>>>
> >>>>> HEAD commit:    0d7588ab riscv: process: Fix no prototype for
> >>>>> arch_dup_tas..
> >>>>> git tree:
> >>>>> git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> >>>>> console output:
> >>>>> https://syzkaller.appspot.com/x/log.txt?x=1212c6e6d00000
> >>>>> kernel config:
> >>>>> https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136
> >>>>> dashboard link:
> >>>>> https://syzkaller.appspot.com/bug?extid=e74b94fe601ab9552d69
> >>>>> userspace arch: riscv64
> >>>>>
> >>>>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>>>
> >>>>> IMPORTANT: if you fix the issue, please add the following tag to
> >>>>> the commit:
> >>>>> Reported-by: syzbot+e74b94fe601ab9552d69 at syzkaller.appspotmail.com
> >>>>
> >>>> +riscv maintainers
> >>>>
> >>>> This is riscv64-specific.
> >>>> I've seen similar crashes in put_user in other places. It looks like
> >>>> put_user crashes in the user address is not mapped/protected (?).
> >>>
> >>> I've been having a look, and this seems to be down to access of the
> >>> tsk->set_child_tid variable. I assume the fuzzing here is to pass a
> >>> bad address to clone?
> >>>
> >>>   From looking at the code, the put_user() code should have set the
> >>> relevant SR_SUM bit (the value for this, which is 1<<18 is in the
> >>> s2 register in the crash report) and from looking at the compiler
> >>> output from my gcc-10, the code looks to be dong the relevant csrs
> >>> and then csrc around the put_user
> >>>
> >>> So currently I do not understand how the above could have happened
> >>> over than something re-tried the code seqeunce and ended up retrying
> >>> the faulting instruction without the SR_SUM bit set.
> >>
> >> I would maybe blame qemu for randomly resetting SR_SUM, but it's
> >> strange that 99% of these crashes are in schedule_tail. If it would be
> >> qemu, then they would be more evenly distributed...
> >>
> >> Another observation: looking at a dozen of crash logs, in none of
> >> these cases fuzzer was actually trying to fuzz clone with some insane
> >> arguments. So it looks like completely normal clone's (e..g coming
> >> from pthread_create) result in this crash.
> >>
> >> I also wonder why there is ret_from_exception, is it normal? I see
> >> handle_exception disables SR_SUM:
> >
> > csrrc does the right thing: it cleans SR_SUM bit in status but saves the
> > previous value that will get correctly restored.
> >
> > ("The CSRRC (Atomic Read and Clear Bits in CSR) instruction reads the
> > value of the CSR, zero-extends the value to XLEN bits, and writes it to
> > integer registerrd.  The initial value in integerregisterrs1is treated
> > as a bit mask that specifies bit positions to be cleared in the CSR. Any
> > bitthat is high inrs1will cause the corresponding bit to be cleared in
> > the CSR, if that CSR bit iswritable.  Other bits in the CSR are
> > unaffected.")
>
> I think there may also be an understanding issue on what the SR_SUM
> bit does. I thought if it is set, M->U accesses would fault, which is
> why it gets set early on. But from reading the uaccess code it looks
> like the uaccess code sets it on entry and then clears on exit.
>
> I am very confused. Is there a master reference for rv64?
>
> https://people.eecs.berkeley.edu/~krste/papers/riscv-privileged-v1.9.pdf
> seems to state PUM is the SR_SUM bit, and that (if set) disabled
>
> Quote:
>   The PUM (Protect User Memory) bit modifies the privilege with which
> S-mode loads, stores, and instruction fetches access virtual memory.
> When PUM=0, translation and protection behave as normal. When PUM=1,
> S-mode memory accesses to pages that are accessible by U-mode (U=1 in
> Figure 4.19) will fault. PUM has no effect when executing in U-mode
>
>
> >> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/riscv/kernel/entry.S#L73
> >>
> >
> > Still no luck for the moment, can't reproduce it locally, my test is
> > maybe not that good (I created threads all day long in order to trigger
> > the put_user of schedule_tail).
>
> It may of course depend on memory and other stuff. I did try to see if
> it was possible to clone() with the child_tid address being a valid but
> not mapped page...
>
> > Given that the path you mention works most of the time, and that the
> > status register in the stack trace shows the SUM bit is not set whereas
> > it is set in put_user, I'm leaning toward some race condition (maybe an
> > interrupt that arrives at the "wrong" time) or a qemu issue as you
> > mentioned.
>
> I suppose this is possible. From what I read it should get to the
> point of being there with the SUM flag cleared, so either something
> went wrong in trying to fix the instruction up or there's some other
> error we're missing.
>
> > To eliminate qemu issues, do you have access to some HW ? Or to
> > different qemu versions ?
>
> I do have access to a Microchip Polarfire board. I just need the
> instructions on how to setup the test-code to make it work on the
> hardware.

For full syzkaller support, it would need to know how to reboot these
boards and get access to the console.
syzkaller has a stop-gap VM backend which just uses ssh to a physical
machine and expects the kernel to reboot on its own after any crashes.

But I actually managed to reproduce it in an even simpler setup.
Assuming you have Go 1.15 and riscv64 cross-compiler gcc installed

$ go get -u -d github.com/google/syzkaller/...
$ cd $GOPATH/src/github.com/google/syzkaller
$ make stress executor TARGETARCH=riscv64
$ scp bin/linux_riscv64/syz-execprog bin/linux_riscv64/syz-executor
your_machine:/

Then run ./syz-stress on the machine.
On the first run it crashed it with some other bug, on the second run
I got the crash in schedule_tail.
With qemu tcg I also added -slowdown=10 flag to syz-stress to scale
all timeouts, if native execution is faster, then you don't need it.