[PATCH] rpmsg: virtio: Fix broken rpmsg_probe()

Jason Wang jasowang at redhat.com
Tue Jul 5 21:03:27 PDT 2022


On Mon, Jul 4, 2022 at 5:45 PM Arnaud POULIQUEN
<arnaud.pouliquen at foss.st.com> wrote:
>
> Hello Jason,
>
> On 7/4/22 06:35, Jason Wang wrote:
> > On Fri, Jul 1, 2022 at 2:16 PM Michael S. Tsirkin <mst at redhat.com> wrote:
> >>
> >> On Fri, Jul 01, 2022 at 09:22:15AM +0800, Jason Wang wrote:
> >>> On Fri, Jul 1, 2022 at 3:20 AM Michael S. Tsirkin <mst at redhat.com> wrote:
> >>>>
> >>>> On Thu, Jun 30, 2022 at 11:51:30AM -0600, Mathieu Poirier wrote:
> >>>>> + virtualization at lists.linux-foundation.org
> >>>>> + jasowang at redhat.com
> >>>>> + mst at redhat.com
> >>>>>
> >>>>> On Thu, 30 Jun 2022 at 10:20, Arnaud POULIQUEN
> >>>>> <arnaud.pouliquen at foss.st.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> On 6/29/22 19:43, Mathieu Poirier wrote:
> >>>>>>> Hi Anup,
> >>>>>>>
> >>>>>>> On Wed, Jun 08, 2022 at 10:43:34PM +0530, Anup Patel wrote:
> >>>>>>>> The rpmsg_probe() is broken at the moment because virtqueue_add_inbuf()
> >>>>>>>> fails due to both virtqueues (Rx and Tx) marked as broken by the
> >>>>>>>> __vring_new_virtqueue() function. To solve this, virtio_device_ready()
> >>>>>>>> (which unbreaks queues) should be called before virtqueue_add_inbuf().
> >>>>>>>>
> >>>>>>>> Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
> >>>>>>>> Signed-off-by: Anup Patel <apatel at ventanamicro.com>
> >>>>>>>> ---
> >>>>>>>>  drivers/rpmsg/virtio_rpmsg_bus.c | 6 +++---
> >>>>>>>>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c
> >>>>>>>> index 905ac7910c98..71a64d2c7644 100644
> >>>>>>>> --- a/drivers/rpmsg/virtio_rpmsg_bus.c
> >>>>>>>> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c
> >>>>>>>> @@ -929,6 +929,9 @@ static int rpmsg_probe(struct virtio_device *vdev)
> >>>>>>>>      /* and half is dedicated for TX */
> >>>>>>>>      vrp->sbufs = bufs_va + total_buf_space / 2;
> >>>>>>>>
> >>>>>>>> +    /* From this point on, we can notify and get callbacks. */
> >>>>>>>> +    virtio_device_ready(vdev);
> >>>>>>>> +
> >>>>>>>
> >>>>>>> Calling virtio_device_ready() here means that virtqueue_get_buf_ctx_split() can
> >>>>>>> potentially be called (by way of rpmsg_recv_done()), which will race with
> >>>>>>> virtqueue_add_inbuf().  If buffers in the virtqueue aren't available then
> >>>>>>> rpmsg_recv_done() will fail, potentially breaking remote processors' state
> >>>>>>> machines that don't expect their initial name service to fail when the "device"
> >>>>>>> has been marked as ready.
> >>>>>>>
> >>>>>>> What does make me curious though is that nobody on the remoteproc mailing list
> >>>>>>> has complained about commit 8b4ec69d7e09 breaking their environment... By now,
> >>>>>>> i.e rc4, that should have happened.  Anyone from TI, ST and Xilinx care to test this on
> >>>>>>> their rig?
> >>>>>>
> >>>>>> I tested on STm32mp1 board using tag v5.19-rc4(03c765b0e3b4)
> >>>>>> I confirm the issue!
> >>>>>>
> >>>>>> Concerning the solution, I share Mathieu's concern. This could break legacy.
> >>>>>> I made a short test and I would suggest to use __virtio_unbreak_device instead, tounbreak the virtqueues without changing the init sequence.
> >>>>>>
> >>>>>> I this case the patch would be:
> >>>>>>
> >>>>>> +       /*
> >>>>>> +        * Unbreak the virtqueues to allow to add buffers before setting the vdev status
> >>>>>> +        * to ready
> >>>>>> +        */
> >>>>>> +       __virtio_unbreak_device(vdev);
> >>>>>> +
> >>>>>>
> >>>>>>         /* set up the receive buffers */
> >>>>>>         for (i = 0; i < vrp->num_bufs / 2; i++) {
> >>>>>>                 struct scatterlist sg;
> >>>>>>                 void *cpu_addr = vrp->rbufs + i * vrp->buf_size;
> >>>>>
> >>>>> This will indeed fix the problem.  On the flip side the kernel
> >>>>> documentation for __virtio_unbreak_device() puzzles me...
> >>>>> It clearly states that it should be used for probing and restoring but
> >>>>> _not_ directly by the driver.  Function rpmsg_probe() is part of
> >>>>> probing but also the entry point to a driver.
> >>>>>
> >>>>> Michael and virtualisation folks, is this the right way to move forward?
> >>>>
> >>>> I don't think it is, __virtio_unbreak_device is intended for core use.
> >>>
> >>> Can we fill the rx after virtio_device_ready() in this case?
> >>>
> >>> Btw, the driver set driver ok after registering, we probably get a svq
> >>> kick before DRIVER_OK?
>
> By "registering" you mean calling rpmsg_virtio_add_ctrl_dev and
> rpmsg_ns_register_device?

Yes.

>
> The rpmsg_ns_register_device has to be called before. Because it has to be
> probed to handle the first message coming from the remote side to create
> associated rpmsg local device.

I couldn't find the code to do this, maybe you can give me some hint on this.

> It doesn't send message.

I see the function register the device to the bus, I wonder if this
means the device could be probed and used by the driver before
virtio_device_ready().

>
> The risk could be for the rpmsg_ctrl device. Registering it
> after the virtio_device_ready(vdev) call could make sense...

I see.

>
> >>>
> >>> Thanks
> >>
> >> Is this an ack for the original patch?
> >
> > Nope, I meant, instead of moving virtio_device_ready() a little bit
> > earlier, can we only move the rvq filling after virtio_device_ready().
> >
> > Thanks
>
> Please find some concerns about this inversion here:
> https://lore.kernel.org/lkml/20220701053813-mutt-send-email-mst@kernel.org/
>
> Regarding __virtio_unbreak_device. The pending virtio_break_device is
> used by some virtio driver.
> Could we consider that it makes sense to also have a
> virtio_unbreak_device interface?

We don't want to allow the driver to unbreak a device since it's
easier to have bugs.

>
>
> I do not well understand the reason of the commit:
> 8b4ec69d7e09 ("virtio: harden vring IRQ", 2022-05-27)

It tries to forbid the virtqueue callbacks to be called before
virtio_device_ready(). This helps to prevent the malicious device from
attacking the driver.

But unfortunately, it breaks several driver because:

1) some driver have races in probe/remove
2) it tries to reuse vq->broken which may break the driver that call
virqueue_add() before virtio_device_ready() which is allowed by the
spec

There's a discussion to have a better behavior that doesn't break the
existing drivers. And the IRQ hardening feature is marked as broken
now, so rpmsg should be fine without any extra effort.

> So following alternative is probably pretty naive:
> Is the use of virtqueue_disable_cb could be an alternative to the
> vq->broken usage allowing to register buffer while preventing virtqueue IRQ?

Probably not, there's no guarantee that the device will not send
notification after virqtueue_disable_cb().

Thanks

>
> Thanks,
> Arnaud
>
> >
> >>
> >>>>
> >>>>>>
> >>>>>> Regards,
> >>>>>> Arnaud
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Mathieu
> >>>>>>>
> >>>>>>>>      /* set up the receive buffers */
> >>>>>>>>      for (i = 0; i < vrp->num_bufs / 2; i++) {
> >>>>>>>>              struct scatterlist sg;
> >>>>>>>> @@ -983,9 +986,6 @@ static int rpmsg_probe(struct virtio_device *vdev)
> >>>>>>>>       */
> >>>>>>>>      notify = virtqueue_kick_prepare(vrp->rvq);
> >>>>>>>>
> >>>>>>>> -    /* From this point on, we can notify and get callbacks. */
> >>>>>>>> -    virtio_device_ready(vdev);
> >>>>>>>> -
> >>>>>>>>      /* tell the remote processor it can start sending messages */
> >>>>>>>>      /*
> >>>>>>>>       * this might be concurrent with callbacks, but we are only
> >>>>>>>> --
> >>>>>>>> 2.34.1
> >>>>>>>>
> >>>>
> >>
> >
>




More information about the kvm-riscv mailing list