[PATCH] rpmsg: virtio: Fix broken rpmsg_probe()
Arnaud POULIQUEN
arnaud.pouliquen at foss.st.com
Tue Jul 5 23:56:44 PDT 2022
On 7/6/22 06:03, Jason Wang wrote:
> On Mon, Jul 4, 2022 at 5:45 PM Arnaud POULIQUEN
> <arnaud.pouliquen at foss.st.com> wrote:
>>
>> Hello Jason,
>>
>> On 7/4/22 06:35, Jason Wang wrote:
>>> On Fri, Jul 1, 2022 at 2:16 PM Michael S. Tsirkin <mst at redhat.com> wrote:
>>>>
>>>> On Fri, Jul 01, 2022 at 09:22:15AM +0800, Jason Wang wrote:
>>>>> On Fri, Jul 1, 2022 at 3:20 AM Michael S. Tsirkin <mst at redhat.com> wrote:
>>>>>>
>>>>>> On Thu, Jun 30, 2022 at 11:51:30AM -0600, Mathieu Poirier wrote:
>>>>>>> + virtualization at lists.linux-foundation.org
>>>>>>> + jasowang at redhat.com
>>>>>>> + mst at redhat.com
>>>>>>>
>>>>>>> On Thu, 30 Jun 2022 at 10:20, Arnaud POULIQUEN
>>>>>>> <arnaud.pouliquen at foss.st.com> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 6/29/22 19:43, Mathieu Poirier wrote:
>>>>>>>>> Hi Anup,
>>>>>>>>>
>>>>>>>>> On Wed, Jun 08, 2022 at 10:43:34PM +0530, Anup Patel wrote:
>>>>>>>>>> The rpmsg_probe() is broken at the moment because virtqueue_add_inbuf()
>>>>>>>>>> fails due to both virtqueues (Rx and Tx) marked as broken by the
>>>>>>>>>> __vring_new_virtqueue() function. To solve this, virtio_device_ready()
>>>>>>>>>> (which unbreaks queues) should be called before virtqueue_add_inbuf().
>>>>>>>>>>
>>>>>>>>>> Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
>>>>>>>>>> Signed-off-by: Anup Patel <apatel at ventanamicro.com>
>>>>>>>>>> ---
>>>>>>>>>> drivers/rpmsg/virtio_rpmsg_bus.c | 6 +++---
>>>>>>>>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c
>>>>>>>>>> index 905ac7910c98..71a64d2c7644 100644
>>>>>>>>>> --- a/drivers/rpmsg/virtio_rpmsg_bus.c
>>>>>>>>>> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c
>>>>>>>>>> @@ -929,6 +929,9 @@ static int rpmsg_probe(struct virtio_device *vdev)
>>>>>>>>>> /* and half is dedicated for TX */
>>>>>>>>>> vrp->sbufs = bufs_va + total_buf_space / 2;
>>>>>>>>>>
>>>>>>>>>> + /* From this point on, we can notify and get callbacks. */
>>>>>>>>>> + virtio_device_ready(vdev);
>>>>>>>>>> +
>>>>>>>>>
>>>>>>>>> Calling virtio_device_ready() here means that virtqueue_get_buf_ctx_split() can
>>>>>>>>> potentially be called (by way of rpmsg_recv_done()), which will race with
>>>>>>>>> virtqueue_add_inbuf(). If buffers in the virtqueue aren't available then
>>>>>>>>> rpmsg_recv_done() will fail, potentially breaking remote processors' state
>>>>>>>>> machines that don't expect their initial name service to fail when the "device"
>>>>>>>>> has been marked as ready.
>>>>>>>>>
>>>>>>>>> What does make me curious though is that nobody on the remoteproc mailing list
>>>>>>>>> has complained about commit 8b4ec69d7e09 breaking their environment... By now,
>>>>>>>>> i.e rc4, that should have happened. Anyone from TI, ST and Xilinx care to test this on
>>>>>>>>> their rig?
>>>>>>>>
>>>>>>>> I tested on STm32mp1 board using tag v5.19-rc4(03c765b0e3b4)
>>>>>>>> I confirm the issue!
>>>>>>>>
>>>>>>>> Concerning the solution, I share Mathieu's concern. This could break legacy.
>>>>>>>> I made a short test and I would suggest to use __virtio_unbreak_device instead, tounbreak the virtqueues without changing the init sequence.
>>>>>>>>
>>>>>>>> I this case the patch would be:
>>>>>>>>
>>>>>>>> + /*
>>>>>>>> + * Unbreak the virtqueues to allow to add buffers before setting the vdev status
>>>>>>>> + * to ready
>>>>>>>> + */
>>>>>>>> + __virtio_unbreak_device(vdev);
>>>>>>>> +
>>>>>>>>
>>>>>>>> /* set up the receive buffers */
>>>>>>>> for (i = 0; i < vrp->num_bufs / 2; i++) {
>>>>>>>> struct scatterlist sg;
>>>>>>>> void *cpu_addr = vrp->rbufs + i * vrp->buf_size;
>>>>>>>
>>>>>>> This will indeed fix the problem. On the flip side the kernel
>>>>>>> documentation for __virtio_unbreak_device() puzzles me...
>>>>>>> It clearly states that it should be used for probing and restoring but
>>>>>>> _not_ directly by the driver. Function rpmsg_probe() is part of
>>>>>>> probing but also the entry point to a driver.
>>>>>>>
>>>>>>> Michael and virtualisation folks, is this the right way to move forward?
>>>>>>
>>>>>> I don't think it is, __virtio_unbreak_device is intended for core use.
>>>>>
>>>>> Can we fill the rx after virtio_device_ready() in this case?
>>>>>
>>>>> Btw, the driver set driver ok after registering, we probably get a svq
>>>>> kick before DRIVER_OK?
>>
>> By "registering" you mean calling rpmsg_virtio_add_ctrl_dev and
>> rpmsg_ns_register_device?
>
> Yes.
>
>>
>> The rpmsg_ns_register_device has to be called before. Because it has to be
>> probed to handle the first message coming from the remote side to create
>> associated rpmsg local device.
>
> I couldn't find the code to do this, maybe you can give me some hint on this.
The rpmsg_ns is available here :
https://elixir.bootlin.com/linux/latest/source/drivers/rpmsg/rpmsg_ns.c
It is probed on rpmsg_ns_register_device call.
https://elixir.bootlin.com/linux/latest/source/drivers/rpmsg/virtio_rpmsg_bus.c#L974
>
>> It doesn't send message.
>
> I see the function register the device to the bus, I wonder if this
> means the device could be probed and used by the driver before
> virtio_device_ready().
>
>>
>> The risk could be for the rpmsg_ctrl device. Registering it
>> after the virtio_device_ready(vdev) call could make sense...
>
> I see.
>
>>
>>>>>
>>>>> Thanks
>>>>
>>>> Is this an ack for the original patch?
>>>
>>> Nope, I meant, instead of moving virtio_device_ready() a little bit
>>> earlier, can we only move the rvq filling after virtio_device_ready().
>>>
>>> Thanks
>>
>> Please find some concerns about this inversion here:
>> https://lore.kernel.org/lkml/20220701053813-mutt-send-email-mst@kernel.org/
>>
>> Regarding __virtio_unbreak_device. The pending virtio_break_device is
>> used by some virtio driver.
>> Could we consider that it makes sense to also have a
>> virtio_unbreak_device interface?
>
> We don't want to allow the driver to unbreak a device since it's
> easier to have bugs.
>
>>
>>
>> I do not well understand the reason of the commit:
>> 8b4ec69d7e09 ("virtio: harden vring IRQ", 2022-05-27)
>
> It tries to forbid the virtqueue callbacks to be called before
> virtio_device_ready(). This helps to prevent the malicious device from
> attacking the driver.
>
> But unfortunately, it breaks several driver because:
>
> 1) some driver have races in probe/remove
> 2) it tries to reuse vq->broken which may break the driver that call
> virqueue_add() before virtio_device_ready() which is allowed by the
> spec
>
> There's a discussion to have a better behavior that doesn't break the
> existing drivers. And the IRQ hardening feature is marked as broken
> now, so rpmsg should be fine without any extra effort.
Thanks for the explanations.
If the discussions are in a mail thread could you give me the reference?
Thanks,
Arnaud
>
>> So following alternative is probably pretty naive:
>> Is the use of virtqueue_disable_cb could be an alternative to the
>> vq->broken usage allowing to register buffer while preventing virtqueue IRQ?
>
> Probably not, there's no guarantee that the device will not send
> notification after virqtueue_disable_cb().
>
> Thanks
>
>>
>> Thanks,
>> Arnaud
>>
>>>
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Arnaud
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Mathieu
>>>>>>>>>
>>>>>>>>>> /* set up the receive buffers */
>>>>>>>>>> for (i = 0; i < vrp->num_bufs / 2; i++) {
>>>>>>>>>> struct scatterlist sg;
>>>>>>>>>> @@ -983,9 +986,6 @@ static int rpmsg_probe(struct virtio_device *vdev)
>>>>>>>>>> */
>>>>>>>>>> notify = virtqueue_kick_prepare(vrp->rvq);
>>>>>>>>>>
>>>>>>>>>> - /* From this point on, we can notify and get callbacks. */
>>>>>>>>>> - virtio_device_ready(vdev);
>>>>>>>>>> -
>>>>>>>>>> /* tell the remote processor it can start sending messages */
>>>>>>>>>> /*
>>>>>>>>>> * this might be concurrent with callbacks, but we are only
>>>>>>>>>> --
>>>>>>>>>> 2.34.1
>>>>>>>>>>
>>>>>>
>>>>
>>>
>>
>
More information about the kvm-riscv
mailing list