[PATCH v5 06/14] iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC
Jason Gunthorpe
jgg at nvidia.com
Fri Jan 10 11:49:50 PST 2025
On Fri, Jan 10, 2025 at 11:27:53AM -0800, Nicolin Chen wrote:
> On Fri, Jan 10, 2025 at 01:48:42PM -0400, Jason Gunthorpe wrote:
> > On Tue, Jan 07, 2025 at 09:10:09AM -0800, Nicolin Chen wrote:
> >
> > > +static ssize_t iommufd_veventq_fops_read(struct iommufd_eventq *eventq,
> > > + char __user *buf, size_t count,
> > > + loff_t *ppos)
> > > +{
> > > + size_t done = 0;
> > > + int rc = 0;
> > > +
> > > + if (*ppos)
> > > + return -ESPIPE;
> > > +
> > > + mutex_lock(&eventq->mutex);
> > > + while (!list_empty(&eventq->deliver) && count > done) {
> > > + struct iommufd_vevent *cur = list_first_entry(
> > > + &eventq->deliver, struct iommufd_vevent, node);
> > > +
> > > + if (cur->data_len > count - done)
> > > + break;
> > > +
> > > + if (copy_to_user(buf + done, cur->event_data, cur->data_len)) {
> > > + rc = -EFAULT;
> > > + break;
> > > + }
> >
> > Now that I look at this more closely, the fault path this is copied
> > from is not great.
> >
> > This copy_to_user() can block while waiting on a page fault, possibily
> > for a long time. While blocked the mutex is held and we can't add more
> > entries to the list.
> >
> > That will cause the shared IRQ handler in the iommu driver to back up,
> > which would cause a global DOS.
> >
> > This probably wants to be organized to look more like
> >
> > while (itm = eventq_get_next_item(eventq)) {
> > if (..) {
> > eventq_restore_failed_item(eventq);
> > return -1;
> > }
> > }
> >
> > Where the next_item would just be a simple spinlock across the linked
> > list manipulation.
>
> Would it be simpler by just limiting one node per read(), i.e.
> no "while (!list_empty)" and no block?
>
> The report() adds one node at a time, and wakes up the poll()
> each time of adding a node. And user space could read one event
> at a time too?
That doesn't really help, the issue is it holds the lock over the
copy_to_user() which it is doing because it doesn't want pull the item off
the list and then try to handle the failure and put it back.
Jason
More information about the linux-arm-kernel
mailing list