[RFC PATCH 0/9] vhost-nvme: new qemu nvme backend using nvme target
Ming Lin
mlin at kernel.org
Mon Nov 23 23:27:54 PST 2015
On Mon, 2015-11-23 at 15:14 +0100, Paolo Bonzini wrote:
>
> On 23/11/2015 09:17, Ming Lin wrote:
> > On Sat, 2015-11-21 at 14:11 +0100, Paolo Bonzini wrote:
> >>
> >> On 20/11/2015 01:20, Ming Lin wrote:
> >>> One improvment could be to use google's NVMe vendor extension that
> >>> I send in another thread, aslo here:
> >>> https://git.kernel.org/cgit/linux/kernel/git/mlin/linux.git/log/?h=nvme-google-ext
> >>>
> >>> Qemu side:
> >>> http://www.minggr.net/cgit/cgit.cgi/qemu/log/?h=vhost-nvme.0
> >>> Kernel side also here:
> >>> https://git.kernel.org/cgit/linux/kernel/git/mlin/linux.git/log/?h=vhost-nvme.0
> >>
> >> How much do you get with vhost-nvme plus vendor extension, compared to
> >> 190 MB/s for QEMU?
> >
> > There is still some bug. I'll update.
>
> Sure.
>
> >> Note that in all likelihood, QEMU can actually do better than 190 MB/s,
> >> and gain more parallelism too, by moving the processing of the
> >> ioeventfds to a separate thread. This is similar to
> >> hw/block/dataplane/virtio-blk.c.
> >>
> >> It's actually pretty easy to do. Even though
> >> hw/block/dataplane/virtio-blk.c is still using some old APIs, all memory
> >> access in QEMU is now thread-safe. I have pending patches for 2.6 that
> >> cut that file down to a mere 200 lines of code, NVMe would probably be
> >> about the same.
> >
> > Is there a git tree for your patches?
>
> No, not yet. I'll post them today or tomorrow, will make sure to Cc you.
>
> > Did you mean some pseduo code as below?
> > 1. need a iothread for each cq/sq?
> > 2. need a AioContext for each cq/sq?
> >
> > hw/block/nvme.c | 32 ++++++++++++++++++++++++++++++--
> > hw/block/nvme.h | 8 ++++++++
> > 2 files changed, 38 insertions(+), 2 deletions(-)
> >
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index f27fd35..fed4827 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -28,6 +28,8 @@
> > #include "sysemu/sysemu.h"
> > #include "qapi/visitor.h"
> > #include "sysemu/block-backend.h"
> > +#include "sysemu/iothread.h"
> > +#include "qom/object_interfaces.h"
> >
> > #include "nvme.h"
> >
> > @@ -558,9 +560,22 @@ static void nvme_init_cq_eventfd(NvmeCQueue *cq)
> > uint16_t offset = (cq->cqid*2+1) * (4 << NVME_CAP_DSTRD(n->bar.cap));
> >
> > event_notifier_init(&cq->notifier, 0);
> > - event_notifier_set_handler(&cq->notifier, nvme_cq_notifier);
> > memory_region_add_eventfd(&n->iomem,
> > 0x1000 + offset, 4, false, 0, &cq->notifier);
> > +
> > + object_initialize(&cq->internal_iothread_obj,
> > + sizeof(cq->internal_iothread_obj),
> > + TYPE_IOTHREAD);
> > + user_creatable_complete(OBJECT(&cq->internal_iothread_obj), &error_abort);
>
> For now, you have to use one iothread for all cq/sq of a single NVMe
> device; multiqueue block layer is planned for 2.7 or 2.8. Otherwise
> yes, it's very close to just these changes.
Here is the call stack of iothread for virtio-blk-dataplane.
handle_notify (qemu/hw/block/dataplane/virtio-blk.c:126)
aio_dispatch (qemu/aio-posix.c:329)
aio_poll (qemu/aio-posix.c:474)
iothread_run (qemu/iothread.c:45)
start_thread (pthread_create.c:312)
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)
I think I'll have a "nvme_dev_notify" similar as "handle_notify"
static void nvme_dev_notify(EventNotifier *e)
{
....
}
But then how can I know this notify is for cq or sq?
More information about the Linux-nvme
mailing list