[PATCH V2 1/2] nvme-core: fix nvme module ref count Oops

Christoph Hellwig hch at lst.de
Wed Sep 16 12:01:55 EDT 2020


On Wed, Sep 16, 2020 at 09:58:38AM -0600, Logan Gunthorpe wrote:
> 
> 
> On 2020-09-15 9:53 p.m., Chaitanya Kulkarni wrote:
> > Introduce car dev relase function, get/put the module refernece which
> > allows us to fix the potential Oops which can be easily reproduced with
> > NVMeOF passthru ctrl :-
> > 
> > Entering kdb (current=0xffff8887f8290000, pid 3128) on processor 30 Oops: (null)
> > due to oops @ 0xffffffffa01019ad
> > CPU: 30 PID: 3128 Comm: bash Tainted: G        W  OE     5.8.0-rc4nvme-5.9+ #35
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.4
> > RIP: 0010:nvme_free_ctrl+0x234/0x285 [nvme_core]
> > Code: 57 10 a0 e8 73 bf 02 e1 ba 3d 11 00 00 48 c7 c6 98 33 10 a0 48 c7 c7 1d 57 10 a0 e8 5b bf 02 e1 8
> > RSP: 0018:ffffc90001d63de0 EFLAGS: 00010246
> > RAX: ffffffffa05c0440 RBX: ffff8888119e45a0 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: ffff8888177e9550 RDI: ffff8888119e43b0
> > RBP: ffff8887d4768000 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: ffffc90001d63c90 R12: ffff8888119e43b0
> > R13: ffff8888119e5108 R14: dead000000000100 R15: ffff8888119e5108
> > FS:  00007f1ef27b0740(0000) GS:ffff888817600000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffffffa05c0470 CR3: 00000007f6bee000 CR4: 00000000003406e0
> > Call Trace:
> >  device_release+0x27/0x80
> >  kobject_put+0x98/0x170
> >  nvmet_passthru_ctrl_disable+0x4a/0x70 [nvmet]
> >  nvmet_passthru_enable_store+0x4c/0x90 [nvmet]
> >  configfs_write_file+0xe6/0x150
> >  vfs_write+0xba/0x1e0
> >  ksys_write+0x5f/0xe0
> >  do_syscall_64+0x52/0xb0
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x7f1ef1eb2840
> > Code: Bad RIP value.
> > RSP: 002b:00007fffdbff0eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1ef1eb2840
> > RDX: 0000000000000002 RSI: 00007f1ef27d2000 RDI: 0000000000000001
> > RBP: 00007f1ef27d2000 R08: 000000000000000a R09: 00007f1ef27b0740
> > R10: 0000000000000001 R11: 0000000000000246 R12: 00007f1ef2186400
> > R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
> > 
> > With this patch fix we take the module ref count in nvme_dev_open() and
> > release that ref count in newly introduced nvme_dev_release().
> > 
> > Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni at wdc.com>
> > ---
> >  drivers/nvme/host/core.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 8b75f6ca0b61..c5f9d64b2bec 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -3261,10 +3261,22 @@ static int nvme_dev_open(struct inode *inode, struct file *file)
> >  		return -EWOULDBLOCK;
> >  	}
> >  
> > +	if (!try_module_get(ctrl->ops->module))
> > +		return -EINVAL;
> 
> Aren't we also still missing the nvme_get_ctrl() here? We have a
> reference to the controller that's not counted; which was the original
> bug, and we need a reference to the module to be able to put that
> reference...

Yes, indeed.  Pulled from nvme-5.9 again..



More information about the Linux-nvme mailing list