[PATCH blktests v1 0/2] extend nvme/045 to reconnect with invalid key

Daniel Wagner dwagner at suse.de
Tue Mar 5 03:18:36 PST 2024


On Tue, Mar 05, 2024 at 09:44:45AM +0000, Shinichiro Kawasaki wrote:
> On Mar 04, 2024 / 17:13, Daniel Wagner wrote:
> > The is the test case for
> > 
> > https://lore.kernel.org/linux-nvme/20240304161006.19328-1-dwagner@suse.de/
> >
> > 
> > Daniel Wagner (2):
> >   nvme/rc: add reconnect-delay argument only for fabrics transports
> >   nvme/048: add reconnect after ctrl key change
> 
> I apply the kernel patches in the link above to v6.8-rc7, then ran nvme/045
> with the blktests patches in the series. And I observed failure of the test
> case with various transports [1]. Is this failure expected?

If you have these patches applied, the test should pass. But we might
have still some more stuff to unify between the transports. The nvme/045
test passes in my setup. Though I have seen runs which were hang for
some reason. Haven't figured out yet what's happening there. But I
haven't seen failures, IIRC.

I am not really surprised we seeing some fallouts though. We start to
test the error code paths with this test extension.

> Also, I observed KASAN double-free [2]. Do you observe it in your environment?
> I created a quick fix [3], and it looks resolving the double-free.

No, I haven't seen this.

> sudo ./check nvme/045
> nvme/045 (Test re-authentication)                            [failed]
>     runtime  8.069s  ...  7.639s
>     --- tests/nvme/045.out      2024-03-05 18:09:07.267668493 +0900
>     +++ /home/shin/Blktests/blktests/results/nodev/nvme/045.out.bad     2024-03-05 18:10:07.735494384 +0900
>     @@ -9,5 +9,6 @@
>      Change hash to hmac(sha512)
>      Re-authenticate with changed hash
>      Renew host key on the controller and force reconnect
>     -disconnected 0 controller(s)
>     +controller "nvme1" not deleted within 5 seconds
>     +disconnected 1 controller(s)
>      Test complete

That means the host either successfully reconnected or never
disconnected. We have another test case just for the disconnect test
(number of queue changes), so if this test passes, it must be the
former... Shouldn't really happen, this would mean the auth code has bug.

> diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
> index f2832f70e7e0..4e161d3cd840 100644
> --- a/drivers/nvme/host/sysfs.c
> +++ b/drivers/nvme/host/sysfs.c
> @@ -221,14 +221,10 @@ static int ns_update_nuse(struct nvme_ns *ns)
>  
>  	ret = nvme_identify_ns(ns->ctrl, ns->head->ns_id, &id);
>  	if (ret)
> -		goto out_free_id;
> +		return ret;

Yes, this is correct.
>  
>  	ns->head->nuse = le64_to_cpu(id->nuse);
> -
> -out_free_id:
> -	kfree(id);
> -
> -	return ret;
> +	return 0;
>  }
>

I think you still need to free the 'id' on the normal exit path though

If you have these patches applied, the test should pass. But we might
have still some more stuff to unify between the transports. The nvme/045
test passes in my setup. Though I have seen runs which were hang for
some reason. Haven't figured out yet what's happening there. But I
haven't seen failures.



More information about the Linux-nvme mailing list