[PATCH v2] nvme-tcp: Fix netns UAF introduced by commit 1be52169c348

Christoph Hellwig hch at lst.de
Thu Apr 3 23:17:31 PDT 2025


I'll do another minor fixup for the comment formatting, but otherwise
this looks good.  I'll queue it up.

On Thu, Apr 03, 2025 at 10:47:48PM +0800, shaopeijie at cestc.cn wrote:
> From: Peijie Shao <shaopeijie at cestc.cn>
> 
> The patch is for nvme-tcp host side.
> 
> commit 1be52169c348
> ("nvme-tcp: fix selinux denied when calling sock_sendmsg")
> uses sock_create_kern instead of sock_create to solve SELinux
> problem, however sock_create_kern does not take a reference of
> the given netns, which results in a use-after-free when the
> non-init_net netns is destroyed before sock_release.
> 
> For example: a container not share with host's network namespace
> doing a 'nvme connect', and is stopped without 'nvme disconnect'.
> 
> The patch changes parameter current->nsproxy->net_ns to init_net,
> makes the socket always belongs to the host. It also naturally
> avoids changing sock's netns from previous creator's netns to
> init_net when sock is re-created by nvme recovery path
> (workqueue is in init_net namespace).
> 
> Signed-off-by: Peijie Shao <shaopeijie at cestc.cn>

> ---
> 
> Changes in v2:
>     1. Fix style problems reviewed by Christoph Hellwig, thanks!
>     2. Add 'nvme-tcp:' prefix for the patch.
> 
> Version v1:
> Hi all,
> This is the v1 patch. Before this version, I tried to
> get_net(current->nsproxy->net_ns) in nvme_tcp_alloc_queue() to
> fix the issue, but failed to find a suitable placeto do
> put_net(). Because the socket is released by fput() internally.
> I think code like below:
>     nvme_tcp_free_queue() {
>         fput()
>         put_net()
>     }
> can not ensure the socket was released before put_net, since
> someone is still holding the file.
> 
> So I would like to use the 'init_net' net namespace.
> 
> ---
>  drivers/nvme/host/tcp.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 26c459f0198d..9b1d0ad18b77 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1789,8 +1789,14 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
>  		queue->cmnd_capsule_len = sizeof(struct nvme_command) +
>  						NVME_TCP_ADMIN_CCSZ;
>  
> -	ret = sock_create_kern(current->nsproxy->net_ns,
> -			ctrl->addr.ss_family, SOCK_STREAM,
> +	/*
> +	 * sock_create_kern() does not take a reference to
> +	 * current->nsproxy->net_ns, use init_net instead.
> +	 * This also avoid changing sock's netns from previous
> +	 * creator's netns to init_net when sock is re-created
> +	 * by nvme recovery path.
> +	 */
> +	ret = sock_create_kern(&init_net, ctrl->addr.ss_family, SOCK_STREAM,
>  			IPPROTO_TCP, &queue->sock);
>  	if (ret) {
>  		dev_err(nctrl->device,
> -- 
> 2.43.0
> 
> 
---end quoted text---



More information about the Linux-nvme mailing list