[RFC PATCH] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

Fabio M. De Francesco fmdefrancesco at gmail.com
Tue Aug 16 02:18:08 PDT 2022


kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

However, there is a huge constraint which might block some conversions
to kmap_local_page(): the kernel virtual address cannot be handed across
different threads. Ira made me notice that the kmap() and kunmap() in this
driver happen in two different workqueues. Therefore, kunmap_local() will
try to unmap an invalid address.

Let me explain why I'm sending an RFC. When I hit the above mentioned
issues I tried to refactor the code in ways where mapping and unmapping
happen in a single thread (to not break the rules of threads locality).

However, while reading this code again I think I noticed an important
prerequisite which may lead to a simpler solution... If I'm not wrong, it
looks like the pages are allocated in nvmet_tcp_map_data(), using the
GFP_KERNEL flag.

This would assure that those pages _cannot_ come from HIGHMEM. If I'm not
missing something (again!), a plain page_address() could replace the kmap()
of sg_page(sg); furthermore, we shouldn't need the unmappings any longer.

Unfortunately, I don't know this protocol and I'm not so experienced with
kernel development to be able to understand this code properly.

Therefore, I have two questions: am I right about thinking that the pages
mapped in nvmet_tcp_map_pdu_iovec() are allocated with GFP_KERNEL? If so,
can anyone with more knowledge than mine please say if my changes make any
sense?

Suggested-by: Ira Weiny <ira.weiny at intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco at gmail.com>
---
 drivers/nvme/target/tcp.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index dc3b4dc8fe08..affba6d862fc 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -167,7 +167,6 @@ static const struct nvmet_fabrics_ops nvmet_tcp_ops;
 static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c);
 static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd);
 static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd);
-static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd);
 
 static inline u16 nvmet_tcp_cmd_tag(struct nvmet_tcp_queue *queue,
 		struct nvmet_tcp_cmd *cmd)
@@ -309,19 +308,6 @@ static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd)
 	cmd->req.sg = NULL;
 }
 
-static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd)
-{
-	struct scatterlist *sg;
-	int i;
-
-	sg = &cmd->req.sg[cmd->sg_idx];
-
-	for (i = 0; i < cmd->nr_mapped; i++)
-		kunmap(sg_page(&sg[i]));
-
-	cmd->nr_mapped = 0;
-}
-
 static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd)
 {
 	struct kvec *iov = cmd->iov;
@@ -338,7 +324,7 @@ static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd)
 	while (length) {
 		u32 iov_len = min_t(u32, length, sg->length - sg_offset);
 
-		iov->iov_base = kmap(sg_page(sg)) + sg->offset + sg_offset;
+		iov->iov_base = page_address(sg_page(sg)) + sg->offset + sg_offset;
 		iov->iov_len = iov_len;
 
 		length -= iov_len;
@@ -1141,7 +1127,6 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)
 		cmd->rbytes_done += ret;
 	}
 
-	nvmet_tcp_unmap_pdu_iovec(cmd);
 	if (queue->data_digest) {
 		nvmet_tcp_prep_recv_ddgst(cmd);
 		return 0;
@@ -1411,7 +1396,6 @@ static void nvmet_tcp_restore_socket_callbacks(struct nvmet_tcp_queue *queue)
 static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd)
 {
 	nvmet_req_uninit(&cmd->req);
-	nvmet_tcp_unmap_pdu_iovec(cmd);
 	nvmet_tcp_free_cmd_buffers(cmd);
 }
 
@@ -1424,7 +1408,6 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue)
 		if (nvmet_tcp_need_data_in(cmd))
 			nvmet_req_uninit(&cmd->req);
 
-		nvmet_tcp_unmap_pdu_iovec(cmd);
 		nvmet_tcp_free_cmd_buffers(cmd);
 	}
 
-- 
2.37.1




More information about the Linux-nvme mailing list