[RFC PATCH] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

Fabio M. De Francesco fmdefrancesco at gmail.com
Wed Aug 17 05:02:54 PDT 2022


On mercoledì 17 agosto 2022 11:44:09 CEST Sagi Grimberg wrote:
> >> Therefore, I have two questions: am I right about thinking that the pages
> >> mapped in nvmet_tcp_map_pdu_iovec() are allocated with GFP_KERNEL?
> > 
> > I think you are correct.
> 
> It is correct. It is the same model for the linux scsi target, sunrpc
> etc.

I'll try to address the comments from the two last messages from Keith and 
Sagi with this email (I replied yesterday to Chaitanya).

First of all: good to know that it is the same model for other subsystem. This 
is useful to know. Thanks!

> >> If so, can anyone with more knowledge than mine please say if my changes
> >> make
> >> any sense?
> > 
> > I think it does make sense.

Thanks, I'm glad I was not wrong :-)

> > I like the code simplification, though this use
> > was't really paying the kmap penalty since, as you mentioned, this is 
never
> > highmem.

Correct, however everybody like code simplification. I added a couple of 
sentences to kmap_local_page() documentation in highmem.rst. They clearly 
state that, when users know that pages cannot come from Highmem, they may 
better prefer page_address().

The changes to nvmet-tcp started with trying to convert kmap() / kunmap() to 
kmap_local_page() /kunmap_local(), but it ended up to code shortening and 
simplification with a plain use of page_address(). 

Obviously, due to my little experience with kernel developing and less than 
little knowledge of this protocol I had to ask whether or not I was right in 
identifying the site of the allocations.

The reasons why I had to use page_address() will be clearer reading what 
follows...

> Yes, its the same code-path. Would be great if we still had an
> abstraction that would do the right thing regardless of highmem or
> not like kmap provides though.

It would be great and it is already possible (this is why Thomas Gleixner 
created this kmap_local_page() API) but here we have a huge issue. kmap() and 
kmap_atomic() have recently been deprecated and they shouldn't any longer be 
used in new code: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com/ ("[PATCH] checkpatch: Add kmap and kmap_atomic to the 
deprecated list").

kmap_local_page() always does the right thing: users can call it with or 
without HIGHMEM enabled, in-atomic (also in interrupts) or in preemptible 
contexts, they can take page faults. 

It doesn't require global lock for synchronization and doesn't require global 
TLB invalidation when the kmap's pool wraps and doesn't block waiting for free 
slots. 

Nice, isn't it?

However, with nvmet-tcp we cannot easily use kmap_local_page() because it 
comes with a major problem: it's local to the thread. If users handed the 
kernel virtual addresses returned by this function to other threads, the 
pointers would be invalid.

Here kmap() and kunmap() call sites are in two different workqueues. 
Therefore, there is no way to convert kmap() to kmap_local_page(), unless this 
code is heavily refactored.

Knowing that the pages cannot come from Highmem avoids this refactoring and in 
the meantime it allows us to delete the kmap() and kunmap() calls sites.

> > You should also remove the cmd's 'nr_mapped' field while you're at it,
> > otherwise you'll hit the WARN in nvmet_tcp_free_cmd_buffers().
> 
> Not remove nr_mapped because we use it to know the iovec entries, but
> we can just remove the WARN statement.

Ah, OK. I'll take care of this too. That was not my first concern when I did 
the RFC. The "real" patch must also address this detail.

@Chaitanya:

Since this is a mere simplification and shorten of code, I suppose I can skip 
the performance tests. Ira and I have still hundreds of call sites with kmap() 
and kmap_atomic() which we should care of, therefore we prefer to leave alone 
everything that is not strictly necessary for the deprecated API deletions.

Thanks to you all,

Fabio





More information about the Linux-nvme mailing list