[PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page

Catalin Marinas catalin.marinas at arm.com
Fri May 3 06:02:42 EDT 2013


On Thu, May 02, 2013 at 08:38:36PM +0100, Simon Baatz wrote:
> On Thu, May 02, 2013 at 10:54:31AM +0100, Catalin Marinas wrote:
> > On Wed, May 01, 2013 at 08:04:41PM +0100, Simon Baatz wrote:
> > > On Wed, May 01, 2013 at 03:22:06PM +0100, Catalin Marinas wrote:
> > > > On Tue, Apr 30, 2013 at 10:04:03PM +0100, Simon Baatz wrote:
> > > > > On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> > > > > > On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > > > > > > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > > > > > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> > > > > ...
> > > > > > > 
> > > > > > > It is not the driver itself which is using the API, it is the
> > > > > > > generic scatterlist memory iterator. And I don't think that this is
> > > > > > > wrong, as I have tried to explain in [1].
> > > > > > 
> > > > > > Trying to remember what we've discussed over the past months on this
> > > > > > topic. It looks like sg_miter_stop() does the right thing in calling
> > > > > > flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
> > > > > > superfluous flush_kernel_dcache_page()) removed this function entirely.
> > > > > > The code previously had this comment - /* highmem pages are always
> > > > > > flushed upon kunmap already */ which I think it wasn't fully correct
> > > > > > either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
> > > > > > I suspect we only get the flushing if SG_MITER_ATOMIC.
> > > > > > 
> > > > > > So it looks to me like flush_kernel_dcache_page() should be implemented
> > > > > > even for highmem pages (with VIVT or aliasing VIPT, at least for non
> > > > > > kmap_atomic addresses by checking for FIXADDR_START). If highmem is
> > > > > > disabled, I suspect we still need this function since the calling code
> > > > > > doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
> > > > > > simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?
> > > > > 
> > > > > My first version ([1]) had:
> > > > > 
> > > > > 	if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
> > > > > 		__flush_kernel_dcache_page(page);
> > > > > 
> > > > > If I understand this correctly, you are proposing to remove the
> > > > > highmem exclusion.
> > > > 
> > > > The highmem exclusion may have been there originally because of a
> > > > comment suggesting that kunmap() does the flushing. This is the case
> > > > only for kunmap_atomic() AFAICT (and maybe we could remove that as well
> > > > and rely on flush_kernel_dcache_page() being called).
> > > > 
> > > > > And then in __flush_kernel_dcache_page():
> > > > > 
> > > > > 	mapping = page_mapping(page);
> > > > > 
> > > > > 	if (!mapping || mapping_mapped(mapping))
> > > > > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > > > > 
> > > > > I still prefer to have this condition here to avoid the flush when
> > > > > there is no user mapping at all.  This is handled by lazy flushing
> > > > > and is probably the most common case anyway (given how many people
> > > > > seem to be affected by this problem).
> > > > 
> > > > Looking at the old thread, you said there is a case when this condition
> > > > is not true (O_DIRECT case). If that's for a page cache page, then we
> > > > can handle it lazily (for anonymous pages I don't think we can rely on
> > > > lazy flushing since the kernel does not guarantee the clearing of the
> > > > PG_arch_1 bit).
> > > 
> > > As Russel pointed out in a comment to a later version of the patch,
> > > PG_arch_1 makes only sense for page cache pages. The condition above
> > > is ok from my point of view (it is based on what flush_dcache_page()
> > > uses):
> > > 
> > > - Page cache page without user space mapping:
> > > mapping != NULL and mapping_mapped() == 0
> > > 
> > > -> no flush here; lazy flush based on PG_arch_1 later if needed (we
> > >    rely on the proper initialization of the page to "dirty" here.)
> > 
> > Indeed.
> > 
> > > - Page cache page with user space mapping:
> > > mapping != NULL and mapping_mapped() != 0
> > > 
> > > -> kernel mapping flushed here (user mapping can be assumed to be clean)
> > 
> > We had similar thoughts for AArch64 here and decided it's not needed, it
> > can just clear the PG_arch_1 bit and do it lazily (patches not pushed
> > yet, need more testing). This assumes that even if mapping_mapped(), the
> > page is not actually mapped in user space and we eventually get a
> > set_pte_at() call. That's what powerpc is doing.
> 
> For arm64 only I-/D-cache coherency is relevant, right?

Yes. Same for 32-bit ARMv7 or non-aliasing D-cache on ARMv6.

> In this case,
> that may be right (but you may want to have a look at what
> xol_get_insn_slot() in kernel/events/uprobes.c does.  I don't know
> what kind of pages it handles, but this looks suspicious. And I think
> uprobes are also available for powerpc).

It is suspicious indeed and even the author is unsure about using the
right API. Should it rather use copy_to_user_page? I'm not familiar with
uprobes.

> However, if you can have aliases in D-cache this is needed.  See
> below to trigger this on pages that are already mapped
> to user space (direct I/O into a memory mapped file).

You are probably right, for direct I/O and aliases in the D-cache it
needs to flush. The documentation for flush_dcache_page() states that it
can also be called when the kernel is about to read from a page cache
page (potentially mapped into user space).

> > > - Anonymous page:
> > > mapping == NULL
> > > 
> > > -> kernel mapping flushed here (user mapping can be assumed to be clean)
> > 
> > I don't think it should care about anonymous pages at all. I put a
> > WARN_ON(!mapping) in flush_dcache_page() and it hasn't triggered yet,
> > though not intensive testing.
> 
> I assume that you inhibited the call to flush_dcache_page() in
> __get_user_pages() for anon pages.  Otherwise, you will be flooded
> with warnings.

I haven't done any stress testing so I don't think I hit this code path,
so no warning. But yes, it should have triggered. Anyway, in this case
flush_dcache_page() should have just ignored (clearing PG_arch_1 is
harmless anyway if we also ignore this bit in __sync_icache_dcache for
non-aliasing caches).

> DIY steps to produce both cases:
> (We will need software encryption, thus I need to remove the hardware
> encryption module mv_cesa on my hardware first)
> 
> # rmmod mv_cesa     
> # wget -O mapping_tests.c http://ubuntuone.com/0jF9fNsguN8BvI1Z8fsBVI
> # gcc -o mapping_tests mapping_tests.c
> # dd if=/dev/urandom of=map.out bs=4k count=10
> # dd if=/dev/urandom of=cdevice.img bs=4k count=10
> # losetup /dev/loop0 cdevice.img 
> # cryptsetup -c aes create c_sda2 /dev/loop0
> <enter any passphrase>
> # ./mapping_tests
> 
> cryptsetup will already trigger the warning and mapping_tests should
> flood your console.
>  
> Since flush_kernel_dcache_page() is supposed to handle user space
> pages, it must flush the kernel mapping in these cases.  Running the
> above on a device driver using an unpatched
> flush_kernel_dcache_page() yields:
> 
> # ./mapping_tests 
> Anonymous page: differs!
> User space mapped page: differs!

I haven't run the tests but I can see  why it fails without
flush_kernel_dcache_page(). So I think this function needs to be
implemented for aliasing VIPT or VIVT caches.

> It is even needed in flush_dcache_page() as long as everybody
> continues to use flush_dcache_page() instead of
> flush_kernel_dcache_page() when appropriate...
> (This is probably the main reason why the problem I reported is so
> uncommon: Everybody seems to use flush_dcache_page() and since it
> flushes the kernel mapping in these cases, everything is fine.)

That's why for non-aliasing VIPT we could make it just a clear_bit() and
let the callers fix their API usage.

> > > > > Additionally, although we can assume that the page is kmapped,
> > > > > page_address(page) can still be NULL for a highmem page, right?
> > > > 
> > > > It looks like kmap() always sets page_address(page) but I'm not sure
> > > > about kmap_atomic(), it doesn't seem to.
> > > 
> > > Hmm, in __flush_dcache_page() we have the following code to flush the
> > > kernel mapping:
> > > 
> > > void __flush_dcache_page(struct address_space *mapping, struct page *page)
> > > {
> > > 	/*
> > > 	 * Writeback any data associated with the kernel mapping of this
> > > 	 * page.  This ensures that data in the physical page is mutually
> > > 	 * coherent with the kernels mapping.
> > > 	 */
> > > 	if (!PageHighMem(page)) {
> > > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > > 	} else {
> > > 		void *addr = kmap_high_get(page);
> > > 		if (addr) {
> > > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > > 			kunmap_high(page);
> > > 		} else if (cache_is_vipt()) {
> > > 			/* unmapped pages might still be cached */
> > > 			addr = kmap_atomic(page);
> > > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > > 			kunmap_atomic(addr);
> > > 		}
> > > 	}
> > > ...
> > > 
> > > 
> > > We probably should reuse it in flush_kernel_dcache_page() to flush
> > > the kernel mapping. (The last else clause looks strange though)
> > 
> > I think it makes sense to reuse this logic in
> > flush_kernel_dcache_page(). If the page is already mapped with kmap,
> > then kmap_high_get() should return the actual address. If it was mapped
> > with kmap_atomic, kmap_high_get() would return NULL, hence the 'else'
> > clause and the additional kmap_atomic(). The cache_is_vipt() check is
> > useful because kunmap_atomic() would flush VIVT caches anyway.
> 
> Yes, but why don't we flush for VIPT _aliasing_ already in
> kunmap_atomic?

Some comment from Nico on the list, it seems that highmem does not work
with aliasing VIPT caches:

http://article.gmane.org/gmane.linux.ports.arm.kernel/85114

> Why do we need to do anything for non-aliasing VIPT here?

Do you mean __flush_dcache_page()? I don't think we need.

> > As for __flush_dcache_page() called from other places like
> > flush_dcache_page(), because of this 'else if' clause it looks like it
> > misses flushing unmapped highmem pages on VIVT cache.
> 
> How many users do we have with VIVT D-caches using highmem? (This is
> not a rhetorical questions, I have no idea.  However, I would assume
> that this use case is almost non-existent.)

I don't really know. Maybe Nico has more info.

So my proposal:

1. Non-aliasing VIPT - defer any D-cache flushing until
   __sync_icache_dcache() (and fix callers like uprobes)
2. Aliasing VIPT or VIVT - flush the aliases as before in
   flush_dcache_page()
3. Aliasing VIPT or VIVT - implement flush_kernel_dcache_page() for all
   pages (don't check for PageHighmem)
4. VIVT - remove cache flushing from kunmap_atomic and rely on
   flush_kernel_dcache_page()

I'm not entirely sure about 4 but at the moment kunmap() doesn't flush
caches, only kunmap_atomic() so it looks like an inconsistency.

-- 
Catalin



More information about the linux-arm-kernel mailing list