No subject


Fri Oct 22 17:57:35 EDT 2010


from the efficient implementation perspective it does matter.

Take for example the read-ahead done on block devices.  We don't want to
flush all those pages that were read in when we don't know that they're
ever going to end up in a user mapping.  So what's commonly done (as
suggested by DaveM) is that flush_dcache_page() detects that it's a
dcache page, ensures that there's no user mappings, and sets a 'dirty'
flag.  This flag is guaranteed to be clear when new, clean, unread
pages enter the page cache.

When the page eventually ends up in a user mapping, that dirty flag is
checked and the necessary cache flushing done at that point.

Note that when there are user mappings, flush_dcache_page() has to flush
those mappings too, otherwise mmap() <-> read()/write() coherency breaks.
I believe this was what flush_dcache_page() was created to resolve.

flush_kernel_dcache_page() was to solve the problem of PIO drivers
writing to dcache pages, so that data written into the kernel mapping
would be visible to subsequent user mappings.

We chose a different overall approach - which had already been adopted by
PPC - where we invert the meaning of this 'dirty' bit to mean that it's
clean.  So every new page cache page starts out life as being marked
dirty and so nothing needs to be done at flush_kernel_dcache_page().
We continue to use davem's optimization but with the changed meaning of
the bit, but as we now support SMP we do the flushing at set_pte_at()
time.

This also means that we don't have to rely on the (endlessly) buggy PIO
drivers remembering to add flush_kernel_dcache_page() calls - something
which has been a source of constant never-ending pain for us.

The final piece of the jigsaw is flush_anon_page() which deals with
kernel<->user coherency for anonymous pages by flushing both the user
and kernel sides of the mapping.  This was to solve direct-io coherency
problems.

As the users of flush_anon_page() always do:

	flush_anon_page(vma, page, addr);
	flush_dcache_page(page);

and documentation doesn't appear to imply that this will always be the
case, we restrict flush_dcache_page() to only work on page cache pages,
otherwise we end up flushing the kernel-side mapping multiple time in
succession.

Maybe we should make flush_anon_page() only flush the user mapping,
stipulate that it shall always be followed by flush_dcache_page(),
which shall flush the kernel side mapping even for anonymous pages?
That sounds to me like a recipe for missing flush_dcache_page() calls
causing bugs.



More information about the linux-arm-kernel mailing list