[RESEND PATCH v2 50/53] mtd: nand: allocate aligned buffers if NAND_OWN_BUFFERS is unset
Boris Brezillon
boris.brezillon at free-electrons.com
Tue Mar 28 01:07:21 PDT 2017
On Tue, 28 Mar 2017 09:59:07 +0200
Boris Brezillon <boris.brezillon at free-electrons.com> wrote:
> +Russell to correct me if I'm wrong or give further information.
>
> On Tue, 28 Mar 2017 01:13:10 +0000
> <yamada.masahiro at socionext.com> wrote:
>
> > Hi Boris,
> >
> >
> > > -----Original Message-----
> > > From: Boris Brezillon [mailto:boris.brezillon at free-electrons.com]
> > > Sent: Monday, March 27, 2017 5:01 PM
> > > To: Yamada, Masahiro/山田 真弘 <yamada.masahiro at socionext.com>
> > > Cc: linux-mtd at lists.infradead.org; David Woodhouse
> > > <dwmw2 at infradead.org>; Marek Vasut <marek.vasut at gmail.com>; Brian Norris
> > > <computersforpeace at gmail.com>; thorsten.christiansson at idquantique.com;
> > > laurent.monat at idquantique.com; Dinh Nguyen <dinguyen at kernel.org>; Artem
> > > Bityutskiy <artem.bityutskiy at linux.intel.com>; Graham Moore
> > > <grmoore at opensource.altera.com>; Enrico Jorns <ejo at pengutronix.de>;
> > > Chuanxiao Dong <chuanxiao.dong at intel.com>; Masami Hiramatsu
> > > <mhiramat at kernel.org>; Jassi Brar <jaswinder.singh at linaro.org>; Rob
> > > Herring <robh at kernel.org>
> > > Subject: Re: [RESEND PATCH v2 50/53] mtd: nand: allocate aligned buffers
> > > if NAND_OWN_BUFFERS is unset
> > >
> > > Hi Masahiro,
> > >
> > > On Thu, 23 Mar 2017 09:17:59 +0900
> > > Masahiro Yamada <yamada.masahiro at socionext.com> wrote:
> > >
> > > > Commit 66507c7bc889 ("mtd: nand: Add support to use nand_base poi
> > > > databuf as bounce buffer") fixed the issue that drivers can be
> > > > passed with a kmap()'d buffer. This addressed the use_bufpoi = 0
> > > > case.
> > > >
> > > > When use_bufpoi = 1, chip->buffers->databuf is used. The databuf
> > > > allocated by nand_scan_tail() is not suitable for DMA due to the
> > > > offset, sizeof(*chip->buffers).
> > >
> > > As said earlier, I'm fine with the patch content, but I'm not sure
> > > about your explanation. Alignment requirements are DMA controller
> > > dependent so you're not exactly fixing a bug for all NAND controller
> > > drivers, just those that require >4 bytes alignment.
> >
> >
> > We have two contexts when we talk about alignment for DMA:
> >
> > [A] Requirement by CPU architecture (cache alignment)
> > [B] Requirement by the controller's DMA engine
> >
> >
> > As I will state below, having sizeof(*chip->buffers) in the same cache
> > line is no good. This is the same reason as devm_* is not recommended for DMA.
> > (https://lkml.org/lkml/2017/3/8/693)
>
> Having non-cache line aligned buffers is definitely more dangerous,
> but, AFAIU, it's not impossible.
>
> Let's consider this case:
>
>
> | cache line | cache line | ... |
> -------------------------------------------------------------
> | nand_buffers size | data |
>
>
> If you call dma_map_single(dev, data, size, DMA_TO_DEVICE), the first
> cache line will be flushed (content written back to memory), and
> assuming you don't touch nand_buffers content between dma_map_single()
> and dma_unmap_single() you shouldn't have any problem (the cache line
> cannot magically turn dirty and thus cannot be flushed in the
> background).
>
> For the DMA_FROM_DEVICE direction, the cache line is invalidated when
> dma_unmap_single() is called, which means your nand_buffers content
> might be updated with what is present in SDRAM, but it shouldn't have
> changed since nand_buffers is only touched at initialization time (when
> the buffer is created).
>
> So, for our use case where nand_buffers is never modified between
> dma_map_single() and dma_unmap_single(), it should be safe to have
> non-cache line aligned buffers.
>
> Russell, please let me know if my reasoning is incorrect.
> Note that I'm not arguing against the new approach where we allocate
> each buffer independently using kmalloc (having cache line aligned
> buffers is definitely safer and does not imply any significant
> overhead), I just want to make sure I understand things correctly.
>
> >
> >
> > The current situation violates [A].
>
> Do you have a real failure that is proven to be caused by mis cache
> line alignment, or are you just speculating?
>
> >
> > Usually [B] is less than [A].
>
> Yep, it's likely the case.
>
> > So, if we meet [A] (by using kmalloc), [B] will be met as well.
>
> Sure.
>
> >
> >
> >
> > > Regarding the cache line alignment issue, in this case it shouldn't be
> > > a problem, because the driver and the core shouldn't touch the
> > > chip->buffers object during a DMA transfer, so dma_map/unmap_single()
> > > should work fine (they may flush/invalidate one cache line entry that
> > > contains non-payload data, but that's not a problem as long as nothing
> > > is modified during the DMA transfer).
> >
> >
> > This is related to 52/53:
> > http://patchwork.ozlabs.org/patch/742409/
> >
> > Can you also read this?
> > https://lkml.org/lkml/2017/3/8/693
> >
> > Your comment is very similar to what was discussed in the thread.
>
> I read it.
>
> >
> >
> >
> > > >
> > > > So, drivers using DMA are very likely to end up with setting the
> > > > NAND_OWN_BUFFERS flag and allocate buffers by themselves. Drivers
> > > > tend to use devm_k*alloc to simplify the probe failure path, but
> > > > devm_k*alloc() does not return a cache line aligned buffer.
> > >
> > > Oh, I didn't know that. I had a closer look at the code, and indeed,
> > > devm_kmalloc() does not guarantee any alignment.
> > >
> > > >
> > > > If used, it could violate the DMA-API requirement stated in
> > > > Documentation/DMA-API.txt:
> > > > Warnings: Memory coherency operates at a granularity called the
> > > > cache line width. In order for memory mapped by this API to
> > > > operate correctly, the mapped region must begin exactly on a cache
> > > > line boundary and end exactly on one (to prevent two separately
> > > > mapped regions from sharing a single cache line).
> > > >
> > > > To sum up, NAND_OWN_BUFFERS is not very convenient for drivers.
> > > > nand_scan_tail() can give a separate buffer for each of ecccalc,
> > > > ecccode, databuf. It is guaranteed kmalloc'ed memory is aligned
> > > > with ARCH_DMA_MINALIGN.
> > >
> > > Maybe I'm wrong, but AFAIU, kmalloc&co do not guarantee cache line
> > > alignment for small buffers (< cache line), so even kmalloc can return
> > > a buffer that is not cache line aligned.
> > > This being said, we should be good because most NAND controllers are
> > > only manipulating the page buffer (->databuf) which is large enough to
> > > be cache line aligned.
> >
> >
> > In my understanding kmalloc() returns cache aligned address even for 1 byte memory.
>
> After digging into the SLAB code I found the calculate_alignment()
> function [1] which is used to calculate the required alignment of
> objects provided by a SLAB cache. For kmalloc caches SLAB_HWCACHE_ALIGN
> is set, but if you look at the code, if the object size is smaller
> than half a cache line the alignment constraint is relaxed, meaning that
> elements smaller than cache_line/2 are not guaranteed to be aligned on
> a cache line.
Hm, it seems that alignment for kmalloc SLAB caches is set to
ARCH_KMALLOC_MINALIGN which is set to ARCH_DMA_MINALIGN by default and
ARCH_DMA_MINALIGN is usually equal to the L1 cache line size, so I was
wrong here.
More information about the linux-mtd
mailing list