[EXT] Re: The problem about arm64: io: Relax implicit barriers in default I/O accessors

Frank Li frank.li at nxp.com
Wed Jun 23 08:48:10 PDT 2021



> -----Original Message-----
> From: Will Deacon <will at kernel.org>
> Sent: Tuesday, June 22, 2021 4:12 AM
> To: Frank Li <frank.li at nxp.com>
> Cc: Catalin Marinas <catalin.marinas at arm.com>; Zhi Li <lznuaa at gmail.com>;
> Shenwei Wang <shenwei.wang at nxp.com>; Han Xu <han.xu at nxp.com>; Nitin Garg
> <nitin.garg at nxp.com>; Jason Liu <jason.hui.liu at nxp.com>; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [EXT] Re: The problem about arm64: io: Relax implicit barriers
> in default I/O accessors
> 
> Caution: EXT Email
> 
> On Mon, Jun 21, 2021 at 09:32:22PM +0000, Frank Li wrote:
> >
> >
> > > -----Original Message-----
> > > From: Will Deacon <will at kernel.org>
> > > Sent: Monday, June 21, 2021 1:13 PM
> > > To: Frank Li <frank.li at nxp.com>
> > > Cc: Catalin Marinas <catalin.marinas at arm.com>; Zhi Li
> <lznuaa at gmail.com>;
> > > Shenwei Wang <shenwei.wang at nxp.com>; Han Xu <han.xu at nxp.com>; Nitin
> Garg
> > > <nitin.garg at nxp.com>; Jason Liu <jason.hui.liu at nxp.com>; linux-arm-
> > > kernel at lists.infradead.org
> > > Subject: Re: [EXT] Re: The problem about arm64: io: Relax implicit
> barriers
> > > in default I/O accessors
> > >
> > > Caution: EXT Email
> > >
> > > On Mon, Jun 21, 2021 at 05:56:43PM +0000, Frank Li wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Will Deacon <will at kernel.org>
> > > > > Sent: Monday, June 21, 2021 12:00 PM
> > > > > To: Frank Li <frank.li at nxp.com>
> > > > > Cc: Catalin Marinas <catalin.marinas at arm.com>; Zhi Li
> > > <lznuaa at gmail.com>;
> > > > > Shenwei Wang <shenwei.wang at nxp.com>; Han Xu <han.xu at nxp.com>; Nitin
> > > Garg
> > > > > <nitin.garg at nxp.com>; Jason Liu <jason.hui.liu at nxp.com>; linux-arm-
> > > > > kernel at lists.infradead.org
> > > > > Subject: Re: [EXT] Re: The problem about arm64: io: Relax implicit
> > > barriers
> > > > > in default I/O accessors
> > > > >
> > > > > Caution: EXT Email
> > > > >
> > > > > On Mon, Jun 21, 2021 at 05:26:41PM +0100, Will Deacon wrote:
> > > > > > On Mon, Jun 21, 2021 at 04:11:57PM +0000, Frank Li wrote:
> > > > > > > > Oh, interesting. Maybe this is a case where OSH vs SY
> actually
> > > makes
> > > > > a
> > > > > > > > difference. I'm not quite sure what it means for the
> coherency of
> > > > > normal,
> > > > > > > > non-cacheable accesses (which are outer-shareable) so that
> > > probably
> > > > > needs a
> > > > > > > > bit more thought.
> > > > > > > >
> > > > > > > > Can you confirm that the issue *does* still occur if you use
> > > dmb(osh)
> > > > > > > > instead of dmb(oshst), please?
> > > > > > >
> > > > > > > After get ARM support
> > > > >
> > >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fservices.
> > > > >
> > >
> arm.com%2Fsupport%2Fs%2Fcase%2F5003t00001RuJHw&data=04%7C01%7Cfrank.li%
> > > > >
> > >
> 40nxp.com%7Ca319ac5213a14aa6bb2508d934d5facc%7C686ea1d3bc2b4c6fa92cd99c5c30
> > > > >
> > >
> 1635%7C0%7C0%7C637598915908588560%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > > > >
> > >
> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6%2F%2FK
> > > > > ScsCmnUgNPnzcvyjRrOLjLVPrHtbVgI3J959U%2BQ%3D&reserved=0,
> > > > > > > This issue have some progress.
> > > > > > >
> > > > > > > Our system configure SYSBARDISABLE = 0x0, So ARM core barrier
> > > propagate
> > > > > to CCI-400
> > > > > > >
> > > > > > > Our DMA and USB is located below downstream of CCI-400. So USB
> or
> > > DMA
> > > > > is located
> > > > > > > in system shared domain. Only use dmb(st), CCI-400 wait for
> > > previous
> > > > > transaction
> > > > > > > Complete. When dma(osh), the response is sent when snoop
> responses
> > > are
> > > > > received for
> > > > > > > all earlier transactions. CCI-400 don't wait for previous write
> > > finish.
> > > > > >
> > > > > > Thanks for following up. I'll cook a patch to fix this...
> > > > >
> > > > > ... and in doing so, I realised I still have a question about this.
> > > > >
> > > > > If a CPU is writing to a zero-initialised non-cacheable buffer in
> > > memory
> > > > > and does something like:
> > > > >
> > > > >         buffer[0] = 1;
> > > > >         dma_wmb();      // DMB OSHST
> > > > >         buffer[64] = 1;
> > > > >
> > > > > would a non-coherent device reading this be able to see buffer[64]
> == 1
> > > > > but buffer[0] = 0? In other words, do we need to upgrade the dmb_*
> > > barriers
> > > > > as well as the I/O accessors, or are they still ordered by the bus
> > > fabric
> > > > > because all of the accesses are going to the DDR?
> > > >
> > > > I think re-order is possible. According to my understanding,
> > > > If cci ack dmb(oshst), the follow order is not guaranteed if no
> address
> > > overlap
> > > > for normal memory.
> > >
> > > Hmm, so that's a bit rubbish because it means that
> > > load-acquire/store-release to non-cacheable memory will *not* create
> order
> > > for non-coherent devices, as the memory type is outer-shareable :/
> > >
> > > So rewriting the above as:
> > >
> > >         buffer[0] = 1;
> > >         smp_store_release(&buffer[64], 1);
> > >
> > > wouldn't be ordered either.
> > >
> > > Can you confirm that it is the case, please?
> >
> > I have not test case, which can test it directly.
> > I supposed smp_mb is not work for no-coherent dma master.
> > If want dma master see order, need dma_wmb().
> 
> I think you had a support case open with Arm [1] which I'm not able to
> access -- please can you ask them about the two examples above?

Still not get feedback from ARM.
But I found some information, 
https://developer.arm.com/documentation/den0024/a/CHDCJBGA

Unlike the data barrier instructions, which take a qualifier to control which shareability domains see the effect of the barrier, the LDAR and STLR instructions use the attribute of the address accessed.

* address attribute * is controlled by page table.

SH0 bits[13:12] Shareability     
  00            Non-shareable    
  01            UNPREDICTABLE
  10            Outer Shareable
  11            Inner Shareable

#define PTE_SHARED               (_AT(pteval_t, 3) << 8)         /* SH[1:0], inner shareable */

So I think smp_store_release barrier to inner shared domain only.

Frank Li

> 
> Will
> 
> [1]
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fservices.
> arm.com%2Fsupport%2Fs%2Fcase%2F5003t00001RuJHw&data=04%7C01%7Cfrank.li%
> 40nxp.com%7C985edf1d391d42b0a6c908d9355dc3d7%7C686ea1d3bc2b4c6fa92cd99c5c30
> 1635%7C0%7C0%7C637599499095794610%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=wgaC0e%2
> B%2BjDBC0LrqVX%2F0b4KHJUqds5DUS72db94%2B%2Fsw%3D&reserved=0



More information about the linux-arm-kernel mailing list