[EXT] Re: The problem about arm64: io: Relax implicit barriers in default I/O accessors

Will Deacon will at kernel.org
Mon Jun 21 11:13:26 PDT 2021


On Mon, Jun 21, 2021 at 05:56:43PM +0000, Frank Li wrote:
> 
> 
> > -----Original Message-----
> > From: Will Deacon <will at kernel.org>
> > Sent: Monday, June 21, 2021 12:00 PM
> > To: Frank Li <frank.li at nxp.com>
> > Cc: Catalin Marinas <catalin.marinas at arm.com>; Zhi Li <lznuaa at gmail.com>;
> > Shenwei Wang <shenwei.wang at nxp.com>; Han Xu <han.xu at nxp.com>; Nitin Garg
> > <nitin.garg at nxp.com>; Jason Liu <jason.hui.liu at nxp.com>; linux-arm-
> > kernel at lists.infradead.org
> > Subject: Re: [EXT] Re: The problem about arm64: io: Relax implicit barriers
> > in default I/O accessors
> > 
> > Caution: EXT Email
> > 
> > On Mon, Jun 21, 2021 at 05:26:41PM +0100, Will Deacon wrote:
> > > On Mon, Jun 21, 2021 at 04:11:57PM +0000, Frank Li wrote:
> > > > > Oh, interesting. Maybe this is a case where OSH vs SY actually makes
> > a
> > > > > difference. I'm not quite sure what it means for the coherency of
> > normal,
> > > > > non-cacheable accesses (which are outer-shareable) so that probably
> > needs a
> > > > > bit more thought.
> > > > >
> > > > > Can you confirm that the issue *does* still occur if you use dmb(osh)
> > > > > instead of dmb(oshst), please?
> > > >
> > > > After get ARM support
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fservices.
> > arm.com%2Fsupport%2Fs%2Fcase%2F5003t00001RuJHw&data=04%7C01%7Cfrank.li%
> > 40nxp.com%7Ca319ac5213a14aa6bb2508d934d5facc%7C686ea1d3bc2b4c6fa92cd99c5c30
> > 1635%7C0%7C0%7C637598915908588560%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6%2F%2FK
> > ScsCmnUgNPnzcvyjRrOLjLVPrHtbVgI3J959U%2BQ%3D&reserved=0,
> > > > This issue have some progress.
> > > >
> > > > Our system configure SYSBARDISABLE = 0x0, So ARM core barrier propagate
> > to CCI-400
> > > >
> > > > Our DMA and USB is located below downstream of CCI-400. So USB or DMA
> > is located
> > > > in system shared domain. Only use dmb(st), CCI-400 wait for previous
> > transaction
> > > > Complete. When dma(osh), the response is sent when snoop responses are
> > received for
> > > > all earlier transactions. CCI-400 don't wait for previous write finish.
> > >
> > > Thanks for following up. I'll cook a patch to fix this...
> > 
> > ... and in doing so, I realised I still have a question about this.
> > 
> > If a CPU is writing to a zero-initialised non-cacheable buffer in memory
> > and does something like:
> > 
> >         buffer[0] = 1;
> >         dma_wmb();      // DMB OSHST
> >         buffer[64] = 1;
> > 
> > would a non-coherent device reading this be able to see buffer[64] == 1
> > but buffer[0] = 0? In other words, do we need to upgrade the dmb_* barriers
> > as well as the I/O accessors, or are they still ordered by the bus fabric
> > because all of the accesses are going to the DDR?
> 
> I think re-order is possible. According to my understanding, 
> If cci ack dmb(oshst), the follow order is not guaranteed if no address overlap
> for normal memory. 

Hmm, so that's a bit rubbish because it means that
load-acquire/store-release to non-cacheable memory will *not* create order
for non-coherent devices, as the memory type is outer-shareable :/

So rewriting the above as:

	buffer[0] = 1;
	smp_store_release(&buffer[64], 1);

wouldn't be ordered either.

Can you confirm that it is the case, please?

Will



More information about the linux-arm-kernel mailing list