The problem about arm64: io: Relax implicit barriers in default I/O accessors

Catalin Marinas catalin.marinas at arm.com
Thu Jun 17 02:27:44 PDT 2021


On Wed, Jun 16, 2021 at 02:24:39PM -0500, Zhi Li wrote:
> On Wed, Jun 16, 2021 at 2:18 PM Frank Li <frank.li at nxp.com> wrote:
> > Will Deacon wrote:
> > > It would also be helpful to know a bit more about the hardware:
> > >
> > >   - What is the "internal bus fabric"?
> 
> > Look like ARM call as "Interconnect",  Multi AXI master and multi AXI slave
> > connected together. 
> 
> I  drawed simplified bus structure. 
>  
>         ┌──────┐ ┌────┐
>         │ A53  │ │A72 │
>         └───┬──┘ └─┬──┘
>             │      │
>         ┌───▼──────▼──┐
>         │    CCI400   │
>         └─────┬───────┘
>               │   1 (a)write to ddr (normal uncached memory)
>               │   DMB OSHST
>               │   2 (b)write to usb register(device, nGnRE)
>         ┌─────▼───────────────────────┐       ┌───────────┐
>         │                             ◄───────┤   GPU     │
>         │     Bus fabric              │       │           │
>         └────────────────────────────┬┘       └───────────┘
> 3 (b) reach usb   ▲ 4 usb read   ▲   │ 6.(a)reach
>          │        │   ddr        │   │
>       ┌──▼────────┴─┐            │   │
>       │             │            │   │
>       │  USB        │      5.usb │   │
>       │             │      read  │   │
>       └─────────────┘            │   │
>                                ┌─┴───▼─┐
>                                │       │
>                                │ DDR   │
>                                │       │
>                                └───────┘

Since you sent an HTML message, it was rejected by the list server. The
above is a plain-text rendition by w3m (and changed barrier() to DMB
OSHST).

Is the DMB propagated to the bus fabric? IIUC, our logic is that if the
write (b) to USB is observable by, let's say, the GPU, the same GPU
should also observe the write (a) to DDR. Since the write (a) to DDR is
globally observable, the USB device read at (4) should also observe it
(well, we may be wrong).

So while the bus fabric could ensure the ordering of the DDR write (a)
and the USB write (b) from the perspective of a third observer (the
GPU), I don't see how it can force it from the USB perspective as it
cannot observe the write (b) to its registers.

Replacing the DMB with the DSB forces the write (a) to reach the DDR on
your platform.

Will, any better idea of why it goes wrong?

-- 
Catalin



More information about the linux-arm-kernel mailing list