Is it possible to run a linux on multiple clusters if CCI is missing
Li Chen
me at linux.beauty
Wed Jul 13 03:34:05 PDT 2022
Hi Arnd
---- On Wed, 13 Jul 2022 15:19:23 +0800 Arnd Bergmann <arnd at arndb.de> wrote ---
> On Wed, Jul 13, 2022 at 6:07 AM Li Chen <me at linux.beauty> wrote:
> >
> > Got it. I will come to #armlinux after fixing the IRC connection issue with my company network.
> > But I prefer mailing list over IRC in that most IRC channels don't have archives available, so they
> > are searchable on google.
>
> The easiest way is usually to pay for an irccloud.com account, which gives you
> access through normal https connections and a downloadable archive.
irccloud works perfectly for me!
>
> > > Regarding your question, I'm pretty sure that you cannot run Linux across
> > > multiple clusters without a CCI, as the kernel among other things on behavior
> > > documented in Documentation/memory-barriers.txt that is not guaranteed
> > > otherwise.
> >
> > Good point, but I don't know how CCI deals with the memory barrier, can you share more about it?
> >
> > Apart from memory barriers and coherence, can TLB invalidation not also work properly
> > among four clusters without CCI?
>
> The problem is more fundamental than this: without cache coherency, a CPU
> can keep an outdated copy of a cache line in its local cache indefinitely after
> another CPU writes to it, and the barriers that are meant to serialize access
> have no effect here. No idea what happens with TLB invalidation, I suppose that
> is similar but you won't even see get to this.
>
> > > The only way I can think of for using that kind of system would be to run
> > > a separate kernel on each cluster, and assigning each device to one of the
> > > instances, and use explicit cache management for a communication
> > > channel between them.
> >
> > Sorry for my three noob questions:
> > 1. How to assign each devices to one of the instances? Can NIC do it?
>
> The easiest way would be to just have DT files that don't overlap. Since you
> cannot share memory or devices, you already need a separate set of devices
> (root file system, network, etc) for each instance.
>
> In a more sophisticated setup, you could have a small hypervisor that
> controls all instances provides memory protection and communication
> between them.
>
> > 2. When you say "explicit cache management", do you mean flush/invalidate
> > cache in kernel manually with flush_cache* and fush_tlb*?
>
> Each instance would appear as a DMA device to the other ones, so this
> comes down to the normal dma-mapping.h interfaces. You could use
> uncached memory from dma_alloc_coherent() for simple shared memory
> channels, or streaming mappings using dma_map_single() or similar
> to perform cache flushes.
>
> > 3. What kind of communication channel? share a region of memory as
> > communication and monitor it with PMU?
>
> A SoC design that is meant for running multiple OSs would typically have
> some hardware support for this, using a combination of mailbox,
> sram, doorbell or hwspinlock, which one can use to build higher-level
> abstractions for device drivers. This is obviously hardware specific.
>
> Most likely, the answer is that it's not worth trying to run Linux on
> more than one cluster given this type of hardware. A more useful
> model might be to have Linux on one cluster and run a single
> bare-metal application on the other ones, which is accessed from
> Linux using a device driver.
>
> Arnd
>
Thanks a lot for your answers.
Regards,
Li
More information about the linux-arm-kernel
mailing list