[RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver
Arnd Bergmann
arnd at kernel.org
Fri Jan 8 09:23:23 EST 2021
On Fri, Jan 8, 2021 at 1:54 PM Mark Rutland <mark.rutland at arm.com> wrote:
> On Fri, Jan 08, 2021 at 07:52:31PM +0900, Misono Tomohiro wrote:
> > (Resend as cover letter title was missing in the first time. Sorry for noise)
> >
> > This series adds Fujitsu A64FX SoC entry in drivers/soc and hardware
> > barrier driver for it.
> >
> > [Driver Description]
> > A64FX CPU has several functions for HPC workload and hardware barrier
> > is one of them. It is a mechanism to realize fast synchronization by
> > PEs belonging to the same L3 cache domain by using implementation
> > defined hardware registers.
> > For more details, see A64FX HPC extension specification in
> > https://github.com/fujitsu/A64FX
> >
> > The driver mainly offers a set of ioctls to manipulate related registers.
> > Patch 1-9 implements driver code and patch 10 finally adds kconfig,
> > Makefile and MAINTAINER entry for the driver.
>
> I have a number of concerns here, and at a high level, I do not think
> that this is something Linux can reasonably support in its current form.
> Sorry if this comes across as harsh; I appreciate the work that has gone
> into this, and the effort to try to upstream support is great -- my
> concerns are with the overal picture.
>
> As a general rule, we avoid the use of IMPLEMENTATION DEFINED features
> in Linux, as they pose a number of correctness/safety challenges and
> come with a potentially significan long term maintenance burden that is
> generally not justified by the features themselves. For example, such
> features are not usable under virtualization (where a hypervisor may set
> HCR_EL2.TIDCP, or fail to context-switch state that it is unaware of).
I am somewhat less concerned about the feature being implementation
defined than I am about adding a custom user interface for one
platform.
In the end, anything outside of the CPU core that ends up in a SoC
is implementation defined, and this is usually not a problem as long
as we have an abstraction in the kernel that hides the details from
the user, and the system is still functional if the implementation is
turned off for whatever reason.
> Secondly, the intended usage model appears to expose this to EL0 for
> direct access, and the code seems to depend on threads being pinned, but
> AFAICT this is not enforced and there is no provision for
> context-switch, thread migration, or interaction with ptrace. I fear
> this is going to be very fragile in practice, and that extending that
> support in future will require much more complexity than is currently
> apparent, with potentially invasive changes to arch code.
Right, this is the main problem I see, too. I had not even realized
that this will have to tie in with user space threads in some form, but
you are right that once this has to interact with the CPU scheduler,
it all breaks down.
One way I can imagine this working out is to tie into the cpuset
mechanism that is used for isolating threads to CPU cores, and
then provide a cpuset interface that has the desired behavior
but that can fall back to a generic implementation with the same
or stronger (but normally slower) semantics.
> Thirdly, this requires userspace software to be intimately familiar with
> the HW platform that it is running on (both in terms of using IMP-DEF
> instructions and needing to know the physical layout), rather than being
> generic and portable, which I don't believe is something that we wish to
> encourage. I also think this is unlikely to be supported by generic
> software because of the lack of portability, and consequently I struggle
> to beleive that this will see significant usage.
Agreed as well.
Arnd
More information about the linux-arm-kernel
mailing list