[RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver

misono.tomohiro at fujitsu.com misono.tomohiro at fujitsu.com
Thu Feb 18 04:49:07 EST 2021


> > > > Also, It is common usage that each running thread is bound to one PE in
> > > > multi-threaded HPC applications.
> > >
> > > I think the expectation that all threads are bound to a physical CPU
> > > makes sense for using this feature, but I think it would be necessary
> > > to enforce that, e.g. by allowing only threads to enable it after they
> > > are isolated to a non-shared CPU, and automatically disabling it
> > > if the CPU isolation is changed.
> > >
> > > For the user space interface, something based on process IDs
> > > seems to make more sense to me than something based on CPU
> > > numbers. All of the above does require some level of integration
> > > with the core kernel of course.
> > >
> > > I think the next step would be to try to come up with a high-level
> > > user interface design that has a chance to get merged, rather than
> > > addressing the review comments for the current implementation.

Hello,

Sorry for late response but while thinking new approaches, I come up with
some different idea and want to hear your opinions. How about offload
all control to user space while the driver just offers read/write access
to the needed registers? Let me explain in detail. 

Although I searched similar functions in other products, I could not find
it. Also, this hardware barrier performs intra-numa synchronization and
it is hard to be used for general inter-process barrier. So I think
generalizing this feature in kernel does not go well.

As I said this is mainly for HPC application. In the usual situations, the
user has full control of the PC nodes when running HPC application and
thus the user has full responsibility of running processes on the machine.
Offloading all controls to these registers to the user is acceptable in that
case (i.e. the driver just offers access to the registers and does not control it). 
This is the safe for the kernel operation as manipulating barrier related
registers just affects user application.

In this approach we could remove ioctls or control logic in the driver but
we need some way to access the needed registers. I firstly think if I can
use x86's MSR driver like approach but I know the idea is rejected
recently for security concerns:
 https://lore.kernel.org/linux-arm-kernel/20201130174833.41315-1-rongwei.wang@linux.alibaba.com/ 

Based on these observations, I have two ideas currently: 
 1) make the driver to only expose sysfs interface for reading/writing
   A64FX's barrier registers  
or 
 2) generalizing (1) in some way; To make some mechanism to expose 
   CPU defined registers which can be safely accessed from user space 

Are these idea acceptable ways to explore to get merged in upstream? 
I'd appreciate any criticism/comments. 

Regards, 
Tomohiro


More information about the linux-arm-kernel mailing list