[RFC PATCH 2/4] DRM: Add support of AI Processor Unit (APU)

Christian König christian.koenig at amd.com
Wed Sep 22 23:17:33 PDT 2021


Am 23.09.21 um 02:58 schrieb Dave Airlie:
> On Sat, 18 Sept 2021 at 07:57, Alexandre Bailon <abailon at baylibre.com> wrote:
>> Some Mediatek SoC provides hardware accelerator for AI / ML.
>> This driver provides the infrastructure to manage memory
>> shared between host CPU and the accelerator, and to submit
>> jobs to the accelerator.
>> The APU itself is managed by remoteproc so this drivers
>> relies on remoteproc to found the APU and get some important data
>> from it. But, the driver is quite generic and it should possible
>> to manage accelerator using another ways.
>> This driver doesn't manage itself the data transmitions.
>> It must be registered by another driver implementing the transmitions.
>>
>> Signed-off-by: Alexandre Bailon <abailon at baylibre.com>
>> [SNIP]

>> Please refer to
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2FDocumentation%2Fioctl%2Fbotching-up-ioctls.rst&data=04%7C01%7Cchristian.koenig%40amd.com%7C53a0ef2630404ddc4d9408d97e2d409c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679555123878415%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6oVXAAOjQX%2FnDzJxZIAALqjDourHdrdGF6QVQKR58KI%3D&reserved=0
>>
>> here and below in many places.
>>
>> There's a lot of missing padding/alignment here.

There is also the pahole utility which show you nicely where you need 
padding for your IOCTL structures.

For example "pahole drivers/gpu/drm/amd/amdgpu/amdgpu.ko -C 
drm_amdgpu_gem_va" gives you:

struct drm_amdgpu_gem_va {
     __u32                      handle;               /*     0     4 */
     __u32                      _pad;                 /*     4     4 */
     __u32                      operation;            /*     8     4 */
     __u32                      flags;                /*    12     4 */
     __u64                      va_address;           /*    16     8 */
     __u64                      offset_in_bo;         /*    24     8 */
     __u64                      map_size;             /*    32     8 */

     /* size: 40, cachelines: 1, members: 7 */
     /* last cacheline: 40 bytes */
};

And as you can see we have added the _pad field to our IOCTL parameter 
structure to properly align the 64bit members.

Regards,
Christian.

>>
>> I'm trying to find the time to review this stack in full, any writeups
>> on how this is used from userspace would be useful (not just the code
>> repo, but some sort of how do I get at it) it reads as kinda generic
>> (calling it apu), but then has some specifics around device binding.
>>
>> Dave.




More information about the linux-arm-kernel mailing list