[PATCH v1 3/7] DCE/DSE: Add a new scripts/Makefile.syscalls

Zhangjin Wu falcon at tinylab.org
Tue Oct 3 01:16:51 PDT 2023


Hi, Arnd

> On Tue, Sep 26, 2023, at 00:38, Zhangjin Wu wrote:
> > When CONFIG_TRIM_UNUSED_SYSCALLS is enabled, get used syscalls from
> > CONFIG_USED_SYSCALLS. CONFIG_USED_SYSCALLS may be a list of used
> > syscalls or a file to store such a list.
> >
> > If CONFIG_USED_SYSCALLS is configured as a list of the used syscalls,
> > directly record them in a used_syscalls variable, if it is a file to
> > store the list, record the file name to the used_syscalls_file variable
> > and put its content to the used_syscalls variable.
> >
> > Signed-off-by: Zhangjin Wu <falcon at tinylab.org>
> 
> I like the idea of configuring the set of syscalls more, but we
> should probably discuss the implementation of this here. You
> introduce two new ways of doing this, on top of the existing
> coarse-grained method (per syscall class Kconfig symbols).
> 
> Both methods seem a little awkward to me, but are doable
> in principle if we can't come up with a better way. However,
> I'd much prefer to not add both the Kconfig symbol and the
> extra file here, since at least one of them is redundant.
>

Ok, it is reasonable to remove this extra scripts/Makefile.syscalls if
we only support the configuring the set of syscalls via Kconfig symbol.

> Do you have automatic tooling to generate these lists from
> a profile, or do you require manually writing them? Do you
> have an example list?
>

Yes, we have but no generic tool across architectures, libcs and systems
and I plan to delay this work as the part3 patchset of the whole series.

I have tried two methods, one is objdumping of the binary, another
method is recording the used syscalls from the libc linking stage, both
of them are limited to single-application system, for complicated
system, the libraryopt [1] tool may be required to find used libcs at
first.

The objdumping method simply greps the 'syscall' instructions and
extract the syscall number from the 'num' register write instruction,
but I have encountered some exceptions, that is the nearest 'num'
register instruction is not just the one for the 'syscall' instruction
and therefore we may get a wrong syscall number, I do have a demo
script, but it is very ugly currently [2], and I have found Tim have
prepared such a python script too [3], not sure if it may get wrong
syscall number either, If not, it should be a good one for our future
development.

After getting the syscall numbers, it is not that hard to convert them
to syscall names with the help of <unistd.h> or let kernel directly
accept a setting of syscall number via the Kconfig symbol (I have tried
it, but it may make things not readable, so, the support is removed in
this patchset).

Another method we have tried is recording the used syscalls during the
linking stage of the libc (only for Nolibc currently) with the help of
sections-gc [4].

It uses such logic:

- Rename all of the my_syscall<N> macros to _my_syscall<N>
- The new my_syscall<N> use .pushsection to add empty and not really used .rodata.syscall.__NR_<syscall_name> sections
- Enable -ffunction-sections -fdata-sections, -Wl,--gc-sections and -Wl,--print-gc-sections
- The unused sys_<syscall_name> will be eliminated, record the syscall list as group1
- All of .rodata.syscall.__NR_<syscall_name> section will be eliminated, record the syscall list as group2
- "group2 - group1" should be the used syscalls

Basic test shows, this works well with Nolibc, but it requires to modify
the Libc source code.

Thanks,
Zhangjin Wu

---
[1]: http://libraryopt.sourceforge.net/
[2]: https://github.com/tinyclub/linux-lab/blob/master/tools/nolibc/dump.sh
[3]: https://github.com/tbird20d/auto-reduce/blob/master/programs/find-syscalls.py
[4]: https://lore.kernel.org/lkml/cbcbfbb37cabfd9aed6088c75515e4ea86006cff.1676594211.git.falcon@tinylab.org/

>       Arnd



More information about the linux-riscv mailing list