linux uml segfault

Anton Ivanov anton.ivanov at kot-begemot.co.uk
Thu Mar 4 07:45:45 GMT 2021


On 04/03/2021 05:38, Hajime Tazaki wrote:
> 
> On Thu, 04 Mar 2021 07:40:00 +0900,
> Johannes Berg wrote:
>>
>> I think the problem is here:
>>
>>> #24 0x000000006080f234 in ipc_init_ids (ids=0x60c60de8 <init_ipc_ns+8>)
>>> at ipc/util.c:119
>>> #25 0x0000000060813c6d in sem_init_ns (ns=0x60d895bb <textbuf+91>) at
>>> ipc/sem.c:254
>>> #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
>>> #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-
>>> gnu/libcom_err.so.2
>>
>> You're in the init of libcom_err.so.2, which is loaded by
>>
>>> "libnss_nis.so.2"
>>
>> which is loaded by normal NSS code (getgrnam):
>>
>>> #40 0x00007f89909bf3a6 in nss_load_library (ni=ni at entry=0x61497db0) at
>>> nsswitch.c:359
>>> #41 0x00007f89909bfc39 in __GI___nss_lookup_function (ni=0x61497db0,
>>> fct_name=<optimized out>, fct_name at entry=0x7f899089b020 "setgrent") at
>>> nsswitch.c:467
>>> #42 0x00007f899089554b in init_nss_interface () at nss_compat/compat-
>>> grp.c:83
>>> #43 init_nss_interface () at nss_compat/compat-grp.c:79
>>> #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (name=0x7f8990a2a1e0
>>> "tty", grp=0x7ffe3e7a2910, buffer=0x7ffe3e7a24e0 "", buflen=1024,
>>> errnop=0x7f899089eb00) at nss_compat/compat-grp.c:486
>>> #45 0x00007f8990968b85 in __getgrnam_r (name=name at entry=0x7f8990a2a1e0
>>> "tty", resbuf=resbuf at entry=0x7ffe3e7a2910,
>>> buffer=buffer at entry=0x7ffe3e7a24e0 "", buflen=1024,
>>> result=result at entry=0x7ffe3e7a2908)
>>>      at ../nss/getXXbyYY_r.c:315
>>
>>
>> You have a strange nsswitch configuration that causes all of this
>> (libnss_nis.so.2 -> libcom_err.so.2) to get loaded.
>>
>> Now libcom_err.so.2 is trying to call sem_init(), and that gets ... tada
>> ... Linux's sem_init() instead of libpthread's.
>>
>> And then the crash.
>>
>> Now, I don't know how to fix it (short of changing your nsswitch
>> configuration) - maybe we could somehow rename sem_init()? Or maybe we
>> can somehow give the kernel binary a lower symbol resolution than the
>> libc/libpthread.
> 
> objcopy (from binutils) can localize symbols (i.e., objcopy -L
> sem_init $orig_file $new_file).  It also does renaming symbols.  But
> not sure this is the ideal solution.
> 
> How does UML handle symbol conflicts between userspace code and Linux
> kernel (like this case sem_init) ?  AFAIK, libnl has a same symbol as
> Linux kernel (genlmsg_put) and others can possibly do as well.

It used to handle them. I do not think it does now - something broke and 
it's fairly recent.

I actually have something which confirms this.

I worked on a patch around 5.8-5.9 which would give the option to pick 
up libc equivalents for the functions from string.h and there was a 
clear performance difference of ~ 20%+ This is because UML has no means 
of optimizing them and picks up the worst case scenario x86 version.

I parked that for a while, because had to look at other stuff at work.

I restarted working on it after 5.10. My first observation was that 
despite not changing anything in the patches, the gain was no longer 
there. The performance was the same as if it picked up libc equivalents.

I can either try to reproduce the nss config which causes the sem_init 
issue or use my own libc patchset to try to dissect. The problem commit 
will be roughly around the time the performance difference from applying 
the "switch to libc" goes away.

Brgds,

A.
> 
> 
> -- Hajime
> 
> _______________________________________________
> linux-um mailing list
> linux-um at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
> 


-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/



More information about the linux-um mailing list