[FS#766] Intermittent SIGSEGV crash of dnsmasq-full

LEDE Bugs lede-bugs at lists.infradead.org
Tue May 9 00:47:58 PDT 2017


The following task has a new comment added:

FS#766 - Intermittent SIGSEGV crash of dnsmasq-full 
User who did this - guidosarducci (guidosarducci)

----------
Hehe, very optimistic of you to close this...

I saw the update from Simon Kelley (thank you!) on the Dnsmasq-discuss mailing list and built an updated LEDE dnsmasq-2.77rc1 package to test. (see required patch attached)

The prior minimal test-case passed, but the original production config file now creates a horrible SIGSEGV crash-loop (log attached):
Mon May  8 22:59:46 2017 kern.info kernel: [1738736.539480] do_page_fault(): sending SIGSEGV to dnsmasq for invalid read access from 00000000
Mon May  8 22:59:46 2017 kern.info kernel: [1738736.548375] epc = 0040e79b in dnsmasq[400000+2d000]
Mon May  8 22:59:46 2017 kern.info kernel: [1738736.553564] ra  = 0040e773 in dnsmasq[400000+2d000]


Stack trace indicates something to do with logging:
(gdb) core-file dnsmasq.18906.11.1494309586.core
[New LWP 18906]
...
Core was generated by `dnsmasq -C /var/etc/dnsmasq.conf.cfg02411c --no-daemon'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0040e79b in search_servers (now=now at entry=1494309586,
    addrpp=addrpp at entry=0x0, qtype=qtype at entry=32768, qdomain=,
    type=type at entry=0x7fd02c74, domain=domain at entry=0x7fd02c78,
    norebind=norebind at entry=0x0) at forward.c:222
222           log_query(logflags | flags | F_CONFIG | F_FORWARD, qdomain, *addrpp, NULL);
(gdb) bt
#0  0x0040e79b in search_servers (now=now at entry=1494309586,
    addrpp=addrpp at entry=0x0, qtype=qtype at entry=32768, qdomain=,
    type=type at entry=0x7fd02c74, domain=domain at entry=0x7fd02c78,
    norebind=norebind at entry=0x0) at forward.c:222
#1  0x00410759 in reply_query (fd=, family=,
    now=now at entry=1494309586) at forward.c:938
#2  0x004127dd in check_dns_listeners (now=now at entry=1494309586)
    at dnsmasq.c:1560
#3  0x004047db in main (argc=, argv=)
    at dnsmasq.c:1044
(gdb) print logflags
$1 = 32800
(gdb) print flags
$2 = 
(gdb) print *qdomain
value has been optimized out
(gdb) print addrpp
$3 = (struct all_addr **) 0x0
(gdb)

This turns out to be easy to reproduce. Simply add domain-needed to the prior standalone config file.
Then trigger the crash from a client with:
$ nslookup -port=55553 google.com 192.168.1.1
;; connection timed out; no servers could be reached

I attached all the relevant logs, configs and patches.

  
----------

One or more files have been attached.

More information can be found at the following URL:
https://bugs.lede-project.org/index.php?do=details&task_id=766#comment2589



More information about the lede-bugs mailing list