XO -> XO ping problem
dcbw at redhat.com
Thu Jun 28 01:30:15 EDT 2007
We're observing an interesting problem here, replicated in at least 3
different locations (Waltham (me), Toronto (tamtam), Arlington
(dilinger)) on a bunch of different builds (406.x, 432, 451) with
different hardware (B1, B2, B3) and different firmware (5.220.10.p5,
5.220.11.p5). In all 3 locations, nothing is doing mesh networking at
The problem is that when connected in infrastructure mode, two XOs
cannot ping each _other_, but other normal laptops connected to the same
AP can ping each XO, and the XO can both ping the normal laptop and
external sites like google.
For my investigation of this problem, I turned off NetworkManager
completely and booted up in runlevel 3.
On both laptops I have, I did:
chkconfig --level 345 NetworkManager off
chkconfig --level 345 dhcdbd off
chkconfig --level 345 network off
nano /etc/inittab (set runlevel 3 as default)
iwconfig eth0 essid foobar key <blah> mode managed
dhclient -1 eth0
and then tried to ping each other XO from the other, and to ping google,
and to ping my ThinkPad T42. The XOs always failed to ping each other,
but could both ping google and my T42.
Digging further into the issue, I found via wireshark/ethereal on my
thinkpad that I can see the ARP request from XO1 -> XO2, but XO2 never
sends an ARP reply back (as seen from wireshark). Turning on RX & TX
debugging in the driver on the XO2 shows the "SendSinglePacket succeeds"
message for (apparently) each ARP request XO2 receives, which I
interpret to mean that XO2 is actually trying to send the ARP reply, but
that the reply gets lost between the host_to_card() function and the
radio. I can actually try to get the hexdump of the outgoing tx packets
if that would help.
However, running tcpdump on both machines has interesting results. The
pinging XO1 shows the pings in the dump, but the dump from the XO2 being
pinged shows only LLC frames, and no ICMP frames. tcpdump isn't getting
any pings. If you want the tcpdumps I can send you a link.
Thoughts? Can you try to replicate with, say, build 451 and debug the
issue as well?
More information about the libertas-dev