[FS#449] relayd can cause dropped packets if client has /proc/sys/net/ipv4/conf/*/rp_filter=1

LEDE Bugs lede-bugs at lists.infradead.org
Wed Feb 1 08:34:32 PST 2017


A new Flyspray task has been opened.  Details are below. 

User who did this - Theodore A. Roth (troth) 

Attached to Project - LEDE Project
Summary - relayd can cause dropped packets if client has /proc/sys/net/ipv4/conf/*/rp_filter=1
Task Type - Bug Report
Category - Base system
Status - Unconfirmed
Assigned To - 
Operating System - All
Severity - Low
Priority - Very Low
Reported Version - Trunk
Due in Version - Undecided
Due Date - Undecided
Details - DEVICE: N/A (can be reproduced on any system running relayd)
LEDE version: N/A (can be reproduced on a VM running debian with relayd installed)

Steps to reproduce:

* Run relayd on a system (aka router)with two interfaces.
* Run a stock Ubuntu-16.04 system (aka client) connected to the managed interface.
* Run another system (aka server) on the other interface of relayd system.
* Have server ping client and watch connectivity drop out periodically.
* Have client ping server and watch connectivity drop out periodically.

Our fix for the problem need two changes:

* Add arptables rules to system to handle kernel level arp requests properly via mangling the source address in the arp requests
* Modify relayd to send the correct src addr in the arp requests that it generates.

Our changes to relayd are here:

* https://github.com/troth/relayd/commit/c8d895ee71be59262f01c3fdf50f307ebf1593e7

>From commit message for my fix:


    Add option to set arp src addr for managed interfaces.
    
    Relayd will send arp requests out a managed interface like this:
    
        Who has 192.168.1.40, tell 192.168.2.1
    
    In most cases, this works, but some clients will not send a reply (on
    linux, client will not reply if /proc/sys/net/ipv4/conf/*/rp_filter is
    set to 1, which happens to be the default on ubuntu-16.04).
    
    Add '-s' option to tell relayd to use the specified addr as the arp src
    addr for managed interfaces. The arp requests would then look like:
    
        Who has 192.168.1.40, tell 192.168.1.100
    
    for which the client properly sends a reply.
    
    The symptoms of the problem manifest as dropped packets due to the
    kernel marking the arp entry for the client as FAILED due to lack of
    responses to the arp requests. Eventually (10-30 seconds later), the arp
    table is updated and connectivity is restored.


More information can be found at the following URL:
https://bugs.lede-project.org/index.php?do=details&task_id=449



More information about the lede-bugs mailing list