Bug caused by multicast

Mon Apr 1 13:09:51 EDT 2013

Hi,

I have the following error when trying to send netlink multicast
[15185.963673] could not multicast packet: -3
my_timer_callback called (4298697424).
[15205.976244] sending request for eid (4).
[15205.976250] BUG: scheduling while atomic: swapper/5/0/0x10000100
[15205.976253] Modules linked in: lig_module(O) ip6table_filter ip6_tables
iptable_filter ip_tables ebtable_nat ebtables x_tables dm_crypt arc4 ath9k
mac80211 binfmt_misc uvcvideo snd_hda_codec_hdmi videobuf2_core
ath9k_common ath9k_hw videodev snd_hda_codec_realtek videobuf2_vmalloc
videobuf2_memops snd_hda_intel snd_hda_codec snd_hwdep snd_pcm ath
fglrx(PO) snd_seq_midi snd_rawmidi joydev cfg80211 kvm_intel
snd_seq_midi_event snd_seq kvm snd_timer psmouse snd_seq_device snd
i7core_edac mei asus_laptop edac_core lpc_ich soundcore microcode
sparse_keymap snd_page_alloc input_polldev serio_raw amd_iommu_v2 coretemp
hid_generic usbhid hid atl1c video
[15205.976335] Pid: 0, comm: swapper/5 Tainted: P        W  O
3.5.0mptcpbymatt+ #5
[15205.976338] Call Trace:
[15205.976340]  <IRQ>  [<ffffffff816630b3>] __schedule_bug+0x4d/0x59
[15205.976360]  [<ffffffff8166c904>] __schedule+0x6e4/0x7c0
[15205.976368]  [<ffffffff81084b3a>] __cond_resched+0x2a/0x40
[15205.976375]  [<ffffffff8166ca60>] _cond_resched+0x30/0x40
[15205.976382]  [<ffffffff8116c198>] kmem_cache_alloc_node+0x38/0x150
[15205.976390]  [<ffffffff8155a12b>] ? __alloc_skb+0x4b/0x230
[15205.976398]  [<ffffffffa0277230>] ? send_request_for_eid+0x130/0x130
[lig_module]
[15205.976404]  [<ffffffff8155a12b>] __alloc_skb+0x4b/0x230
[15205.976411]  [<ffffffffa0277230>] ? send_request_for_eid+0x130/0x130
[lig_module]
[15205.976418]  [<ffffffffa027713b>] send_request_for_eid+0x3b/0x130
[lig_module]
[15205.976424]  [<ffffffffa0277230>] ? send_request_for_eid+0x130/0x130
[lig_module]
[15205.976431]  [<ffffffffa0277230>] ? send_request_for_eid+0x130/0x130
[lig_module]
[15205.976437]  [<ffffffffa027727d>] my_timer_callback+0x4d/0x60
[lig_module]
[15205.976447]  [<ffffffff810630fd>] run_timer_softirq+0x13d/0x340
[15205.976453]  [<ffffffff8101a2e9>] ? read_tsc+0x9/0x20
[15205.976460]  [<ffffffff8105a6f6>] __do_softirq+0xb6/0x1d0
[15205.976467]  [<ffffffff810a7c06>] ? clockevents_program_event+0x76/0x120
[15205.976474]  [<ffffffff810a9154>] ? tick_program_event+0x24/0x30
[15205.976481]  [<ffffffff8167729c>] call_softirq+0x1c/0x30
[15205.976489]  [<ffffffff81015115>] do_softirq+0x75/0xb0
[15205.976494]  [<ffffffff8105aac5>] irq_exit+0xa5/0xb0
[15205.976500]  [<ffffffff81677bde>] smp_apic_timer_interrupt+0x6e/0x99
[15205.976507]  [<ffffffff8167694a>] apic_timer_interrupt+0x6a/0x70
[15205.976510]  <EOI>  [<ffffffff8138ad5a>] ? intel_idle+0xea/0x150
[15205.976525]  [<ffffffff8138ad3b>] ? intel_idle+0xcb/0x150
[15205.976532]  [<ffffffff8151c239>] cpuidle_enter+0x19/0x20
[15205.976539]  [<ffffffff8151c869>] cpuidle_idle_call+0xa9/0x240
[15205.976545]  [<ffffffff8101c40f>] cpu_idle+0xaf/0x120
[15205.976551]  [<ffffffff81658919>] start_secondary+0x1de/0x1e5

and caused by the following code. The family and ops are correctly
registered, I've created the multicast group beforehand. It seems it does
work when there is a client registered in this multicast group but if there
isn't any client, it returns -3. The return code is fine with me but I
don't understand why there is this stack trace though.

/**
send the request
**/
int send_request_for_eid(u32 eid)
{
    //!
    struct sk_buff *skb;
    int rc = 0;
    void *msg_head;

    printk( "sending request for eid %u.%u.%u.%u.\n", NIPQUAD(eid) );

    /* send a message back*/
    /* allocate some memory, since the size is not yet known use
NLMSG_GOODSIZE*/
    skb = genlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);

    if (skb == NULL)
    {
        printk(KERN_ERR "could not allocate space for new msg");
        return -ENOMEM;
    }
    /* create the message headers
         arguments of genlmsg_put:
           struct sk_buff *,
           int (sending) pid,
           int sequence number,
           struct genl_family *,
           int flags,
           u8 command index (why do we need this?)
*/

 // TODO passer des NL_AUTO plutot ?
    msg_head = genlmsg_put(
                        skb,
                        0,            /*   */
                        0, /* no de seq (NL_AUTO_SEQ ne marche pas) */
                        &lig_gnl_family,
                        LIG_GENL_HDRLEN,    /* header length (to check) */
                        ELC_REQUEST_RLOCS_FOR_EID   /* command */
                        );

    if (msg_head == NULL) {
        printk( KERN_ERR "could not create generic header");

        return -ENOMEM;

    }

    /* puts EID we are looking RLOCs for */
    rc = nla_put_u32( skb, ELA_EID, eid);
    // rc = nla_put_string(skb, ELA_EID, "hello world from kernel space\n");
    if (rc != 0)
    {
        printk( KERN_ERR "could not add payload");
        return rc;
    }

    /* finalize the message */
    genlmsg_end(skb, msg_head);

    /* returns -ESRCH  (= -3) => no such process */
        // todo a envoyer en broadcast
    //struct sk_buff *skb, u32 pid,unsigned int group, gfp_t flags)
    rc = genlmsg_multicast(
        skb,
        0,  /* set own pid to not recevie */
        lig_multicast_group.id,
        GFP_KERNEL
         );

    if(rc != 0)
    {
        printk( KERN_ERR "could not multicast packet: %d", rc);
    }
    return 0;
}

Maybe this is more a netlink error than an libgenl error so don't hesitate
to point me to a better suited mailing list if that's the case.

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/libnl/attachments/20130401/37a9c104/attachment.html>