RxRPC: 3 issues found in my example code

Leon Liu liuyang at neusoft.com
Thu Apr 24 18:51:35 PDT 2014


Hi,

I wrote an example to try performance of RxRPC and met 3 issues. I've
contacted to David Howells for these issues and he recommended me to send a
mail to you guys. This is for your information and thanks in advance if
someone can help on these issues.

====================================
Test environment:
====================================
Kernel 3.3.8 with Ubuntu 12.04 32bit based on vmware virtual machine, MTU
set "auto". I've ported the "tag rxrpc-20140226" code back to kernel 3.3.8
to make a try but no luck. In fact, the new code did fixed about storm I
observed.

My example runs only on kernel space. My code is a "server to server"
(assign address in each call) communication. Basically, my example just like
what AFS does in kernel.

ret = socket->ops->setsockopt(socket, SOL_RXRPC,
RXRPC_MIN_SECURITY_LEVEL,(char *)&sec, sizeof(sec));
	ret = kernel_bind(socket, (struct sockaddr *) &srx, sizeof(srx));
	
	if (ret < 0) {
		_err("bind socket error: %d", ret);
		goto out3;
	}

	rxrpc_kernel_intercept_rx_messages(socket, my_rx_interceptor);

I did NOT invoke kernel_listen(), kernel_connect(). And key = NULL with
RXRPC_SECURITY_PLAIN.

I did NOT invoke a send RXRPC_USER_CALL_ID/ RXRPC_ACCEPT control message
like user space listen.c does. I invoke rxrpc_kernel_accept_call() instead.
I don't see rxrpc_kernel_accept_call() do a send RXRPC_USER_CALL_ID/
RXRPC_ACCEPT control message stuff.

====================================
Issues 1:
====================================
when I send out a call data larger than 1500(aka: MTU size), the first data
snippet was received by receiver successfully but failed for following
snippets. No problem if call data is less than MTU size.

====================================
Log I saw:
====================================

sender:
rsyslo] ==> rxrpc_fast_process_packet(ca1c8000,ca240c00)
[rsyslo]     abort
[rsyslo] ### Rx ABORT %0 { fffffffe }

receiver:
[swappe] ==> rxrpc_data_ready(cdbb1c00, 1480)
[swappe] @@@ recv skb cc807240
[swappe] @@@ Rx UDP packet from c0a83296:14001
[swappe] @@@ Rx RxRPC ToServer ep=534cd637 call=4:2
[swappe] ==> rxrpc_find_call_hash()
[swappe] ==> rxrpc_call_hashfunc()
[swappe] <== rxrpc_call_hashfunc() key = 0x68b36c9
[swappe] <== rxrpc_find_call_hash() =   (null)
[swappe]     can't route call
[swappe]     first packet
[swappe] ==> rxrpc_put_local(ccb3a480{u=3})
[swappe] <== rxrpc_put_local()
[swappe] <== rxrpc_data_ready() [incoming]
[swappe] ==> rxrpc_data_ready(cdbb1c00, 404)
[swappe] @@@ recv skb cc807c00
[swappe] @@@ Rx UDP packet from c0a83296:14001
[swappe] @@@ Rx RxRPC ToServer ep=534cd637 call=4:2
[swappe] ==> rxrpc_find_call_hash()
[swappe] ==> rxrpc_call_hashfunc()
[swappe] <== rxrpc_call_hashfunc() key = 0x68b36c9
[swappe] <== rxrpc_find_call_hash() =   (null)
[swappe]     can't route call
[swappe]     reject type 1
[swappe] ==> rxrpc_put_local(ccb3a480{u=4})
[swappe] <== rxrpc_put_local()
[swappe] <== rxrpc_data_ready() [no call]
[kworke] ==> rxrpc_accept_incoming_calls(1) 

====================================
What I think:
====================================
The problem is the second data snippet was sent out before the receiver
begins "rxrpc_accept_incoming_calls" and receiver says "can't route call"
and send out "abort".

The first data snippet acts as an accepting "ping". The receiver does NOT
send out any "accepted" to sender. The sender does NOT wait any event of
"accepted" but send out the second data snippet.

====================================
MY code example(most the same with AFS's code.  Just example):
====================================
int make_call(struct in_addr *addr, struct my_call *call, gfp_t gfp,
                                  const struct my_wait_mode *wait_mode)
{
                struct sockaddr_rxrpc srx;
                struct rxrpc_call *rxcall;
                struct msghdr msg;
                struct kvec iov[1];
                int ret;
                struct sk_buff *skb;

                call->wait_mode = wait_mode;
                INIT_WORK(&call->async_work, my_process_async_call);

                memset(&srx, 0, sizeof(srx));
                srx.srx_family = AF_RXRPC;
                srx.srx_service = call->service_id;
                srx.transport_type = SOCK_DGRAM;
                srx.transport_len = sizeof(srx.transport.sin);
                srx.transport.sin.sin_family = AF_INET;
                srx.transport.sin.sin_port = call->port;
                memcpy(&srx.transport.sin.sin_addr, addr, 4);

                /* create a call */
                rxcall = rxrpc_kernel_begin_call(pcramfs_socket, &srx,
call->key,
 
(unsigned long) call, gfp);
                call->key = NULL;
                if (IS_ERR(rxcall)) {
                                ret = PTR_ERR(rxcall);
                                goto error_kill_call;
                }

                call->rxcall = rxcall;

                /* send the request */
                iov[0].iov_base = call->request;
                iov[0].iov_len    = call->request_size;

                msg.msg_name                                = NULL;
                msg.msg_namelen                         = 0;
                msg.msg_iov                     = (struct iovec *) iov;
                msg.msg_iovlen                               = 1;
                msg.msg_control                             = NULL;
                msg.msg_controllen       = 0;
                msg.msg_flags                  = (call->send_page ? MSG_MORE
: 0);

                call->state = PCRAMFS_CALL_AWAIT_REPLY;
                ret = rxrpc_kernel_send_data(rxcall, &msg,
call->request_size);
                if (ret < 0)
                                goto error_do_abort;    

                /* at this point, an async call may no longer exist as it
may have
                * already completed */
                return wait_mode->wait(call);
                return ret;
}    

static void my_rx_interceptor(struct sock *sk, unsigned long user_call_ID,
                                                       struct sk_buff *skb)
{
                struct my_call *call = (struct my_call *) user_call_ID;
                
                if (!call) {
                                /* its an incoming call to be accepted */
                                skb_queue_tail(&my_incoming_calls, skb);
                                queue_work(my_accept_calls_wq,
&my_collect_incoming_call_work
                } else {
                                /* route the messages directly to the
appropriate call */
                                skb_queue_tail(&call->rx_queue, skb);
                                call->wait_mode->rx_wakeup(call); 
                }
} 

====================================
log attached in case you need this:
====================================
right after reboot the virtual pc, the first test is successful:

kern_client_OK: kernel space server (send out) successful
kern_server_OK: kernel space server (receive) successful

All following test got error:

kern_client_ERR: kernel space server (send out) error
kern_server_ERR: kernel space server (receive) error

looks like it is a racing condition (this is may not true as I could write
incorrect code in kernel space) while the first test make new connection
which take time, following tests re-use old connection.  

====================================
issues 2&3:
====================================
I observed two issues yesterday, just want you to know these issues for your
information (maybe they are not issues). I changed my test code to send less
than MTU data per call, so I can run a long test.

One issue is that a very small chance(maybe 1/100000), rxrpc stack return
"local error 103, rx busy". This is maybe a lower layer issue not a rxrpc
specific issue? A simple retry fixed the problem. See attachment " rx_busy"
for detail.

Another is that all calls were not destroyed by invoke
rxrpc_kernel_end_call() but later by an event of rxrpc_dead_call_expired().
This is maybe for reusing old calls. However I saw all calls were NOT reused
by "rxrpc_get_client_call()",  but a new allocation: "<==
rxrpc_get_client_call() = cc0fa580 [new]". So the two search in
"rxrpc_get_client_call()" are not efficient?  See attachment
"dead_call_expired" for detail.       

Thanks!
Leon

-----Original Message-----
From: David Howells [mailto:dhowells at redhat.com] 
Sent: Thursday, April 24, 2014 9:16 PM
To: Leon Liu
Cc: dhowells at redhat.com
Subject: Re: Please help on RxRPC issues


Sorry about this, but I'm on paternity leave this week and next week.  Can
you try sending what you've found to:

	linux-afs at lists.infradead.org

and cc:


	Tim Smith <tim at electronghost.co.uk>

Thanks!

David
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this communication in error,please 
immediately notify the sender by return e-mail, and delete the original message and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kern_client_ERR
Type: application/octet-stream
Size: 6610 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-afs/attachments/20140425/9d6bb50c/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kern_client_OK
Type: application/octet-stream
Size: 7438 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-afs/attachments/20140425/9d6bb50c/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kern_server_ERR
Type: application/octet-stream
Size: 5764 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-afs/attachments/20140425/9d6bb50c/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kern_server_OK
Type: application/octet-stream
Size: 10564 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-afs/attachments/20140425/9d6bb50c/attachment-0009.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rx_busy
Type: application/octet-stream
Size: 15577 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-afs/attachments/20140425/9d6bb50c/attachment-0010.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dead_call_expired
Type: application/octet-stream
Size: 413820 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-afs/attachments/20140425/9d6bb50c/attachment-0011.obj>


More information about the linux-afs mailing list