lockdep warning: fs_reclaim_acquire vs tcp_sendpage

Wed Oct 19 06:09:39 PDT 2022

On 10/19/22 14:35, Daniel Wagner wrote:
> On Wed, Oct 19, 2022 at 11:37:13AM +0200, Daniel Wagner wrote:
>>>>     Possible unsafe locking scenario:
>>>>
>>>>           CPU0                    CPU1
>>>>           ----                    ----
>>>>      lock(fs_reclaim);
>>>>                                   lock(sk_lock-AF_INET-NVME);
>>>>                                   lock(fs_reclaim);
>>>>      lock(sk_lock-AF_INET-NVME);
>>>
>>> Indeed. I see the issue.
>>> kswapd is trying to swap out pages, but if someone were to delete
>>> the controller (like in this case), sock_release -> tcp_disconnect
>>> will alloc skb that may need to reclaim pages.
>>>
>>> Two questions, the stack trace suggests that you are not using
>>> nvme-mpath? is that the case?
>>
>> This is with a multipath setup. The fio settings are pushing the limits
>> of the VM (memory size) hence the kswap process kicking in.
>>
>>> Given that we fail all inflight requests before we free the socket,
>>> I don't expect for this to be truly circular...
>>>
>>> I'm assuming that we'll need the below similar to nbd/iscsi:
>>
>> Let me try this.
> 
> Still able to trigger though I figured out how I am able to
> reproduce it:
> 
>   VM 4M memory, 8 vCPUs

thats small...

What is vm.min_free_kbytes (via sysctl)?