WQ_UNBOUND workqueue warnings from multiple drivers

Sun Apr 7 13:08:23 PDT 2024

On 03/04/2024 2:50, Kamaljit Singh wrote:
> Sagi, Chaitanya,
>   
> Sorry for the delay, found your replies in the junk folder :(
>   
>>   Was the test you were running read-heavy?
> No, most of the failing fio tests were doing heavy writes. All were with 8 Controllers and 32 NS each. io-specs are below.
>
> [1] bs=16k, iodepth=16, rwmixread=0, numjobs=16
> Failed in ~1 min
>
> Some others were:
> [2] bs=8k, iodepth=16, rwmixread=5, numjobs=16
> [3] bs=8k, iodepth=16, rwmixread=50, numjobs=16

Interesting, that is the opposite of what I would suspect (I thought that
the workload would be read-only or read-mostly).

Does this happen with a 90-%100% read workload?

If we look at nvme_tcp_io_work() it is essentially looping
doing send() and recv() and every iteration checks if a 1ms
deadline elapsed. The fact that it happens on a 100% write
workload leads me to conclude that the only way this can
happen if sending a single 16K request to a controller on its
own takes more than 10ms, which is unexpected...

Question, are you working with a Linux controller? what
is the ctrl ioccsz?