[LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] : blktests: status, expansion plan for the storage stack test framework

Mon Apr 20 23:19:12 PDT 2026

On Feb 16, 2026 / 00:08, Nilay Shroff wrote:
> 
> 
> On 2/13/26 4:53 PM, Shinichiro Kawasaki wrote:
> > On Feb 12, 2026 / 08:52, Daniel Wagner wrote:
> >> On Wed, Feb 11, 2026 at 08:35:30PM +0000, Chaitanya Kulkarni wrote:
> >>>    For the storage track at LSFMMBPF2026, I propose a session dedicated to
> >>>    blktests to discuss expansion plan and CI integration progress.
> >>
> >> Thanks for proposing this topic.
> > 
> > Chaitanya, my thank also goes to you.
> > 
> Yes thanks for proposing this!
> 
> >> Just a few random topics which come to mind we could discuss:
> >>
> >> - blktests has gain a bit of traction and some folks run on regular
> >>   basis these tests. Can we gather feedback from them, what is working
> >>   good, what is not? Are there feature wishes?
> > 
> > Good topic, I also would like to hear about it.
> > 
> One improvement I’d like to highlight is related to how blktests are executed
> today. So far, we’ve been running blktests serially, but if it's possible to 
> run tests in parallel to improve test turnaround time and make large-scale or
> CI-based testing more efficient? For instance, adding parallel_safe Tags: Marking tests
> that don't modify global kernel state so they can be safely offloaded to parallel
> workers. Marking parallel_safe tags would allow the runner to distinguish:
> 
> Safe Tests: Tests that only perform I/O on a specific, non-shared device or 
> check static kernel parameters.
> 
> Unsafe Tests: Tests that reload kernel modules, modify global /sys or /proc entries,
> or require exclusive access to specific hardware addresses.
> 
> Yes adding parallel execution support shall require framework/design changes.

Hi Nilay, thanks for the idea. I understand that shorter test time will make CI
cycles faster and improve the development efficiency.

Said that, the safe/unsafe testing idea may not be enough. I think majority of
test case does kernel module set up using null_blk, scsi_debug, or nvme target
drivers. Then I foresee the majority of the test cases will be "unsafe", and
cannot be run in parallel.

Also, parallel runs on single system will affect dmesg or kmemleak checking.
We cannot tell which run caused a dmesg message or a memory leak.

For the runtime reduction by parallel runs, I guess blktests run on VMs might be
the good approach as Haris pointed out. Anyway, this topic will need more
discussion.

[...]

> >  4. Long standing failures make test result reports dirty
> >     - I feel lockdep WARNs are tend to be left unfixed rather long period.
> >       How can we gather effort to fix them?
> 
> I agree regarding lockdep; recently we did see quite a few lockdep splats.
> That said, I believe the number has dropped significantly and only a small
> set remains. From what I can tell, most of the outstanding lockdep issues
> are related to fs-reclaim paths recursing into the block layer while the
> queue is frozen. We should be able to resolve most of these soon, or at
> least before the conference. If anything is still outstanding after that,
> we can discuss it during the conference and work toward addressing it as
> quickly as possible.

Taking this chance, I'd like to express my appreciation for the effort to
resolve the lockdep issues. It is great that a number of lockdeps are already
fixed. Said that, two lockdep issues are still observed with v7.0 kernel at
nvme/005 and nbd/002 [1]. I would like to gather attentions to the failures.

[1] https://lore.kernel.org/linux-block/ynmi72x5wt5ooljjafebhcarit3pvu6axkslqenikb2p5txe57@ldytqa2t4i2x/