Fix potential data loss and corruption due to Incorrect BIO Chain Handling
Ming Lei
ming.lei at redhat.com
Sun Nov 23 05:48:56 PST 2025
On Sat, Nov 22, 2025 at 03:56:58PM +0100, Andreas Gruenbacher wrote:
> On Sat, Nov 22, 2025 at 1:07 PM Ming Lei <ming.lei at redhat.com> wrote:
> > > static void bio_chain_endio(struct bio *bio)
> > > {
> > > bio_endio(__bio_chain_endio(bio));
> > > }
> >
> > bio_chain_endio() never gets called really, which can be thought as `flag`,
>
> That's probably where this stops being relevant for the problem
> reported by Stephen Zhang.
>
> > and it should have been defined as `WARN_ON_ONCE(1);` for not confusing people.
>
> But shouldn't bio_chain_endio() still be fixed to do the right thing
> if called directly, or alternatively, just BUG()? Warning and still
> doing the wrong thing seems a bit bizarre.
IMO calling ->bi_end_io() directly shouldn't be encouraged.
The only in-tree direct call user could be bcache, so is this reported
issue triggered on bcache?
If bcache can't call bio_endio(), I think it is fine to fix
bio_chain_endio().
>
> I also see direct bi_end_io calls in erofs_fileio_ki_complete(),
> erofs_fscache_bio_endio(), and erofs_fscache_submit_bio(), so those
> are at least confusing.
All looks FS bio(non-chained), so bio_chain_endio() shouldn't be involved
in erofs code base.
Thanks,
Ming
More information about the Linux-nvme
mailing list