Fix potential data loss and corruption due to Incorrect BIO Chain Handling

Sun Nov 23 05:48:56 PST 2025

On Sat, Nov 22, 2025 at 03:56:58PM +0100, Andreas Gruenbacher wrote:
> On Sat, Nov 22, 2025 at 1:07 PM Ming Lei <ming.lei at redhat.com> wrote:
> > > static void bio_chain_endio(struct bio *bio)
> > > {
> > >         bio_endio(__bio_chain_endio(bio));
> > > }
> >
> > bio_chain_endio() never gets called really, which can be thought as `flag`,
> 
> That's probably where this stops being relevant for the problem
> reported by Stephen Zhang.
> 
> > and it should have been defined as `WARN_ON_ONCE(1);` for not confusing people.
> 
> But shouldn't bio_chain_endio() still be fixed to do the right thing
> if called directly, or alternatively, just BUG()? Warning and still
> doing the wrong thing seems a bit bizarre.

IMO calling ->bi_end_io() directly shouldn't be encouraged.

The only in-tree direct call user could be bcache, so is this reported
issue triggered on bcache?

If bcache can't call bio_endio(), I think it is fine to fix
bio_chain_endio().

> 
> I also see direct bi_end_io calls in erofs_fileio_ki_complete(),
> erofs_fscache_bio_endio(), and erofs_fscache_submit_bio(), so those
> are at least confusing.

All looks FS bio(non-chained), so bio_chain_endio() shouldn't be involved
in erofs code base.

Thanks,
Ming