Out of order read() completion and buffer filling beyond returned amount

David Howells dhowells at redhat.com
Mon Jan 17 01:57:17 PST 2022


Hi Al, Linus,

Do you have an opinion on whether it's permissible for a filesystem to write
into the read() buffer beyond the amount it claims to return, though still
within the specified size of the buffer?

I'm working on common DIO routines for 9p, afs, ceph and cifs in netfs lib,
and I can see that at least three of those four filesystems either can or must
split a read, possibly being required to distribute across multiple servers.

If a filesystem was to emit multiple read RPCs in parallel, there is the
possibility that they would complete out of order - particularly if they go to
multiple servers.

Would it be a violation of the way the read() family of syscalls work to write
the data into the buffers out of order, and then abandon the extra data
written at the end if one of the RPCs returned a short read?  We would have
clobbered some of the buffer that we haven't said we've modified.

For buffered reads, it's not a problem as we can fill the pagecache out of
order with no issue.

David




More information about the linux-afs mailing list