afs_fsync

Mon Apr 19 07:54:09 EDT 2010

Christoph Hellwig <hch at lst.de> wrote:

> I've been looking at afs_fsync a bit lately and don't quite
> understanding what's going on there.  As of 2.6.32 we always
> write out all data before calling into ->fsync.  From my very
> unscientific exploration into afs_fsync it's doing exactly that
> data writeout again, just in a rather complicated way, and
> then marks the inode as having dirty pages again, which is not
> very helpful inside ->fsync.  Any chance you could explain
> what's really going on there?

kAFS maintains a queue of outstanding writebacks for each inode/vnode, which
afs_writepages() attempts to write back in order when there are multiple
elements in the queue for the range being written.  The afs_writeback struct
that forms the elements in that queue maps a written region in the file to a
key struct, which defines the security details to use for the AFS StoreData op.

These afs_writeback structs also make it easier to write back multiple pages
with one network op.

Now, afs_fsync() sticks a null record at the tail of the queue, so that it will
get woken up when everything before it in the queue at that point is gone.  It
then invokes afs_writepages() to write out the contents of the queue and waits
for the null record to be processes.

I don't recall why I put __mark_inode_dirty() in there with I_DIRTY_PAGES.  I
would suspect I copied it from somewhere, but I don't remember where.

I should probably optimise afs_fsync() to not do anything if vnode->writebacks
is empty once it has taken vnode->writeback_lock.

David