[PATCH 3/3] fs: Fix remaining filesystems to wait for stable page writeback

Jeff Layton jlayton at samba.org
Thu Nov 1 16:22:54 EDT 2012


On Thu, 1 Nov 2012 11:43:26 -0700
Boaz Harrosh <bharrosh at panasas.com> wrote:

> On 11/01/2012 12:58 AM, Darrick J. Wong wrote:
> > Fix up the filesystems that provide their own ->page_mkwrite handlers to
> > provide stable page writes if necessary.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong at oracle.com>
> > ---
> >  fs/9p/vfs_file.c |    1 +
> >  fs/afs/write.c   |    4 ++--
> >  fs/ceph/addr.c   |    1 +
> >  fs/cifs/file.c   |    1 +
> >  fs/ocfs2/mmap.c  |    1 +
> >  fs/ubifs/file.c  |    4 ++--
> >  6 files changed, 8 insertions(+), 4 deletions(-)
> > 
> > 
> > diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
> > index c2483e9..aa253f0 100644
> > --- a/fs/9p/vfs_file.c
> > +++ b/fs/9p/vfs_file.c
> > @@ -620,6 +620,7 @@ v9fs_vm_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	lock_page(page);
> >  	if (page->mapping != inode->i_mapping)
> >  		goto out_unlock;
> > +	wait_on_stable_page_write(page);
> >  
> 
> Good god thanks, yes please ;-)
> 
> >  	return VM_FAULT_LOCKED;
> >  out_unlock:
> > diff --git a/fs/afs/write.c b/fs/afs/write.c
> > index 9aa52d9..39eb2a4 100644
> > --- a/fs/afs/write.c
> > +++ b/fs/afs/write.c
> > @@ -758,7 +758,7 @@ int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
> 
> afs, is it not a network filesystem? which means that it has it's own emulated none-block-device
> BDI, registered internally. So if you do need stable pages someone should call
> bdi_require_stable_pages()
> 
> But again since it is a network filesystem I don't see how it is needed, and/or it might be
> taken care of already.
> 
> >  #ifdef CONFIG_AFS_FSCACHE
> >  	fscache_wait_on_page_write(vnode->cache, page);
> >  #endif
> > -
> > +	wait_on_stable_page_write(page);
> >  	_leave(" = 0");
> > -	return 0;
> > +	return VM_FAULT_LOCKED;
> >  }
> > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> 
> CEPH for sure has it's own "emulated none-block-device BDI". This one is also
> a pure networking filesystem.
> 
> And it already does what it needs to do with wait_on_writeback().
> 
> So i do not think you should touch CEPH
> 
> > index 6690269..e9734bf 100644
> > --- a/fs/ceph/addr.c
> > +++ b/fs/ceph/addr.c
> > @@ -1208,6 +1208,7 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  		set_page_dirty(page);
> >  		up_read(&mdsc->snap_rwsem);
> >  		ret = VM_FAULT_LOCKED;
> > +		wait_on_stable_page_write(page);
> >  	} else {
> >  		if (ret == -ENOMEM)
> >  			ret = VM_FAULT_OOM;
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> 
> Cifs also self-BDI network filesystem, but
> 
> > index edb25b4..a8770bf 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -2997,6 +2997,7 @@ cifs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	struct page *page = vmf->page;
> >  
> >  	lock_page(page);
> 
> It waits by locking the page, that's cifs naive way of waiting for writeback
> 
> > +	wait_on_stable_page_write(page);
> 
> Instead it could do better and not override page_mkwrite at all, and all it needs
> to do is call bdi_require_stable_pages() at it's own registered BDI
> 

Hmm...I don't know...

I've never been crazy about using the page lock for this, but in the
absence of a better way to guarantee stable pages, it was what I ended
up with at the time. cifs_writepages will hold the page lock until
kernel_sendmsg returns. At that point the TCP layer will have copied
off the page data so it's safe to release it.

With this change though, we're going to end up blocking until the
writeback flag clears, right? And I think that will happen when the
reply comes in? So, we'll end up blocking for much longer than is
really necessary in page_mkwrite with this change.

-- 
Jeff Layton <jlayton at samba.org>



More information about the linux-afs mailing list