[PATCH 10/21] block: Add fops atomic write support

Bart Van Assche bvanassche at acm.org
Mon Oct 2 12:12:49 PDT 2023


On 10/2/23 03:10, John Garry wrote:
> On 29/09/2023 18:51, Bart Van Assche wrote:
>> On 9/29/23 03:27, John Garry wrote:
>  > +    if (pos % atomic_write_unit_min_bytes)
>  > +        return false;
> 
> See later rules.

Is atomic_write_unit_min_bytes always equal to the logical block size?
If so, can the above test be left out?

>  > +    if (iov_iter_count(iter) % atomic_write_unit_min_bytes)
>  > +        return false;
> 
> For SCSI, there is an atomic write granularity, which dictates 
> atomic_write_unit_min_bytes. So here we need to ensure that the length 
> is a multiple of this value.

Are there any SCSI devices that we care about that report an ATOMIC 
TRANSFER LENGTH GRANULARITY that is larger than a single logical block?
I'm wondering whether we really have to support such devices.

>  > +    if (!is_power_of_2(iov_iter_count(iter)))
>  > +        return false;
> 
> This rule comes from FS block alignment and NVMe atomic boundary.
> 
> FSes (XFS) have discontiguous extents. We need to ensure that an atomic 
> write does not cross discontiguous extents. To do this we ensure extent 
> length and alignment and limit atomic_write_unit_max_bytes to that.
> 
> For NVMe, an atomic write boundary is a boundary in LBA space which an 
> atomic write should not cross. We limit atomic_write_unit_max_bytes such 
> that it is evenly divisible into this atomic write boundary.
> 
> To ensure that the write does not cross these alignment boundaries we 
> say that it must be naturally aligned and a power-of-2 in length.
> 
> We may be able to relax this rule but I am not sure it buys us anything 
> - typically we want to be writing a 64KB block aligned to 64KB, for 
> example.

It seems to me that the requirement is_power_of_2(iov_iter_count(iter))
is necessary for some filesystems but not for all filesystems. 
Restrictions that are specific to a single filesystem (XFS) should not 
occur in code that is intended to be used by all filesystems 
(blkdev_atomic_write_valid()).

Thanks,

Bart.




More information about the Linux-nvme mailing list