[PATCH] um: Fix WRITE_ZEROES in the UBD Driver

Frédéric Danis frederic.danis at collabora.com
Wed Jan 26 02:02:51 PST 2022


Hi Anton,

On 26/01/2022 10:47, Anton Ivanov wrote:
>
>
> On 25/01/2022 18:50, Frédéric Danis wrote:
>> Hi Anton,
>>
>> On 25/01/2022 18:56, Anton Ivanov wrote:
>>> On 25/01/2022 17:14, Frédéric Danis wrote:
>>>> Call to fallocate with FALLOC_FL_PUNCH_HOLE on a device backed by a 
>>>> sparse
>>>> file can end up by missing data, zeroes data range, if the 
>>>> underlying file
>>>> is used with a tool like bmaptool which will referenced only used 
>>>> spaces.
>>>
>>> Can you elaborate on when do you miss data.
>>>
>>> A file with a hole created using FALLOC_FL_PUNCH_HOLE when reading 
>>> from the hole should return zeroes.
>>>
>>> If it returns anything else or if the zeroed range exceeds the one 
>>> specified in falloc - that is an underlying fs bug.
>>
>> I use bmaptool to create a block map of the used part of a raw system 
>> image file, then compress the file and at the end use again bmaptool 
>> to flash the file to a SDCard.
>>
>> iiuc fallocate() call with FALLOC_FL_PUNCH_HOLE doesn't "register" 
>> the space, as it is zeroed, so when bmaptool creates the block map it 
>> doesn't reference the "used holes", and when I flash the SDCard those 
>> parts are missing, so the filesystem on the SDCard is corrupted.
>
> I can see what you are doing and I will OK the patch.
>
> However, I still do not understand why it does not work.
>
> bmaptool creates a map of the used blocks.
>
> When copying, it should create the "missing" as sparse, right?

It depends:
- if you're creating a file, in this case there's no problem as bmaptool 
will create holes which will be seen as zeroed spaces,
- or if you're flashing a device like a SDCard, in this case it only 
writes the referenced blocks and let untouched the other parts which are 
certainly not full of zeroes, this is why it needs that intentional 
zeroed parts are not just holes.

I'm not sure to be clear :(, bmaptool is better explained at 
https://github.com/intel/bmap-tools/

>
> A sparse "hole" when read should be always read as zero. If it is not 
> being read as zero when compressing - bug. Ditto at the next stage 
> when flashing.

Compression is optional, I have the same problem if I use bmaptool to 
flash the uncompressed file as it will copy to device only the 
referenced blocks in the block map.

>
> IMHO there is an underlying bug outside UML here somewhere.
>
> A

Regards,

Fred
>
>>
>>>
>>>>
>>>> Signed-off-by: Frédéric Danis <frederic.danis at collabora.com>
>>>> ---
>>>>   arch/um/drivers/ubd_kern.c  | 8 +++++++-
>>>>   arch/um/include/shared/os.h | 1 +
>>>>   arch/um/os-Linux/file.c     | 9 +++++++++
>>>>   3 files changed, 17 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>>>> index 69d2d0049a61..b03269faef71 100644
>>>> --- a/arch/um/drivers/ubd_kern.c
>>>> +++ b/arch/um/drivers/ubd_kern.c
>>>> @@ -1526,13 +1526,19 @@ static void do_io(struct io_thread_req 
>>>> *req, struct io_desc *desc)
>>>>               }
>>>>               break;
>>>>           case REQ_OP_DISCARD:
>>>> -        case REQ_OP_WRITE_ZEROES:
>>>>               n = os_falloc_punch(req->fds[bit], off, len);
>>>>               if (n) {
>>>>                   req->error = map_error(-n);
>>>>                   return;
>>>>               }
>>>>               break;
>>>> +        case REQ_OP_WRITE_ZEROES:
>>>> +            n = os_falloc_zeroes(req->fds[bit], off, len);
>>>> +            if (n) {
>>>> +                req->error = map_error(-n);
>>>> +                return;
>>>> +            }
>>>> +            break;
>>>>           default:
>>>>               WARN_ON_ONCE(1);
>>>>               req->error = BLK_STS_NOTSUPP;
>>>> diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
>>>> index 00214059d9ec..fafde1d5416e 100644
>>>> --- a/arch/um/include/shared/os.h
>>>> +++ b/arch/um/include/shared/os.h
>>>> @@ -168,6 +168,7 @@ extern unsigned os_major(unsigned long long dev);
>>>>   extern unsigned os_minor(unsigned long long dev);
>>>>   extern unsigned long long os_makedev(unsigned major, unsigned 
>>>> minor);
>>>>   extern int os_falloc_punch(int fd, unsigned long long offset, int 
>>>> count);
>>>> +extern int os_falloc_zeroes(int fd, unsigned long long offset, int 
>>>> count);
>>>>   extern int os_eventfd(unsigned int initval, int flags);
>>>>   extern int os_sendmsg_fds(int fd, const void *buf, unsigned int len,
>>>>                 const int *fds, unsigned int fds_num);
>>>> diff --git a/arch/um/os-Linux/file.c b/arch/um/os-Linux/file.c
>>>> index e4421dbc4c36..fc4450db59bd 100644
>>>> --- a/arch/um/os-Linux/file.c
>>>> +++ b/arch/um/os-Linux/file.c
>>>> @@ -625,6 +625,15 @@ int os_falloc_punch(int fd, unsigned long long 
>>>> offset, int len)
>>>>       return n;
>>>>   }
>>>>   +int os_falloc_zeroes(int fd, unsigned long long offset, int len)
>>>> +{
>>>> +    int n = fallocate(fd, 
>>>> FALLOC_FL_ZERO_RANGE|FALLOC_FL_KEEP_SIZE, offset, len);
>>>> +
>>>> +    if (n < 0)
>>>> +        return -errno;
>>>> +    return n;
>>>> +}
>>>> +
>>>>   int os_eventfd(unsigned int initval, int flags)
>>>>   {
>>>>       int fd = eventfd(initval, flags);
>>>
>>>
>>
>

-- 
Frédéric Danis
Senior Software Engineer

Collabora Ltd.
Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, United Kingdom
Registered in England & Wales, no. 5513718




More information about the linux-um mailing list