[LSF/MM/BPF TOPIC] : Flexible Data Placement (FDP) availability for kernel space file systems
Javier González
javier.gonz at samsung.com
Wed Jan 17 03:58:12 PST 2024
On 16.01.2024 11:39, Viacheslav Dubeyko wrote:
>
>
>> On Jan 15, 2024, at 8:54 PM, Javier González <javier.gonz at samsung.com> wrote:
>>
>> On 15.01.2024 11:46, Viacheslav Dubeyko wrote:
>>> Hi Javier,
>>>
>>> Samsung introduced Flexible Data Placement (FDP) technology
>>> pretty recently. As far as I know, currently, this technology
>>> is available for user-space solutions only. I assume it will be
>>> good to have discussion how kernel-space file systems could
>>> work with SSDs that support FDP technology by employing
>>> FDP benefits.
>>
>> Slava,
>>
>> Thanks for bringing this up.
>>
>> First, this is not a Samsung technology. Several vendors are building
>> FDP and several customers are already deploying first product.
>>
>> We enabled FDP thtough I/O Passthru to avoid unnecesary noise in the
>> block layer until we had a clear idea on use-cases. We have been
>> following and reviewing Bart's write hint series and it covers all the
>> block layer and interface needed to support FDP. Currently, we have
>> patches with small changes to wire the NVMe driver. We plan to submit
>> them after Bart's patches are applied. Now it is a good time since we
>> have LSF and there are also 2 customers using FDP on block and file.
>>
>>>
>>> How soon FDP API will be available for kernel-space file systems?
>>
>> The work is done. We will submit as Bart's patches are applied.
>>
>> Kanchan is doing this work.
>>
>>> How kernel-space file systems can adopt FDP technology?
>>
>> It is based on write hints. There is no FS-specific placement decisions.
>> All the responsibility is in the application.
>>
>> Kanchan: Can you comment a bit more on this?
>>
>>> How FDP technology can improve efficiency and reliability of
>>> kernel-space file system?
>>
>> This is an open problem. Our experience is that making data placement
>> decisions on the FS is tricky (beyond the obvious data / medatadata). If
>> someone has a good use-case for this, I think it is worth exploring.
>> F2FS is a good candidate, but I am not sure FDP is of interest for
>> mobile - here ZUFS seems to be the current dominant technology.
>>
>
>If I understand the FDP technology correctly, I can see the benefits for
>file systems. :)
>
>For example, SSDFS is based on segment concept and it has multiple
>types of segments (superblock, mapping table, segment bitmap, b-tree
>nodes, user data). So, at first, I can use hints to place different segment
>types into different reclaim units.
Yes. This is what I meant with data / metadata. We have looked also into
using 1 RUH for metadata and rest make available to applications. We
decided to go with a simple solution to start with and complete as we
see users.
For SSDFS it makes sense.
>The first point is clear, I can place different
>type of data/metadata (with different “hotness”) into different reclaim units.
>Second point could be not so clear. SSDFS provides the way to define
>the size of erase block. If it’s ZNS SSD, then mkfs tool uses the size of zone
>that storage device exposes to mkfs tool. However, for the case of conventional
>SSD, the size of erase block is defined by user. Technically speaking, this size
>could be smaller or bigger that the real erase block inside of SSD. Also, FTL could
>use a tricky mapping scheme that could combine LBAs in the way making
>FS activity inefficient even by using erase block or segment concept. I can see
>how FDP can help here. First of all, reclaim unit makes guarantee that erase
>blocks or segments on file system side will match to erase blocks (reclaim units)
>on SSD side. Also, I can use various sizes of logical erase blocks but the logical
>erase blocks of the same segment type will be placed into the same reclaim unit.
>It could guarantee the decreasing the write amplification and predictable reclaiming on
>SSD side. The flexibility to use various logical erase block sizes provides
>the better efficiency of file system because various workloads could require
>different logical erase block sizes.
Sounds good. I see you sent a proposal on SSDFS specificaly. It makes
sense to cover this specific uses there.
>
>Technically speaking, any file system can place different types of metadata in
>different reclaim units. However, user data is slightly more tricky case. Potentially,
>file system logic can track “hotness” or frequency of updates of some user data
>and try to direct the different types of user data in different reclaim units.
>But, from another point of view, we have folders in file system namespace.
>If application can place different types of data in different folders, then, technically
>speaking, file system logic can place the content of different folders into different
>reclaim units. But application needs to follow some “discipline” to store different
>types of user data (different “hotness”, for example) in different folders.
Exactly. This is why I think it makes sense to look at specific FSs as
there are real deployments that we can use to argue for changes that
cover a large percentage of use-cases.
More information about the Linux-nvme
mailing list