[LSF/MM/BPF TOPIC] : Flexible Data Placement (FDP) availability for kernel space file systems
Viacheslav Dubeyko
slava at dubeyko.com
Tue Jan 16 00:39:16 PST 2024
> On Jan 15, 2024, at 8:54 PM, Javier González <javier.gonz at samsung.com> wrote:
>
> On 15.01.2024 11:46, Viacheslav Dubeyko wrote:
>> Hi Javier,
>>
>> Samsung introduced Flexible Data Placement (FDP) technology
>> pretty recently. As far as I know, currently, this technology
>> is available for user-space solutions only. I assume it will be
>> good to have discussion how kernel-space file systems could
>> work with SSDs that support FDP technology by employing
>> FDP benefits.
>
> Slava,
>
> Thanks for bringing this up.
>
> First, this is not a Samsung technology. Several vendors are building
> FDP and several customers are already deploying first product.
>
> We enabled FDP thtough I/O Passthru to avoid unnecesary noise in the
> block layer until we had a clear idea on use-cases. We have been
> following and reviewing Bart's write hint series and it covers all the
> block layer and interface needed to support FDP. Currently, we have
> patches with small changes to wire the NVMe driver. We plan to submit
> them after Bart's patches are applied. Now it is a good time since we
> have LSF and there are also 2 customers using FDP on block and file.
>
>>
>> How soon FDP API will be available for kernel-space file systems?
>
> The work is done. We will submit as Bart's patches are applied.
>
> Kanchan is doing this work.
>
>> How kernel-space file systems can adopt FDP technology?
>
> It is based on write hints. There is no FS-specific placement decisions.
> All the responsibility is in the application.
>
> Kanchan: Can you comment a bit more on this?
>
>> How FDP technology can improve efficiency and reliability of
>> kernel-space file system?
>
> This is an open problem. Our experience is that making data placement
> decisions on the FS is tricky (beyond the obvious data / medatadata). If
> someone has a good use-case for this, I think it is worth exploring.
> F2FS is a good candidate, but I am not sure FDP is of interest for
> mobile - here ZUFS seems to be the current dominant technology.
>
If I understand the FDP technology correctly, I can see the benefits for
file systems. :)
For example, SSDFS is based on segment concept and it has multiple
types of segments (superblock, mapping table, segment bitmap, b-tree
nodes, user data). So, at first, I can use hints to place different segment
types into different reclaim units. The first point is clear, I can place different
type of data/metadata (with different “hotness”) into different reclaim units.
Second point could be not so clear. SSDFS provides the way to define
the size of erase block. If it’s ZNS SSD, then mkfs tool uses the size of zone
that storage device exposes to mkfs tool. However, for the case of conventional
SSD, the size of erase block is defined by user. Technically speaking, this size
could be smaller or bigger that the real erase block inside of SSD. Also, FTL could
use a tricky mapping scheme that could combine LBAs in the way making
FS activity inefficient even by using erase block or segment concept. I can see
how FDP can help here. First of all, reclaim unit makes guarantee that erase
blocks or segments on file system side will match to erase blocks (reclaim units)
on SSD side. Also, I can use various sizes of logical erase blocks but the logical
erase blocks of the same segment type will be placed into the same reclaim unit.
It could guarantee the decreasing the write amplification and predictable reclaiming on
SSD side. The flexibility to use various logical erase block sizes provides
the better efficiency of file system because various workloads could require
different logical erase block sizes.
Technically speaking, any file system can place different types of metadata in
different reclaim units. However, user data is slightly more tricky case. Potentially,
file system logic can track “hotness” or frequency of updates of some user data
and try to direct the different types of user data in different reclaim units.
But, from another point of view, we have folders in file system namespace.
If application can place different types of data in different folders, then, technically
speaking, file system logic can place the content of different folders into different
reclaim units. But application needs to follow some “discipline” to store different
types of user data (different “hotness”, for example) in different folders.
Thanks,
Slava.
More information about the Linux-nvme
mailing list