Re: [LSF/MM/BPF TOPIC] : Flexible Data Placement (FDP) availability for kernel space file systems

Javier González <javier.gonz@xxxxxxxxxxx> · Wed, 17 Jan 2024 12:58:12 +0100

On 16.01.2024 11:39, Viacheslav Dubeyko wrote:

On Jan 15, 2024, at 8:54 PM, Javier González <javier.gonz@xxxxxxxxxxx> wrote:

On 15.01.2024 11:46, Viacheslav Dubeyko wrote:
Hi Javier,

Samsung introduced Flexible Data Placement (FDP) technology
pretty recently. As far as I know, currently, this technology
is available for user-space solutions only. I assume it will be
good to have discussion how kernel-space file systems could
work with SSDs that support FDP technology by employing
FDP benefits.

Slava,

Thanks for bringing this up.

First, this is not a Samsung technology. Several vendors are building
FDP and several customers are already deploying first product.

We enabled FDP thtough I/O Passthru to avoid unnecesary noise in the
block layer until we had a clear idea on use-cases. We have been
following and reviewing Bart's write hint series and it covers all the
block layer and interface needed to support FDP. Currently, we have
patches with small changes to wire the NVMe driver. We plan to submit
them after Bart's patches are applied. Now it is a good time since we
have LSF and there are also 2 customers using FDP on block and file.

How soon FDP API will be available for kernel-space file systems?

The work is done. We will submit as Bart's patches are applied.

Kanchan is doing this work.

How kernel-space file systems can adopt FDP technology?

It is based on write hints. There is no FS-specific placement decisions.
All the responsibility is in the application.

Kanchan: Can you comment a bit more on this?

How FDP technology can improve efficiency and reliability of
kernel-space file system?

This is an open problem. Our experience is that making data placement
decisions on the FS is tricky (beyond the obvious data / medatadata). If
someone has a good use-case for this, I think it is worth exploring.
F2FS is a good candidate, but I am not sure FDP is of interest for
mobile - here ZUFS seems to be the current dominant technology.

If I understand the FDP technology correctly, I can see the benefits for
file systems. :)

For example, SSDFS is based on segment concept and it has multiple
types of segments (superblock, mapping table, segment bitmap, b-tree
nodes, user data). So, at first, I can use hints to place different segment
types into different reclaim units.

Yes. This is what I meant with data / metadata. We have looked also into
using 1 RUH for metadata and rest make available to applications. We
decided to go with a simple solution to start with and complete as we
see users.

For SSDFS it makes sense.

The first point is clear, I can place different
type of data/metadata (with different “hotness”) into different reclaim units.
Second point could be not so clear. SSDFS provides the way to define
the size of erase block. If it’s ZNS SSD, then mkfs tool uses the size of zone
that storage device exposes to mkfs tool. However, for the case of conventional
SSD, the size of erase block is defined by user. Technically speaking, this size
could be smaller or bigger that the real erase block inside of SSD. Also, FTL could
use a tricky mapping scheme that could combine LBAs in the way making
FS activity inefficient even by using erase block or segment concept. I can see
how FDP can help here. First of all, reclaim unit makes guarantee that erase
blocks or segments on file system side will match to erase blocks (reclaim units)
on SSD side. Also, I can use various sizes of logical erase blocks but the logical
erase blocks of the same segment type will be placed into the same reclaim unit.
It could guarantee the decreasing the write amplification and predictable reclaiming on
SSD side. The flexibility to use various logical erase block sizes provides
the better efficiency of file system because various workloads could require
different logical erase block sizes.

Sounds good. I see you sent a proposal on SSDFS specificaly. It makes
sense to cover this specific uses there.

Technically speaking, any file system can place different types of metadata in
different reclaim units. However, user data is slightly more tricky case. Potentially,
file system logic can track “hotness” or frequency of updates of some user data
and try to direct the different types of user data in different reclaim units.
But, from another point of view, we have folders in file system namespace.
If application can place different types of data in different folders, then, technically
speaking, file system logic can place the content of different folders into different
reclaim units. But application needs to follow some “discipline” to store different
types of user data (different “hotness”, for example) in different folders.

Exactly. This is why I think it makes sense to look at specific FSs as
there are real deployments that we can use to argue for changes that
cover a large percentage of use-cases.