Re: [PATCH 2/3] ext4: Context support

Saugata Das <saugata.das@xxxxxxxxxx> · Tue, 12 Jun 2012 19:56:58 +0530

On 12 June 2012 18:59, Arnd Bergmann <arnd.bergmann@xxxxxxxxxx> wrote:
> On Tuesday 12 June 2012, Saugata Das wrote:
>> On 11 June 2012 17:57, Ted Ts'o <tytso@xxxxxxx> wrote:
>> > On Mon, Jun 11, 2012 at 02:41:31PM +0300, Artem Bityutskiy wrote:
>> > The proof-of-concept patches seem to use the inode number as a way of
>> > trying to group related writes, but what about at a larger level than
>> > that?  For example, if we install a RPM or deb package where all of
>> > the files will likely be replaced together, should that be given the
>> > same context?
>>
>> In this patch, context is used at file level based on inode number.
>> So, in the above example, multiple contexts will be used for the
>> directory, file updates during RPM installation.
>>
>> >
>> > How likely does it have to be that related blocks written under the
>> > same context must be deleted at the same time for this concept to be
>> > helpful?
>>
>> There is no restriction that related blocks within the MMC context
>> needs to be deleted together
>
> I don't think that is correct. The most obvious implementation in eMMC
> hardware for this would be to group all data from one context to be
> written into the same erase block, in order to reduce the amount
> of garbage collection that needs to happen at erase time. AFAICT,
> the main interest here is, as Ted is guessing correctly, to make sure
> that all data which gets written into one context has roughly the
> same life time before it gets erased or overwritten.
>

The restriction is there on "large unit" context, which prevents
trim/erase of the blocks till the context is active. But we do not
enable "large unit". On non-"large unit" context, the specification
does not restrict the trim/erase of blocks based on context.

>> > If we have a context where it is the context assumption does
>> > not hold (example: a database where you have a random access
>> > read/write pattern with blocks updated in place) how harm will it be
>> > to the device format if those blocks are written under the same
>> > context?
>> >
>>
>> MMC context allows the data blocks to be overwritten or randomly accessed
>
> That is of course the defined behavior of a block device that does
> not change with the use of contexts. To get the best performance,
> a random-write database file would always reside in a context by itself
> and not get mixed with long-lived write-once data. If we have a way
> in the file system to tell whether a file is written linearly or randomly
> (e.g. by looking at the O_APPEND or O_CREAT flag), it might make sense
> to split the context space accordingly.
>
>> > The next set of questions we need to ask is how generalizable is this
>> > concept to devices that might be more sophisticated than simple eMMC
>> > devices.  If we're going to expose something all the way out to the
>> > file system layer, it would be nice if it worked on more than just
>> > low-end flash devices, but also on more sophisticated devices as well.
>> >
>>
>> This context mechanism will be used on both UFS and MMC devices. If
>> there are some alternate suggestions on what can be used as context
>> from file system perspective, then please  suggest.
>
> One suggestion that has been made before was to base the context on
> the process ID rather than the inode number, but that has many other
> problems, e.g. when the same file gets written by multiple processes.
>
>        Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html