Re: [PULL] Re: bcache stability patches

Denis Bychkov <manover@xxxxxxxxx> · Sat, 2 Jan 2016 10:48:04 -0500

On Sat, Jan 2, 2016 at 6:50 AM, Vojtech Pavlik <vojtech@xxxxxxxx> wrote:
> On Fri, Jan 01, 2016 at 08:28:39PM -0500, Denis Bychkov wrote:
>
>> > We certainly can do all that, but: since new bcache can't read the old bcache
>> > format (I can go into why that's impractical, if people are curious) - that
>> > means there's a pretty high cost to switching to the new format:
>> >  - people have to manually upgrade
>> >  - the kernel would have to carry around both the old and the new
>> >    implementations of bcache for as long as people are using the old format -
>> >    we can't force people to upgrade
>>
>> Not that hard technically, you could just leave the existing bcache
>> module as-is to avoid regressions when adding the new format support,
>> since the existing code does not require a lot of maintenance and add
>> a new module that would only recognize a new superblock. Another
>> question is how easy it is to convince Linus/top maintainers to keep 2
>> modules with a lot of duplication with intention to retire the old
>> code eventually, but something tells me that you know ways around this
>> problem.
>
> We had that with the UHCI drivers, it is there with ext3/ext4, I don't
> think that is a real problem.
>
>> > So this isn't something we want to do more than once, which means we need to
>> > make sure the new on disk format is 100% done. And it's not quite done - the
>> > main thing that's left for it to really be considered done is big endian support
>> > and endian compatibility (writing the code so a little endian machine can read a
>> > big endian layout and vice versa; due to the way bkeys work it's not possible to
>> > just have an endian agnostic layout, we'll have to do swabbing).
>>
>> But this problem is not unique to bcache at all, AFAIK, almost any FS
>> linux supports would not be able to work on a different endianness
>> than it was created for.
>
> On the contrary, all modern filesystems cope with endianness
> portability. The only major filesystem in use where endianness is not
> handled is, as far I know, UFS.
>
> At the same time, I don't see endianness portability, the ability to
> create a cache on a machine with one endian and then mounting it on a
> machine with the opposite endian a real use case.
>
> Unlike fileystems, which can be used to transfer valuable data between
> machines, the cache only contains ephemeral data, which can easily be
> recreated from the backing device.
>
> Hence I believe that it is reasonable to require the user to nuke the
> contents of the cache when moving the cache set between machines of
> different endianity.
>
> Ideally this would happen automatically and error out if the cache isn't
> clean.
>
> Actually, the same would be fine for format version changes.

Yeah, I totally agree with you here. I just think that dirty cache
situation might be much more common and less avoidable, which means it
requires a lot of dancing around in terms of tooling, documentation,
testing, etc. But it can easily be solved, it's not a hard problem,
it's just time-consuming and this is something that Kent might use
some help with.

>
>> > And this isn't a trivial amount of work - and besides finishing the on disk
>> > format, there's a fair amount of work on tooling and related stuff to make sure
>> > everything is ready for the switch.
>> >
>> > And, I can't work for free, so somehow funding has to be secured. Given the
>> > number of companies that are using bcache, and the fact that Canonical and SuSe
>> > are both apparantly putting in at least a little bit of engineering time into
>> > supporting bcache, you'd think it should be possible but offers have not been
>> > forthcoming.
>>
>> I don't know, IMHO bcache was hurt a lot because of a host of small
>> problems that nobody was able to address for quite some time. It
>> gained a bad reputation as a production system, unfortunately, which
>> means not much interest from the enterprise world, which means
>> Canonical & co. did not want to invest into it. Don't get me wrong, I
>> am not blaming you. Of all people, I might understand pretty well what
>> was going on, just explaining why RH or Canonical or Suse did not
>> fight for the privilege to financially support this project.
>
> SUSE had plans for bcache, however, since upstram stable branch
> maintenance has been more than unreliable, we postponed most of them and
> are building knowledge in-house to be able to fully support it before we
> deploy.

Yeah, I was able to deduce that much. Not good.

 >
> The structure of the code doesn't really help, either.
>
> --
> Vojtech Pavlik
> Director SUSE Labs

-- 

Denis
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html