Re: Best way to contribute bunch of miscellaneous fixes? (32 bit builds)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Gentelmen,


I am happy to announce that with all my patches applied I have managed to build ceph for armv7l and successfully passed full test suite! :)


@Kefu: thank you for letting us know current state of affairs re 32 bit systems. I continuously build ceph for 32 bit systems and I can send in my PRs to keep codebase relevant to 32 bit systems. Generally problems encountered are:

1. Use of size_t where uint64_t should be used

2. Buggy malloc requesting extreme amount of memory which is just not addressable on 32 bit system.

3. 32 bit systems at best may have 3GB of memory for userspace (I am not speaking about physical memory - I am speaking about 3/1GB split between userspace/kernel - CONFIG_VMSPLIT_3G in kernel config) and ceph OSD is defaulted to 4GB. It also appears that rest of the settings are geared towards 4GB of memory too.

4. Issues where it is expected to have a system with higher performance (premature timeouts, OOMs).

5. Worst armv7l specific issue I did encounter was a problem with buggy TLS (Thread Local Storage) in GCC 9.1 for ARMv7 - that took me a whole month of debugging and digging into it to figure out how to avoid the issue on this specific platform. It was not a problem with ceph as such but rather ceph code triggering obscure bug in TLS in GCC - fixed by using GCC settings to avoid broken TLS (that TLS bug is there pretty much for best part of two decades and no one was able to fix it).

Unfortunately it is my side project and I (as I guess is the case for many of us) don't have much time to contribute. On top it takes 2 days for my Orange Pi Plus 2E to build ceph from scratch - so please don't expect me to be very quick with delivery.


@Tony: over last decade there was a bunch of cheap ARM based devices developed and it would be a shame to ignore them in low-cost settings (I use ceph as home cluster - cheap system with very high stability). Also they may be used as cheap hardware for a huge cluster with slow moving data: I used it to keep radioastronomy data - it does not change, comes in large quantities and must be protected by good redundancy. No need to have fast access but needs a good scrub once per week and should automatically recover if one of 180 HDDs gives up ghost unexpectedly - perfect scenario for ceph on cheap ARMs.

I have C/C++ development background and I am not afraid to pull out gdb and stick my teeth into thick of it - so I can take a look into failures like #44197. I will provide PRs whenever I can to keep armv7 build alive. I have my own build farm at home so I am good with resources - just short on time.


@Duncan: since I have posted you my patches I have found a last issue with cephadm trying to run up to 10 threads simultaneously in python. As I mentioned above 32 bit systems are constrained to 3GB of memory in best case scenario and because python may be memory hungry only 3 simultaneous threads may be used: need to change src/pybind/mgr/cephadm/module.py@287 to self._worker_pool = multiprocessing.pool.ThreadPool(3) .

I will use your idea of conditionals in code so 64 bit builds will not be disrupted.


Regards,

Vladimir

On 24/11/20 8:30 pm, Duncan Bellamy wrote:
Hi,
Thanks, I have Vlads patches as he kindly emailed them to me and have been starting to add them to alpine to get 32bit working here (currently not building):
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14888

Vlad says he only applies the “arm32bit” prefix patches on armv7, he is working on merge requests to ceph also.

So it needs a way to detect 64/32bit in cmake or I saw a way with <cstdint> that is portable for applying the changes with "#ifdef” for 32bit only fixes that are not needed for 64bit.

Thanks,

Duncan

On 24 Nov 2020, at 09:22, Anthony Davies <anthony.t.davies@xxxxxxxxx> wrote:

Hi All,

I had a similar idea to Vlad here but using the Olimex Lime2 as a platform, great minds think alike. My interest was mainly to use it with Rook for my low power kubernetes cluster.

The last successful build I had running using Rook-ceph was 15.2.0, I doubled my kids around that time (1 to 2) and time disappeared.

Having said that I had an issue #44197 which Brad was planning to take a look at before he had a shift in priorities.

I would love to see this up and working but have very little time to spend on it at the moment unfortunately, sounds like Vlad is more a dev then I am and can contribute PRs which I struggle with when it comes to C.

Armv7 is a subset of armv8 and so any armv8 architecture can be used to compile, I personally used a 32 bit armv7 alpine image on high powered armv8 infrastructure.

Vlad, I was using cloud.drone.io to automate my builds, that may be an option for you also? They offer it free for open source projects.

Happy to help out however I can with the limited time I have.

Cheers,

Tony

On Tue, 24 Nov 2020 at 19:50, kefu chai <tchaikov@xxxxxxxxx> wrote:
hi Duncan and Vladimir,

we don't have the hardware or dedicated resource for supporting armv7,
i586, i686 or other 32bits architectures. but patches enabling us to
support more architectures are always welcomed.

cypress is only used for testing the dashboard. would be better if we
could disable it conditionally based on platform / archs.

+Tony Davies. as he's been testing Ceph on armv7 and aarch64 recently.

On Mon, Nov 23, 2020 at 8:33 PM Duncan Bellamy <a.16bit.sysop@xxxxxxxxxx> wrote:
>
> Yes please, if you could email me as well as PRs I can see if I can get it to build for Alpine Linux.
>
> On 23 Nov 2020, at 12:20, Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote:
>
> I have checked their patches and they are just not enough of getting everything up and running. They have patched two files which I patched as well but I also patched number of other files which are vital for correct operation. I'll do PRs soon or if you'd like I can email my patches directly.
>
> On 23/11/20 11:09 pm, Duncan Bellamy wrote:
>
> Thanks for that, ubuntu has it building for all arches, they have patches for 32bit and arm in the patches directory here:
>
> http://archive.ubuntu.com/ubuntu/pool/main/c/ceph/ceph_15.2.5-0ubuntu1.1.debian.tar.xz
>
>
> On 23 Nov 2020, at 08:48, Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote:
>
> I doubt that Cypress will build necessary stuff. Much faster:
>
> diff -uNr ceph-15.2.4/src/pybind/mgr/dashboard/frontend/package.json ceph-15.2.4-arm32_fix/src/pybind/mgr/dashboard/frontend/package.json
> --- ceph-15.2.4/src/pybind/mgr/dashboard/frontend/package.json    2020-07-01 01:10:51.000000000 +0930
> +++ ceph-15.2.4-arm32_fix/src/pybind/mgr/dashboard/frontend/package.json    2020-11-21 22:11:16.065796889 +1030
> @@ -122,7 +122,6 @@
>      "@types/node": "12.12.34",
>      "@types/simplebar": "5.1.1",
>      "codelyzer": "5.2.2",
> -    "cypress": "4.4.0",
>      "html-linter": "1.1.1",
>      "htmllint-cli": "0.0.7",
>      "jest": "25.2.4",
>
>
> On 23/11/20 7:37 pm, Duncan Bellamy wrote:
>
> Thanks for the info, I have opened an issue about it on cypress github:
> https://github.com/cypress-io/cypress/issues/9272
>
> I don’t think the alpine build is running the test suite, cypress is installed and for the test step it does a cd into the build directory and runs “ctest” but looking at the build logs and searching for “ctest” returns 0 results.
>
> Duncan
>
> On 22 Nov 2020, at 23:18, Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote:
>
> There no cypress for ARM 32 bit at all. As it is used just for testing I have removed it from package.json and then dropped relevant test to avoid test failure. Not the best solution but because I do build 64 bits as well I can sleep relatively easy knowing that it should work. Node is used to create mgr frontend and it should generate pretty much the same JS regardless of architecture.
>
> Explicit use of cypress also tells me that 32 bit support was dropped by Ceph silently. But I can say that after some patching ceph does pass full test suite on armv7l. I will provide them as PRs soon - hopefully they will be accepted and Ceph will run on 32 bits again.
>
> On 23 November 2020 4:01:18 am AEDT, Duncan Bellamy <a.16bit.sysop@xxxxxxxxxx> wrote:
> I have been updating the Alpine Linux version to 15.2.6 and it fails on armv7 and x86 because cypress install fails with not found, how did you get cypress to work on arm 32bit?
>
> Duncan
>
> On 22 Nov 2020, at 08:49, Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote:
>
> 
> Yeah, that's exactly because of this document I am kinda wondering. This document is latest dev and thus one may expect that 32 bits builds are still a thing. But after I have went through considerable effort building it on 32 bit ARM I am sure that current Ceph release is not buildable out of the box. Say in options.h Options::size_t (notion of SIZE option) uses std::size_t (largest counter available - 32 bits on 32 bit arches) as store of value. It automatically implies that smoke.sh test which tries to create 100G OSD fails as 100G value just does not fit into 32 bits and trimmed to become just 0 (zero). And the same story with other SIZE options which may exceed 2^32. So clearly Ceph in its current form cannot pass testing on 32 bit systems making me think that 32 bit target was silently dropped at some point.
>
> On 22/11/20 7:39 pm, Yuval Lifshitz wrote:
>
> I don't know if it is built and tested regularly on 32bit arch, but according to this:
> https://docs.ceph.com/en/latest/dev/release-process/
> 32bit arch is a valid target for ceph.
>
> On Sun, Nov 22, 2020 at 10:30 AM Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote:
> Hi Yuval,
>
>
>
> Thank you for pointing me into right direction. I am planning to provide PRs as I already have relevant patches.
>
>
>
> BTW: am I right in my thinking that Ceph no longer does building and testing on 32 bit architectures?
>
>
>
> Regards,
>
> Vladimir
>
> On 22/11/20 7:18 pm, Yuval Lifshitz wrote:
>
> Hi Vladimir,
> We use PRs to github for code contributions to Ceph. For more details see here: [1].
> If you find issues that need fixing, but you don't plan on fixing right away you can open tracker issues here [2].
>
> Yuval
>
> [1] https://github.com/ceph/ceph/blob/master/SubmittingPatches.rst
> [2] https://tracker.ceph.com/projects
>
> On Sun, Nov 22, 2020 at 5:45 AM Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote:
> Hi all,
>
>
> Over last few weeks I was working on making Ceph 15.2.4 to build and run
> on ARM 32 bit system. It appears that Ceph no longer supports 32 bit
> systems. But I wanted to run it on Odroid HC-2 which makes a very good
> cheap OSD basically turning SATA disk into ethernet enabled disk. So I
> stuck my teeth into getting current Ceph up and running on 32 bit
> architecture. While doing so I have encountered number of bugs in Ceph
> codebase which I now want to put back for everyone to use.
>
> For example I found a bug in EC code which attempts to malloc around 4GB
> - this bug is not noticeable on 64 bit systems as malloc happily
> allocates 4GB but it of course failed on 32 bit system and digging into
> the issue shown that this malloc was just a bug - it should not be
> happening at all in the first place.
>
> Or wrong include (.cc instead of .h) which caused compilation hard
> error. Or python threading timeout was specified as int while it should
> be a float and basically timeout was not happening. And so on. Small
> inaccuracies peppered around code base. It is certainly should not be
> done as one huge patch and I am happy to put these patches one by one.
> But which way it should be done? As number of PR requests on github? Or
> patch files emailed to someone?
>
> May someone from the list enlighten me on acceptable procedure I need to
> follow?
>
>
> Regards,
>
> Vladimir
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
>
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
>
>
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
>
>
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx



--
Regards
Kefu Chai

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux