okay another day another nightmare ;-) So far we discussed pools as bundles of: - pool 1) 15 HDD-OSDs (consisting of a total of 25 HDDs actual, 5 single HDDs and five raid0 pairs as mentioned before) - pool 2) 6 SSD-OSDs unfortunately (well) on the "physical" pool 1 there are two "logical" pools (my wording is here maybe not cephish?) now I wonder about the real free space on "the pool"... ceph df tells me: GLOBAL: SIZE AVAIL RAW USED %RAW USED 52806G 17457G 35349G 66.94 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS pool-1-HDD 9 995G 13.34 3232G 262134 pool-2-HDD 10 14986G 69.86 3232G 3892481 pool-3-SDD 12 1318G 55.94 519G 372618 Now how do I read this? the sum of "MAX AVAIL" in the "POOLS" section is 7387 okay 7387*2 (since all three pools have a size of 2) is 14774 The GLOBAL section on the other hand tells me I still got 17457G available 17457-14774=2683 where are the missing 2683 GB? or am I missing something (else than space and a sane setup I mean :-) AND (!) if in the "physical" HDD pool the reported two times 3232G available space is true, than in this setup (two hosts) there would be only 3232G free on each host. Given that the HDD-OSDs are 4TB in size - if one dies and the host tries to restore the data (as I learned yesterday the data in this setup will ONLY be restored on that host on which the OSD died) than ... it doesn't work, right? Except I could hope that - due to too few placement groups and the resulting miss-balance of space usage on the OSDs - the dead OSD was only filled by 60% and not 85% and only the real data will rewritten(restored). But even that seems not possible - given the miss-balanced OSDs - the fuller ones will hit total saturation and - at least as I understand it now - after that (again after the first OSD is filled 100%) I can't use the left space on the other OSDs. right? If all that is true (and PLEASE point out any mistake in my thinking) than I got here at the moment 25 harddisks of which NONE must fail or the pool will at least stop accepting writes. Am I right? (feels like a reciprocal Russian roulette ... ONE chamber WITHOUT a bullet ;-) Now - sorry we are not finished yet (and yes this is true, I'm not trying to make fun of you) On top of all this I see a rapid decrease in the available space which is not consistent with growing data inside the rbds living in this cluster nore growing numbers of rbds (we ONLY use rbds). BUT someone is running sanpshots. How do I sum up the amount of space each snapshot is using. is it the sum of the USED column in the output of "rbd du --snapp" ? And what is the philosophy of snapshots in ceph? AN object is 4MB in size, if a bit in that object changes is the whole object replicated? (the cluster is luminous upgraded from jewel so we use filestore on xfs not bluestore) TIA On Tue, Dec 5, 2017 at 11:10 AM, Stefan Kooman <stefan@xxxxxx> wrote: > Quoting tim taler (robur314@xxxxxxxxx): >> And I'm still puzzled about the implication of the cluster size on the >> amount of OSD failures. >> With size=2 min_size=1 one host could die and (if by chance there is >> NO read error on any bit on the living host) I could (theoretically) >> recover, is that right? > True. >> OR is it that if any two disks in the cluster fail at the same time >> (or while one is still being rebuild) all my data would be gone? > Only the objects that are located on those disks. So for example obj1 > disk1,host1 and obj 1 on disk2,host2 ... you will lose data, yes. > > Gr. Stefan > > -- > | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com