Re: How can I use not-replicated pool (replication 1 or raid-0)

Janne Johansson <icepic.dz@xxxxxxxxx> · Wed, 12 Apr 2023 09:10:27 +0200

Den mån 10 apr. 2023 kl 22:31 skrev mhnx <morphinwithyou@xxxxxxxxx>:
> Hello.
> I have a 10 node cluster. I want to create a non-replicated pool
> (replication 1) and I want to ask some questions about it:
>
> Let me tell you my use case:
> - I don't care about losing data,
> - All of my data is JUNK and these junk files are usually between 1KB to 32MB.
> - These files will be deleted in 5 days.
> - Writable space and I/O speed is more important.
> - I have high Write/Read/Delete operations, minimum 200GB a day.

That is "only" 18MB/s which should easily be doable even with
repl=2,3,4. or EC. This of course depends on speed of drives, network,
cpus and all that, but in itself it doesn't seem too hard to achieve
in terms of average speeds. We have EC8+3 rgw backed by some 12-14 OSD
hosts with hdd and nvme (for wal+db) that can ingest over 1GB/s if you
parallelize the rgw streams, so 18MB/s seems totally doable with 10
decent machines. Even with replication.

> I'm afraid that, in any failure, I won't be able to access the whole
> cluster. Losing data is okay but I have to ignore missing files,

Even with repl=1, in case of a failure, the cluster will still aim at
fixing itself rather than ignoring currently lost data and moving on,
so any solution that involves "forgetting" about lost data would need
a ceph operator telling the cluster to ignore all the missing parts
and to recreate the broken PGs. This would not be automatic.

--
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx