Hi John, Sorry, I'm not sure what the largest file is on our systems. We have lots of data sets that are ~8TB uncompressed, these typically compress 3:1. Thus if the users wants a single file, we hit 3TB. I'm rsyncing 360TB of data from an Isilon to cephfs, it'll be interesting to see how cephfs copes with 400 million files... thanks again for your help, Jake On 24/05/17 20:30, John Spray wrote: > On Wed, May 24, 2017 at 8:17 PM, Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote: >> Hi John, >> That's great, thank you so much for the advice. >> Some of our users have massive files so this would have been a big block. >> >> Is there any particular reason for having a file size limit? > > Without the size limit, a user can create a file of arbitrary size > (without necessarily writing any data to it), such that when the MDS > came to e.g. delete it, it would have to do a ridiculously large > number of operations to check if any of the objects within the range > that could exist (according to the file size) really existed. > > The idea is that we don't want to prevent users creating files big > enough to hold their data, but we don't want to let them just tell the > system "oh hey this file that I never wrote anything to is totally an > exabyte in size, have fun enumerating the objects when you try to > delete it lol". > > 1TB is a bit conservative these days -- that limit was probably set > circa 10 years ago and maybe we should revist it. As an datapoint, > what's your largest file? > >> Would setting >> max_file_size to 0 remove all limits? > > Nope, it would limit you to only creating empty files :-) > > It's a 64 bit field, so you can set it to something huge if you like. > > John > >> >> Thanks again, >> >> Jake >> >> On 24 May 2017 19:45:52 BST, John Spray <jspray@xxxxxxxxxx> wrote: >>> >>> On Wed, May 24, 2017 at 7:41 PM, Brady Deetz <bdeetz@xxxxxxxxx> wrote: >>>> >>>> Are there any repercussions to configuring this on an existing large fs? >>> >>> >>> No. It's just a limit that's enforced at the point of appending to >>> files or setting their size, it doesn't affect how anything is stored. >>> >>> John >>> >>>> On Wed, May 24, 2017 at 1:36 PM, John Spray <jspray@xxxxxxxxxx> wrote: >>>>> >>>>> >>>>> On Wed, May 24, 2017 at 7:19 PM, Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> >>>>> wrote: >>>>>> >>>>>> Dear All, >>>>>> >>>>>> I've been testing out cephfs, and bumped into what appears to be an >>>>>> upper >>>>>> file size limit of ~1.1TB >>>>>> >>>>>> e.g: >>>>>> >>>>>> [root@cephfs1 ~]# time rsync --progress -av /ssd/isilon_melis.tar >>>>>> /ceph/isilon_melis.tar >>>>>> sending incremental file list >>>>>> isilon_melis.tar >>>>>> 1099341824000 54% 237.51MB/s 1:02:05 >>>>>> rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: >>>>>> Broken pipe (32) >>>>>> rsync: write failed on "/ceph/isilon_melis.tar": File too large (27) >>>>>> rsync error: error in file IO (code 11) at receiver.c(322) >>>>>> [receiver=3.0.9] >>>>>> rsync: connection unexpectedly closed (28 bytes received so far) >>>>>> [sender] >>>>>> rsync error: error in rsync protocol data stream (code 12) at >>>>>> io.c(605) >>>>>> [sender=3.0.9] >>>>>> >>>>>> Firstly, is this expected? >>>>> >>>>> >>>>> CephFS has a configurable maximum file size, it's 1TB by default. >>>>> >>>>> Change it with: >>>>> ceph fs set <fs name> max_file_size <size in bytes> >>>>> >>>>> John >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> If not, then does anyone have any suggestions on where to start >>>>>> digging? >>>>>> >>>>>> I'm using erasure encoding (4+1, 50 x 8TB drives over 5 servers), with >>>>>> an >>>>>> nvme hot pool of 4 drives (2 x replication). >>>>>> >>>>>> I've tried both Kraken (release), and the latest Luminous Dev. >>>>>> >>>>>> many thanks, >>>>>> >>>>>> Jake >>>>>> -- >>>>>> >>>>>> ________________________________ >>>>>> >>>>>> ceph-users mailing list >>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>>> ________________________________ >>>>> >>>>> ceph-users mailing list >>>>> ceph-users@xxxxxxxxxxxxxx >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Dr Jake Grimmett Head Of Scientific Computing MRC Laboratory of Molecular Biology Francis Crick Avenue, Cambridge CB2 0QH, UK. Phone 01223 267019 Mobile 0776 9886539 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com