Re: Bareos and libradosstriper works only for 4M sripe_unit size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Gregory!

It turns out that this error is internal CEPH feature. I wrote standalone program to create 132M object in striper mode. It works only for 4M stripe.  If you set stripe_unit = 2M it still creates 4M stripe_unit.  Anything bigger than 4M causes crash here:


__u32 object_size = layout->object_size;
  __u32 su = layout->stripe_unit;
  __u32 stripe_count = layout->stripe_count;
  assert(object_size >= su);   <------------

I'm curious where it gets layout->object_size for object that is just been created.

As I understod striper mode was created by CERN guys.  In there document they recommend 8M stripe_unit.  But it does not work in luminous.

Created I/O context.
Connected to pool backup with rados_striper_create 
Stripe unit OK 8388608 
Stripe count OK 1 
/build/ceph-12.2.0/src/osdc/Striper.cc: In function 'static void Striper::file_to_extents(CephContext*, const char*, const file_layout_t*, uint64_t, uint64_t, uint64_t, std::map<object_t, std::vector<ObjectExtent> >&, uint64_t)' thread 7f13bd5c1e00 time 2017-10-07 21:44:58.654778
/build/ceph-12.2.0/src/osdc/Striper.cc: 64: FAILED assert(object_size >= su)
 ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f13b3f3b332]
 2: (Striper::file_to_extents(CephContext*, char const*, file_layout_t const*, unsigned long, unsigned long, unsigned long, std::map<object_t, std::vector<ObjectExtent, std::allocator<ObjectExtent> >, std::less<object_t>, std::allocator<std::pair<object_t const, std::vector<ObjectExtent, std::allocator<ObjectExtent> > > > >&, unsigned long)+0x1e1e) [0x7f13bce235ee]
 3: (Striper::file_to_extents(CephContext*, char const*, file_layout_t const*, unsigned long, unsigned long, unsigned long, std::vector<ObjectExtent, std::allocator<ObjectExtent> >&, unsigned long)+0x51) [0x7f13bce23691]
 4: (libradosstriper::RadosStriperImpl::internal_aio_write(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::intrusive_ptr<libradosstriper::MultiAioCompletionImpl>, ceph::buffer::list const&, unsigned long, unsigned long, ceph_file_layout const&)+0x224) [0x7f13bcda4184]
 5: (libradosstriper::RadosStriperImpl::write_in_open_object(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph_file_layout const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::list const&, unsigned long, unsigned long)+0x13c) [0x7f13bcda476c]
 6: (libradosstriper::RadosStriperImpl::write(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::list const&, unsigned long, unsigned long)+0xd5) [0x7f13bcda4bd5]
 7: (rados_striper_write()+0xdb) [0x7f13bcd9ba0b]
 8: (()+0x10fb) [0x55dd87b410fb]
 9: (__libc_start_main()+0xf1) [0x7f13bc9d72b1]
 10: (()+0xbca) [0x55dd87b40bca]


On Fri, Sep 29, 2017 at 11:46 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
I haven't used the striper, but it appears to make you specify sizes, stripe units, and stripe counts. I would expect you need to make sure that the size is an integer multiple of the stripe unit. And it probably defaults to a 4MB object if you don't specify one?

On Fri, Sep 29, 2017 at 2:09 AM Alexander Kushnirenko <kushnirenko@xxxxxxxxx> wrote:
Hi,

I'm trying to use CEPH-12.2.0 as storage for with Bareos-16.2.4 backup with libradosstriper1 support.   

Libradosstriber was suggested on this list to solve the problem, that current CEPH-12 discourages users from using object with very big size (>128MB).  Bareos treat Rados Object as Volume and in CEPH-10 it created objects with very big size (10G and more).  CEPH-10 allowed such behaviour, put recovery indeed take very long time. So stripping objects seems to be the right thing to do.

Bareos supports libradosstriper and the code seems to work. But for some reason it run only with stripe_unit=4194304, which seems to be typical value for RadosGW for example.  I tried several other values for stripe_unit, but the code exit with error.

Is there a particular reason why only 4M size works?  Can one use some CLI to test different stripe sizes?

Basic flow of creating object in Bareos is the following:
rados_ioctx_create(m_cluster, m_rados_poolname, &m_ctx);
rados_striper_create(m_ctx, &m_striper);
rados_striper_set_object_layout_stripe_unit(m_striper, m_stripe_unit);
rados_striper_set_object_layout_stripe_count(m_striper, m_stripe_count);
.....
status = rados_striper_write(m_striper, m_virtual_filename, buffer, count, offset);

Alexander

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

#include <stdio.h>
#include <string.h>
#include <rados/librados.h>
#include <radosstriper/libradosstriper.h>
#include <stdlib.h>
#define EXIT_FAILURE 1

int main (int argc, const char * argv[])
{


	/* Declare the cluster handle and required arguments. */
        rados_t cluster;
        char cluster_name[] = "ceph";
        char user_name[] = "client.admin";
        uint64_t flags;
        rados_striper_t m_striper;
        
        
        /* Initialize the cluster handle with the "ceph" cluster name and the "client.admin" user */
        int err;
        err = rados_create2(&cluster, cluster_name, user_name, flags);

        if (err < 0) {
                fprintf(stderr, "%s: Couldn't create the cluster handle! %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nCreated a cluster handle.\n");
        }


        /* Read a Ceph configuration file to configure the cluster handle. */
        err = rados_conf_read_file(cluster, "/etc/ceph/ceph.conf");
        if (err < 0) {
                fprintf(stderr, "%s: cannot read config file: %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nRead the config file.\n");
        }

        /* Read command line arguments */
        err = rados_conf_parse_argv(cluster, argc, argv);
        if (err < 0) {
                fprintf(stderr, "%s: cannot parse command line arguments: %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nRead the command line arguments.\n");
        }

        /* Connect to the cluster */
        err = rados_connect(cluster);
        if (err < 0) {
                fprintf(stderr, "%s: cannot connect to cluster: %s\n", argv[0], strerror(-err));
                exit(EXIT_FAILURE);
        } else {
                printf("\nConnected to the cluster.\n");
        }

        /*
         * Continued from previous C example, where cluster handle and
         * connection are established. First declare an I/O Context.
         */

        rados_ioctx_t io;
        char *poolname = "backup";

        err = rados_ioctx_create(cluster, poolname, &io);
        if (err < 0) {
                fprintf(stderr, "%s: cannot open rados pool %s: %s\n", argv[0], poolname, strerror(-err));
                rados_shutdown(cluster);
                exit(EXIT_FAILURE);
        } else {
                printf("\nCreated I/O context.\n");
        }


        err = rados_striper_create(io, &m_striper);
        if (err < 0) {
            fprintf (stderr, "Unable to create RADOS striper object for pool %s: ERR=%s\n", poolname, strerror(-err));
            rados_shutdown(cluster);
            exit(EXIT_FAILURE);
        } else {
            fprintf (stderr, "Connected to pool %s with rados_striper_create \n", poolname);
        }

        //int stripe_unit = 2097152;
        int stripe_unit = 4194304;
        //int stripe_unit = 16777216;
        //int stripe_unit = 8388608;
        err = rados_striper_set_object_layout_stripe_unit(m_striper, stripe_unit);
        if (err < 0) {
            fprintf (stderr, "Unable to set RADOS striper unit size to %d  for pool %s: ERR=%s\n", stripe_unit, poolname, err);
            rados_shutdown(cluster);
            exit(EXIT_FAILURE);
        } else {
            fprintf (stderr, "Stripe unit OK %d \n", stripe_unit); 
        }

        int stripe_count = 1;
        err = rados_striper_set_object_layout_stripe_count(m_striper, stripe_count);
        if (err < 0) {
            fprintf (stderr, "Unable to set RADOS striper stripe count to %d  for pool %s: ERR=%s\n", stripe_count, poolname, err);
            rados_shutdown(cluster);
            exit(EXIT_FAILURE);
        } else {
           fprintf (stderr, "Stripe count OK %d \n", stripe_count); 
        }
      



        /* Write data to the cluster synchronously. */
        char read_res[1048576];
        memset(read_res, 0xcc, 1048576);
        
        for (int i=0; i<2048; i++) {
            err = rados_striper_write(m_striper, "striper", read_res, 65536, i*65536);
            if (err < 0) {
                fprintf(stderr, "Cannot write object striper to pool %s:", strerror(-err));
                rados_ioctx_destroy(io);
                rados_shutdown(cluster);
                exit(1);
            } else {
                //printf("\nWrote \"Hello World\" to object \"hw\".\n");
            }
        }

        rados_ioctx_destroy(io);
        rados_shutdown(cluster);


}
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux