Re: query about glusterfs 3.12-3 write-behind.c coredump

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

Yes, thanks very much for your quick response.

 

 

I attach the whole file, not very big.

 

 

Br,

Li Deqian

 

From: Raghavendra Gowdappa <rgowdapp@xxxxxxxxxx>
Sent: Wednesday, January 30, 2019 11:00 AM
To: Li, Deqian (NSB - CN/Hangzhou) <deqian.li@xxxxxxxxxxxxxxx>
Cc: gluster-users <gluster-users@xxxxxxxxxxx>
Subject: Re: query about glusterfs 3.12-3 write-behind.c coredump

 

 

 

On Wed, Jan 30, 2019 at 7:35 AM Li, Deqian (NSB - CN/Hangzhou) <deqian.li@xxxxxxxxxxxxxxx> wrote:

Hi,

 

Could you help to check this coredump?

We are using glusterfs 3.12-3(3 replicated bricks solution ) to do stability testing under high CPU load like 80% by stress and doing I/O.

After several hours, coredump happened in glusterfs side .

 

[Current thread is 1 (Thread 0x7ffff37d2700 (LWP 3696))]

Missing separate debuginfos, use: dnf debuginfo-install rcp-pack-glusterfs-1.8.1_11_g99e9ca6-RCP2.wf28.x86_64

(gdb) bt

#0  0x00007ffff0d5c845 in wb_fulfill (wb_inode=0x7fffd406b3b0, liabilities=0x7fffdc234b50) at write-behind.c:1148

#1  0x00007ffff0d5e4d5 in wb_process_queue (wb_inode=0x7fffd406b3b0) at write-behind.c:1718

#2  0x00007ffff0d5eda7 in wb_writev (frame=0x7fffe0086290, this=0x7fffec014b00, fd=0x7fffe4034070, vector=0x7fffdc445720, count=1, offset=67108863, flags=32770, iobref=0x7fffdc00d550, xdata=0x0)

    at write-behind.c:1825

#3  0x00007ffff0b51fcb in du_writev_resume (ret=0, frame=0x7fffdc0305a0, opaque=0x7fffdc0305a0) at disk-usage.c:490

#4  0x00007ffff7b3510d in synctask_wrap () at syncop.c:377

#5  0x00007ffff60d0660 in ?? () from /lib64/libc.so.6

#6  0x0000000000000000 in ?? ()

(gdb) p wb_inode

$1 = (wb_inode_t *) 0x7fffd406b3b0

(gdb) frame 2

#2  0x00007ffff0d5eda7 in wb_writev (frame=0x7fffe0086290, this=0x7fffec014b00, fd=0x7fffe4034070, vector=0x7fffdc445720, count=1, offset=67108863, flags=32770, iobref=0x7fffdc00d550, xdata=0x0)

    at write-behind.c:1825

1825         in write-behind.c

(gdb) p *fd

$2 = {pid = 18154, flags = 32962, refcount = 0, inode_list = {next = 0x7fffe4034080, prev = 0x7fffe4034080}, inode = 0x0, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0,

        __nusers = 0, __kind = -1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>, __align = 0}},

  _ctx = 0x7fffe4022930, xl_count = 17, lk_ctx = 0x7fffe40350e0, anonymous = _gf_false}

(gdb) p fd

$3 = (fd_t *) 0x7fffe4034070

 

(gdb) p wb_inode->this

$1 = (xlator_t *) 0xffffffffffffff00

 

After adding test log I found the FOP  sequence in write-behind xlator side was mass as bellow showing.  In the FUSE side the FLUSH is after write2, but in the WB side, FLUSH is between write2 wb_do_unwinds and wb_fulfill.

So I think this should has problem. I think its possible that the FLUSH and later RELEASE operation will destroy the fd , it will cause wb_in->this(0xffffffffffffff00). Do you think so?

And I think our new adding disk-usage xlators synctask_new will dealy the write operation, but the FLUSH operation without this delay(because not invoked the disk-usage xlator).

 

Do you agree with my speculation ? and how to fix?(we dont want to move the disk-usage xlator)

 

 

Problematic FOP sequence :

 

FUSE side:             WB side:

                     

Write 1                write1

Write2 do unwind

Write 2                FLUSH

                      Release(destroy fd)

FLUSH                write2  (wb_fulfill)  then coredump.

Release

 

 

int

wb_fulfill (wb_inode_t *wb_inode, list_head_t *liabilities)

{

         wb_request_t  *req     = NULL;

         wb_request_t  *head    = NULL;

         wb_request_t  *tmp     = NULL;

         wb_conf_t     *conf    = NULL;

         off_t          expected_offset = 0;

         size_t         curr_aggregate = 0;

         size_t         vector_count = 0;

        int            ret          = 0;

 

         conf = wb_inode->this->private;   à this line coredump

 

         list_for_each_entry_safe (req, tmp, liabilities, winds) {

                  list_del_init (&req->winds);

 

.

 

 

volume ccs-write-behind

68:     type performance/write-behind

69:     subvolumes ccs-dht

70: end-volume

71: 

 72: volume ccs-disk-usage                      à we add a new xlator here for write op ,just for checking if disk if full.  And synctask_new for write.

73:     type performance/disk-usage

74:     subvolumes ccs-write-behind

75: end-volume

76: 

 77: volume ccs-read-ahead

78:     type performance/read-ahead

79:     subvolumes ccs-disk-usage

80: end-volume

 

 

 

Ps. Part of  Our new translator code

 

int

du_writev (call_frame_t *frame, xlator_t *this, fd_t *fd,

            struct iovec *vector, int count, off_t off, uint32_t flags,

            struct iobref *iobref, dict_t *xdata)

{

    int           op_errno = -1;

    int           ret = -1;

    du_local_t  *local = NULL;

    loc_t          tmp_loc      = {0,};

 

    VALIDATE_OR_GOTO (frame, err);

    VALIDATE_OR_GOTO (this, err);

    VALIDATE_OR_GOTO (fd, err);

 

    tmp_loc.gfid[15] = 1;

    tmp_loc.inode = fd->inode;

    tmp_loc.parent = fd->inode;

    local = du_local_init (frame, &tmp_loc, fd, GF_FOP_WRITE);

    if (!local) {

 

            op_errno = ENOMEM;

            goto err;

    }

    local->vector = iov_dup (vector, count);

    local->offset = off;

    local->count = count;

    local->flags = flags;

    local->iobref = iobref_ref (iobref);

   

    ret = synctask_new(this->ctx->env, du_get_du_info,du_writev_resume,frame,frame);

 

Can you paste the code of,

* du_get_du_info

* du_writev_resume

 

    if(ret)

    {

            op_errno = -1;

            gf_log (this->name, GF_LOG_WARNING,"synctask_new return failure ret(%d)  ",ret);

            goto err;

    }

    return 0;

err:

    op_errno = (op_errno == -1) ? errno : op_errno;

    DU_STACK_UNWIND (writev, frame, -1, op_errno, NULL, NULL, NULL);

    return 0;

}

 

Br,

Li Deqian

#ifndef _CONFIG_H
#define _CONFIG_H
#include "config.h"
#endif
#include "glusterfs.h"
#include "xlator.h"
#include "defaults.h"
#include <sys/time.h>
#include "disk-usage.h"
#include "disk-usage-mem-types.h"
#include "statedump.h"
#include <unistd.h>
#define DOUBLE_ZERO 1e-10
#define IS_DOUBLE_ZERO(d) (abs(d) < DOUBLE_ZERO)

static gf_boolean_t
du_is_loc_filled (xlator_t *this)
{

    du_conf_t *conf = NULL;
    gf_boolean_t subvol_filled_inodes = _gf_false;
    gf_boolean_t subvol_filled_space = _gf_false;
    gf_boolean_t is_subvol_filled = _gf_false;

    conf = this->private;

    /* Check for values above specified percent or free disk */
    LOCK (&conf->subvolume_lock);
    {
        if (conf->disk_unit == 'p') 
        {
            if(conf->du_stats->avail_percent < 0)
            {
                gf_log (this->name, GF_LOG_TRACE,"subvol avail percent is initialized value");
            }
            else if(conf->du_stats->avail_percent <conf->min_free_disk)
            {
                subvol_filled_space = _gf_true;
            }
        }else
        {
           if(conf->du_stats->avail_space <0)
            {
                gf_log (this->name, GF_LOG_TRACE,"subvol inode avail percent is initialized value");
           }
            else if (conf->du_stats->avail_space <conf->min_free_disk)
            {
                subvol_filled_space = _gf_true;
            }
        }
        if ((conf->du_stats->avail_inodes > 0.0f) && (conf->du_stats->avail_inodes <conf->min_free_inodes))
        {
            subvol_filled_inodes = _gf_true;
        }
    }
    UNLOCK (&conf->subvolume_lock);
    
    if (subvol_filled_space )
    {
        if (!(conf->du_stats->log++ % (GF_UNIVERSAL_ANSWER * 10)))
        {
            gf_log (this->name, GF_LOG_WARNING,
                "disk space on subvolume is getting "
                "full (%.2f %%), consider adding more nodes",
                (100 - conf->du_stats->avail_percent));
        }
    }

    if (subvol_filled_inodes )
    {
        if (!(conf->du_stats->log++ % (GF_UNIVERSAL_ANSWER * 10)))
        {
            gf_log (this->name, GF_LOG_CRITICAL,
                "inodes on subvolume are at "
                "(%.2f %%), consider adding more nodes",
                (100 - conf->du_stats->avail_inodes));
        }
    }
    is_subvol_filled = (subvol_filled_space || subvol_filled_inodes);
    return is_subvol_filled;
}

static int du_update_stats( struct statvfs *statvfs,du_conf_t      *conf)
{
    double         percent = 0;
    double         percent_inodes = 0;
    uint64_t       bytes = 0;
    uint32_t       bpc;     /* blocks per chunk */
    uint32_t       chunks   = 0;
    
    if (statvfs && statvfs->f_blocks)
    {
        percent = (statvfs->f_bavail * 100) / statvfs->f_blocks;
        bytes = (statvfs->f_bavail * statvfs->f_frsize);
        /*
         * A 32-bit count of 1MB chunks allows a maximum brick size of
         * ~4PB.  It's possible that we could see a single local FS
         * bigger than that some day, but this code is likely to be
         * irrelevant by then.  Meanwhile, it's more important to keep
         * the chunk size small so the layout-calculation code that
         * uses this value can be tested on normal machines.
         */
        bpc = (1 << 20) / statvfs->f_bsize;
        chunks = (statvfs->f_blocks + bpc - 1) / bpc;
    }

    if (statvfs && statvfs->f_files)
    {
        percent_inodes = (statvfs->f_ffree * 100) / statvfs->f_files;
    }
    else
    {
    /*
     * Set percent inodes to 100 for dynamically allocated inode
     * filesystems. The rationale is that distribute need not
     * worry about total inodes; rather, let the 'create()' be
     * scheduled on the hashed subvol regardless of the total
     * inodes.
     */
        percent_inodes = 100;
    }
    
    LOCK (&conf->subvolume_lock);
    {
        conf->du_stats->avail_percent = percent;
        conf->du_stats->avail_space   = bytes;
        conf->du_stats->avail_inodes  = percent_inodes;
        conf->du_stats->chunks        = chunks;
        

        gf_log ("du_update_stats", GF_LOG_TRACE,
                  "avail_percent "
                  "is: %.2f and avail_space "
                  "is: %" PRIu64" and avail_inodes"
                  " is: %.2f,",
                  conf->du_stats->avail_percent,
                  conf->du_stats->avail_space,
                  conf->du_stats->avail_inodes);
    }
    UNLOCK (&conf->subvolume_lock);
    
    return 0;
}

static
gf_boolean_t is_refresh_interval_arrived(xlator_t *this,struct timeval tv)
{
    du_conf_t      *conf         = this->private;
    if(tv.tv_sec !=  conf->last_stat_fetch.tv_sec)
        return _gf_true;
    if( (tv.tv_sec == conf->last_stat_fetch.tv_sec) && (tv.tv_usec - conf->last_stat_fetch.tv_usec) > conf->refresh_interval)
        return _gf_true;
    return _gf_false;
}

static int
du_get_du_info (void * data)
{
    int            ret          = -1;
    du_conf_t      *conf         = NULL;
    call_frame_t * frame = data;
    du_local_t     *local = NULL;
    loc_t loc ={0,};
    struct timeval tv           = {0,};
    loc_t          tmp_loc      = {0,};
    xlator_t *this = THIS;
    struct statvfs  dst_statfs = {0,};
    conf  = this->private;
    local = frame->local;
    VALIDATE_OR_GOTO (frame, err);
    VALIDATE_OR_GOTO (this, err);
    VALIDATE_OR_GOTO (conf, err);
    VALIDATE_OR_GOTO (local, err);

    loc = local->loc;
    
    /* make it root gfid, should be enough to get the proper info back */
    tmp_loc.gfid[15] = 1;
    tmp_loc.inode=loc.parent;
    gettimeofday(&tv,NULL);
    if ( is_refresh_interval_arrived(this,tv))
    {
        conf->last_stat_fetch.tv_sec = tv.tv_sec;
        conf->last_stat_fetch.tv_usec = tv.tv_usec;
        /* this is sync op which means when this function returns, statfs result is already in dst_statfs*/
        ret = syncop_statfs (FIRST_CHILD(this), &tmp_loc, &dst_statfs,NULL,NULL);
        if (ret) {
                gf_log (this->name, GF_LOG_ERROR,
                        "failed to get statfs of %s on %s (%s)",
                        loc.path, this->name, strerror (-ret));
                ret = -1;
                goto err;
        }
        du_update_stats(&dst_statfs,conf);
    }
    return 0;
err:
    gf_log (this->name, GF_LOG_WARNING, "failed to get disk usage info!");
    return -1;
}

void
du_local_wipe (xlator_t *this, du_local_t *local)
{
    if (!local)
            return;
    loc_wipe (&local->loc);
    if (local->fd) {
            fd_unref (local->fd);
            local->fd = NULL;
    }
    if (local->params) {
            dict_unref (local->params);
            local->params = NULL;
    }
    GF_FREE (local->vector);

    if (local->iobref)
            iobref_unref (local->iobref);
    
    if (local->xdata) {
            dict_unref (local->xdata);
            local->xdata = NULL;
    }
    mem_put (local);
    
}

du_local_t *
du_local_init (call_frame_t *frame, loc_t *loc, fd_t *fd, glusterfs_fop_t fop)
{
    du_local_t *local = NULL;
    inode_t     *inode = NULL;
    int          ret   = 0;
    
    local = mem_get0 (THIS->local_pool);
    if (!local)
    {
         gf_log ("disk-usage", GF_LOG_WARNING,"mem_get local failure!"); 
          goto out;
    }
    if (loc)
    {
            ret = loc_copy (&local->loc, loc);
            if (ret)
            {
               gf_log ("disk-usage", GF_LOG_WARNING,"loc_copy failure!"); 
               goto out;
            }
            inode = loc->inode;
    }

    if (fd)
    {
            local->fd = fd_ref (fd);
            if (!inode)
            {
                inode = fd->inode;
            }
    }

    local->op_ret   = -1;
    local->op_errno = EUCLEAN;
    local->fop      = fop;
    frame->local = local;
out:
    if (ret) {
            if (local)
                    mem_put (local);
            local = NULL;
    }
    return local;
}

int du_change_statfs(xlator_t *this,struct statvfs *statvfs)
{

 du_conf_t  * conf = this->private;
 double         percent = 0;
 uint64_t       bytes = 0;
 if (statvfs && statvfs->f_blocks)
 {
    percent = (statvfs->f_bavail * 100) / statvfs->f_blocks;
    bytes = (statvfs->f_bavail * statvfs->f_frsize);
 }
 if (conf->disk_unit == 'p' )
 {
     if(conf->min_free_disk >= percent)
     {
         statvfs->f_bavail =0 ;
         gf_log (this->name, GF_LOG_TRACE,"modify statfs result f_statvfs->f_bavail to zero");
      }
  }
 else
 {
     if(conf->min_free_disk >=bytes)
     {
         statvfs->f_bavail =0;
         gf_log (this->name, GF_LOG_TRACE,"modify statfs result f_statvfs->f_bavail to zero");
     }
     else
     {
         if(statvfs->f_bavail -(conf->min_free_disk/statvfs->f_frsize) > 0)
             statvfs->f_bavail = statvfs->f_bavail -(conf->min_free_disk/statvfs->f_frsize) ;
         else
            statvfs->f_bavail = 0;
      }
  }
 return 0;
}

static int
du_statfs_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
                int op_ret, int op_errno, struct statvfs *statvfs,
                dict_t *xdata)
{
     du_change_statfs(this,statvfs);
     DU_STACK_UNWIND(statfs, frame, op_ret, op_errno,
                                     statvfs, xdata);
     return 0;
}

int
du_statfs (call_frame_t *frame, xlator_t *this, loc_t *loc, dict_t *xdata)
{
    int           op_errno = -1;
    du_local_t *local = NULL;
    
    VALIDATE_OR_GOTO (frame, err);
    VALIDATE_OR_GOTO (this, err);
    VALIDATE_OR_GOTO (loc, err);
    VALIDATE_OR_GOTO (this->private, err);
    local = du_local_init (frame, loc, NULL, GF_FOP_STATFS);
    if (!local) {
            op_errno = ENOMEM;
            gf_log (this->name, GF_LOG_WARNING,"allocation local failure  ");
            goto err;
    }

    STACK_WIND (frame, du_statfs_cbk,
                FIRST_CHILD(this),
                FIRST_CHILD(this)->fops->statfs,
                loc, xdata);
     return 0;
err:
    op_errno = (op_errno == -1) ? errno : op_errno;
    DU_STACK_UNWIND (statfs, frame, -1, op_errno, NULL, NULL);
    return 0;
}

static int
du_mkdir_cbk (call_frame_t *frame, void *cookie,
                      xlator_t *this, int op_ret, int op_errno,
                      inode_t *inode, struct iatt *stbuf,
                      struct iatt *preparent, struct iatt *postparent,
                      dict_t *xdata)
{
    if (op_ret != -1)
        DU_STACK_UNWIND (mkdir, frame, op_ret, op_errno, inode, stbuf, preparent,
                         postparent, NULL);
    else
        DU_STACK_UNWIND (mkdir, frame, -1, op_errno, NULL, NULL, NULL,
                          NULL, NULL);
    return 0;
}

 int
du_mkdir_resume (int ret, call_frame_t *frame, void *opaque)
{
    int         op_errno = -1;
    du_local_t     *local = NULL;
    loc_t loc ={0,};
    xlator_t *this = NULL;   
    mode_t mode;
    mode_t umask;
    dict_t *params;
    VALIDATE_OR_GOTO (frame, err);
    this = frame->this;
    VALIDATE_OR_GOTO (this, err);
    local = frame->local;
    VALIDATE_OR_GOTO (local, err);
    loc = local->loc;

    mode = local->mode;
    umask= local->umask;
    params=local->params;

    if (!du_is_loc_filled (this))
    {
        STACK_WIND (frame, du_mkdir_cbk,
                FIRST_CHILD(this),
                FIRST_CHILD(this)->fops->mkdir,
                &loc, mode, umask, params);
    }
    else
    {
        gf_log (this->name, GF_LOG_WARNING,"%s: disk space is full",loc.path);
        op_errno=ENOSPC;
        goto err;
    }
    return 0;
err:
    op_errno = (op_errno == -1) ? errno : op_errno;
    DU_STACK_UNWIND (mkdir, frame, -1, op_errno, NULL, NULL, NULL,
                          NULL, NULL);
    return op_errno;
}

int
du_mkdir (call_frame_t *frame, xlator_t *this,
           loc_t *loc, mode_t mode, mode_t umask, dict_t *params)
{
    du_local_t  *local  = NULL;
    int           op_errno = -1;
    int           ret = -1;

    VALIDATE_OR_GOTO (frame, err);
    VALIDATE_OR_GOTO (this, err);
    VALIDATE_OR_GOTO (loc, err);
    VALIDATE_OR_GOTO (loc->inode, err);
    VALIDATE_OR_GOTO (loc->path, err);
    VALIDATE_OR_GOTO (this->private, err);

    local = du_local_init (frame, loc, NULL, GF_FOP_MKDIR);
    if (!local) {
            op_errno = ENOMEM;
            goto err;
    }

    local->mode = mode,
    local->umask = umask;
    local->params=  dict_ref (params);
    ret = synctask_new(this->ctx->env, du_get_du_info,du_mkdir_resume,frame,frame);
    if(ret)
    {
            op_errno = -1;
            gf_log (this->name, GF_LOG_WARNING,"synctask_new return failure ret(%d)  ",ret);
            goto err;
    }
    return 0;
err:
    op_errno = (op_errno == -1) ? errno : op_errno;
    DU_STACK_UNWIND (mkdir, frame, -1, op_errno, NULL, NULL, NULL,
                     NULL, NULL);
    return op_errno;
}

static int
du_writev_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
                int op_ret, int op_errno, struct iatt *prebuf,
                struct iatt *postbuf, dict_t *xdata)
{
    DU_STACK_UNWIND (writev, frame, op_ret, op_errno, prebuf, postbuf,
                     xdata);
    return 0;
}


int
du_writev_resume (int ret, call_frame_t *frame, void *opaque)
{
    int         op_errno = -1;
    du_local_t     *local = NULL;
    xlator_t *this = NULL;   

    fd_t *fd=NULL;
    struct iovec *vector = NULL;
    off_t off = 0;
    uint32_t flags = 0;
    struct iobref *iobref = NULL;
    dict_t *xdata =NULL;
    int count = 0;
    VALIDATE_OR_GOTO (frame, err);
    this = frame->this;
    VALIDATE_OR_GOTO (this, err);
    local = frame->local;
    VALIDATE_OR_GOTO (local, err);

     flags=local->flags;
     fd = local->fd;
     off = local->offset;
     xdata = local->xdata;
     iobref = local->iobref;
     vector = local->vector;
     count = local->count;
     

    if (!du_is_loc_filled (this))
    {
        STACK_WIND (frame, du_writev_cbk,
                FIRST_CHILD(this), FIRST_CHILD(this)->fops->writev,
                fd, vector, count, off, flags, iobref, xdata);
    }
    else
    {
        gf_log (this->name, GF_LOG_WARNING," disk space is full");
        op_errno=ENOSPC;
        goto err;
    }
    return 0;
err:
    op_errno = (op_errno == -1) ? errno : op_errno;
    DU_STACK_UNWIND (writev, frame, -1, op_errno, NULL, NULL, NULL);
    return op_errno;
}

int
du_writev (call_frame_t *frame, xlator_t *this, fd_t *fd,
            struct iovec *vector, int count, off_t off, uint32_t flags,
            struct iobref *iobref, dict_t *xdata)
{
    int           op_errno = -1;
    int           ret = -1;
    du_local_t  *local = NULL;
    loc_t          tmp_loc      = {0,};

    VALIDATE_OR_GOTO (frame, err);
    VALIDATE_OR_GOTO (this, err);
    VALIDATE_OR_GOTO (fd, err);

    tmp_loc.gfid[15] = 1;
    tmp_loc.inode = fd->inode;
    tmp_loc.parent = fd->inode;
    local = du_local_init (frame, &tmp_loc, fd, GF_FOP_WRITE);
    if (!local) {

            op_errno = ENOMEM;
            goto err;
    }
    local->vector = iov_dup (vector, count);
    local->offset = off;
    local->count = count;
    local->flags = flags;
    local->iobref = iobref_ref (iobref);
    
    ret = synctask_new(this->ctx->env, du_get_du_info,du_writev_resume,frame,frame);
    if(ret)
    {
            op_errno = -1;
            gf_log (this->name, GF_LOG_WARNING,"synctask_new return failure ret(%d)  ",ret);
            goto err;
    }
    return 0;
err:
    op_errno = (op_errno == -1) ? errno : op_errno;
    DU_STACK_UNWIND (writev, frame, -1, op_errno, NULL, NULL, NULL);
    return 0;
}

static int
du_create_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
                int op_ret, int op_errno,
                fd_t *fd, inode_t *inode, struct iatt *stbuf,
                struct iatt *preparent, struct iatt *postparent, dict_t *xdata)
{
    DU_STACK_UNWIND(create, frame, op_ret, op_errno, fd, inode, stbuf, preparent,
                    postparent, xdata);
    return 0;
}


int
du_create_resume (int ret, call_frame_t *frame, void *opaque)
{
    int         op_errno = -1;
    du_local_t     *local = NULL;
    loc_t loc ={0,};
    xlator_t *this = NULL;   
    int32_t flags;
    mode_t mode;
    mode_t umask;
    fd_t *fd=NULL;
    dict_t *params;
    VALIDATE_OR_GOTO (frame, err);
    this = frame->this;
    VALIDATE_OR_GOTO (this, err);
    local = frame->local;
    VALIDATE_OR_GOTO (local, err);
    loc = local->loc;

     flags=local->flags;
     mode = local->mode;
     umask= local->umask;
     fd = local->fd;
     params=local->params;

     if (!du_is_loc_filled (this))
    {
           STACK_WIND (frame, du_create_cbk,
                       FIRST_CHILD(this), FIRST_CHILD(this)->fops->create,
                       &loc, flags, mode, umask, fd, params);
    }
    else
    {
        gf_log (this->name, GF_LOG_WARNING,"%s: disk space is full",loc.path);
        op_errno=ENOSPC;
        goto err;
    }
    return 0;
err:
    op_errno = (op_errno == -1) ? errno : op_errno;
    DU_STACK_UNWIND (create, frame, -1, op_errno, NULL, NULL, NULL,
                     NULL, NULL, NULL);
    return op_errno;
}



int
du_create (call_frame_t *frame, xlator_t *this,
            loc_t *loc, int32_t flags, mode_t mode,
            mode_t umask, fd_t *fd, dict_t *params)
{
    int         op_errno = -1;
    du_local_t *local = NULL;
    int ret = -1;
    VALIDATE_OR_GOTO (frame, err);
    VALIDATE_OR_GOTO (this, err);
    VALIDATE_OR_GOTO (loc, err);
    local = du_local_init (frame, loc, fd, GF_FOP_CREATE);
    if (!local) {
            op_errno = ENOMEM;
            gf_log (this->name, GF_LOG_WARNING,"allocation local failure  ");
            goto err;
    }
    local->flags = flags;
    local->mode = mode,
    local->umask = umask;
    local->params=  dict_ref (params);
    ret = synctask_new(this->ctx->env, du_get_du_info,du_create_resume,frame,frame);
    if(ret)
    {
            op_errno = -1;
            gf_log (this->name, GF_LOG_WARNING,"create synctask return failure ret(%d)  ",ret);
            goto err;
    }
    return 0;
err:
    op_errno = (op_errno == -1) ? errno : op_errno;
    DU_STACK_UNWIND (create, frame, -1, op_errno, NULL, NULL, NULL,
                     NULL, NULL, NULL);
    return op_errno;
}

int
reconfigure (xlator_t *this, dict_t *options)
{
    du_conf_t      *conf = NULL;

    GF_VALIDATE_OR_GOTO ("du", this, out);
    GF_VALIDATE_OR_GOTO ("du", options, out);
    gf_log (this->name, GF_LOG_TRACE,"reconfigure disk usage parameters!");
    conf = this->private;
    if (!conf)
        return 0;

    GF_OPTION_RECONF ("min-free-disk", conf->min_free_disk, options,
                      percent_or_size, out);
    /* option can be any one of percent or bytes */
    conf->disk_unit = 0;
    if (conf->min_free_disk < 100.0)
            conf->disk_unit = 'p';
    GF_OPTION_RECONF ("min-free-inodes", conf->min_free_inodes, options,
                       percent, out);
out:
        return 0;
}

int32_t
mem_acct_init (xlator_t *this)
{
    int     ret = -1;

    GF_VALIDATE_OR_GOTO ("du", this, out);
    ret = xlator_mem_acct_init (this, gf_du_mt_end + 1);

    if (ret != 0) {
            gf_log (this->name, GF_LOG_ERROR, "Memory accounting init"
                    "failed");
            return ret;
    }
out:
    return ret;
}

int
init(xlator_t * this)
{
    du_conf_t                      *conf           = NULL;
    GF_VALIDATE_OR_GOTO ("du", this, err);
    if (!this->children) {
            gf_log(this->name, GF_LOG_CRITICAL,
                    "DiskUsage needs more than one subvolume");
            return -1;
    }
    if (!this->parents)
    {
        gf_log(this->name, GF_LOG_WARNING,
                "dangling volume. check volfile");
    }
    conf = GF_CALLOC (1, sizeof (*conf),gf_du_mt_du_conf_t);
    if (!conf)
    {
        gf_log(this->name, GF_LOG_WARNING,
                "du_conf allocation failure!");
        goto err;
    }
    this->local_pool = mem_pool_new (du_local_t, 512);
    if (!this->local_pool) {
        gf_log(this->name, GF_LOG_ERROR,
                " DU initialisation failed. "
                "failed to create local_t's memory pool");
        goto err;
    }

    GF_OPTION_INIT ("min-free-disk", conf->min_free_disk, percent_or_size,err);
    /* option can be any one of percent or bytes */
    conf->disk_unit = 0;
    if (conf->min_free_disk < 100)
        conf->disk_unit = 'p';
    conf->refresh_interval = 300000;
    conf->du_stats = GF_CALLOC (1, sizeof (du_info_t), gf_du_mt_du_stat_t);
    if (!conf->du_stats) {
        gf_log(this->name, GF_LOG_WARNING,
                "du_sats allocation failure!");
            goto err;
    }
    conf->du_stats->avail_percent = -1.0f;
    conf->du_stats->avail_inodes = -1.0f;
    GF_OPTION_INIT ("min-free-inodes", conf->min_free_inodes, percent, err);
    LOCK_INIT (&conf->subvolume_lock);
    this->private = conf;
    return 0;
err:
    if (conf) 
    {
        GF_FREE (conf->du_stats);
        GF_FREE (conf);
    }
    return -1;
}


int32_t
du_priv_dump (xlator_t *this)
{
    char            key_prefix[GF_DUMP_MAX_BUF_LEN];
    char            key[GF_DUMP_MAX_BUF_LEN];
    du_conf_t      *conf = NULL;
    int             ret = -1;

    if (!this)
            goto out;

    conf = this->private;
    if (!conf)
            goto out;

    ret = TRY_LOCK(&conf->subvolume_lock);
    if (ret != 0) {
            return ret;
    }
    gf_proc_dump_add_section("xlator.performance.%s.priv", this->name);
    gf_proc_dump_build_key(key_prefix,"xlator.performance.disk-usage","%s.priv",this->name);
    gf_proc_dump_write("gen", "%d", conf->gen);
    gf_proc_dump_write("min_free_disk", "%lf", conf->min_free_disk);
    gf_proc_dump_write("min_free_inodes", "%lf", conf->min_free_inodes);
    gf_proc_dump_write("disk_unit", "%c", conf->disk_unit);
    gf_proc_dump_write("refresh_interval", "%d", conf->refresh_interval);
    if (conf->du_stats)
    {
        snprintf (key, sizeof (key),"du_stats.avail_percent");
        gf_proc_dump_write (key, "%lf",conf->du_stats->avail_percent);
        snprintf (key, sizeof (key), "du_stats.avail_space");
        gf_proc_dump_write (key, "%lu",conf->du_stats->avail_space);
        snprintf (key, sizeof (key),"du_stats.avail_inodes");
        gf_proc_dump_write (key, "%lf",conf->du_stats->avail_inodes);
        snprintf (key, sizeof (key), "du_stats.log");
        gf_proc_dump_write (key, "%lu",conf->du_stats->log);
    }
    if (conf->last_stat_fetch.tv_sec)
            gf_proc_dump_write("last_stat_fetch", "%s",ctime(&conf->last_stat_fetch.tv_sec));
    UNLOCK(&conf->subvolume_lock);
out:
        return ret;
}
void
fini (xlator_t *this)
{
    du_conf_t *conf = NULL;
    GF_VALIDATE_OR_GOTO ("du", this, out);
    conf = this->private;
    this->private = NULL;
    if (conf) 
    {
        GF_FREE (conf->du_stats);
        GF_FREE (conf);
    }
out:
    return;
}


struct xlator_fops fops = {
        .create      = du_create,
        .mkdir       = du_mkdir,
        .writev      = du_writev,
        .statfs      = du_statfs,
};


struct xlator_cbks cbks = {

};

struct xlator_dumpops dumpops = {
    .priv = du_priv_dump
};

struct volume_options options[] = {
        { .key  = {"min-free-disk"},
          .type = GF_OPTION_TYPE_PERCENT_OR_SIZET,
          .default_value = "104857600",
          .description = "Percentage/Size of disk space, after which the "
          "glusterfs will refuse to create/mkdir/write to the volume, and logs will appear "
          "in log files",
        },
        { .key  = {"min-free-inodes"},
          .type = GF_OPTION_TYPE_PERCENT,
          .default_value = "5%",
          .description = "after system has only N% of inodes, warnings "
          "starts to appear in log files",
        },
        { .key  = {NULL} },
};






_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux