Various file lengths while uploading the same file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 19, 2014 at 12:19 AM, Arthur Tumanyan
<arthurtumanyan at gmail.com> wrote:
> Hi,
> I'm writing an fastcgi application, which just gets data from stdin and
> appends it to ceph.
> Please,have a look:
>
>                 while (0 != (written = fread(buffer, sizeof (char), sizeof
> buffer, stdin))) {
>                     if (written < 0) {
>                         logger("Error: %s\n", strerror(errno));
>                         break;
>                     }
>                     err = rados_aio_append(io, filename, comp, (const char
> *) buffer, written);
>                     if (err < 0) {
>                         logger("Could not schedule aio append: %s\n",
> strerror(-err));
>                         rados_aio_release(comp);
>                         rados_ioctx_destroy(io);
>                         rados_shutdown(cluster);
>                         break;
>                     } else {
>                         if (len >= (100 * MB)) {
>                             if (0 == (i % (50 * MB)) && i != 0) {
>                                 logger("Uploaded %d bytes", i);
>                             }
>                         }
>                         i += written;
>                     }
>                 }
>
> I test like this:
> curl -XPUT -T guide.pdf  "http://95.211.192.129:81/upload?pics|guide.pdf" -i
> -v
>
> In most cases all works fine, but sometimes for (big files) , ceph/or
> application corrupts file. And the same file may have various sizes
>
> ceph1 07:14:46 UploadToCeph # rados -p pics stat 1024.img
> pics/1024.img mtime 1400335784, size 1019215872
>
> As you can see, the size is smaller than should be. Can anyone point me
> what's wrong,please ?

In your code here you don't really wait for the aio completions, so
you don't know if the rados requests even succeeded. A useful tool is
the ceph messenger debug log that will help you see what's actually
going to the osds and the return status. You can turn it on by setting
'debug ms = 1'.
I'd also replace the append operation with a write to a specific
offset. It'll help with debugging, and you'd avoid any potential
ordering issues.
Finally, you shouldn't really create raw rados objects with unbounded
sizes. The right way to do it is to stripe everything larger than a
specific size (usually we use 4MB stripes). Besides having a problem
with data distribution, there are certain operations that don't handle
large objects very well (e.g., deep scrub).

Yehuda


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux