Various file lengths while uploading the same file

yehuda@xxxxxxxxxxx (Yehuda Sadeh) · Mon, 19 May 2014 01:02:03 -0700

On Mon, May 19, 2014 at 12:19 AM, Arthur Tumanyan
<arthurtumanyan at gmail.com> wrote:
> Hi,
> I'm writing an fastcgi application, which just gets data from stdin and
> appends it to ceph.
> Please,have a look:
>
>                 while (0 != (written = fread(buffer, sizeof (char), sizeof
> buffer, stdin))) {
>                     if (written < 0) {
>                         logger("Error: %s\n", strerror(errno));
>                         break;
>                     }
>                     err = rados_aio_append(io, filename, comp, (const char
> *) buffer, written);
>                     if (err < 0) {
>                         logger("Could not schedule aio append: %s\n",
> strerror(-err));
>                         rados_aio_release(comp);
>                         rados_ioctx_destroy(io);
>                         rados_shutdown(cluster);
>                         break;
>                     } else {
>                         if (len >= (100 * MB)) {
>                             if (0 == (i % (50 * MB)) && i != 0) {
>                                 logger("Uploaded %d bytes", i);
>                             }
>                         }
>                         i += written;
>                     }
>                 }
>
> I test like this:
> curl -XPUT -T guide.pdf  "http://95.211.192.129:81/upload?pics|guide.pdf" -i
> -v
>
> In most cases all works fine, but sometimes for (big files) , ceph/or
> application corrupts file. And the same file may have various sizes
>
> ceph1 07:14:46 UploadToCeph # rados -p pics stat 1024.img
> pics/1024.img mtime 1400335784, size 1019215872
>
> As you can see, the size is smaller than should be. Can anyone point me
> what's wrong,please ?

In your code here you don't really wait for the aio completions, so
you don't know if the rados requests even succeeded. A useful tool is
the ceph messenger debug log that will help you see what's actually
going to the osds and the return status. You can turn it on by setting
'debug ms = 1'.
I'd also replace the append operation with a write to a specific
offset. It'll help with debugging, and you'd avoid any potential
ordering issues.
Finally, you shouldn't really create raw rados objects with unbounded
sizes. The right way to do it is to stripe everything larger than a
specific size (usually we use 4MB stripes). Besides having a problem
with data distribution, there are certain operations that don't handle
large objects very well (e.g., deep scrub).

Yehuda