Re: How to solve data fixity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



i've cc'ed Matt who's working on the s3 object integrity feature
https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html,
where rgw compares the generated checksum with the client's on ingest,
then stores it with the object so clients can read it back for later
integrity checks. you can track the progress in
https://tracker.ceph.com/issues/63951

On Fri, Feb 9, 2024 at 8:49 AM Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote:
>
> MPU etags are an MD5-of-MD5s, FWIW. If the users knows how the parts are
> uploaded then it can be used to verify contents, both just after upload and
> then at download time (both need to be validated if you want end-to-end
> validation - but then you're trusting the system to not change the etag
> underneath you).
>
> Josh
>
> On Fri, Feb 9, 2024, 6:16 a.m. Michal Strnad <michal.strnad@xxxxxxxxx>
> wrote:
>
> > Thank you for your response.
> >
> > We have already done some Lua scripting in the past, and it wasn't
> > entirely enjoyable :-), but we may have to do it again. Scrubbing is
> > still enabled, and turning it off definitely won't be an option.
> > However, due to the project requirements, it would be great if
> > Ceph could, on upload completion, initiate and compute hash (
> > md5, sha256) and store it to object's metadata, so that user later
> > could validate if the downloaded data are correct.
> >
> > We can't use Etag for that as it is does not contain md5 in case of
> > multipart upload.
> >
> > Michal
> >
> >
> > On 2/9/24 13:53, Anthony D'Atri wrote:
> > > You could use Lua scripting perhaps to do this at ingest, but I'm very
> > curious about scrubs -- you have them turned off completely?
> > >
> > >
> > >> On Feb 9, 2024, at 04:18, Michal Strnad <michal.strnad@xxxxxxxxx>
> > wrote:
> > >>
> > >> Hi all!
> > >>
> > >> In the context of a repository-type project, we need to address a
> > situation where we cannot use periodic checks in Ceph (scrubbing) due to
> > the project's nature. Instead, we need the ability to write a checksum into
> > the metadata of the uploaded file via API. In this context, we are not
> > concerned about individual file parts, but rather the file as a whole.
> > Users will calculate the checksum and write it. Based on this hash, we
> > should be able to trigger a check of the given files. We are aware that
> > tools like s3cmd can write MD5 hashes to file metadata, but is there a more
> > general approach? Does anyone have experience with this, or can you suggest
> > a tool that can accomplish this?
> > >>
> > >> Thx
> > >> Michal
> > >> _______________________________________________
> > >> ceph-users mailing list -- ceph-users@xxxxxxx
> > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux