Re: K/V interface buffer transaction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 12, 2015 at 3:26 PM, Somnath Roy <Somnath.Roy@xxxxxxxxxxx> wrote:
> Haomai,
>
> << KeyValueStore will only write one for duplicate entry in ordering
>
> I saw K/v store (keyvaluestore.cc) itself is not removing the duplicates , are you saying the shim layer like leveldbstore/rocksdbstore is removing the duplicates or the leveldb/rocksdb ?

Oh no, sorry. That's just I want to do in mind. I forget I haven't impl it.

Each ObjectStore::Transaction in KeyValueStore has corresponding
BufferTransaction will store all kvs needed to store. We could let
submit_transaction do it at last instead of calling backend each op.

Yeah, we could resolve it in KeyValueStore clearly.
>
> Thanks & Regards
> Somnath
>
> -----Original Message-----
> From: Haomai Wang [mailto:haomaiwang@xxxxxxxxx]
> Sent: Wednesday, February 11, 2015 7:36 PM
> To: Somnath Roy
> Cc: sjust@xxxxxxxxxx; Sage Weil; Gregory Farnum; Ceph Development
> Subject: Re: K/V interface buffer transaction
>
> On Thu, Feb 12, 2015 at 6:53 AM, Somnath Roy <Somnath.Roy@xxxxxxxxxxx> wrote:
>> Yeah, thanks!
>> Not sure if level-db is handling duplicate entries within a transaction properly or not, if not, in case of filestore (and also for K/V stores) we are having an extra (redundant) OMAP write in the Write-Path.
>
> KeyValueStore will only write one for duplicate entry in ordering.
>
> But FileStore will write redundant omap.
>
> And from dump log, the duplicate entry looks like from pglog
>
>>
>> Regards
>> Somnath
>>
>> -----Original Message-----
>> From: Samuel Just [mailto:sam.just@xxxxxxxxxxx]
>> Sent: Wednesday, February 11, 2015 2:36 PM
>> To: Somnath Roy
>> Cc: Sage Weil; Gregory Farnum; Haomai Wang (haomaiwang@xxxxxxxxx);
>> Ceph Development
>> Subject: Re: K/V interface buffer transaction
>>
>> Well, the transaction is atomic, so if the key is set twice, you can certainly ignore the first one.
>> -Sam
>>
>> On Wed, Feb 11, 2015 at 2:20 PM, Somnath Roy <Somnath.Roy@xxxxxxxxxxx> wrote:
>>> Hi,
>>> My code had a bug during printing log. I was using map to store the
>>> attribute keys in sorted order and that was discarding the duplicates
>>> :-)
>>>
>>> This is what I found out coming during transaction.
>>>
>>> 2015-02-05 15:58:12.311738 7f27b5429700  0 queue_transactions ::
>>> before _do_transactions
>>> 2015-02-05 15:58:12.311754 7f27b5429700  0 _do_transactions::before
>>> _do_transaction
>>> 2015-02-05 15:58:12.311770 7f27b5429700  0 Transaction::OP_WRITE::cid
>>> = 1.a3_head oid =
>>> 680256a3/rbd_data.100974b0dc51.0000000000000631/head//1 offset =
>>> 3997696 len = 65536
>>> 2015-02-05 15:58:12.311800 7f27b5429700  0
>>> Transaction::OP_SETATTR::cid = 1.a3_head oid =
>>> 680256a3/rbd_data.100974b0dc51.0000000000000631/head//1 attr_name = _
>>> attr_value_len = 273
>>> 2015-02-05 15:58:12.311822 7f27b5429700  0
>>> Transaction::OP_SETATTR::cid = 1.a3_head oid =
>>> 680256a3/rbd_data.100974b0dc51.0000000000000631/head//1 attr_name =
>>> snapset attr_value_len = 31
>>> 2015-02-05 15:58:12.311840 7f27b5429700  0
>>> Transaction::OP_OMAP_SETKEYS::cid = 1.a3_head oid = a3//head//1
>>> 2015-02-05 15:58:12.311845 7f27b5429700  0 OMAP_KEY = 0000000102.00000000000000001592 Value = buffer::list(len=178,
>>>         buffer::ptr(0~4 0x3efc21000 in raw 0x3efc21000 len 4096 nref 6),
>>>         buffer::ptr(0~170 0x3d74840 in raw 0x3d74840 len 688 nref 3),
>>>         buffer::ptr(4~4 0x3efc21004 in raw 0x3efc21000 len 4096 nref
>>> 6)
>>> )
>>> 2015-02-05 15:58:12.311931 7f27b5429700  0
>>> Transaction::OP_OMAP_SETKEYS::cid = 1.a3_head oid = a3//head//1
>>> 2015-02-05 15:58:12.311938 7f27b5429700  0 OMAP_KEY = _epoch Value = buffer::list(len=4,
>>>         buffer::ptr(0~4 0x3efc1f000 in raw 0x3efc1f000 len 4096 nref
>>> 3)
>>> )
>>> 2015-02-05 15:58:12.311943 7f27b5429700  0 OMAP_KEY = _info Value = buffer::list(len=713,
>>>         buffer::ptr(0~713 0x3efc1e000 in raw 0x3efc1e000 len 4096
>>> nref
>>> 3)
>>> )
>>> 2015-02-05 15:58:12.311965 7f27b5429700  0
>>> Transaction::OP_OMAP_SETKEYS::cid = 1.a3_head oid = a3//head//1
>>> 2015-02-05 15:58:12.311969 7f27b5429700  0 OMAP_KEY = 0000000102.00000000000000001592 Value = buffer::list(len=178,
>>>         buffer::ptr(0~4 0x3d75e40 in raw 0x3d75e40 len 688 nref 6),
>>>         buffer::ptr(0~170 0x3d75b80 in raw 0x3d75b80 len 688 nref 3),
>>>         buffer::ptr(4~4 0x3d75e44 in raw 0x3d75e40 len 688 nref 6)
>>> )
>>> 2015-02-05 15:58:12.311980 7f27b5429700  0 OMAP_KEY = can_rollback_to Value = buffer::list(len=12,
>>>         buffer::ptr(0~12 0x3efc25000 in raw 0x3efc25000 len 4096 nref
>>> 3)
>>> )
>>> 2015-02-05 15:58:12.311985 7f27b5429700  0 OMAP_KEY = rollback_info_trimmed_to Value = buffer::list(len=12,
>>>         buffer::ptr(0~12 0x3efc24000 in raw 0x3efc24000 len 4096 nref
>>> 3)
>>> )
>>>
>>>
>>>
>>> So, the OMAP_KEY = 0000000102.00000000000000001592 is coming twice !
>>>
>>> Is there any reason, why ? What is this attribute by the way ?
>>> Can we safely discard the first OP_OMAP_SETKEYS call for the same key ?
>>>
>>> Thanks & Regards
>>> Somnath
>>>
>>> -----Original Message-----
>>> From: Somnath Roy
>>> Sent: Tuesday, February 10, 2015 4:36 PM
>>> To: 'Sage Weil'; Gregory Farnum
>>> Cc: sjust@xxxxxxxxxx; Haomai Wang (haomaiwang@xxxxxxxxx); Ceph
>>> Development
>>> Subject: RE: K/V interface buffer transaction
>>>
>>> Thanks Greg/Sam/Sage !
>>> For now, we will be doing our testing by sorting the keys and will keep an eye on the duplicates.
>>> Another point, why do we need the K/V store thread pool for processing transactions anymore ?
>>> I got rid of that and calling _do_transaction() directly from the ::queue_trasaction , this is giving me ~3X performance improvement.
>>>
>>> Regards
>>> Somnath
>>>
>>> -----Original Message-----
>>> From: Sage Weil [mailto:sweil@xxxxxxxxxx]
>>> Sent: Tuesday, February 10, 2015 10:44 AM
>>> To: Gregory Farnum
>>> Cc: Somnath Roy; sjust@xxxxxxxxxx; Haomai Wang
>>> (haomaiwang@xxxxxxxxx); Ceph Development
>>> Subject: Re: K/V interface buffer transaction
>>>
>>> On Tue, 10 Feb 2015, Gregory Farnum wrote:
>>>> On Tue, Feb 10, 2015 at 10:26 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
>>>> > On Tue, 10 Feb 2015, Somnath Roy wrote:
>>>> >> Thanks Sam !
>>>> >> So, is it safe to do ordering if in a transaction *no* remove/truncate/create/add call ?
>>>> >> For example, do we need to preserve ordering in case of the below transaction ?
>>>> >> It will be helpful if you can give some insight in what scenario preserving order is *must*.
>>>> >
>>>> > If I'm not mistaken teh only time ordering would matter at all in
>>>> > an transaction is when the same key is updated twice, right?  The
>>>> > whole thing is committed atomically.  If there *are* dups, then
>>>> > the order there obviously should be preserved.
>>>> >
>>>> > Maybe a first pass would be add an assert or something that there
>>>> > are no dup keys and see if anything every falls out of that...
>>>> > hopefully there are none!
>>>>
>>>> I'm pretty sure some of the transaction analysis discussions people
>>>> have had say that we do double-updates at times. IIRC it might have
>>>> been the pglog head getting set twice in most transactions?
>>>
>>> Oh yeah, could be.  There was the snapset xattr update, but that was resetting it to an existing value (not the same value inside the same txn).  I forget if there were others.
>>>
>>> sage
>>>
>>> ________________________________
>>>
>>> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>>>
>
>
>
> --
> Best Regards,
>
> Wheat



-- 
Best Regards,

Wheat
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux