Re: [PATCH 04/39] mds: make sure table request id unique

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/20/2013 02:15 PM, Sage Weil wrote:
> On Wed, 20 Mar 2013, Yan, Zheng wrote:
>> On 03/20/2013 07:09 AM, Greg Farnum wrote:
>>> Hmm, this is definitely narrowing the race (probably enough to never hit it), but it's not actually eliminating it (if the restart happens after 4 billion requests?). More importantly this kind of symptom makes me worry that we might be papering over more serious issues with colliding states in the Table on restart.
>>> I don't have the MDSTable semantics in my head so I'll need to look into this later unless somebody else volunteers to do so?
>>
>> Not just 4 billion requests, MDS restart has several stage, mdsmap epoch 
>> increases for each stage. I don't think there are any more colliding 
>> states in the table. The table client/server use two phase commit. it's 
>> similar to client request that involves multiple MDS. the reqid is 
>> analogy to client request id. The difference is client request ID is 
>> unique because new client always get an unique session id.
> 
> Each time a tid is consumed (at least for an update) it is journaled in 
> the EMetaBlob::table_tids list, right?  So we could actually take a max 
> from journal replay and pick up where we left off?  That seems like the 
> cleanest.
> 
> I'm not too worried about 2^32 tids, I guess, but it would be nicer to 
> avoid that possibility.
> 

Can we re-use the client request ID as table client request ID ?

Regards
Yan, Zheng

> sage
> 
>>
>> Thanks
>> Yan, Zheng
>>
>>> -Greg
>>>
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Sunday, March 17, 2013 at 7:51 AM, Yan, Zheng wrote:
>>>
>>>> From: "Yan, Zheng" <zheng.z.yan@xxxxxxxxx>
>>>>  
>>>> When a MDS becomes active, the table server re-sends 'agree' messages
>>>> for old prepared request. If the recoverd MDS starts a new table request
>>>> at the same time, The new request's ID can happen to be the same as old
>>>> prepared request's ID, because current table client assigns request ID
>>>> from zero after MDS restarts.
>>>>  
>>>> Signed-off-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx (mailto:zheng.z.yan@xxxxxxxxx)>
>>>> ---
>>>> src/mds/MDS.cc (http://MDS.cc) | 3 +++
>>>> src/mds/MDSTableClient.cc (http://MDSTableClient.cc) | 5 +++++
>>>> src/mds/MDSTableClient.h | 2 ++
>>>> 3 files changed, 10 insertions(+)
>>>>  
>>>> diff --git a/src/mds/MDS.cc (http://MDS.cc) b/src/mds/MDS.cc (http://MDS.cc)
>>>> index bb1c833..859782a 100644
>>>> --- a/src/mds/MDS.cc (http://MDS.cc)
>>>> +++ b/src/mds/MDS.cc (http://MDS.cc)
>>>> @@ -1212,6 +1212,9 @@ void MDS::boot_start(int step, int r)
>>>> dout(2) << "boot_start " << step << ": opening snap table" << dendl;  
>>>> snapserver->load(gather.new_sub());
>>>> }
>>>> +
>>>> + anchorclient->init();
>>>> + snapclient->init();
>>>>  
>>>> dout(2) << "boot_start " << step << ": opening mds log" << dendl;
>>>> mdlog->open(gather.new_sub());
>>>> diff --git a/src/mds/MDSTableClient.cc (http://MDSTableClient.cc) b/src/mds/MDSTableClient.cc (http://MDSTableClient.cc)
>>>> index ea021f5..beba0a3 100644
>>>> --- a/src/mds/MDSTableClient.cc (http://MDSTableClient.cc)
>>>> +++ b/src/mds/MDSTableClient.cc (http://MDSTableClient.cc)
>>>> @@ -34,6 +34,11 @@
>>>> #undef dout_prefix
>>>> #define dout_prefix *_dout << "mds." << mds->get_nodeid() << ".tableclient(" << get_mdstable_name(table) << ") "
>>>>  
>>>> +void MDSTableClient::init()
>>>> +{
>>>> + // make reqid unique between MDS restarts
>>>> + last_reqid = (uint64_t)mds->mdsmap->get_epoch() << 32;
>>>> +}
>>>>  
>>>> void MDSTableClient::handle_request(class MMDSTableRequest *m)
>>>> {
>>>> diff --git a/src/mds/MDSTableClient.h b/src/mds/MDSTableClient.h
>>>> index e15837f..78035db 100644
>>>> --- a/src/mds/MDSTableClient.h
>>>> +++ b/src/mds/MDSTableClient.h
>>>> @@ -63,6 +63,8 @@ public:
>>>> MDSTableClient(MDS *m, int tab) : mds(m), table(tab), last_reqid(0) {}
>>>> virtual ~MDSTableClient() {}
>>>>  
>>>> + void init();
>>>> +
>>>> void handle_request(MMDSTableRequest *m);
>>>>  
>>>> void _prepare(bufferlist& mutation, version_t *ptid, bufferlist *pbl, Context *onfinish);
>>>> --  
>>>> 1.7.11.7
>>>
>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux