On Wed, 20 Mar 2013, Yan, Zheng wrote: > On 03/20/2013 07:09 AM, Greg Farnum wrote: > > Hmm, this is definitely narrowing the race (probably enough to never hit it), but it's not actually eliminating it (if the restart happens after 4 billion requests?). More importantly this kind of symptom makes me worry that we might be papering over more serious issues with colliding states in the Table on restart. > > I don't have the MDSTable semantics in my head so I'll need to look into this later unless somebody else volunteers to do so? > > Not just 4 billion requests, MDS restart has several stage, mdsmap epoch > increases for each stage. I don't think there are any more colliding > states in the table. The table client/server use two phase commit. it's > similar to client request that involves multiple MDS. the reqid is > analogy to client request id. The difference is client request ID is > unique because new client always get an unique session id. Each time a tid is consumed (at least for an update) it is journaled in the EMetaBlob::table_tids list, right? So we could actually take a max from journal replay and pick up where we left off? That seems like the cleanest. I'm not too worried about 2^32 tids, I guess, but it would be nicer to avoid that possibility. sage > > Thanks > Yan, Zheng > > > -Greg > > > > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > > > > On Sunday, March 17, 2013 at 7:51 AM, Yan, Zheng wrote: > > > >> From: "Yan, Zheng" <zheng.z.yan@xxxxxxxxx> > >> > >> When a MDS becomes active, the table server re-sends 'agree' messages > >> for old prepared request. If the recoverd MDS starts a new table request > >> at the same time, The new request's ID can happen to be the same as old > >> prepared request's ID, because current table client assigns request ID > >> from zero after MDS restarts. > >> > >> Signed-off-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx (mailto:zheng.z.yan@xxxxxxxxx)> > >> --- > >> src/mds/MDS.cc (http://MDS.cc) | 3 +++ > >> src/mds/MDSTableClient.cc (http://MDSTableClient.cc) | 5 +++++ > >> src/mds/MDSTableClient.h | 2 ++ > >> 3 files changed, 10 insertions(+) > >> > >> diff --git a/src/mds/MDS.cc (http://MDS.cc) b/src/mds/MDS.cc (http://MDS.cc) > >> index bb1c833..859782a 100644 > >> --- a/src/mds/MDS.cc (http://MDS.cc) > >> +++ b/src/mds/MDS.cc (http://MDS.cc) > >> @@ -1212,6 +1212,9 @@ void MDS::boot_start(int step, int r) > >> dout(2) << "boot_start " << step << ": opening snap table" << dendl; > >> snapserver->load(gather.new_sub()); > >> } > >> + > >> + anchorclient->init(); > >> + snapclient->init(); > >> > >> dout(2) << "boot_start " << step << ": opening mds log" << dendl; > >> mdlog->open(gather.new_sub()); > >> diff --git a/src/mds/MDSTableClient.cc (http://MDSTableClient.cc) b/src/mds/MDSTableClient.cc (http://MDSTableClient.cc) > >> index ea021f5..beba0a3 100644 > >> --- a/src/mds/MDSTableClient.cc (http://MDSTableClient.cc) > >> +++ b/src/mds/MDSTableClient.cc (http://MDSTableClient.cc) > >> @@ -34,6 +34,11 @@ > >> #undef dout_prefix > >> #define dout_prefix *_dout << "mds." << mds->get_nodeid() << ".tableclient(" << get_mdstable_name(table) << ") " > >> > >> +void MDSTableClient::init() > >> +{ > >> + // make reqid unique between MDS restarts > >> + last_reqid = (uint64_t)mds->mdsmap->get_epoch() << 32; > >> +} > >> > >> void MDSTableClient::handle_request(class MMDSTableRequest *m) > >> { > >> diff --git a/src/mds/MDSTableClient.h b/src/mds/MDSTableClient.h > >> index e15837f..78035db 100644 > >> --- a/src/mds/MDSTableClient.h > >> +++ b/src/mds/MDSTableClient.h > >> @@ -63,6 +63,8 @@ public: > >> MDSTableClient(MDS *m, int tab) : mds(m), table(tab), last_reqid(0) {} > >> virtual ~MDSTableClient() {} > >> > >> + void init(); > >> + > >> void handle_request(MMDSTableRequest *m); > >> > >> void _prepare(bufferlist& mutation, version_t *ptid, bufferlist *pbl, Context *onfinish); > >> -- > >> 1.7.11.7 > > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html