On Tue, Jul 2, 2013 at 9:07 PM, Alex Elder <alex.elder@xxxxxxxxxx> wrote: > On 06/24/2013 01:41 AM, Yan, Zheng wrote: >> From: "Yan, Zheng" <zheng.z.yan@xxxxxxxxx> > > Sorry it took so long, I intended to take a look at this > for you sooner. > > I would also like to thank you for this nice clear > description. It made it very easy to understand > why you were proposing the change, and to focus in > on exactly which parts of the design it's affecting. > >> We can't use !req->r_sent to check if OSD request is sent for the >> first time, this is because __cancel_request() zeros req->r_sent >> when OSD map changes. Rather than adding a new variable to struct > > You're right. > >> ceph_osd_request to indicate if it's sent for the first time, We >> can call the unsafe callback only when unsafe OSD reply is received. >> If OSD's first reply is safe, just skip calling the unsafe callback. > > This seems reasonable, but it's different from the way I > thought about what constituted "unsafe." But I may be > wrong, and the way this is used by the file system might > do something that addresses my concern. > > The way I interpreted "unsafe" was simply that it was possible > a write *could* have been made persistent, even if the client > doesn't know about it. A request could have made it to its > target osd, been written, and the response could be in flight > at the point something (maybe a router?) crashes and the response > gets lost. During that time window, the stored data may not be > in a state that's consistent with the client's view of it. > > So I thought of "unsafe" as meaning that a write is in flight, > and until we get a successful response, the storage might > contain the old data or it might contain the new data; the > client has no way of knowing which. > > With that interpretation, a request becomes unsafe the > instant it leaves the client, and becomes safe again > the instant a response arrives. > > If my interpretation is correct, this change is wrong. > > But I may be wrong, and there may really be no need to > worry about a possible modification of data until after > an acknowledgement response is received. In that case, > I've looked at your patch and it looks good. > > Can you explain why I'm wrong about what is "unsafe?" I didn't say you are wrong. the reason I changed the meaning of the unsafe callback is that the "unsafe' callback is only used for fsync(2). I think it's OK to change it as long as the change does not break fsync(2). regards yan, zheng > > -Alex > >> The purpose of unsafe callback is adding unsafe request to a list, >> so that fsync(2) can wait for the safe reply. fsync(2) doesn't need >> to wait for a write(2) that hasn't returned yet. So it's OK to add >> request to the unsafe list when the first OSD reply is received. >> (ceph_sync_write() returns after receiving the first OSD reply) >> >> Signed-off-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx> >> --- >> net/ceph/osd_client.c | 14 +++++++------- >> 1 file changed, 7 insertions(+), 7 deletions(-) >> >> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c >> index 540dd29..dd47889 100644 >> --- a/net/ceph/osd_client.c >> +++ b/net/ceph/osd_client.c >> @@ -1337,10 +1337,6 @@ static void __send_request(struct ceph_osd_client *osdc, >> >> ceph_msg_get(req->r_request); /* send consumes a ref */ >> >> - /* Mark the request unsafe if this is the first timet's being sent. */ >> - >> - if (!req->r_sent && req->r_unsafe_callback) >> - req->r_unsafe_callback(req, true); >> req->r_sent = req->r_osd->o_incarnation; >> >> ceph_con_send(&req->r_osd->o_con, req->r_request); >> @@ -1431,8 +1427,6 @@ static void handle_osds_timeout(struct work_struct *work) >> >> static void complete_request(struct ceph_osd_request *req) >> { >> - if (req->r_unsafe_callback) >> - req->r_unsafe_callback(req, false); >> complete_all(&req->r_safe_completion); /* fsync waiter */ >> } >> >> @@ -1559,14 +1553,20 @@ static void handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg, >> mutex_unlock(&osdc->request_mutex); >> >> if (!already_completed) { >> + if (req->r_unsafe_callback && >> + result >= 0 && !(flags & CEPH_OSD_FLAG_ONDISK)) >> + req->r_unsafe_callback(req, true); >> if (req->r_callback) >> req->r_callback(req, msg); >> else >> complete_all(&req->r_completion); >> } >> >> - if (flags & CEPH_OSD_FLAG_ONDISK) >> + if (flags & CEPH_OSD_FLAG_ONDISK) { >> + if (req->r_unsafe_callback && already_completed) >> + req->r_unsafe_callback(req, false); >> complete_request(req); >> + } >> >> done: >> dout("req=%p req->r_linger=%d\n", req, req->r_linger); >> > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html