Dan Smith wrote: > OL> @@ -86,46 +132,44 @@ static int cr_read_file(struct cr_ctx *ctx, int objref) > OL> goto out; > > OL> ret = -EINVAL; > OL> + if (hh->fd_objref < 0) > OL> + goto out; > > As far as I can tell, hh->fd_objref never gets set anywhere. On my > system, this causes restart to always fail because there is garbage in > that field, thus triggering the above check. If I remove this, > restart completes successfully. The following grep tells me that > maybe this check isn't valid: > > % grep fd_objref checkpoint/*.c include/linux/checkpoint*.h > checkpoint/rstr_file.c: file = cr_obj_get_by_ref(ctx, hh->fd_objref, CR_OBJ_FILE); > checkpoint/rstr_file.c: file = cr_obj_add_file(ctx, fds[1-which], hh->fd_objref); > checkpoint/rstr_file.c:static int cr_read_fd_objref(struct cr_ctx *ctx, struct cr_hdr_file *hh) > checkpoint/rstr_file.c: file = cr_obj_get_by_ref(ctx, hh->fd_objref, CR_OBJ_FILE); > checkpoint/rstr_file.c: if (hh->fd_objref < 0) > checkpoint/rstr_file.c: fd = cr_read_fd_objref(ctx, hh); > include/linux/checkpoint_hdr.h: __s32 fd_objref; hh->fd_objref is set, for pipes, in fs/pipe.c (outcome of the move to f_ops). So the problem is that the field isn't explicitly zeroed otherwise. I'll fix that for the next round. Meanwhile, you can add: hh->fd_objref = 0; in cr_write_file() before the call to file->f_ops->checkpoint(). Thanks, Oren. > > I haven't looked into the surrounding bits yet, so maybe I'm missing > something, but this seems to be causing a spurious failure on s390 at > least. > > I'm doing this on a clone of your repository's ckpt-v14-rc2 branch. > Perhaps that repo is missing a patch? > _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers