> -----Original Message----- > From: fio-owner@xxxxxxxxxxxxxxx [mailto:fio-owner@xxxxxxxxxxxxxxx] On > Behalf Of Jens Axboe > Sent: Friday, 22 August, 2014 2:11 PM > To: scameron@xxxxxxxxxxxxxxxxxx ... > On 2014-08-22 14:09, scameron@xxxxxxxxxxxxxxxxxx wrote: > > On Fri, Aug 22, 2014 at 02:04:34PM -0500, Jens Axboe wrote: > >> On 2014-08-11 11:04, scameron@xxxxxxxxxxxxxxxxxx wrote: > >>> On Mon, Aug 11, 2014 at 10:44:23AM -0500, scameron@xxxxxxxxxxxxxxxxxx > >>> wrote: > >>>> ... > >>> > >> >from eta.c: > >>> > >>> void print_thread_status(void) > >>> { > >>> struct jobs_eta *je; > >>> size_t size; > >>> > >>> je = get_jobs_eta(0, &size); > >>> if (je) > >>> display_thread_status(je); > >>> > >>> free(je); > >>> } > >>> > >>> Maybe that je is coming back false? which is > >>> probably the return value of calc_thread_status() which, well, > >>> at a glance, I'm not sure what calc_thread_status() is doing. > >> > >> I'll take a look at this next week, been away at a conference since > last > >> weekend. > > > > Ok. Meantime, I had to reclaim the machine for testing, so I no longer > > have it just sitting there to debug, and I have not sseen the problem > again > > that I know of. > > Clearly a hardware issue :-) > > -- > Jens Axboe Rerunning a multi-day job to test out the 64-bit counter fixes, I just saw the same thing after about 2 days - eta updates stop, although IO is still running. Jobs: 210 (f=210): [r(98),X(14),r(112)] [31.5% done] [2388MB/0KB/0KB /s] [4891K/0/0 iops] [eta 01d:17h:05m:24s] I notice that get_jobs_eta makes a malloc() call without checking for NULL - maybe that happened? commit 4b9d6e40b90029f42c378ee82b130af9ceafffd7 Author: Robert Elliott <elliott@xxxxxx> Date: Fri Dec 12 14:04:11 2014 -0600 eta.c: check malloc return code Check the malloc return code in get_jobs_eta. Signed-off-by: Robert Elliott <elliott@xxxxxx> diff --git a/eta.c b/eta.c index a90f1fb47637..167bf5f62b21 100644 --- a/eta.c +++ b/eta.c @@ -572,6 +572,8 @@ struct jobs_eta *get_jobs_eta(int force, size_t *size) *size = sizeof(*je) + THREAD_RUNSTR_SZ; je = malloc(*size); + if (!je) + return NULL; memset(je, 0, *size); if (!calc_thread_status(je, force)) { --- Rob Elliott HP Server Storage -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html