RE: fio main thread got stuck over the weekend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: fio-owner@xxxxxxxxxxxxxxx [mailto:fio-owner@xxxxxxxxxxxxxxx] On
> Behalf Of Jens Axboe
> Sent: Friday, 22 August, 2014 2:11 PM
> To: scameron@xxxxxxxxxxxxxxxxxx
...
> On 2014-08-22 14:09, scameron@xxxxxxxxxxxxxxxxxx wrote:
> > On Fri, Aug 22, 2014 at 02:04:34PM -0500, Jens Axboe wrote:
> >> On 2014-08-11 11:04, scameron@xxxxxxxxxxxxxxxxxx wrote:
> >>> On Mon, Aug 11, 2014 at 10:44:23AM -0500, scameron@xxxxxxxxxxxxxxxxxx
> >>> wrote:
> >>>>
...
> >>>
> >> >from eta.c:
> >>>
> >>> void print_thread_status(void)
> >>> {
> >>>          struct jobs_eta *je;
> >>>          size_t size;
> >>>
> >>>          je = get_jobs_eta(0, &size);
> >>>          if (je)
> >>>                  display_thread_status(je);
> >>>
> >>>          free(je);
> >>> }
> >>>
> >>> Maybe that je is coming back false?  which is
> >>> probably the return value of calc_thread_status() which, well,
> >>> at a glance, I'm not sure what calc_thread_status() is doing.
> >>
> >> I'll take a look at this next week, been away at a conference since
> last
> >> weekend.
> >
> > Ok.  Meantime, I had to reclaim the machine for testing, so I no longer
> > have it just sitting there to debug, and I have not sseen the problem
> again
> > that I know of.
> 
> Clearly a hardware issue :-)
> 
> --
> Jens Axboe

Rerunning a multi-day job to test out the 64-bit counter fixes,
I just saw the same thing after about 2 days - eta updates stop, 
although IO is still running.

Jobs: 210 (f=210): [r(98),X(14),r(112)] [31.5% done] [2388MB/0KB/0KB /s] [4891K/0/0 iops] [eta 01d:17h:05m:24s]

I notice that get_jobs_eta makes a malloc() call without 
checking for NULL - maybe that happened?



commit 4b9d6e40b90029f42c378ee82b130af9ceafffd7
Author: Robert Elliott <elliott@xxxxxx>
Date:   Fri Dec 12 14:04:11 2014 -0600

    eta.c: check malloc return code
    
    Check the malloc return code in get_jobs_eta.
    
    Signed-off-by: Robert Elliott <elliott@xxxxxx>

diff --git a/eta.c b/eta.c
index a90f1fb47637..167bf5f62b21 100644
--- a/eta.c
+++ b/eta.c
@@ -572,6 +572,8 @@ struct jobs_eta *get_jobs_eta(int force, size_t *size)
 
 	*size = sizeof(*je) + THREAD_RUNSTR_SZ;
 	je = malloc(*size);
+	if (!je)
+		return NULL;
 	memset(je, 0, *size);
 
 	if (!calc_thread_status(je, force)) {


---
Rob Elliott    HP Server Storage





--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux