On Thu, Mar 16, 2017 at 4:15 PM, nokia ceph <nokiacephusers@xxxxxxxxx> wrote: > Hello, > > We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 } > > Sure I will try to alter this directive - bdev_aio_max_queue_depth and will > share our results. > > Could you please explain how this calculation happens? What calculation are you referring to? > Thanks > > > On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> >> On Wed, 15 Mar 2017, Brad Hubbard wrote: >> > +ceph-devel >> > >> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph <nokiacephusers@xxxxxxxxx> >> > wrote: >> > > Hello, >> > > >> > > We suspect these messages not only at the time of OSD creation. But in >> > > idle >> > > conditions also. May I know what is the impact of these error? Can we >> > > safely >> > > ignore this? Or is there any way/config to fix this problem >> > > >> > > Few occurrence for these events as follows:--- >> > > >> > > ==== >> > > 2017-03-14 17:16:09.500370 7fedeba61700 4 rocksdb: (Original Log Time >> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60 started >> > > 2017-03-14 17:16:09.500374 7fedeba61700 4 rocksdb: (Original Log Time >> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60: >> > > memtable #1 >> > > done >> > > 2017-03-14 17:16:09.500376 7fedeba61700 4 rocksdb: (Original Log Time >> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros": >> > > 1489511769500289, >> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0, 0, >> > > 0], >> > > "immutable_memtables": 0} >> > > 2017-03-14 17:16:09.500382 7fedeba61700 4 rocksdb: (Original Log Time >> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1 max >> > > bytes >> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76 >> > > >> > > 2017-03-14 17:16:09.500390 7fedeba61700 4 rocksdb: [JOB 17] Try to >> > > delete >> > > WAL files size 244090350, prev total WAL file size 247331500, number >> > > of live >> > > WAL files 2. >> > > >> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1 >> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6 >> > >> > These errors come from here. >> > >> > void KernelDevice::aio_submit(IOContext *ioc) >> > { >> > ... >> > int r = aio_queue.submit(*cur, &retries); >> > if (retries) >> > derr << __func__ << " retries " << retries << dendl; >> > >> > The submit function is this one which calls libaio's io_submit >> > function directly and increments retries if it receives EAGAIN. >> > >> > #if defined(HAVE_LIBAIO) >> > int FS::aio_queue_t::submit(aio_t &aio, int *retries) >> > { >> > // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds >> > int attempts = 16; >> > int delay = 125; >> > iocb *piocb = &aio.iocb; >> > while (true) { >> > int r = io_submit(ctx, 1, &piocb); <-------------NOTE >> > if (r < 0) { >> > if (r == -EAGAIN && attempts-- > 0) { <-------------NOTE >> > usleep(delay); >> > delay *= 2; >> > (*retries)++; >> > continue; >> > } >> > return r; >> > } >> > assert(r == 1); >> > break; >> > } >> > return 0; >> > } >> > >> > >> > From the man page. >> > >> > IO_SUBMIT(2) Linux Programmer's >> > Manual IO_SUBMIT(2) >> > >> > NAME >> > io_submit - submit asynchronous I/O blocks for processing >> > ... >> > RETURN VALUE >> > On success, io_submit() returns the number of iocbs submitted >> > (which may be 0 if nr is zero). For the failure >> > return, see NOTES. >> > >> > ERRORS >> > EAGAIN Insufficient resources are available to queue any iocbs. >> > >> > I suspect increasing bdev_aio_max_queue_depth may help here but some >> > of the other devs may have more/better ideas. >> >> Yes--try increasing bdev_aio_max_queue_depth. It defaults to 32; try >> changing it to 128, 1024, or 4096 and see if these errors go away. >> >> I've never been able to trigger this on my test boxes, but I put in the >> warning to help ensure we pick a good default. >> >> What kernel version are you running? >> >> Thanks! >> sage > > -- Cheers, Brad -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html