Help with understanding throttle.finish()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm running a rather simple setup on my FreeBSD port.

function TEST_simple() {
    # run the most simple config, and run a bechmark on it.
    local dir=$1

    run_mon $dir a || return 1
    run_osd $dir 0 || return 1

    #
    # default values should work
    #
    ceph tell osd.0 bench || return 1

}

This in the end crashes with:
    8059eec00 -1 FileStore: sync_entry timed out after 600 seconds.
exactly 10 minutes after startup.
This trhread does just about exactly nothing, it initialises the time, and then traps after 10 minutes.
# grep 8059eec00  testdir/osd-bench/osd.0.log
2016-04-28 19:51:44.444689 8059eec00 -1 FileStore: sync_entry timed out after 600 seconds. 2016-04-28 19:51:44.487104 8059eec00 -1 os/filestore/FileStore.cc: In function 'virtual void SyncEntryTimeout::finish(int)' thread 8059eec00 time

I've augmented just about every line of code in src/os/filestore/File*.cc

And I get as far as
	throttle.flush()
It seems to work the first time it gets called. That is, it returns.

However the second time is "stalls" in the call
	 throttle.put(to_put_bytes);
after a number of itterations thru the while-loop.

Below the code of Throttle::Flush and an sequence of the thread doing the blocking call
from the log where I've added mode debugging code.

Does anybody have a clue as to continue from here?
  Why does throttle.put not return?
  Is it just throttleing too hard and not reorganising the queue?

--WjW

Code:
std::pair<uint64_t, uint64_t> JournalThrottle::flush(uint64_t mono_id)
{
  uint64_t to_put_bytes = 0;
  uint64_t to_put_ops = 0;
  std::pair<uint64_t, uint64_t> ret;
  {
    locker l(lock);
    while (!journaled_ops.empty() &&
           journaled_ops.front().first <= mono_id) {
      to_put_bytes += journaled_ops.front().second;
      to_put_ops++;
      journaled_ops.pop_front();
    }
  }
  throttle.put(to_put_bytes);
  ret = make_pair(to_put_ops, to_put_bytes);
  return ret;
}

Thread log:
2016-04-28 19:41:43.231410 8059ec900 20 filestore(testdir/osd-bench/0) sync_entry(1:1) waiting for max_interval 5.000000 2016-04-28 19:41:43.231438 8059ec900 20 filestore(testdir/osd-bench/0) sync_entry(1:1) started at: 2016-04-28 19:41:43.231436 2016-04-28 19:41:44.395763 8059ec900 20 filestore(testdir/osd-bench/0) sync_entry(1:1) ended at: 2016-04-28 19:41:44.395758 2016-04-28 19:41:44.395781 8059ec900 20 filestore(testdir/osd-bench/0) sync_entry(1:1) waited for 1 2016-04-28 19:41:44.395789 8059ec900 20 filestore(testdir/osd-bench/0) sync_entry(1:1) woke after 1.164433 2016-04-28 19:41:44.395795 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3726 2016-04-28 19:41:44.395801 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3730 2016-04-28 19:41:44.395806 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3733 2016-04-28 19:41:44.400570 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3735 starting apply_manager.commit_start() 2016-04-28 19:41:44.400584 8059ec900 10 journal commit_start max_applied_seq 32, open_ops 0 2016-04-28 19:41:44.400588 8059ec900 10 journal commit_start blocked, all open_ops have completed 2016-04-28 19:41:44.400591 8059ec900 10 journal commit_start committing 32, still blocked
2016-04-28 19:41:44.400596 8059ec900 10 journal FileJournal commit_start
2016-04-28 19:41:44.400599 8059ec900 10 journal commit_start FULL_NOTFULL. All good. 2016-04-28 19:41:44.400602 8059ec900 10 journal apply_manager.commit_start() returning 1 2016-04-28 19:41:44.400605 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3738 done(true) apply_manager.commit_start() 2016-04-28 19:41:44.400637 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3743 2016-04-28 19:41:44.400643 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3745 2016-04-28 19:41:44.400653 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3748 2016-04-28 19:41:44.400673 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3750 2016-04-28 19:41:44.400680 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) 3752 2016-04-28 19:41:44.400685 8059ec900 15 filestore(testdir/osd-bench/0) sync_entry(1:1) committing 32 2016-04-28 19:41:44.400701 8059ec900 10 journal commit_started committing 32, unblocking
2016-04-28 19:41:44.400728 8059ec900 20 filestore dbobjectmap: seq is 5
2016-04-28 19:41:44.517112 8059ec900 15 genericfilestorebackend(testdir/osd-bench/0) syncfs: doing a full sync (syncfs(2) if possible) 2016-04-28 19:41:45.471870 8059ec900 10 filestore(testdir/osd-bench/0) sync_entry(1:1) commit took 1.071257, interval was 2.240513 2016-04-28 19:41:45.471885 8059ec900 10 journal commit_finish 238 thru 32 journal: 0x8058e5400
2016-04-28 19:41:45.473683 8059ec900 15 journal committed_thru:1804 locked.
2016-04-28 19:41:45.473697 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473701 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473705 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473707 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473710 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473747 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473750 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473753 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473756 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473758 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473761 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473763 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473767 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473772 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473775 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473777 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473779 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473781 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473784 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473786 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473788 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473797 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473800 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473803 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473806 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473809 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473811 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473814 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473817 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473819 8059ec900 10 throttle flush while !journaled_ops.empty() 2016-04-28 19:41:45.473822 8059ec900 10 throttle flush:63 calling throttle.put()

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux