Re: failed erasure code pool creation after client upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Yuri,

https://github.com/ceph/ceph-qa-suite/commit/3b4442a2014200222764e7fce0cb9c343d97efde

points to 

https://github.com/ceph/ceph/blob/dumpling/qa/workunits/rados/test-upgrade-firefly.sh

however, because the client was upgraded, the ceph_test_rados_api_aio binary being run is the firefly one (only the workunit is pulled from the repository, if I'm not mistaken) and it tries to create the erasure coded pool.

2014-10-26T05:27:18.338 INFO:tasks.workunit:Running workunits matching rados/test-upgrade-firefly.sh on client.0...
2014-10-26T05:27:18.338 INFO:tasks.workunit:Running workunit rados/test-upgrade-firefly.sh...
2014-10-26T05:27:18.339 INFO:teuthology.orchestra.run.plana67:Running: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=dumpling TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/rados/test-upgrade-firefly.sh'
2014-10-26T05:27:18.350 INFO:tasks.workunit.client.0.plana67.stderr:+ ceph_test_rados_api_aio --gtest_filter=-LibRadosAio.OmapPP
2014-10-26T05:27:18.356 INFO:tasks.workunit.client.0.plana67.stdout:Running main() from gtest_main.cc
2014-10-26T05:27:18.356 INFO:tasks.workunit.client.0.plana67.stdout:Note: Google Test filter = -LibRadosAio.OmapPP
...
2014-10-26T05:29:14.364 INFO:tasks.workunit.client.0.plana67.stdout:[ RUN      ] LibRadosAioEC.SimpleWrite
2014-10-26T05:29:20.002 INFO:tasks.workunit.client.0.plana67.stdout:test/librados/aio.cc:1634: Failure
2014-10-26T05:29:20.002 INFO:tasks.workunit.client.0.plana67.stdout:Value of: test_data.init()
2014-10-26T05:29:20.002 INFO:tasks.workunit.client.0.plana67.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana67-12901-32) failed: error rados_mon_command erasure-code-profile set name:testprofile failed with error -95"
2014-10-26T05:29:20.003 INFO:tasks.workunit.client.0.plana67.stdout:Expected: ""

What would probably make sense is to make sure firefly tests are able to run successfully against a dumpling cluster. Or just silently skip tests that can't run on a cluster that does not have the required features ? In any case I can't think of a solution that would run what you want just by juggling with binaries in various branches. But someone else may have an idea, it is entirely possible that I'm missing something simple ;-)

Cheers

On 26/10/2014 08:33, Yuri Weinstein wrote:
> So far the change did not help still having issues in this run 
> http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-25_17:05:01-upgrade:firefly:singleton-firefly-distro-basic-multi/571576/teuthology.log
> 
> On Sat, Oct 25, 2014 at 3:49 PM, Tamil Muthamizhan <tamil.muthamizhan@xxxxxxxxxxx <mailto:tamil.muthamizhan@xxxxxxxxxxx>> wrote:
> 
>     ok, so it looks like we are running the wrong version of rados/test.sh in this test.
> 
>     we are actually upgrading from dumpling to firefly in this failing test and we should have used dumpling version of rados/test-upgrade-firefly.sh [which is exclusively when upgrading the cluster from dumpling].
> 
>     Yuri is working on fixing this in the suite.
> 
>     Thanks,
>     Tamil
> 
> 
>     On Sat, Oct 25, 2014 at 10:29 AM, Loic Dachary <loic@xxxxxxxxxxx <mailto:loic@xxxxxxxxxxx>> wrote:
> 
>         [cc'ing ceph-devel for archive]
> 
>         Hi,
> 
>         I see a lot of errors with
> 
>         #define EOPNOTSUPP      95      /* Operation not supported on transport endpoint */
> 
>         2014-10-24T20:26:54.335 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWrite
>         2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1634: Failure
>         2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
>         2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-33) failed: error rados_mon_command erasure-code-profile set name:testprofile failed with error -95"
>         2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:Expected: ""
>         2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[  FAILED  ] LibRadosAioEC.SimpleWrite (2403 ms)
>         2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWritePP
>         2014-10-24T20:26:59.141 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1669: Failure
>         2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
>         2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-
> 
>         which indeed suggests that the client is trying to create an erasure coded pool in a cluster that does not support it. But since it looks like it's upgrading from firefly to a later version, I don't understand why that would be a problem.
> 
>         How did that get scheduled ?
> 
>         Cheers
> 
>         On 25/10/2014 08:37, Yuri Weinstein wrote:
>         > Not sure what's going on with it, thx.
>         >
>         > It's unusual in a way that upgrades a client first.
>         >
>         > http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-24_17:05:01-upgrade:firefly:singleton-firefly-distro-basic-multi/569532/teuthology.log
> 
>         --
>         Loïc Dachary, Artisan Logiciel Libre
> 
> 
> 
> 
>     -- 
>     Regards,
>     Tamil
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux