Re: 17.2.5 snap_schedule module error (cephsqlite: cannot open temporary database)

Milind Changire <mchangir@xxxxxxxxxx> · Thu, 17 Nov 2022 18:49:57 +0530

On Thu, Nov 17, 2022 at 6:02 PM phandaal <phandaal@xxxxxxxxxxxx> wrote:

> On 2022-11-17 12:58, Milind Changire wrote:
> > Christian,
> > Some obvious questions ...
> >
> >    1. What Linux distribution have you deployed Ceph on ?
>
> Gentoo Linux, using python 3.10.
> Ceph is only used for CephFS, data pool using EC8+3 on spinners,
> metadata using replication on SSDs.
>
> >    2. The snap_schedule db has indeed been moved to an SQLite DB in
> > rados
> >    in Quincy.
> >    So, is there ample storage space in your metadata pool to move this
> > DB
> >    to ?
>
> Should be :
>
> # ceph df
> --- RAW STORAGE ---
> CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
> hdd    160 TiB   46 TiB  114 TiB   114 TiB      71.20
> ssd    3.5 TiB  3.5 TiB   15 GiB    15 GiB       0.42
> TOTAL  164 TiB   50 TiB  114 TiB   114 TiB      69.69
>
> --- POOLS ---
> POOL             ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
> .mgr              1    1  195 MiB       33  585 MiB   0.02    1.1 TiB
> cephfs-metadata   4   32  3.8 GiB  415.03k   11 GiB   0.33    1.1 TiB
> cephfs-data       5  128   83 TiB   26.75M  113 TiB  77.77     24 TiB
>
> 1 TiB should be enough to store some snap schedules...
> I suppose the snap_schedule module doesn't find the sqlite rados object.
> Is there a way I can verify its existence (and create it if needed) ?
>
> The error arrives when trying to restart old schedules
> (schedule_client.py line 169) and trying to find the old store, which
> does not exist, the schedules have been created in Pacific. Can I just
> wipe them out to recreate the schedules from scratch ?
>

The error does show up when trying to restart old schedules, but the
ioctx.stat()
at line schedule_client.py:201 should've thrown a rados.ObjectNotFound
exception
and got caught at line 205. Which doesn't seem to be the case. Which
implies that
the backing rados object for the DB dump was found, but there was a
libcephsqlite
error as per your original email:
2022-11-17T09:50:25.769+0100 7f7be20db6c0 -1 cephsqlite:
Open: (client.444215)  cannot open temporary database

Hence the error at line 203:
                    db.executescript(dump)

You could try cleaning up the old db dumps and restart the cluster to start
with a
clean slate. But, I'd recommend you to backup the db dump object. The db
dump
object in your metadata pool should be named snap_db_v0
So, you should rename all snap_db_v0 to snap_db_v0.orig for all
file-systems.

After saying all this, I wouldn't recommend you do this at all.

The problem seems to be due to these missing bits:
pybind/mgr: use memory temp_store #48449
<https://github.com/ceph/ceph/pull/48449>

Unfortunately, integration testing is stalled due to infrastructure
problems.
However, things are returning back to normal and this will get into a
release at
the earliest.

> Christian.
>
> >
> >
> >
> > On Thu, Nov 17, 2022 at 2:53 PM phandaal <phandaal@xxxxxxxxxxxx> wrote:
> >
> >> Hi all,
> >>
> >> After upgrading from 16.2.10 to 17.2.5, the snap_schedule dashboard
> >> module does not start anymore (everything else is just fine).
> >> I had snap scheduled with this module in my cephfs, working perfectly
> >> on
> >> 16.2.10, but I couldn't find them anymore after upgrade, dut to the
> >> module being unavailable :
> >> # ceph fs snap-schedule status
> >> Error ENOENT: Module 'snap_schedule' is not available
> >>
> >> In the mgr startup logs i can find an error related to the sqlite
> >> database containing the schedules :
> >>
> >> 2022-11-17T09:50:23.489+0100 7f7bbfc976c0  0 [dashboard INFO request]
> >> [192.168.69.20:8696] [GET] [200] [0.011s] [phandaal] [107.0B]
> >> /ceph/api/mgr/module/snap_schedule
> >> 2022-11-17T09:50:23.499+0100 7f7be20db6c0 -1 client.444215:
> >> SimpleRADOSStriper: lock: snap_db_v0.db: waiting for locks:  lockers
> >> exclusive=1 tag=
> >> lockers=[client.444152:35ac7693-032d-47a8-9d5c-4b71291a8158:v1:
> >> 192.168.69.20:0/937503739]
> >> 2022-11-17T09:50:24.189+0100 7f7be20db6c0 -1 client.444215:
> >> SimpleRADOSStriper: lock: snap_db_v0.db: waiting for locks:  lockers
> >> exclusive=1 tag=
> >> lockers=[client.444152:35ac7693-032d-47a8-9d5c-4b71291a8158:v1:
> >> 192.168.69.20:0/937503739]
> >> 2022-11-17T09:50:24.859+0100 7f7be20db6c0 -1 client.444215:
> >> SimpleRADOSStriper: lock: snap_db_v0.db: waiting for locks:  lockers
> >> exclusive=1 tag=
> >> lockers=[client.444152:35ac7693-032d-47a8-9d5c-4b71291a8158:v1:
> >> 192.168.69.20:0/937503739]
> >> 2022-11-17T09:50:25.769+0100 7f7be20db6c0 -1 cephsqlite: Open:
> >> (client.444215)  cannot open temporary database
> >> 2022-11-17T09:50:25.769+0100 7f7be20db6c0 -1 mgr load Failed to
> >> construct class in 'snap_schedule'
> >> 2022-11-17T09:50:25.769+0100 7f7be20db6c0 -1 mgr load Traceback (most
> >> recent call last):
> >>    File "/usr/share/ceph/mgr/snap_schedule/module.py", line 38, in
> >> __init__
> >>      self.client = SnapSchedClient(self)
> >>    File "/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py",
> >> line
> >> 169, in __init__
> >>      with self.get_schedule_db(fs_name) as conn_mgr:
> >>    File "/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py",
> >> line
> >> 203, in get_schedule_db
> >>      db.executescript(dump)
> >> sqlite3.OperationalError: unable to open database file
> >>
> >> 2022-11-17T09:50:25.769+0100 7f7be20db6c0 -1 mgr operator() Failed to
> >> run module in active mode ('snap_schedule')
> >>
> >> I think the snap_schedule database has been moved into rados in
> >> Quincy,
> >> it there any way to manually create the database (empty) ?
> >>
> >> Regards,
> >> Christian.
> >>
> >> --
> >> Christian Vilhelm : phandaal@xxxxxxxxxxxx
> >> Reality is for people who lack imagination
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >>
>
> --
> Christian Vilhelm : phandaal@xxxxxxxxxxxx
> Reality is for people who lack imagination
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>

-- 
Milind
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx