On Wed, 30 Aug 2017, Xiaoxi Chen wrote: > Also upgraded our pre-production to Luminous from Jewel. > > Some nit: > > 1. no clear explanation on what happen if a pool is not associated > with an application and the difference between "rbd init <pool>" and > "ceph osd ceph osd pool application enable <pool> rbd", actually not > associate a pool with app will not block IO , only a warning on health > status, and "rbd init" is identical to "application enable rbd". But > it really scared user when planning upgrade. Hmm, we can add a final step to the upgrade instructions discussing what to do with untagged pools? > 2. When mgr is not deployed, which is pretty common for jewel > deployment. after fully upgrade and "ceph osd require-osd-release > luminous" is set, all PG related operation(eg: ceph pg stat) goes to > mgr but mgr is not exists, an infinity waiting is there , instead of > error out. This is why restarting or deploying a mgr is step 7 (before the require-osd-release at step 11). Perhaps a bold warning that the mgr is required? Thanks! sage > > 2017-08-30 14:05 GMT+08:00 Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx>: > > Very nice! > > > > I tested an upgrade from Jewel, pretty painless. However we forgot to merge: > > > > http://tracker.ceph.com/issues/20950 > > > > So the mgr creation requires surgery still :-( > > > > regards > > > > Mark > > > > > > > > On 30/08/17 06:20, Abhishek Lekshmanan wrote: > >> > >> We're glad to announce the first release of Luminous v12.2.x long term > >> stable release series. There have been major changes since Kraken > >> (v11.2.z) and Jewel (v10.2.z), and the upgrade process is non-trivial. > >> Please read the release notes carefully. > >> > >> For more details, links & changelog please refer to the > >> complete release notes entry at the Ceph blog: > >> http://ceph.com/releases/v12-2-0-luminous-released/ > >> > >> > >> Major Changes from Kraken > >> ------------------------- > >> > >> - *General*: > >> * Ceph now has a simple, built-in web-based dashboard for monitoring > >> cluster > >> status. > >> > >> - *RADOS*: > >> * *BlueStore*: > >> - The new *BlueStore* backend for *ceph-osd* is now stable and the > >> new default for newly created OSDs. BlueStore manages data > >> stored by each OSD by directly managing the physical HDDs or > >> SSDs without the use of an intervening file system like XFS. > >> This provides greater performance and features. > >> - BlueStore supports full data and metadata checksums > >> of all data stored by Ceph. > >> - BlueStore supports inline compression using zlib, snappy, or LZ4. > >> (Ceph > >> also supports zstd for RGW compression but zstd is not recommended > >> for > >> BlueStore for performance reasons.) > >> > >> * *Erasure coded* pools now have full support for overwrites > >> allowing them to be used with RBD and CephFS. > >> > >> * *ceph-mgr*: > >> - There is a new daemon, *ceph-mgr*, which is a required part of > >> any Ceph deployment. Although IO can continue when *ceph-mgr* > >> is down, metrics will not refresh and some metrics-related calls > >> (e.g., `ceph df`) may block. We recommend deploying several > >> instances of *ceph-mgr* for reliability. See the notes on > >> Upgrading below. > >> - The *ceph-mgr* daemon includes a REST-based management API. > >> The API is still experimental and somewhat limited but > >> will form the basis for API-based management of Ceph going forward. > >> - ceph-mgr also includes a Prometheus exporter plugin, which can > >> provide Ceph > >> perfcounters to Prometheus. > >> - ceph-mgr now has a Zabbix plugin. Using zabbix_sender it sends > >> trapper > >> events to a Zabbix server containing high-level information of the > >> Ceph > >> cluster. This makes it easy to monitor a Ceph cluster's status and > >> send > >> out notifications in case of a malfunction. > >> > >> * The overall *scalability* of the cluster has improved. We have > >> successfully tested clusters with up to 10,000 OSDs. > >> * Each OSD can now have a device class associated with > >> it (e.g., `hdd` or `ssd`), allowing CRUSH rules to trivially map > >> data to a subset of devices in the system. Manually writing CRUSH > >> rules or manual editing of the CRUSH is normally not required. > >> * There is a new upmap exception mechanism that allows individual PGs > >> to be moved around to achieve > >> a *perfect distribution* (this requires luminous clients). > >> * Each OSD now adjusts its default configuration based on whether the > >> backing device is an HDD or SSD. Manual tuning generally not > >> required. > >> * The prototype mClock QoS queueing algorithm is now available. > >> * There is now a *backoff* mechanism that prevents OSDs from being > >> overloaded by requests to objects or PGs that are not currently able > >> to > >> process IO. > >> * There is a simplified OSD replacement process that is more robust. > >> * You can query the supported features and (apparent) releases of > >> all connected daemons and clients with `ceph features` > >> * You can configure the oldest Ceph client version you wish to allow to > >> connect to the cluster via `ceph osd set-require-min-compat-client` > >> and > >> Ceph will prevent you from enabling features that will break > >> compatibility > >> with those clients. > >> * Several `sleep` settings, include `osd_recovery_sleep`, > >> `osd_snap_trim_sleep`, and `osd_scrub_sleep` have been > >> reimplemented to work efficiently. (These are used in some cases > >> to work around issues throttling background work.) > >> * Pools are now expected to be associated with the application using > >> them. > >> Upon completing the upgrade to Luminous, the cluster will attempt to > >> associate > >> existing pools to known applications (i.e. CephFS, RBD, and RGW). > >> In-use pools > >> that are not associated to an application will generate a health > >> warning. Any > >> unassociated pools can be manually associated using the new > >> `ceph osd pool application enable` command. For more details see > >> `associate pool to application` in the documentation. > >> > >> - *RGW*: > >> > >> * RGW *metadata search* backed by ElasticSearch now supports end > >> user requests service via RGW itself, and also supports custom > >> metadata fields. A query language a set of RESTful APIs were > >> created for users to be able to search objects by their > >> metadata. New APIs that allow control of custom metadata fields > >> were also added. > >> * RGW now supports *dynamic bucket index sharding*. This has to be > >> enabled via > >> the `rgw dyamic resharding` configurable. As the number of objects in > >> a > >> bucket grows, RGW will automatically reshard the bucket index in > >> response. > >> No user intervention or bucket size capacity planning is required. > >> * RGW introduces *server side encryption* of uploaded objects with > >> three options for the management of encryption keys: automatic > >> encryption (only recommended for test setups), customer provided > >> keys similar to Amazon SSE-C specification, and through the use of > >> an external key management service (Openstack Barbican) similar > >> to Amazon SSE-KMS specification. > >> * RGW now has preliminary AWS-like bucket policy API support. For > >> now, policy is a means to express a range of new authorization > >> concepts. In the future it will be the foundation for additional > >> auth capabilities such as STS and group policy. > >> * RGW has consolidated the several metadata index pools via the use of > >> rados > >> namespaces. > >> * S3 Object Tagging API has been added; while APIs are > >> supported for GET/PUT/DELETE object tags and in PUT object > >> API, there is no support for tags on Policies & Lifecycle yet > >> * RGW multisite now supports for enabling or disabling sync at a > >> bucket level. > >> > >> - *RBD*: > >> > >> * RBD now has full, stable support for *erasure coded pools* via the > >> new > >> `--data-pool` option to `rbd create`. > >> * RBD mirroring's rbd-mirror daemon is now highly available. We > >> recommend deploying several instances of rbd-mirror for > >> reliability. > >> * RBD mirroring's rbd-mirror daemon should utilize unique Ceph user > >> IDs per instance to support the new mirroring dashboard. > >> * The default 'rbd' pool is no longer created automatically during > >> cluster creation. Additionally, the name of the default pool used > >> by the rbd CLI when no pool is specified can be overridden via a > >> new `rbd default pool = <pool name>` configuration option. > >> * Initial support for deferred image deletion via new `rbd > >> trash` CLI commands. Images, even ones actively in-use by > >> clones, can be moved to the trash and deleted at a later time. > >> * New pool-level `rbd mirror pool promote` and `rbd mirror pool > >> demote` commands to batch promote/demote all mirrored images > >> within a pool. > >> * Mirroring now optionally supports a configurable replication delay > >> via the `rbd mirroring replay delay = <seconds>` configuration > >> option. > >> * Improved discard handling when the object map feature is enabled. > >> * rbd CLI `import` and `copy` commands now detect sparse and > >> preserve sparse regions. > >> * Images and Snapshots will now include a creation timestamp. > >> * Specifying user authorization capabilities for RBD clients has been > >> simplified. The general syntax for using RBD capability profiles is > >> "mon 'profile rbd' osd 'profile rbd[-read-only][ pool={pool-name}[, > >> ...]]'". > >> For more details see "User Management" in the documentation. > >> > >> - *CephFS*: > >> > >> * *Multiple active MDS daemons* is now considered stable. The number > >> of active MDS servers may be adjusted up or down on an active CephFS > >> file > >> system. > >> * CephFS *directory fragmentation* is now stable and enabled by > >> default on new filesystems. To enable it on existing filesystems > >> use "ceph fs set <fs_name> allow_dirfrags". Large or very busy > >> directories are sharded and (potentially) distributed across > >> multiple MDS daemons automatically. > >> * Directory subtrees can be explicitly pinned to specific MDS daemons > >> in > >> cases where the automatic load balancing is not desired or effective. > >> * Client keys can now be created using the new `ceph fs authorize` > >> command > >> to create keys with access to the given CephFS file system and all of > >> its > >> data pools. > >> * When running 'df' on a CephFS filesystem comprising exactly one data > >> pool, > >> the result now reflects the file storage space used and available in > >> that > >> data pool (fuse client only). > >> > >> - *Miscellaneous*: > >> > >> * Release packages are now being built for *Debian Stretch*. Note > >> that QA is limited to CentOS and Ubuntu (xenial and trusty). The > >> distributions we build for now include: > >> > >> - CentOS 7 (x86_64 and aarch64) > >> - Debian 8 Jessie (x86_64) > >> - Debian 9 Stretch (x86_64) > >> - Ubuntu 16.04 Xenial (x86_64 and aarch64) > >> - Ubuntu 14.04 Trusty (x86_64) > >> > >> * A first release of Ceph for FreeBSD is available which contains a > >> full set > >> of features, other than Bluestore. It will run everything needed to > >> build a > >> storage cluster. For clients, all access methods are available, > >> albeit > >> CephFS is only accessible through a Fuse implementation. RBD images > >> can be > >> mounted on FreeBSD systems through rbd-ggate > >> Ceph versions are released through the regular FreeBSD ports and > >> packages > >> system. The most current version is available as: net/ceph-devel. > >> Once > >> Luminous goes into official release, this version will be available > >> as > >> net/ceph. Future development releases will be available via > >> net/ceph-devel > >> > >> * *CLI changes*: > >> > >> - The `ceph -s` or `ceph status` command has a fresh look. > >> - `ceph mgr metadata` will dump metadata associated with each mgr > >> daemon. > >> - `ceph versions` or `ceph {osd,mds,mon,mgr} versions` > >> summarize versions of running daemons. > >> - `ceph {osd,mds,mon,mgr} count-metadata <property>` similarly > >> tabulates any other daemon metadata visible via the `ceph > >> {osd,mds,mon,mgr} metadata` commands. > >> - `ceph features` summarizes features and releases of connected > >> clients and daemons. > >> - `ceph osd require-osd-release <release>` replaces the old > >> `require_RELEASE_osds` flags. > >> - `ceph osd pg-upmap`, `ceph osd rm-pg-upmap`, `ceph osd > >> pg-upmap-items`, `ceph osd rm-pg-upmap-items` can explicitly > >> manage `upmap` items > >> - `ceph osd getcrushmap` returns a crush map version number on > >> stderr, and `ceph osd setcrushmap [version]` will only inject > >> an updated crush map if the version matches. This allows crush > >> maps to be updated offline and then reinjected into the cluster > >> without fear of clobbering racing changes (e.g., by newly added > >> osds or changes by other administrators). > >> - `ceph osd create` has been replaced by `ceph osd new`. This > >> should be hidden from most users by user-facing tools like > >> `ceph-disk`. > >> - `ceph osd destroy` will mark an OSD destroyed and remove its > >> cephx and lockbox keys. However, the OSD id and CRUSH map entry > >> will remain in place, allowing the id to be reused by a > >> replacement device with minimal data rebalancing. > >> - `ceph osd purge` will remove all traces of an OSD from the > >> cluster, including its cephx encryption keys, dm-crypt lockbox > >> keys, OSD id, and crush map entry. > >> - `ceph osd ls-tree <name>` will output a list of OSD ids under > >> the given CRUSH name (like a host or rack name). This is useful > >> for applying changes to entire subtrees. For example, `ceph > >> osd down `ceph osd ls-tree rack1``. > >> - `ceph osd {add,rm}-{noout,noin,nodown,noup}` allow the > >> `noout`, `noin`, `nodown`, and `noup` flags to be applied to > >> specific OSDs. > >> - `ceph osd safe-to-destroy <osd(s)>` will report whether it is safe > >> to > >> remove or destroy OSD(s) without reducing data durability or > >> redundancy. > >> - `ceph osd ok-to-stop <osd(s)>` will report whether it is okay to > >> stop > >> OSD(s) without immediately compromising availability (i.e., all PGs > >> should remain active but may be degraded). > >> - `ceph log last [n]` will output the last *n* lines of the cluster > >> log. > >> - `ceph mgr dump` will dump the MgrMap, including the currently > >> active > >> ceph-mgr daemon and any standbys. > >> - `ceph mgr module ls` will list active ceph-mgr modules. > >> - `ceph mgr module {enable,disable} <name>` will enable or > >> disable the named mgr module. The module must be present in the > >> configured `mgr_module_path` on the host(s) where `ceph-mgr` is > >> running. > >> - `ceph osd crush ls <node>` will list items (OSDs or other CRUSH > >> nodes) > >> directly beneath a given CRUSH node. > >> - `ceph osd crush swap-bucket <src> <dest>` will swap the > >> contents of two CRUSH buckets in the hierarchy while preserving > >> the buckets' ids. This allows an entire subtree of devices to > >> be replaced (e.g., to replace an entire host of FileStore OSDs > >> with newly-imaged BlueStore OSDs) without disrupting the > >> distribution of data across neighboring devices. > >> - `ceph osd set-require-min-compat-client <release>` configures > >> the oldest client release the cluster is required to support. > >> Other changes, like CRUSH tunables, will fail with an error if > >> they would violate this setting. Changing this setting also > >> fails if clients older than the specified release are currently > >> connected to the cluster. > >> - `ceph config-key dump` dumps config-key entries and their > >> contents. (The existing `ceph config-key list` only dumps the key > >> names, not the values.) > >> - `ceph config-key list` is deprecated in favor of `ceph config-key > >> ls`. > >> - `ceph config-key put` is deprecated in favor of `ceph config-key > >> set`. > >> - `ceph auth list` is deprecated in favor of `ceph auth ls`. > >> - `ceph osd crush rule list` is deprecated in favor of `ceph osd > >> crush rule ls`. > >> - `ceph osd set-{full,nearfull,backfillfull}-ratio` sets the > >> cluster-wide ratio for various full thresholds (when the cluster > >> refuses IO, when the cluster warns about being close to full, > >> when an OSD will defer rebalancing a PG to itself, > >> respectively). > >> - `ceph osd reweightn` will specify the `reweight` values for > >> multiple OSDs in a single command. This is equivalent to a series > >> of > >> `ceph osd reweight` commands. > >> - `ceph osd crush {set,rm}-device-class` manage the new > >> CRUSH *device class* feature. Note that manually creating or > >> deleting > >> a device class name is generally not necessary as it will be smart > >> enough to be self-managed. `ceph osd crush class ls` and > >> `ceph osd crush class ls-osd` will output all existing device > >> classes > >> and a list of OSD ids under the given device class respectively. > >> - `ceph osd crush rule create-replicated` replaces the old > >> `ceph osd crush rule create-simple` command to create a CRUSH > >> rule for a replicated pool. Notably it takes a `class` argument > >> for the *device class* the rule should target (e.g., `ssd` or > >> `hdd`). > >> - `ceph mon feature ls` will list monitor features recorded in the > >> MonMap. `ceph mon feature set` will set an optional feature (none > >> of > >> these exist yet). > >> - `ceph tell <daemon> help` will now return a usage summary. > >> - `ceph fs authorize` creates a new client key with caps > >> automatically > >> set to access the given CephFS file system. > >> - The `ceph health` structured output (JSON or XML) no longer > >> contains > >> 'timechecks' section describing the time sync status. This > >> information is now available via the 'ceph time-sync-status' > >> command. > >> - Certain extra fields in the `ceph health` structured output that > >> used to appear if the mons were low on disk space (which duplicated > >> the information in the normal health warning messages) are now > >> gone. > >> - The `ceph -w` output no longer contains audit log entries by > >> default. > >> Add a `--watch-channel=audit` or `--watch-channel=*` to see them. > >> - New "ceph -w" behavior - the "ceph -w" output no longer contains > >> I/O rates, available space, pg info, etc. because these are no > >> longer logged to the central log (which is what `ceph -w` > >> shows). The same information can be obtained by running `ceph pg > >> stat`; alternatively, I/O rates per pool can be determined using > >> `ceph osd pool stats`. Although these commands do not > >> self-update like `ceph -w` did, they do have the ability to > >> return formatted output by providing a `--format=<format>` > >> option. > >> - Added new commands `pg force-recovery` and > >> `pg-force-backfill`. Use them to boost recovery or backfill > >> priority of specified pgs, so they're recovered/backfilled > >> before any other. Note that these commands don't interrupt > >> ongoing recovery/backfill, but merely queue specified pgs before > >> others so they're recovered/backfilled as soon as possible. New > >> commands `pg cancel-force-recovery` and `pg > >> cancel-force-backfill` restore default recovery/backfill > >> priority of previously forced pgs. > >> > >> Major Changes from Jewel > >> ------------------------ > >> > >> - *RADOS*: > >> > >> * We now default to the AsyncMessenger (`ms type = async`) instead > >> of the legacy SimpleMessenger. The most noticeable difference is > >> that we now use a fixed sized thread pool for network connections > >> (instead of two threads per socket with SimpleMessenger). > >> * Some OSD failures are now detected almost immediately, whereas > >> previously the heartbeat timeout (which defaults to 20 seconds) > >> had to expire. This prevents IO from blocking for an extended > >> period for failures where the host remains up but the ceph-osd > >> process is no longer running. > >> * The size of encoded OSDMaps has been reduced. > >> * The OSDs now quiesce scrubbing when recovery or rebalancing is in > >> progress. > >> > >> - *RGW*: > >> > >> * RGW now supports the S3 multipart object copy-part API. > >> * It is possible now to reshard an existing bucket offline. Offline > >> bucket resharding currently requires that all IO (especially > >> writes) to the specific bucket is quiesced. (For automatic online > >> resharding, see the new feature in Luminous above.) > >> * RGW now supports data compression for objects. > >> * Civetweb version has been upgraded to 1.8 > >> * The Swift static website API is now supported (S3 support has been > >> added > >> previously). > >> * S3 bucket lifecycle API has been added. Note that currently it only > >> supports > >> object expiration. > >> * Support for custom search filters has been added to the LDAP auth > >> implementation. > >> * Support for NFS version 3 has been added to the RGW NFS gateway. > >> * A Python binding has been created for librgw. > >> > >> - *RBD*: > >> > >> * The rbd-mirror daemon now supports replicating dynamic image > >> feature updates and image metadata key/value pairs from the > >> primary image to the non-primary image. > >> * The number of image snapshots can be optionally restricted to a > >> configurable maximum. > >> * The rbd Python API now supports asynchronous IO operations. > >> > >> - *CephFS*: > >> > >> * libcephfs function definitions have been changed to enable proper > >> uid/gid control. The library version has been increased to reflect > >> the > >> interface change. > >> * Standby replay MDS daemons now consume less memory on workloads > >> doing deletions. > >> * Scrub now repairs backtrace, and populates `damage ls` with > >> discovered errors. > >> * A new `pg_files` subcommand to `cephfs-data-scan` can identify > >> files affected by a damaged or lost RADOS PG. > >> * The false-positive "failing to respond to cache pressure" warnings > >> have > >> been fixed. > >> > >> > >> Upgrade from Jewel or Kraken > >> ---------------------------- > >> #. Ensure that the `sortbitwise` flag is enabled:: > >> # ceph osd set sortbitwise > >> #. Make sure your cluster is stable and healthy (no down or > >> recoverying OSDs). (Optional, but recommended.) > >> #. Do not create any new erasure-code pools while upgrading the monitors. > >> #. You can monitor the progress of your upgrade at each stage with the > >> `ceph versions` command, which will tell you what ceph version is > >> running for each type of daemon. > >> #. Set the `noout` flag for the duration of the upgrade. (Optional > >> but recommended.):: > >> # ceph osd set noout > >> #. Upgrade monitors by installing the new packages and restarting the > >> monitor daemons. Note that, unlike prior releases, the ceph-mon > >> daemons *must* be upgraded first:: > >> # systemctl restart ceph-mon.target > >> Verify the monitor upgrade is complete once all monitors are up by > >> looking for the `luminous` feature string in the mon map. For > >> example:: > >> # ceph mon feature ls > >> should include `luminous` under persistent features:: > >> on current monmap (epoch NNN) > >> persistent: [kraken,luminous] > >> required: [kraken,luminous] > >> #. Add or restart `ceph-mgr` daemons. If you are upgrading from > >> kraken, upgrade packages and restart ceph-mgr daemons with:: > >> # systemctl restart ceph-mgr.target > >> If you are upgrading from kraken, you may already have ceph-mgr > >> daemons deployed. If not, or if you are upgrading from jewel, you > >> can deploy new daemons with tools like ceph-deploy or ceph-ansible. > >> For example:: > >> # ceph-deploy mgr create HOST > >> Verify the ceph-mgr daemons are running by checking `ceph -s`:: > >> # ceph -s > >> ... > >> services: > >> mon: 3 daemons, quorum foo,bar,baz > >> mgr: foo(active), standbys: bar, baz > >> ... > >> #. Upgrade all OSDs by installing the new packages and restarting the > >> ceph-osd daemons on all hosts:: > >> # systemctl restart ceph-osd.target > >> You can monitor the progress of the OSD upgrades with the new > >> `ceph versions` or `ceph osd versions` command:: > >> # ceph osd versions > >> { > >> "ceph version 12.2.0 (...) luminous (stable)": 12, > >> "ceph version 10.2.6 (...)": 3, > >> } > >> #. Upgrade all CephFS daemons by upgrading packages and restarting > >> daemons on all hosts:: > >> # systemctl restart ceph-mds.target > >> #. Upgrade all radosgw daemons by upgrading packages and restarting > >> daemons on all hosts:: > >> # systemctl restart radosgw.target > >> #. Complete the upgrade by disallowing pre-luminous OSDs and enabling > >> all new Luminous-only functionality:: > >> # ceph osd require-osd-release luminous > >> If you set `noout` at the beginning, be sure to clear it with:: > >> # ceph osd unset noout > >> #. Verify the cluster is healthy with `ceph health`. > >> > >> > >> Upgrading from pre-Jewel releases (like Hammer) > >> ----------------------------------------------- > >> > >> You *must* first upgrade to Jewel (10.2.z) before attempting an > >> upgrade to Luminous. > >> > >> > >> Upgrade compatibility notes, Kraken to Luminous > >> ----------------------------------------------- > >> > >> * The configuration option `osd pool erasure code stripe width` has > >> been replaced by `osd pool erasure code stripe unit`, and given > >> the ability to be overridden by the erasure code profile setting > >> `stripe_unit`. For more details see > >> :ref:`erasure-code-profiles`. > >> > >> * rbd and cephfs can use erasure coding with bluestore. This may be > >> enabled by setting `allow_ec_overwrites` to `true` for a pool. Since > >> this relies on bluestore's checksumming to do deep scrubbing, > >> enabling this on a pool stored on filestore is not allowed. > >> > >> * The `rados df` JSON output now prints numeric values as numbers instead > >> of > >> strings. > >> > >> * The `mon_osd_max_op_age` option has been renamed to > >> `mon_osd_warn_op_age` (default: 32 seconds), to indicate we > >> generate a warning at this age. There is also a new > >> `mon_osd_err_op_age_ratio` that is a expressed as a multitple of > >> `mon_osd_warn_op_age` (default: 128, for roughly 60 minutes) to > >> control when an error is generated. > >> > >> * The default maximum size for a single RADOS object has been reduced from > >> 100GB to 128MB. The 100GB limit was completely impractical in practice > >> while the 128MB limit is a bit high but not unreasonable. If you have > >> an > >> application written directly to librados that is using objects larger > >> than > >> 128MB you may need to adjust `osd_max_object_size`. > >> > >> * The semantics of the `rados ls` and librados object listing > >> operations have always been a bit confusing in that "whiteout" > >> objects (which logically don't exist and will return ENOENT if you > >> try to access them) are included in the results. Previously > >> whiteouts only occurred in cache tier pools. In luminous, logically > >> deleted but snapshotted objects now result in a whiteout object, and > >> as a result they will appear in `rados ls` results, even though > >> trying to read such an object will result in ENOENT. The `rados > >> listsnaps` operation can be used in such a case to enumerate which > >> snapshots are present. > >> This may seem a bit strange, but is less strange than having a > >> deleted-but-snapshotted object not appear at all and be completely > >> hidden from librados's ability to enumerate objects. Future > >> versions of Ceph will likely include an alternative object > >> enumeration interface that makes it more natural and efficient to > >> enumerate all objects along with their snapshot and clone metadata. > >> > >> * The deprecated `crush_ruleset` property has finally been removed; > >> please use `crush_rule` instead for the `osd pool get ...` and `osd > >> pool set ...` commands. > >> > >> * The `osd pool default crush replicated ruleset` option has been > >> removed and replaced by the `psd pool default crush rule` option. > >> By default it is -1, which means the mon will pick the first type > >> replicated rule in the CRUSH map for replicated pools. Erasure > >> coded pools have rules that are automatically created for them if > >> they are not specified at pool creation time. > >> > >> * We no longer test the FileStore ceph-osd backend in combination with > >> btrfs. We recommend against using btrfs. If you are using > >> btrfs-based OSDs and want to upgrade to luminous you will need to > >> add the follwing to your ceph.conf:: > >> > >> enable experimental unrecoverable data corrupting features = btrfs > >> > >> The code is mature and unlikely to change, but we are only > >> continuing to test the Jewel stable branch against btrfs. We > >> recommend moving these OSDs to FileStore with XFS or BlueStore. > >> * The `ruleset-*` properties for the erasure code profiles have been > >> renamed to `crush-*` to (1) move away from the obsolete 'ruleset' > >> term and to be more clear about their purpose. There is also a new > >> optional `crush-device-class` property to specify a CRUSH device > >> class to use for the erasure coded pool. Existing erasure code > >> profiles will be converted automatically when upgrade completes > >> (when the `ceph osd require-osd-release luminous` command is run) > >> but any provisioning tools that create erasure coded pools may need > >> to be updated. > >> * The structure of the XML output for `osd crush tree` has changed > >> slightly to better match the `osd tree` output. The top level > >> structure is now `nodes` instead of `crush_map_roots`. > >> * When assigning a network to the public network and not to > >> the cluster network the network specification of the public > >> network will be used for the cluster network as well. > >> In older versions this would lead to cluster services > >> being bound to 0.0.0.0:<port>, thus making the > >> cluster service even more publicly available than the > >> public services. When only specifying a cluster network it > >> will still result in the public services binding to 0.0.0.0. > >> > >> * In previous versions, if a client sent an op to the wrong OSD, the OSD > >> would reply with ENXIO. The rationale here is that the client or OSD > >> is > >> clearly buggy and we want to surface the error as clearly as possible. > >> We now only send the ENXIO reply if the osd_enxio_on_misdirected_op > >> option > >> is enabled (it's off by default). This means that a VM using librbd > >> that > >> previously would have gotten an EIO and gone read-only will now see a > >> blocked/hung IO instead. > >> > >> * The "journaler allow split entries" config setting has been removed. > >> > >> * The 'mon_warn_osd_usage_min_max_delta' config option has been > >> removed and the associated health warning has been disabled because > >> it does not address clusters undergoing recovery or CRUSH rules that do > >> not target all devices in the cluster. > >> > >> * Added new configuration "public bind addr" to support dynamic > >> environments like Kubernetes. When set the Ceph MON daemon could > >> bind locally to an IP address and advertise a different IP address > >> `public addr` on the network. > >> > >> * The crush `choose_args` encoding has been changed to make it > >> architecture-independent. If you deployed Luminous dev releases or > >> 12.1.0 rc release and made use of the CRUSH choose_args feature, you > >> need to remove all choose_args mappings from your CRUSH map before > >> starting the upgrade. > >> > >> > >> - *librados*: > >> > >> * Some variants of the omap_get_keys and omap_get_vals librados > >> functions have been deprecated in favor of omap_get_vals2 and > >> omap_get_keys2. The new methods include an output argument > >> indicating whether there are additional keys left to fetch. > >> Previously this had to be inferred from the requested key count vs > >> the number of keys returned, but this breaks with new OSD-side > >> limits on the number of keys or bytes that can be returned by a > >> single omap request. These limits were introduced by kraken but > >> are effectively disabled by default (by setting a very large limit > >> of 1 GB) because users of the newly deprecated interface cannot > >> tell whether they should fetch more keys or not. In the case of > >> the standalone calls in the C++ interface > >> (IoCtx::get_omap_{keys,vals}), librados has been updated to loop on > >> the client side to provide a correct result via multiple calls to > >> the OSD. In the case of the methods used for building > >> multi-operation transactions, however, client-side looping is not > >> practical, and the methods have been deprecated. Note that use of > >> either the IoCtx methods on older librados versions or the > >> deprecated methods on any version of librados will lead to > >> incomplete results if/when the new OSD limits are enabled. > >> > >> * The original librados rados_objects_list_open (C) and objects_begin > >> (C++) object listing API, deprecated in Hammer, has finally been > >> removed. Users of this interface must update their software to use > >> either the rados_nobjects_list_open (C) and nobjects_begin (C++) API > >> or > >> the new rados_object_list_begin (C) and object_list_begin (C++) API > >> before updating the client-side librados library to Luminous. > >> Object enumeration (via any API) with the latest librados version > >> and pre-Hammer OSDs is no longer supported. Note that no in-tree > >> Ceph services rely on object enumeration via the deprecated APIs, so > >> only external librados users might be affected. > >> The newest (and recommended) rados_object_list_begin (C) and > >> object_list_begin (C++) API is only usable on clusters with the > >> SORTBITWISE flag enabled (Jewel and later). (Note that this flag is > >> required to be set before upgrading beyond Jewel.) > >> > >> - *CephFS*: > >> > >> * When configuring ceph-fuse mounts in /etc/fstab, a new syntax is > >> available that uses "ceph.<arg>=<val>" in the options column, instead > >> of putting configuration in the device column. The old style syntax > >> still works. See the documentation page "Mount CephFS in your > >> file systems table" for details. > >> * CephFS clients without the 'p' flag in their authentication > >> capability > >> string will no longer be able to set quotas or any layout fields. > >> This > >> flag previously only restricted modification of the pool and > >> namespace > >> fields in layouts. > >> * CephFS will generate a health warning if you have fewer standby > >> daemons > >> than it thinks you wanted. By default this will be 1 if you ever had > >> a standby, and 0 if you did not. You can customize this using > >> `ceph fs set <fs> standby_count_wanted <number>`. Setting it > >> to zero will effectively disable the health check. > >> * The "ceph mds tell ..." command has been removed. It is superceded > >> by "ceph tell mds.<id> ..." > >> * The `apply` mode of cephfs-journal-tool has been removed > >> > >> Getting Ceph > >> ------------ > >> > >> * Git at git://github.com/ceph/ceph.git > >> * Tarball at http://download.ceph.com/tarballs/ceph-12.2.0.tar.gz > >> * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ > >> * For ceph-deploy, see > >> http://docs.ceph.com/docs/master/install/install-ceph-deploy > >> * Release git sha1: 32ce2a3ae5239ee33d6150705cdb24d43bab910c > >> > >> -- > >> Abhishek Lekshmanan > >> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > >> HRB 21284 (AG Nürnberg) > >> -- > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > >
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com