Re: v12.1.0 Luminous RC released

Sage Weil <sage@xxxxxxxxxxxx> · Fri, 23 Jun 2017 21:06:55 +0000 (UTC)

On Fri, 23 Jun 2017, Abhishek L wrote:
> This is the first release candidate for Luminous, the next long term
> stable release.

I just want to reiterate that this is a release candidate, not the final
luminous release.  We're still squashing bugs and merging a few last 
items.  Testing is welcome, but you probably should not deploy this in any 
production environments.

Thanks!
sage

> Ceph Luminous will be the foundation for the next long-term
> stable release series.  There have been major changes since Kraken
> (v11.2.z) and Jewel (v10.2.z).
> 
> Major Changes from Kraken
> -------------------------
> 
> - *General*:
> 
>   * Ceph now has a simple, built-in web-based dashboard for monitoring
>     cluster status.
> 
> - *RADOS*:
> 
>   * *BlueStore*:
> 
>     - The new *BlueStore* backend for *ceph-osd* is now stable and the new
>       default for newly created OSDs.  BlueStore manages data stored by each OSD
>       by directly managing the physical HDDs or SSDs without the use of an
>       intervening file system like XFS.  This provides greater performance
>       and features.
>     - BlueStore supports *full data and metadata checksums* of all
>       data stored by Ceph.
>     - BlueStore supports inline compression using zlib, snappy, or LZ4.  (Ceph
>       also supports zstd for RGW compression but zstd is not recommended for
>       BlueStore for performance reasons.)
> 
>   * *Erasure coded* pools now have full support for *overwrites*,
>     allowing them to be used with RBD and CephFS.
> 
>   * *ceph-mgr*:
> 
>     - There is a new daemon, *ceph-mgr*, which is a required part of any
>       Ceph deployment.  Although IO can continue when *ceph-mgr* is
>       down, metrics will not refresh and some metrics-related calls
>       (e.g., ``ceph df``) may block.  We recommend deploying several instances of
>       *ceph-mgr* for reliability.  See the notes on `Upgrading`_ below.
>     - The *ceph-mgr* daemon includes a REST-based management API.  The
>       API is still experimental and somewhat limited but will form the basis
>       for API-based management of Ceph going forward.
> 
>   * The overall *scalability* of the cluster has improved. We have
>     successfully tested clusters with up to 10,000 OSDs.
>   * Each OSD can now have a *device class* associated with it (e.g., `hdd` or
>     `ssd`), allowing CRUSH rules to trivially map data to a subset of devices
>     in the system.  Manually writing CRUSH rules or manual editing of the CRUSH
>     is normally not required.
>   * You can now *optimize CRUSH weights* can now be optimized to
>     maintain a *near-perfect distribution of data* across OSDs.
>   * There is also a new `upmap` exception mechanism that allows
>     individual PGs to be moved around to achieve a *perfect
>     distribution* (this requires luminous clients).
>   * Each OSD now adjusts its default configuration based on whether the
>     backing device is an HDD or SSD.  Manual tuning generally not required.
>   * The prototype *mclock QoS queueing algorithm* is now available.
>   * There is now a *backoff* mechanism that prevents OSDs from being
>     overloaded by requests to objects or PGs that are not currently able to
>     process IO.
>   * There is a *simplified OSD replacement process* that is more robust.
>   * You can query the supported features and (apparent) releases of
>     all connected daemons and clients with ``ceph features``.
>   * You can configure the oldest Ceph client version you wish to allow to
>     connect to the cluster via ``ceph osd set-require-min-compat-client`` and
>     Ceph will prevent you from enabling features that will break compatibility
>     with those clients.
>   * Several `sleep` settings, include ``osd_recovery_sleep``,
>     ``osd_snap_trim_sleep``, and ``osd_scrub_sleep`` have been
>     reimplemented to work efficiently.  (These are used in some cases
>     to work around issues throttling background work.)
> 
> - *RGW*:
> 
>   * RGW *metadata search* backed by ElasticSearch now supports end
>     user requests service via RGW itself, and also supports custom
>     metadata fields. A query language a set of RESTful APIs were
>     created for users to be able to search objects by their
>     metadata. New APIs that allow control of custom metadata fields
>     were also added.
>   * RGW now supports *dynamic bucket index sharding*.  As the number
>     of objects in a bucket grows, RGW will automatically reshard the
>     bucket index in response.  No user intervention or bucket size
>     capacity planning is required.
>   * RGW introduces *server side encryption* of uploaded objects with
>     three options for the management of encryption keys: automatic
>     encryption (only recommended for test setups), customer provided
>     keys similar to Amazon SSE-C specification, and through the use of
>     an external key management service (Openstack Barbician) similar
>     to Amazon SSE-KMS specification.
>   * RGW now has preliminary AWS-like bucket policy API support.  For
>     now, policy is a means to express a range of new authorization
>     concepts.  In the future it will be the founation for additional
>     auth capabilities such as STS and group policy.
>   * RGW has consolidated the several metadata index pools via the use of rados
>     namespaces.
> 
> - *RBD*:
> 
>   * RBD now has full, stable support for *erasure coded pools* via the new
>     ``--data-pool`` option to ``rbd create``.
>   * RBD mirroring's rbd-mirror daemon is now highly available. We
>     recommend deploying several instances of rbd-mirror for
>     reliability.
>   * The default 'rbd' pool is no longer created automatically during
>     cluster creation. Additionally, the name of the default pool used
>     by the rbd CLI when no pool is specified can be overridden via a
>     new ``rbd default pool = <pool name>`` configuration option.
>   * Initial support for deferred image deletion via new ``rbd
>     trash`` CLI commands. Images, even ones actively in-use by
>     clones, can be moved to the trash and deleted at a later time.
>   * New pool-level ``rbd mirror pool promote`` and ``rbd mirror pool
>     demote`` commands to batch promote/demote all mirrored images
>     within a pool.
>   * Mirroring now optionally supports a configurable replication delay
>     via the ``rbd mirroring replay delay = <seconds>`` configuration
>     option.
>   * Improved discard handling when the object map feature is enabled.
>   * rbd CLI ``import`` and ``copy`` commands now detect sparse and
>     preserve sparse regions.
>   * Snapshots will now include a creation timestamp
> 
> - *CephFS*:
> 
>   * *Multiple active MDS daemons* is now considered stable.  The number
>     of active MDS servers may be adjusted up or down on an active CephFS file
>     system.
>   * CephFS *directory fragmentation* is now stable and enabled by
>     default on new filesystems.  To enable it on existing filesystems
>     use "ceph fs set <fs_name> allow_dirfrags".  Large or very busy
>     directories are sharded and (potentially) distributed across
>     multiple MDS daemons automatically.
>   * Directory subtrees can be explicitly pinned to specific MDS daemons in
>     cases where the automatic load balancing is not desired or effective.
> 
> - *Miscellaneous*:
> 
>   * Release packages are now being built for *Debian Stretch*.  The
>     distributions we build for now includes:
> 
>     - CentOS 7 (x86_64 and aarch64)
>     - Debian 8 Jessie (x86_64)
>     - Debian 9 Stretch (x86_64)
>     - Ubuntu 16.04 Xenial (x86_64 and aarch64)
>     - Ubuntu 14.04 Trusty (x86_64)
> 
>     Note that QA is limited to CentOS and Ubuntu (xenial and trusty).
> 
>   * *CLI changes*:
> 
>     - The ``ceph -s`` or ``ceph status`` command has a fresh look.
>     - ``ceph {osd,mds,mon} versions`` summarizes versions of running daemons.
>     - ``ceph {osd,mds,mon} count-metadata <property>`` similarly
>       tabulates any other daemon metadata visible via the ``ceph
>       {osd,mds,mon} metadata`` commands.
>     - ``ceph features`` summarizes features and releases of connected
>       clients and daemons.
>     - ``ceph osd require-osd-release <release>`` replaces the old
>       ``require_RELEASE_osds`` flags.
>     - ``ceph osd pg-upmap``, ``ceph osd rm-pg-upmap``, ``ceph osd
>       pg-upmap-items``, ``ceph osd rm-pg-upmap-items`` can explicitly
>       manage `upmap` items.
>     - ``ceph osd getcrushmap`` returns a crush map version number on
>       stderr, and ``ceph osd setcrushmap [version]`` will only inject
>       an updated crush map if the version matches.  This allows crush
>       maps to be updated offline and then reinjected into the cluster
>       without fear of clobbering racing changes (e.g., by newly added
>       osds or changes by other administrators).
>     - ``ceph osd create`` has been replaced by ``ceph osd new``.  This
>       should be hidden from most users by user-facing tools like
>       `ceph-disk`.
>     - ``ceph osd destroy`` will mark an OSD destroyed and remove its
>       cephx and lockbox keys.  However, the OSD id and CRUSH map entry
>       will remain in place, allowing the id to be reused by a
>       replacement device with minimal data rebalancing.
>     - ``ceph osd purge`` will remove all traces of an OSD from the
>       cluster, including its cephx encryption keys, dm-crypt lockbox
>       keys, OSD id, and crush map entry.
>     - ``ceph osd ls-tree <name>`` will output a list of OSD ids under
>       the given CRUSH name (like a host or rack name).  This is useful
>       for applying changes to entire subtrees.  For example, ``ceph
>       osd down `ceph osd ls-tree rack1```.
>     - ``ceph osd {add,rm}-{noout,noin,nodown,noup}`` allow the
>       `noout`, `nodown`, `noin`, and `noup` flags to be applied to
>       specific OSDs.
>     - ``ceph log last [n]`` will output the last *n* lines of the cluster
>       log.
>     - ``ceph mgr dump`` will dump the MgrMap, including the currently active
>       ceph-mgr daemon and any standbys.
>     - ``ceph osd crush swap-bucket <src> <dest>`` will swap the
>       contents of two CRUSH buckets in the hierarchy while preserving
>       the buckets' ids.  This allows an entire subtree of devices to
>       be replaced (e.g., to replace an entire host of FileStore OSDs
>       with newly-imaged BlueStore OSDs) without disrupting the
>       distribution of data across neighboring devices.
>     - ``ceph osd set-require-min-compat-client <release>`` configures
>       the oldest client release the cluster is required to support.
>       Other changes, like CRUSH tunables, will fail with an error if
>       they would violate this setting.  Changing this setting also
>       fails if clients older than the specified release are currently
>       connected to the cluster.
>     - ``ceph config-key dump`` dumps config-key entries and their
>       contents.  (The exist ``ceph config-key ls`` only dumps the key
>       names, not the values.)
>     - ``ceph osd set-{full,nearfull,backfillfull}-ratio`` sets the
>       cluster-wide ratio for various full thresholds (when the cluster
>       refuses IO, when the cluster warns about being close to full,
>       when an OSD will defer rebalancing a PG to itself,
>       respectively).
>     - ``ceph osd reweightn`` will specify the `reweight` values for
>       multiple OSDs in a single command.  This is equivalent to a series of
>       ``ceph osd reweight`` commands.
>     - ``ceph crush class {create,rm,ls}`` manage the new CRUSH *device
>       class* feature.  ``ceph crush set-device-class <osd> <class>``
>       will set the clas for a particular device.
>     - ``ceph mon feature ls`` will list monitor features recorded in the
>       MonMap.  ``ceph mon feature set`` will set an optional feature (none of
>       these exist yet).
> 
> Major Changes from Jewel
> ------------------------
> 
> - *RADOS*:
> 
>   * We now default to the AsyncMessenger (``ms type = async``) instead
>     of the legacy SimpleMessenger.  The most noticeable difference is
>     that we now use a fixed sized thread pool for network connections
>     (instead of two threads per socket with SimpleMessenger).
>   * Some OSD failures are now detected almost immediately, whereas
>     previously the heartbeat timeout (which defaults to 20 seconds)
>     had to expire.  This prevents IO from blocking for an extended
>     period for failures where the host remains up but the ceph-osd
>     process is no longer running.
>   * The size of encoded OSDMaps has been reduced.
>   * The OSDs now quiesce scrubbing when recovery or rebalancing is in progress.
> 
> - *RGW*:
> 
>   * RGW now supports the S3 multipart object copy-part API.
>   * It is possible now to reshard an existing bucket offline. Offline
>     bucket resharding currently requires that all IO (especially
>     writes) to the specific bucket is quiesced.  (For automatic online
>     resharding, see the new feature in Luminous above.)
>   * RGW now supports data compression for objects.
>   * Civetweb version has been upgraded to 1.8
>   * The Swift static website API is now supported (S3 support has been added
>     previously).
>   * S3 bucket lifecycle API has been added. Note that currently it only supports
>     object expiration.
>   * Support for custom search filters has been added to the LDAP auth
>     implementation.
>   * Support for NFS version 3 has been added to the RGW NFS gateway.
>   * A Python binding has been created for librgw.
> 
> - *RBD*:
> 
>   * The rbd-mirror daemon now supports replicating dynamic image
>     feature updates and image metadata key/value pairs from the
>     primary image to the non-primary image.
>   * The number of image snapshots can be optionally restricted to a
>     configurable maximum.
>   * The rbd Python API now supports asynchronous IO operations.
> 
> - *CephFS*:
> 
>   * libcephfs function definitions have been changed to enable proper
>     uid/gid control.  The library version has been increased to reflect the
>     interface change.
>   * Standby replay MDS daemons now consume less memory on workloads
>     doing deletions.
>   * Scrub now repairs backtrace, and populates `damage ls` with
>     discovered errors.
>   * A new `pg_files` subcommand to `cephfs-data-scan` can identify
>     files affected by a damaged or lost RADOS PG.
>   * The false-positive "failing to respond to cache pressure" warnings have
>     been fixed.
> 
> For more details refer to the detailed blog entry at
> http://ceph.com/releases/v12-1-0-luminous-rc-released/
> 
> * Git at git://github.com/ceph/ceph.git
> * Tarball at http://download.ceph.com/tarballs/ceph-12.1.0.tar.gz
> * For packages, see http://docs.ceph.com/docs/master/install/get-packages/
> * For ceph-deploy, see http://docs.ceph.com/docs/master/install/install-ceph-deploy
> * Release sha1: 262617c9f16c55e863693258061c5b25dea5b086
> 
> --
> Abhishek Lekshmanan
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com