Hi Stefan, Could you describe more about the linger ops bug? I'm runing Firefly as you say still has this bug. Thanks! On Wed, Aug 5, 2015 at 12:51 AM, Stefan Priebe <s.priebe@xxxxxxxxxxxx> wrote: > We've done the splitting several times. The most important thing is to run a > ceph version which does not have the linger ops bug. > > This is dumpling latest release, giant and hammer. Latest firefly release > still has this bug. Which results in wrong watchers and no working > snapshots. > > Stefan > > Am 04.08.2015 um 18:46 schrieb Samuel Just: >> >> It will cause a large amount of data movement. Each new pg after the >> split will relocate. It might be ok if you do it slowly. Experiment >> on a test cluster. >> -Sam >> >> On Mon, Aug 3, 2015 at 12:57 AM, 乔建峰 <scaleqiao@xxxxxxxxx> wrote: >>> >>> Hi Cephers, >>> >>> This is a greeting from Jevon. Currently, I'm experiencing an issue which >>> suffers me a lot, so I'm writing to ask for your >>> comments/help/suggestions. >>> More details are provided bellow. >>> >>> Issue: >>> I set up a cluster having 24 OSDs and created one pool with 1024 >>> placement >>> groups on it for a small startup company. The number 1024 was calculated >>> per >>> the equation 'OSDs * 100'/pool size. The cluster have been running quite >>> well for a long time. But recently, our monitoring system always >>> complains >>> that some disks' usage exceed 85%. I log into the system and find out >>> that >>> some disks' usage are really very high, but some are not(less than 60%). >>> Each time when the issue happens, I have to manually re-balance the >>> distribution. This is a short-term solution, I'm not willing to do it all >>> the time. >>> >>> Two long-term solutions come in my mind, >>> 1) Ask the customers to expand their clusters by adding more OSDs. But I >>> think they will ask me to explain the reason of the imbalance data >>> distribution. We've already done some analysis on the environment, we >>> learned that the most imbalance part in the CRUSH is the mapping between >>> object and pg. The biggest pg has 613 objects, while the smallest pg only >>> has 226 objects. >>> >>> 2) Increase the number of placement groups. It can be of great help for >>> statistically uniform data distribution, but it can also incur >>> significant >>> data movement as PGs are effective being split. I just cannot do it in >>> our >>> customers' environment before we 100% understand the consequence. So >>> anyone >>> did this under a production environment? How much does this operation >>> affect >>> the performance of Clients? >>> >>> Any comments/help/suggestions will be highly appreciated. >>> >>> -- >>> Best Regards >>> Jevon >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html