Re: [PATCH 7/9] spaceman/defrag: sleeps between segments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jul 9, 2024, at 1:46 PM, Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
> 
> On Tue, Jul 09, 2024 at 12:10:26PM -0700, Wengang Wang wrote:
>> Let user contol the time to sleep between segments (file unlocked) to
>> balance defrag performance and file IO servicing time.
>> 
>> Signed-off-by: Wengang Wang <wen.gang.wang@xxxxxxxxxx>
>> ---
>> spaceman/defrag.c | 26 ++++++++++++++++++++++++--
>> 1 file changed, 24 insertions(+), 2 deletions(-)
>> 
>> diff --git a/spaceman/defrag.c b/spaceman/defrag.c
>> index b5c5b187..415fe9c2 100644
>> --- a/spaceman/defrag.c
>> +++ b/spaceman/defrag.c
>> @@ -311,6 +311,9 @@ void defrag_sigint_handler(int dummy)
>>  */
>> static long g_limit_free_bytes = 1024 * 1024 * 1024;
>> 
>> +/* sleep time in us between segments, overwritten by paramter */
>> +static int g_idle_time = 250 * 1000;
>> +
>> /*
>>  * check if the free space in the FS is less than the _limit_
>>  * return true if so, false otherwise
>> @@ -487,6 +490,7 @@ defrag_xfs_defrag(char *file_path) {
>> int scratch_fd = -1, defrag_fd = -1;
>> char tmp_file_path[PATH_MAX+1];
>> struct file_clone_range clone;
>> + int sleep_time_us = 0;
>> char *defrag_dir;
>> struct fsxattr fsx;
>> int ret = 0;
>> @@ -574,6 +578,9 @@ defrag_xfs_defrag(char *file_path) {
>> 
>> /* checks for EoF and fix up clone */
>> stop = defrag_clone_eof(&clone);
>> + if (sleep_time_us > 0)
>> + usleep(sleep_time_us);
>> +
>> gettimeofday(&t_clone, NULL);
>> ret = ioctl(scratch_fd, FICLONERANGE, &clone);
>> if (ret != 0) {
>> @@ -587,6 +594,10 @@ defrag_xfs_defrag(char *file_path) {
>> if (time_delta > max_clone_us)
>> max_clone_us = time_delta;
>> 
>> + /* sleeps if clone cost more than 500ms, slow FS */
> 
> Why half a second?  I sense that what you're getting at is that you want
> to limit file io latency spikes in other programs by relaxing the defrag
> program, right?  But the help screen doesn't say anything about "only if
> the clone lasts more than 500ms".

This is an optional sleep for very slow FS with CLONE takes long.
Actually, we don’t fall into this case in our local tests.

The main sleep is at above:

 40 +               if (sleep_time_us > 0)
 41 +                       usleep(sleep_time_us);

> 
>> + if (time_delta >= 500000 && g_idle_time > 0)
>> + usleep(g_idle_time);
> 
> These days, I wonder if it makes more sense to provide a CPU utilization
> target and let the kernel figure out how much sleeping that is:
> 
> $ systemd-run -p 'CPUQuota=60%' xfs_spaceman -c 'defrag' /path/to/file
> 
> The tradeoff here is that we as application writers no longer have to
> implement these clunky sleeps ourselves, but then one has to turn on cpu
> accounting in systemd (if there even /is/ a systemd).  Also I suppose we
> don't want this program getting throttled while it's holding a file
> lock.
> 

Yes, we hope less locking time as possible.

This slowness is mainly coming from slow disks. As we tested, when page cache
Is empty, CPU usages is only at about 6% on my VM (the real physical would be spindle disk).
It would be higher for NVMEs.  
I’d like to provide user a way to make balance, say longer sleep time between segments to make IO serviced better.

Thanks,
Wengang

> --D
> 
>> +
>> /* for defrag stats */
>> nr_ext_defrag += segment.ds_nr;
>> 
>> @@ -641,6 +652,12 @@ defrag_xfs_defrag(char *file_path) {
>> 
>> if (stop || usedKilled)
>> break;
>> +
>> + /*
>> +  * no lock on target file when punching hole from scratch file,
>> +  * so minus the time used for punching hole
>> +  */
>> + sleep_time_us = g_idle_time - time_delta;
>> } while (true);
>> out:
>> if (scratch_fd != -1) {
>> @@ -678,6 +695,7 @@ static void defrag_help(void)
>> " -f free_space      -- specify shrethod of the XFS free space in MiB, when\n"
>> "                       XFS free space is lower than that, shared segments \n"
>> "                       are excluded from defragmentation, 1024 by default\n"
>> +" -i idle_time       -- time in ms to be idle between segments, 250ms by default\n"
>> " -n                 -- disable the \"share first extent\" featue, it's\n"
>> "                       enabled by default to speed up\n"
>> ));
>> @@ -691,7 +709,7 @@ defrag_f(int argc, char **argv)
>> int i;
>> int c;
>> 
>> - while ((c = getopt(argc, argv, "s:f:n")) != EOF) {
>> + while ((c = getopt(argc, argv, "s:f:ni")) != EOF) {
>> switch(c) {
>> case 's':
>> g_segment_size_lmt = atoi(optarg) * 1024 * 1024 / 512;
>> @@ -709,6 +727,10 @@ defrag_f(int argc, char **argv)
>> g_enable_first_ext_share = false;
>> break;
>> 
>> + case 'i':
>> + g_idle_time = atoi(optarg) * 1000;
> 
> Should we complain if optarg is non-integer garbage?  Or if g_idle_time
> is larger than 1s?
> 
> --D
> 
>> + break;
>> +
>> default:
>> command_usage(&defrag_cmd);
>> return 1;
>> @@ -726,7 +748,7 @@ void defrag_init(void)
>> defrag_cmd.cfunc = defrag_f;
>> defrag_cmd.argmin = 0;
>> defrag_cmd.argmax = 4;
>> - defrag_cmd.args = "[-s segment_size] [-f free_space] [-n]";
>> + defrag_cmd.args = "[-s segment_size] [-f free_space] [-i idle_time] [-n]";
>> defrag_cmd.flags = CMD_FLAG_ONESHOT;
>> defrag_cmd.oneline = _("Defragment XFS files");
>> defrag_cmd.help = defrag_help;
>> -- 
>> 2.39.3 (Apple Git-146)






[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux