Chris Li <chrisl@xxxxxxxxxx> writes: > On Tue, Jun 18, 2024 at 12:56 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> Chris Li <chrisl@xxxxxxxxxx> writes: >> >> > On Sun, Jun 16, 2024 at 11:21 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> >> >> Hi, Chris, >> >> >> >> Chris Li <chrisl@xxxxxxxxxx> writes: [snip] >> >> > diff --git a/include/linux/swap.h b/include/linux/swap.h >> >> > index 3df75d62a835..cd9154a3e934 100644 >> >> > --- a/include/linux/swap.h >> >> > +++ b/include/linux/swap.h >> >> > @@ -242,23 +242,22 @@ enum { >> >> > * space with SWAPFILE_CLUSTER pages long and naturally aligns in disk. All >> >> > * free clusters are organized into a list. We fetch an entry from the list to >> >> > * get a free cluster. >> >> > - * >> >> > - * The data field stores next cluster if the cluster is free or cluster usage >> >> > - * counter otherwise. The flags field determines if a cluster is free. This is >> >> > - * protected by swap_info_struct.lock. >> >> > */ >> >> > struct swap_cluster_info { >> >> > spinlock_t lock; /* >> >> > - * Protect swap_cluster_info fields >> >> > - * and swap_info_struct->swap_map >> >> > + * Protect swap_cluster_info count and state >> >> >> >> Protect swap_cluster_info fields except 'list' ? >> > >> > I change it to protect the swap_cluster_info bitfields in the second patch. >> >> Although I still prefer my version, I will not insist on that. > > Sure, I actually don't have a strong preference about that. It is just comments. > >> >> >> >> >> > + * field and swap_info_struct->swap_map >> >> > * elements correspond to the swap >> >> > * cluster >> >> > */ >> >> > - unsigned int data:24; >> >> > - unsigned int flags:8; >> >> > + unsigned int count:12; >> >> > + unsigned int state:3; >> >> >> >> I still prefer normal data type over bit fields. How about >> >> >> >> u16 usage; >> >> u8 state; >> > >> > I don't mind the "count" rename to "usage". That is probably a better >> > name. However I have another patch intended to add more bit fields in >> > the cluster info struct. The second patch adds "order" and the later >> > patch will add more. That is why I choose bitfield to be more condense >> > with bits. >> >> We still have space for another "u8" for "order". It appears trivial to >> change it to bit fields when necessary in the future. > > We can, I don't see it necessary to change from bit field to u8 and > back to bit field in the future. It is more of a personal preference > issue. I have to say that I don't think that it's just a personal preference. IMO, if it's unnecessary, we shouldn't use bit fields. You cannot guarantee that your future changes will be merged in its current state. So, I still think that it's better to avoid bit fields for now. >> >> >> >> And, how about use 'usage' instead of 'count'? Personally I think that >> >> it is more clear. But I don't have strong opinions on this. >> >> >> >> > + struct list_head list; /* Protected by swap_info_struct->lock */ >> >> > }; >> >> > -#define CLUSTER_FLAG_FREE 1 /* This cluster is free */ >> >> > -#define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ >> >> > + >> >> > +#define CLUSTER_STATE_FREE 1 /* This cluster is free */ >> >> [snip] >> >> > /* >> >> > @@ -481,21 +371,22 @@ static void __free_cluster(struct swap_info_struct *si, unsigned long idx) >> >> > */ >> >> > static void swap_do_scheduled_discard(struct swap_info_struct *si) >> >> > { >> >> > - struct swap_cluster_info *info, *ci; >> >> > + struct swap_cluster_info *ci; >> >> > unsigned int idx; >> >> > >> >> > - info = si->cluster_info; >> >> > - >> >> > - while (!cluster_list_empty(&si->discard_clusters)) { >> >> > - idx = cluster_list_del_first(&si->discard_clusters, info); >> >> > + while (!list_empty(&si->discard_clusters)) { >> >> > + ci = list_first_entry(&si->discard_clusters, struct swap_cluster_info, list); >> >> > + list_del(&ci->list); >> >> > + idx = ci - si->cluster_info; >> >> > spin_unlock(&si->lock); >> >> > >> >> > discard_swap_cluster(si, idx * SWAPFILE_CLUSTER, >> >> > SWAPFILE_CLUSTER); >> >> > >> >> > spin_lock(&si->lock); >> >> > - ci = lock_cluster(si, idx * SWAPFILE_CLUSTER); >> >> > - __free_cluster(si, idx); >> >> > + >> >> > + spin_lock(&ci->lock); >> >> >> >> Personally, I still prefer to use lock_cluster(), which is more readable >> >> and matches unlock_cluster() below. >> > >> > lock_cluster() uses an index which is not matching unlock_cluster() >> > which is using a pointer to cluster. >> >> lock_cluster()/unlock_cluster() are pair and fit original design >> well. They use different parameter because swap cluster is optional. >> >> > When you get the cluster from the list, you have a cluster pointer. I >> > feel it is unnecessary to convert to index then back convert to >> > cluster pointer inside lock_cluster(). I actually feel using indexes >> > to refer to the cluster is error prone because we also have offset. >> >> I don't think so, it's common to use swap offset. > > Swap offset is not an issue, it is all over the place. The cluster > index(offset/512) is the one I try to avoid. > I have made some mistakes myself on offset vs index. Yes. That's not good, but it's hard to be avoided too. Can we make the variable name more consistent? index: cluster index, offset: swap offset. And, in fact, swap offset is the parameter of lock_cluster() instead of cluster index. >> > >> >> >> >> > + __free_cluster(si, ci); >> >> > memset(si->swap_map + idx * SWAPFILE_CLUSTER, >> >> > 0, SWAPFILE_CLUSTER); >> >> > unlock_cluster(ci); >> >> > @@ -521,20 +412,19 @@ static void swap_users_ref_free(struct percpu_ref *ref) >> >> > complete(&si->comp); >> >> > } >> >> > [snip] >> >> > @@ -611,10 +497,10 @@ scan_swap_map_ssd_cluster_conflict(struct swap_info_struct *si, >> >> > { >> >> > struct percpu_cluster *percpu_cluster; >> >> > bool conflict; >> >> > - >> >> >> >> Usually we use one blank line after local variable declaration. >> > Ack. >> > >> >> >> >> > + struct swap_cluster_info *first = list_first_entry(&si->free_clusters, struct swap_cluster_info, list); >> >> > offset /= SWAPFILE_CLUSTER; >> >> > - conflict = !cluster_list_empty(&si->free_clusters) && >> >> > - offset != cluster_list_first(&si->free_clusters) && >> >> > + conflict = !list_empty(&si->free_clusters) && >> >> > + offset != first - si->cluster_info && >> >> > cluster_is_free(&si->cluster_info[offset]); >> >> > >> >> > if (!conflict) >> >> > @@ -655,10 +541,14 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, >> >> > cluster = this_cpu_ptr(si->percpu_cluster); >> >> > tmp = cluster->next[order]; >> >> > if (tmp == SWAP_NEXT_INVALID) { >> >> > - if (!cluster_list_empty(&si->free_clusters)) { >> >> > - tmp = cluster_next(&si->free_clusters.head) * >> >> > - SWAPFILE_CLUSTER; >> >> > - } else if (!cluster_list_empty(&si->discard_clusters)) { >> >> > + if (!list_empty(&si->free_clusters)) { >> >> > + ci = list_first_entry(&si->free_clusters, struct swap_cluster_info, list); >> >> > + list_del(&ci->list); >> >> >> >> The free cluster is deleted from si->free_clusters now. But later you >> >> will call scan_swap_map_ssd_cluster_conflict() and may abandon the >> >> cluster. And in alloc_cluster() later, it may be deleted again. >> > >> > Yes, that is a bug. Thanks for catching that. >> > >> >> >> >> > + spin_lock(&ci->lock); >> >> > + ci->state = CLUSTER_STATE_PER_CPU; >> >> >> >> Need to change ci->state when move a cluster off the percpu_cluster. >> > >> > In the next patch. This patch does not use the off state yet. >> >> But that is confusing to use wrong state name, the really meaning is >> something like CLUSTER_STATE_NON_FREE. But as I suggested above, we can > > It can be FREE and on the per cpu pointer as well. That is the tricky part. > It can happen on the current code as well. cluster_set_count_flag(0, 0) is called in alloc_cluster(). So, it's not an issue in current code. If you need more, that shouldn't be done in this patch. >> keep swap_cluster_info.flags and CLUSTER_FLAG_FREE in this patch. > > Maybe. Will consider that. > >> >> >> >> >> > + spin_unlock(&ci->lock); >> >> > + tmp = (ci - si->cluster_info) * SWAPFILE_CLUSTER; >> >> > + } else if (!list_empty(&si->discard_clusters)) { >> >> > /* >> >> > * we don't have free cluster but have some clusters in >> >> > * discarding, do discard now and reclaim them, then >> >> > @@ -1062,8 +952,8 @@ static void swap_free_cluster(struct swap_info_struct *si, unsigned long idx) >> >> > >> >> > ci = lock_cluster(si, offset); >> >> > memset(si->swap_map + offset, 0, SWAPFILE_CLUSTER); >> >> > - cluster_set_count_flag(ci, 0, 0); >> >> > - free_cluster(si, idx); >> >> > + ci->count = 0; >> >> > + free_cluster(si, ci); >> >> > unlock_cluster(ci); >> >> > swap_range_free(si, offset, SWAPFILE_CLUSTER); >> >> > } [snip] -- Best Regards, Huang, Ying