On Wed, Jun 19, 2024 at 12:53 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > > Chris Li <chrisl@xxxxxxxxxx> writes: > > > On Tue, Jun 18, 2024 at 12:56 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > >> > >> Chris Li <chrisl@xxxxxxxxxx> writes: > >> > >> > On Sun, Jun 16, 2024 at 11:21 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > >> >> > >> >> Hi, Chris, > >> >> > >> >> Chris Li <chrisl@xxxxxxxxxx> writes: > > [snip] > > >> >> > diff --git a/include/linux/swap.h b/include/linux/swap.h > >> >> > index 3df75d62a835..cd9154a3e934 100644 > >> >> > --- a/include/linux/swap.h > >> >> > +++ b/include/linux/swap.h > >> >> > @@ -242,23 +242,22 @@ enum { > >> >> > * space with SWAPFILE_CLUSTER pages long and naturally aligns in disk. All > >> >> > * free clusters are organized into a list. We fetch an entry from the list to > >> >> > * get a free cluster. > >> >> > - * > >> >> > - * The data field stores next cluster if the cluster is free or cluster usage > >> >> > - * counter otherwise. The flags field determines if a cluster is free. This is > >> >> > - * protected by swap_info_struct.lock. > >> >> > */ > >> >> > struct swap_cluster_info { > >> >> > spinlock_t lock; /* > >> >> > - * Protect swap_cluster_info fields > >> >> > - * and swap_info_struct->swap_map > >> >> > + * Protect swap_cluster_info count and state > >> >> > >> >> Protect swap_cluster_info fields except 'list' ? > >> > > >> > I change it to protect the swap_cluster_info bitfields in the second patch. > >> > >> Although I still prefer my version, I will not insist on that. > > > > Sure, I actually don't have a strong preference about that. It is just comments. > > > >> > >> >> > >> >> > + * field and swap_info_struct->swap_map > >> >> > * elements correspond to the swap > >> >> > * cluster > >> >> > */ > >> >> > - unsigned int data:24; > >> >> > - unsigned int flags:8; > >> >> > + unsigned int count:12; > >> >> > + unsigned int state:3; > >> >> > >> >> I still prefer normal data type over bit fields. How about > >> >> > >> >> u16 usage; > >> >> u8 state; > >> > > >> > I don't mind the "count" rename to "usage". That is probably a better > >> > name. However I have another patch intended to add more bit fields in > >> > the cluster info struct. The second patch adds "order" and the later > >> > patch will add more. That is why I choose bitfield to be more condense > >> > with bits. > >> > >> We still have space for another "u8" for "order". It appears trivial to > >> change it to bit fields when necessary in the future. > > > > We can, I don't see it necessary to change from bit field to u8 and > > back to bit field in the future. It is more of a personal preference > > issue. > > I have to say that I don't think that it's just a personal preference. > IMO, if it's unnecessary, we shouldn't use bit fields. You cannot > guarantee that your future changes will be merged in its current state. > So, I still think that it's better to avoid bit fields for now. That is surprising to hear, I am not dependent on any hardware physical bit location. Anyway, not too big a deal for me. I changed it to u16/u8. > >> > When you get the cluster from the list, you have a cluster pointer. I > >> > feel it is unnecessary to convert to index then back convert to > >> > cluster pointer inside lock_cluster(). I actually feel using indexes > >> > to refer to the cluster is error prone because we also have offset. > >> > >> I don't think so, it's common to use swap offset. > > > > Swap offset is not an issue, it is all over the place. The cluster > > index(offset/512) is the one I try to avoid. > > I have made some mistakes myself on offset vs index. > > Yes. That's not good, but it's hard to be avoided too. Can we make the > variable name more consistent? index: cluster index, offset: swap > offset. > > And, in fact, swap offset is the parameter of lock_cluster() instead of > cluster index. Right, when you get the cluster pointer from the list, it can't directly use lock_cluster(). > > >> > > >> >> > >> >> > + __free_cluster(si, ci); > >> >> > memset(si->swap_map + idx * SWAPFILE_CLUSTER, > >> >> > 0, SWAPFILE_CLUSTER); > >> >> > unlock_cluster(ci); > >> >> > @@ -521,20 +412,19 @@ static void swap_users_ref_free(struct percpu_ref *ref) > >> >> > complete(&si->comp); > >> >> > } > >> >> > > > [snip] > > >> >> > @@ -611,10 +497,10 @@ scan_swap_map_ssd_cluster_conflict(struct swap_info_struct *si, > >> >> > { > >> >> > struct percpu_cluster *percpu_cluster; > >> >> > bool conflict; > >> >> > - > >> >> > >> >> Usually we use one blank line after local variable declaration. > >> > Ack. > >> > > >> >> > >> >> > + struct swap_cluster_info *first = list_first_entry(&si->free_clusters, struct swap_cluster_info, list); > >> >> > offset /= SWAPFILE_CLUSTER; > >> >> > - conflict = !cluster_list_empty(&si->free_clusters) && > >> >> > - offset != cluster_list_first(&si->free_clusters) && > >> >> > + conflict = !list_empty(&si->free_clusters) && > >> >> > + offset != first - si->cluster_info && > >> >> > cluster_is_free(&si->cluster_info[offset]); > >> >> > > >> >> > if (!conflict) > >> >> > @@ -655,10 +541,14 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, > >> >> > cluster = this_cpu_ptr(si->percpu_cluster); > >> >> > tmp = cluster->next[order]; > >> >> > if (tmp == SWAP_NEXT_INVALID) { > >> >> > - if (!cluster_list_empty(&si->free_clusters)) { > >> >> > - tmp = cluster_next(&si->free_clusters.head) * > >> >> > - SWAPFILE_CLUSTER; > >> >> > - } else if (!cluster_list_empty(&si->discard_clusters)) { > >> >> > + if (!list_empty(&si->free_clusters)) { > >> >> > + ci = list_first_entry(&si->free_clusters, struct swap_cluster_info, list); > >> >> > + list_del(&ci->list); > >> >> > >> >> The free cluster is deleted from si->free_clusters now. But later you > >> >> will call scan_swap_map_ssd_cluster_conflict() and may abandon the > >> >> cluster. And in alloc_cluster() later, it may be deleted again. > >> > > >> > Yes, that is a bug. Thanks for catching that. > >> > > >> >> > >> >> > + spin_lock(&ci->lock); > >> >> > + ci->state = CLUSTER_STATE_PER_CPU; > >> >> > >> >> Need to change ci->state when move a cluster off the percpu_cluster. > >> > > >> > In the next patch. This patch does not use the off state yet. > >> > >> But that is confusing to use wrong state name, the really meaning is > >> something like CLUSTER_STATE_NON_FREE. But as I suggested above, we can > > > > It can be FREE and on the per cpu pointer as well. That is the tricky part. > > It can happen on the current code as well. > > cluster_set_count_flag(0, 0) is called in alloc_cluster(). So, it's not > an issue in current code. If you need more, that shouldn't be done in > this patch. Revert to V1 like using the flags. Chris