Re: [PATCH v9 4/5] Input: cs40l50 - Add support for the CS40L50 haptic driver

James Ogletree <James.Ogletree@xxxxxxxxxx> · Fri, 15 Mar 2024 20:23:33 +0000

Hi Jeff,

> On Mar 14, 2024, at 10:54 AM, Jeff LaBundy <jeff@xxxxxxxxxxx> wrote:
> 
> Hi James,
> 
> On Fri, Mar 08, 2024 at 10:24:20PM +0000, James Ogletree wrote:
>> 
>> +static void vibra_start_worker(struct work_struct *work)
>> +{
>> + struct vibra_work *work_data = container_of(work, struct vibra_work, work);
>> + struct vibra_info *info = work_data->info;
>> + struct vibra_effect *start_effect;
>> +
>> + if (pm_runtime_resume_and_get(info->dev) < 0)
>> + goto err_free;
>> +
>> + mutex_lock(&info->lock);
>> +
>> + start_effect = vibra_find_effect(work_data->effect->id, &info->effect_head);
>> + if (start_effect) {
>> + while (--work_data->count >= 0) {
>> + info->dsp.write(info->dev, info->regmap, start_effect->index);
>> + usleep_range(work_data->effect->replay.length,
>> +     work_data->effect->replay.length + 100);
>> + }
>> + }
>> +
>> + mutex_unlock(&info->lock);
>> +
>> + if (!start_effect)
>> + dev_err(info->dev, "Effect to play not found\n");
>> +
>> + pm_runtime_mark_last_busy(info->dev);
>> + pm_runtime_put_autosuspend(info->dev);
>> +err_free:
>> + kfree(work_data);
> 
> This business of allocating memory in one thread and freeing it in another
> got pretty hard to follow, which means it will be hard to maintain. I know
> there are some restrictions around writing into the HALO core, but I'd like
> to see something simpler; the da7280 driver is nowhere near as complex.
> 
> How many s16 words do you realistically need for custom_data? Is it possible
> to simply include an array in the cs40l50_vibra_info struct, and bleat if
> user space tries to upload greater than the maximum size? This seems to be
> the approach of the much simpler da7280 driver as well.
> 
> This array would go on the heap just the same, but you do not have to manage
> it so carefully. Opinions may vary, but mine is that memory allocation and
> freeing should be done in the same scope where possible.

The original issue was that the old implementation of playback effect had
a race condition which Dmitry pointed out. Consider when cs40l50_playback
is called: it populates effect data (including the index, say 1) and
schedules a trigger work to be run in the future. Before this work is
executed, a second call to cs40l50_playback is made, and again effect is
populated (say, index 2), and trigger work is scheduled again. If effect
is in shared data, then it will be overwritten by the second call and index
2 will be triggered twice.

So, I needed a separate work item for each playback to hold its individual
data so that effect cannot be overwritten by other playback calls. This
meant that I needed to take the playback work item and work-related data
out of shared memory (cs40l50_vibra_info), and create a structure
containing all the work data tied to that specific queue_work invocation,
as well as the work_struct itself so we can access that unique work data
via container_of() in the worker function.

Of course to have individuated work data, we need to allocate and populate
a new cs40l50_work with data in each input callback. Sometimes this
entails dynamic allocation. Why? Because sometimes we have no way of
knowing when we can let go of the work data. Contrast upload and erase:
these callbacks are not called in atomic context, so we can use flush_work
to guarantee the worker function returns, and therefore we can free the
work data in the callback. cs40l50_playback on the other hand is called
in atomic context, so we cannot use flush_work, and so we have to
"hold on" to the work data until such time known only by the worker
function itself. Hence we must free it there. So I think the complexity
here falls into the necessary bucket.

I don't think the above applies to the custom data memory, since that
particular memory is freed in the same function it is allocated in.
Looking at the possible use cases, the ceiling on the size of the data
is quite high if the customer wishes to upload a long sequence of
effects. So I prefer to keep that as it currently is.

It might be possible to use the "individuated work item" design for
playbacks only, and use the old "shared work item" for the other
callbacks. However, I decided that having one uniform implementation
would be better than mixing two schemes.

Best,
James