Re: Run queue corruption issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 18, 2016 at 01:29:55AM -0400, Jerrin Shaji George wrote:
> Hi Greg,
> 
> Thanks for your response.
> 
> On Tue, May 17, 2016 at 7:20 PM, Greg KH <greg@xxxxxxxxx> wrote:
> > On Tue, May 17, 2016 at 06:55:07PM -0400, Jerrin Shaji George wrote:
> >> Hi All,
> >>
> >> I wanted help with a piece of code that I have been working on.
> >>
> >> Please see -
> >>
> >> https://gist.github.com/jerrinsg/333e584d1f65dc95b9f13b61dcebdaa7
> >>
> >> I have written two function, migrate_to and migrate_back. migrate_to is used
> >> to remove a process from the run queue, and migrate_back is used to insert this
> >> process back into the run queue.
> >>
> >> The gist is from a taken from a larger project, where we are working on building
> >> a mechanism to support thread migration across heterogeneous processors.
> >> migrate_to_call() will be called by a thread which wants to remove itself from
> >> the run queue (hence, it will pass the current task struct as the migration
> >> argument). Once the other processor completes execution of the assigned task, it
> >> will interrupt the main processor, which runs an interrupt handler, which in
> >> turn calls the migrate_back_call() function. It passes the task struct of the
> >> process that was removed from the run queue earlier to this function.
> >>
> >> This mechanism works fine the first few times, but when this process is repeated
> >> many times in a loop, I am seeing a run queue corruption:
> >> https://gist.github.com/jerrinsg/0ab09cd435d8d2cb6ae692c7e6f4f26b
> >>
> >> Is there anything wrong in the process dequeue or enqueue function that I have
> >> written? Please help!
> >
> > volatile doesn't mean what you think it does, please don't use it in the
> > kernel.
> >
> 
> This flag was to be used for synchronization. I will change this.
> 
> > And why are you using "raw_spin_lock()"?
> 
> I used this seeing other usage in sched/core.c. Can please you let me know if I
> should instead use a different function to lock the run queue?

Ah, don't know, don't mess with the scheduler, thankfully :)

> >> Kernel used: Linux 3.13
> >
> > Wow that's obsolete and buggy, why use such an old thing?
> 
> This is the codebase that I inherited. Once I get the basic prototype working, I
> will be working to port it to a newer version of the kernel.

Try porting it to a modern kernel and then posting your real patch for
review, that would make things a bit more obvious and probably show your
bug better.

good luck,

greg k-h

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux