Re: [PATCH v4 1/2] remoteproc: Add remoteproc character device interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 21, 2020 at 01:56:35PM -0700, Bjorn Andersson wrote:
> On Tue 21 Jul 12:16 PDT 2020, Siddharth Gupta wrote:
> > On 7/15/2020 2:51 PM, Mathieu Poirier wrote:
> > > On Wed, Jul 15, 2020 at 02:18:39PM -0600, Mathieu Poirier wrote:
> > > > On Tue, Jul 07, 2020 at 12:07:49PM -0700, Siddharth Gupta wrote:
> [..]
> > > > > diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
> [..]
> > > > > +int rproc_char_device_add(struct rproc *rproc)
> > > > > +{
> > > > > +	int ret;
> > > > > +	dev_t cdevt;
> > > > > +
> > > > > +	cdev_init(&rproc->char_dev, &rproc_fops);
> > > > > +	rproc->char_dev.owner = THIS_MODULE;
> > > > > +
> > > > > +	cdevt = MKDEV(rproc_major, rproc->index);
> > > > > +	ret = cdev_add(&rproc->char_dev, cdevt, 1);
> > > Trying this patchset on my side gave me the following splat[1].  After finding
> > > the root case I can't understand how you haven't see it on your side when you
> > > tested the feature.
> > > 
> > > [1]. https://pastebin.com/aYTUUCdQ
> 
> Mathieu, I've looked at this back and forth. Afaict this implies that
> rproc_major is still 0. Could it be that either alloc_chrdev_region()
> failed or somehow has yet to be called when you hit this point?

That is exacly what I thought when I first stumbled on this but instrumenting
the code showed otherwise.

After function rproc_init_cdev() has been called @rproc_major contains the
dynamically allocated major number in the upper 12 bits and the base minor
number in the lower 20 bits.

In rproc_char_device_add() we find this line:

        cdevt = MKDEV(rproc_major, rproc->index);

Macro MKDEV() builds a device number by shifting @rproc_major by 20 bits to the
left and OR'ing that with @rproc->index.  But the device's major number is
already occupying the upper 12bits, so shifthing another 20 bits to the left
makes the major portion of the device number '0'.  That is causing cdev_add() to
complain bitterly.

The right way to do this is:

        cdevt = MKDEV(MAJOR(rproc_major), rproc->index);

Once I found the problem I thought about 32/64 bit issues.  Since Siddharth is
using a 64bit application processor shifting another 20 bits would still have
yielded a non-zero value.  But that can't be since dev_t is a u32 in
linux/types.h.

As such I can't see how it is possible to not hit that problem on a 64bit
platform.

> 
> > Hey Mathieu,
> > 
> > We aren't able to reproduce the error that you are seeing, the splat is
> > coming
> > from the check for whiteout device[1] - which shouldn't happen because of
> > the
> > find_dynamic_major call[2], right?
> > 
> > We are successfully seeing all our character device files and able to
> > successfully boot remoteprocs. From what I read and understood about
> > whiteout
> > devices they will be hidden in the fs.
> > 
> > Could you provide more details about your configuration and testing?
> > 
> > [1]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L486
> > <https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L123>
> > [2]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L123
> > 
> > <https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L486>
> > > > > +	if (ret < 0)
> > > > > +		goto out;
> > > > > +
> > > > > +	rproc->dev.devt = cdevt;
> > > > > +out:
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +void rproc_char_device_remove(struct rproc *rproc)
> > > > > +{
> > > > > +	__unregister_chrdev(rproc_major, rproc->index, 1, "remoteproc");
> > > > > +}
> > > > > +
> > > > > +void __init rproc_init_cdev(void)
> > > > > +{
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = alloc_chrdev_region(&rproc_major, 0, NUM_RPROC_DEVICES, "remoteproc");
> > > > > +	if (ret < 0)
> > > > > +		pr_err("Failed to alloc rproc_cdev region, err %d\n", ret);
> > > > > +}
> > > > > +
> > > > > +void __exit rproc_exit_cdev(void)
> > > > > +{
> > > > > +	unregister_chrdev_region(MKDEV(rproc_major, 0), NUM_RPROC_DEVICES);
> > > > Please go back to the comment I made on this during my last review and respin.
> > > After digging in the code while debugging the above problem, I don't see how
> > > unregistering the chrdev region the way it is done here would have worked.
> > Since this is compiled statically and not built as a module, we will never
> > exercise the code path, so I will remove it in the next patchset.
> > 
> 
> You're right Siddharth, since we changed CONFIG_REMOTEPROC to bool it's no longer
> possible to hit remoteproc_exit(), so you can omit this function
> entirely. (And we should clean up the rest of that as well)
> 
> [..]
> > > > > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> [..]
> > > > > @@ -488,6 +489,8 @@ struct rproc_dump_segment {
> > > > >    * @auto_boot: flag to indicate if remote processor should be auto-started
> > > > >    * @dump_segments: list of segments in the firmware
> > > > >    * @nb_vdev: number of vdev currently handled by rproc
> > > > > + * @char_dev: character device of the rproc
> > > > > + * @cdev_put_on_release: flag to indicate if remoteproc should be shutdown on @char_dev release
> > > > >    */
> > > > >   struct rproc {
> > > > >   	struct list_head node;
> > > > > @@ -523,6 +526,8 @@ struct rproc {
> > > > >   	int nb_vdev;
> > > > >   	u8 elf_class;
> > > > >   	u16 elf_machine;
> > > > > +	struct cdev char_dev;
> 
> As stated privately, I assumed based on this name that this is a struct
> device related to that character device. So please rename this cdev to
> save me from doing this mistake again.
> 
> Thanks,
> Bjorn



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux