On Thu, Jul 07, 2016 at 07:18:34PM +0900, Alexandre Courbot wrote: > On Thu, Jul 7, 2016 at 5:17 PM, Joseph Lo <josephl@xxxxxxxxxx> wrote: > > On 07/06/2016 07:39 PM, Alexandre Courbot wrote: > >> > >> Sorry, I will probably need to do several passes on this one to > >> understand everything, but here is what I can say after a first look: > >> > >> On Tue, Jul 5, 2016 at 6:04 PM, Joseph Lo <josephl@xxxxxxxxxx> wrote: > >>> > >>> The Tegra BPMP (Boot and Power Management Processor) is designed for the > >>> booting process handling, offloading the power management tasks and > >>> some system control services from the CPU. It can be clock, DVFS, > >>> thermal/EDP, power gating operation and system suspend/resume handling. > >>> So the CPU and the drivers of these modules can base on the service that > >>> the BPMP firmware driver provided to signal the event for the specific PM > >>> action to BPMP and receive the status update from BPMP. > >>> > >>> Comparing to the ARM SCPI, the service provided by BPMP is message-based > >>> communication but not method-based. The BPMP firmware driver provides the > >>> send/receive service for the users, when the user concerns the response > >>> time. If the user needs to get the event or update from the firmware, it > >>> can request the MRQ service as well. The user needs to take care of the > >>> message format, which we call BPMP ABI. > >>> > >>> The BPMP ABI defines the message format for different modules or usages. > >>> For example, the clock operation needs an MRQ service code called > >>> MRQ_CLK with specific message format which includes different sub > >>> commands for various clock operations. This is the message format that > >>> BPMP can recognize. > >>> > >>> So the user needs two things to initiate IPC between BPMP. Get the > >>> service from the bpmp_ops structure and maintain the message format as > >>> the BPMP ABI defined. > >>> > >>> Based-on-the-work-by: > >>> Sivaram Nair <sivaramn@xxxxxxxxxx> > >>> > >>> Signed-off-by: Joseph Lo <josephl@xxxxxxxxxx> > >>> --- > >>> Changes in V2: > >>> - None > >>> --- > >>> drivers/firmware/tegra/Kconfig | 12 + > >>> drivers/firmware/tegra/Makefile | 1 + > >>> drivers/firmware/tegra/bpmp.c | 713 +++++++++++++++++ > >>> include/soc/tegra/bpmp.h | 29 + > >>> include/soc/tegra/bpmp_abi.h | 1601 > >>> +++++++++++++++++++++++++++++++++++++++ > >>> 5 files changed, 2356 insertions(+) > >>> create mode 100644 drivers/firmware/tegra/bpmp.c > >>> create mode 100644 include/soc/tegra/bpmp.h > >>> create mode 100644 include/soc/tegra/bpmp_abi.h > >>> > >>> diff --git a/drivers/firmware/tegra/Kconfig > >>> b/drivers/firmware/tegra/Kconfig > >>> index 1fa3e4e136a5..ff2730d5c468 100644 > >>> --- a/drivers/firmware/tegra/Kconfig > >>> +++ b/drivers/firmware/tegra/Kconfig > >>> @@ -10,4 +10,16 @@ config TEGRA_IVC > >>> keeps the content is synchronization between host CPU and > >>> remote > >>> processors. > >>> > >>> +config TEGRA_BPMP > >>> + bool "Tegra BPMP driver" > >>> + depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC > >>> + help > >>> + BPMP (Boot and Power Management Processor) is designed to > >>> off-loading > >> > >> > >> s/off-loading/off-load > >> > >>> + the PM functions which include clock/DVFS/thermal/power from > >>> the CPU. > >>> + It needs HSP as the HW synchronization and notification module > >>> and > >>> + IVC module as the message communication protocol. > >>> + > >>> + This driver manages the IPC interface between host CPU and the > >>> + firmware running on BPMP. > >>> + > >>> endmenu > >>> diff --git a/drivers/firmware/tegra/Makefile > >>> b/drivers/firmware/tegra/Makefile > >>> index 92e2153e8173..e34a2f79e1ad 100644 > >>> --- a/drivers/firmware/tegra/Makefile > >>> +++ b/drivers/firmware/tegra/Makefile > >>> @@ -1 +1,2 @@ > >>> +obj-$(CONFIG_TEGRA_BPMP) += bpmp.o > >>> obj-$(CONFIG_TEGRA_IVC) += ivc.o > >>> diff --git a/drivers/firmware/tegra/bpmp.c > >>> b/drivers/firmware/tegra/bpmp.c > >>> new file mode 100644 > >>> index 000000000000..24fda626610e > >>> --- /dev/null > >>> +++ b/drivers/firmware/tegra/bpmp.c > >>> @@ -0,0 +1,713 @@ > >>> +/* > >>> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved. > >>> + * > >>> + * This program is free software; you can redistribute it and/or modify > >>> it > >>> + * under the terms and conditions of the GNU General Public License, > >>> + * version 2, as published by the Free Software Foundation. > >>> + * > >>> + * This program is distributed in the hope it will be useful, but > >>> WITHOUT > >>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > >>> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > >>> for > >>> + * more details. > >>> + */ > >>> + > >>> +#include <linux/mailbox_client.h> > >>> +#include <linux/of.h> > >>> +#include <linux/of_address.h> > >>> +#include <linux/of_device.h> > >>> +#include <linux/platform_device.h> > >>> +#include <linux/semaphore.h> > >>> + > >>> +#include <soc/tegra/bpmp.h> > >>> +#include <soc/tegra/bpmp_abi.h> > >>> +#include <soc/tegra/ivc.h> > >>> + > >>> +#define BPMP_MSG_SZ 128 > >>> +#define BPMP_MSG_DATA_SZ 120 > >>> + > >>> +#define __MRQ_ATTRS 0xff000000 > >>> +#define __MRQ_INDEX(id) ((id) & ~__MRQ_ATTRS) > >>> + > >>> +#define DO_ACK BIT(0) > >>> +#define RING_DOORBELL BIT(1) > >>> + > >>> +struct tegra_bpmp_soc_data { > >>> + u32 ch_index; /* channel index */ > >>> + u32 thread_ch_index; /* thread channel index */ > >>> + u32 cpu_rx_ch_index; /* CPU Rx channel index */ > >>> + u32 nr_ch; /* number of total channels */ > >>> + u32 nr_thread_ch; /* number of thread channels */ > >>> + u32 ch_timeout; /* channel timeout */ > >>> + u32 thread_ch_timeout; /* thread channel timeout */ > >>> +}; > >> > >> > >> With just these comments it is not clear what everything in this > >> structure does. Maybe a file-level comment explaining how BPMP > >> basically works and what the different channels are allocated to would > >> help understanding the code. > > > > > > We have two kinds of TX channels (channel & thread channel above) for the > > BPMP clients (clock, thermal, reset, power mgmt control, etc.) to use. > > > > The channel means an atomic channel that could be used when the client needs > > the response immediately. e.g. setting clock rate, re-parent the clock > > source. Each CPUs have it's own atomic for the usage. The client can acquire > > one of them, and the ch_index means the first channel they are able to use > > in the channel array. > > > > The response of thread channel can be postponed later. And the client allows > > getting the response after BPMP finished the service and response to them by > > IRQ. The thread_ch_index means the same the first channel that the clients > > are available to use. > > > > And the CPU RX channel is designed for the client to register some specific > > services (We call MRQ in the bpmp_abi.) listen to some update from the BPMP > > firmware. > > > > Because we might have different numbers of these channels, using this > > structure as the bpmp_soc_data to get different configuration according to > > different SoC. > > Thanks, that clarifies things. This explanation deserves to in the C > file as well IMHO. > > So IIUC the first 13 channels (6 bound to a specific CPU core and 7 > threaded, allocated dynamically) are all used to initiate a > communication to the BPMP, while the cpu_rx channel is used as a sort > of IRQ (hence the name MRQ). Is this correct? This would be valuable > to state too. Maybe cpu_rx_ch_index can even be renamed to something > like mrq_ch_index to stress that fact. > > A few additional comments follow below as I did a second pass on the code. > > > > > > >> > >>> + > >>> +struct channel_info { > >>> + u32 tch_free; > >>> + u32 tch_to_complete; > >>> + struct semaphore tch_sem; > >>> +}; > >>> + > >>> +struct mb_data { > >>> + s32 code; > >>> + s32 flags; > >>> + u8 data[BPMP_MSG_DATA_SZ]; > >>> +} __packed; > >>> + > >>> +struct channel_data { > >>> + struct mb_data *ib; > >>> + struct mb_data *ob; > >>> +}; > >>> + > >>> +struct mrq { > >>> + struct list_head list; > >>> + u32 mrq_code; > >>> + bpmp_mrq_handler handler; > >>> + void *data; > >>> +}; > >>> + > >>> +struct tegra_bpmp { > >>> + struct device *dev; > >>> + const struct tegra_bpmp_soc_data *soc_data; > >>> + void __iomem *tx_base; > >>> + void __iomem *rx_base; > >>> + struct mbox_client cl; > >>> + struct mbox_chan *chan; > >>> + struct ivc *ivc_channels; > >>> + struct channel_data *ch_area; > >>> + struct channel_info ch_info; > >>> + struct completion *ch_completion; > >>> + struct list_head mrq_list; > >>> + struct tegra_bpmp_ops *ops; > >>> + spinlock_t lock; > >>> + bool init_done; > >>> +}; > >>> + > >>> +static struct tegra_bpmp *bpmp; > >> > >> > >> static? Ok, we only need one... for now. How about a private member in > >> your ivc structure that allows you to retrieve the bpmp and going > >> dynamic? This will require an extra argument in many functions, but is > >> cleaner design IMHO. > > > > > > IVC is designed as a generic IPC protocol among different modules (We have > > not introduced some other usages of the IVC right now.). Maybe don't churn > > some other stuff into IVC is better. > > Anything is fine if you can get rid of that static. > > > > >> > >>> + > >>> +static int bpmp_get_thread_ch(int idx) > >>> +{ > >>> + return bpmp->soc_data->thread_ch_index + idx; > >>> +} > >>> + > >>> +static int bpmp_get_thread_ch_index(int ch) > >>> +{ > >>> + if (ch < bpmp->soc_data->thread_ch_index || > >>> + ch >= bpmp->soc_data->cpu_rx_ch_index) > >> > >> > >> Shouldn't that be ch >= bpmp->soc_data->cpu_rx_ch_index + > >> bpmp->soc_data->nr_thread_ch? > >> > >> Either rx_ch_index indicates the upper bound of the threaded channels, > >> and in that case you don't need tegra_bpmp_soc_data::nr_thread_ch, or > >> it can be anywhere else and you should use the correct member. > > > > > > According the to the table below, we have 14 channels. > > atomic ch: 0 ~ 5, 6 chanls > > thread ch: 6 ~ 17, 7 chanls > > CPU RX ch: 13 ~ 14, 2 chanls Or, did you mean thread ch: 6 -> 12 cpu rx ch: 13 (1 channel) > > > > +static const struct tegra_bpmp_soc_data soc_data_tegra186 = { > > + .ch_index = 0, > > + .thread_ch_index = 6, > > + .cpu_rx_ch_index = 13, > > + .nr_ch = 14, > > + .nr_thread_ch = 7, > > + .ch_timeout = 60 * USEC_PER_SEC, > > + .thread_ch_timeout = 600 * USEC_PER_SEC, > > +}; > > > > We use the index to check channel violation and nr_thread_ch for other usage > > to avoid redundant channel number calculation elsewhere. > > Sorry, my comment had a mistake. I meant that > > ch >= bpmp->soc_data->cpu_rx_ch_index > > Should maybe be > > ch >= bpmp->soc_data->cpu_rx_ch_index + bpmp->soc_data->nr_thread_ch Or did you mean ch >= bpmp->soc_data->thread_ch_index + bpmp->soc_data->nr_thread_ch ? > > According to the description you gave of these fields, there is no > guarantee that cpu_rx_ch_index will always be the first channel after > the threaded channels. I second Alex's concerns. It would better not to depend on the adjacency of the channels. Also I think this data should come from the device tree. > > > > > > >> > >>> + return -1; > >>> + return ch - bpmp->soc_data->thread_ch_index; > >>> +} > >>> + > >>> +static int bpmp_get_ob_channel(void) > >>> +{ > >>> + return smp_processor_id() + bpmp->soc_data->ch_index; > >>> +} > >>> + > >>> +static struct completion *bpmp_get_completion_obj(int ch) > >>> +{ > >>> + int i = bpmp_get_thread_ch_index(ch); > >>> + > >>> + return i < 0 ? NULL : bpmp->ch_completion + i; > >>> +} > >>> + > >>> +static int bpmp_valid_txfer(void *ob_data, int ob_sz, void *ib_data, int > >>> ib_sz) > >>> +{ > >>> + return ob_sz >= 0 && ob_sz <= BPMP_MSG_DATA_SZ && > >>> + ib_sz >= 0 && ib_sz <= BPMP_MSG_DATA_SZ && > >>> + (!ob_sz || ob_data) && (!ib_sz || ib_data); > >>> +} > >>> + > >>> +static bool bpmp_master_acked(int ch) > >>> +{ > >>> + struct ivc *ivc_chan; > >>> + void *frame; > >>> + bool ready; > >>> + > >>> + ivc_chan = bpmp->ivc_channels + ch; > >>> + frame = tegra_ivc_read_get_next_frame(ivc_chan); > >>> + ready = !IS_ERR_OR_NULL(frame); > >>> + bpmp->ch_area[ch].ib = ready ? frame : NULL; > >>> + > >>> + return ready; > >>> +} > >>> + > >>> +static int bpmp_wait_ack(int ch) > > Shouldn't this be bpmp_wait_master_ack ? Looking at the two next > functions makes me think it should (or bpmp_wait_master_free should be > renamed to bpmp_wait_free). > > >>> +{ > >>> + ktime_t t; > >>> + > >>> + t = ns_to_ktime(local_clock()); > >>> + > >>> + do { > >>> + if (bpmp_master_acked(ch)) > >>> + return 0; > >>> + } while (ktime_us_delta(ns_to_ktime(local_clock()), t) < > >>> + bpmp->soc_data->ch_timeout); > >>> + > >>> + return -ETIMEDOUT; > >>> +} > >>> + > >>> +static bool bpmp_master_free(int ch) > >>> +{ > >>> + struct ivc *ivc_chan; > >>> + void *frame; > >>> + bool ready; > >>> + > >>> + ivc_chan = bpmp->ivc_channels + ch; > >>> + frame = tegra_ivc_write_get_next_frame(ivc_chan); > >>> + ready = !IS_ERR_OR_NULL(frame); > >>> + bpmp->ch_area[ch].ob = ready ? frame : NULL; > >>> + > >>> + return ready; > >>> +} > >>> + > >>> +static int bpmp_wait_master_free(int ch) > >>> +{ > >>> + ktime_t t; > >>> + > >>> + t = ns_to_ktime(local_clock()); > >>> + > >>> + do { > >>> + if (bpmp_master_free(ch)) > >>> + return 0; > >>> + } while (ktime_us_delta(ns_to_ktime(local_clock()), t) > >>> + < bpmp->soc_data->ch_timeout); > >>> + > >>> + return -ETIMEDOUT; > >>> +} > >>> + > >>> +static int __read_ch(int ch, void *data, int sz) > >>> +{ > >>> + struct ivc *ivc_chan; > >>> + struct mb_data *p; > >>> + > >>> + ivc_chan = bpmp->ivc_channels + ch; > >>> + p = bpmp->ch_area[ch].ib; > >>> + if (data) > >>> + memcpy_fromio(data, p->data, sz); > >>> + > >>> + return tegra_ivc_read_advance(ivc_chan); > >>> +} > >>> + > >>> +static int bpmp_read_ch(int ch, void *data, int sz) > > bpmp_read_threaded_ch maybe? we have bpmp_write_threaded_ch below, as > this function is clearly dealing with threaded channels only. > > >>> +{ > >>> + unsigned long flags; > >>> + int i, ret; > >>> + > >>> + i = bpmp_get_thread_ch_index(ch); > >> > >> > >> i is not a very good name for this variable. > >> Also note that bpmp_get_thread_ch_index() can return -1, this case is > >> not handled. > > > > Okay, will fix this. > > > > > >> > >>> + > >>> + spin_lock_irqsave(&bpmp->lock, flags); > >>> + ret = __read_ch(ch, data, sz); > >>> + bpmp->ch_info.tch_free |= (1 << i); > >>> + spin_unlock_irqrestore(&bpmp->lock, flags); > >>> + > >>> + up(&bpmp->ch_info.tch_sem); > >>> + > >>> + return ret; > >>> +} > >>> + > >>> +static int __write_ch(int ch, int mrq_code, int flags, void *data, int > >>> sz) > >>> +{ > >>> + struct ivc *ivc_chan; > >>> + struct mb_data *p; > >>> + > >>> + ivc_chan = bpmp->ivc_channels + ch; > >>> + p = bpmp->ch_area[ch].ob; > >>> + > >>> + p->code = mrq_code; > >>> + p->flags = flags; > >>> + if (data) > >>> + memcpy_toio(p->data, data, sz); > >>> + > >>> + return tegra_ivc_write_advance(ivc_chan); > >>> +} > >>> + > >>> +static int bpmp_write_threaded_ch(int *ch, int mrq_code, void *data, int > >>> sz) > >>> +{ > >>> + unsigned long flags; > >>> + int ret, i; > >>> + > >>> + ret = down_timeout(&bpmp->ch_info.tch_sem, > >>> + > >>> usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout)); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + spin_lock_irqsave(&bpmp->lock, flags); > >>> + > >>> + i = __ffs(bpmp->ch_info.tch_free); > >>> + *ch = bpmp_get_thread_ch(i); > >>> + ret = bpmp_master_free(*ch) ? 0 : -EFAULT; > >>> + if (!ret) { > > Style nit: I prefer to make the error case the exception, and normal > runtime the norm. This is where a goto statement can actually make > your code easier to follow. Have an err: label before the spin_unlock, > and jump to it if ret != 0. Then you can have the next three lines at > the lower indentation level, and not looking like as if they were an > error themselves. > > Or if you really don't like the goto, check for ret != 0 and do the > spin_unlock and return in that block. > > >>> + bpmp->ch_info.tch_free &= ~(1 << i); > >>> + __write_ch(*ch, mrq_code, DO_ACK | RING_DOORBELL, data, > >>> sz); > >>> + bpmp->ch_info.tch_to_complete |= (1 << *ch); > >>> + } > >>> + > >>> + spin_unlock_irqrestore(&bpmp->lock, flags); > >>> + > >>> + return ret; > >>> +} > >>> + > >>> +static int bpmp_write_ch(int ch, int mrq_code, int flags, void *data, > >>> int sz) > >>> +{ > >>> + int ret; > >>> + > >>> + ret = bpmp_wait_master_free(ch); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + return __write_ch(ch, mrq_code, flags, data, sz); > >>> +} > >>> + > >>> +static int bpmp_send_receive_atomic(int mrq_code, void *ob_data, int > >>> ob_sz, > >>> + void *ib_data, int ib_sz) > >>> +{ > >>> + int ch, ret; > >>> + > >>> + if (WARN_ON(!irqs_disabled())) > >>> + return -EPERM; > >>> + > >>> + if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz)) > >>> + return -EINVAL; > >>> + > >>> + if (!bpmp->init_done) > >>> + return -ENODEV; > >>> + > >>> + ch = bpmp_get_ob_channel(); > >>> + ret = bpmp_write_ch(ch, mrq_code, DO_ACK, ob_data, ob_sz); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + ret = mbox_send_message(bpmp->chan, NULL); > >>> + if (ret < 0) > >>> + return ret; > >>> + mbox_client_txdone(bpmp->chan, 0); > >>> + > >>> + ret = bpmp_wait_ack(ch); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + return __read_ch(ch, ib_data, ib_sz); > >>> +} > >>> + > >>> +static int bpmp_send_receive(int mrq_code, void *ob_data, int ob_sz, > >>> + void *ib_data, int ib_sz) > >>> +{ > >>> + struct completion *comp_obj; > >>> + unsigned long timeout; > >>> + int ch, ret; > >>> + > >>> + if (WARN_ON(irqs_disabled())) > >>> + return -EPERM; > >>> + > >>> + if (!bpmp_valid_txfer(ob_data, ob_sz, ib_data, ib_sz)) > >>> + return -EINVAL; > >>> + > >>> + if (!bpmp->init_done) > >>> + return -ENODEV; > >>> + > >>> + ret = bpmp_write_threaded_ch(&ch, mrq_code, ob_data, ob_sz); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + ret = mbox_send_message(bpmp->chan, NULL); > >>> + if (ret < 0) > >>> + return ret; > >>> + mbox_client_txdone(bpmp->chan, 0); > >>> + > >>> + comp_obj = bpmp_get_completion_obj(ch); > >>> + timeout = usecs_to_jiffies(bpmp->soc_data->thread_ch_timeout); > >>> + if (!wait_for_completion_timeout(comp_obj, timeout)) > >>> + return -ETIMEDOUT; > >>> + > >>> + return bpmp_read_ch(ch, ib_data, ib_sz); > >>> +} > >>> + > >>> +static struct mrq *bpmp_find_mrq(u32 mrq_code) > >>> +{ > >>> + struct mrq *mrq; > >>> + > >>> + list_for_each_entry(mrq, &bpmp->mrq_list, list) { > >>> + if (mrq_code == mrq->mrq_code) > >>> + return mrq; > >>> + } > >>> + > >>> + return NULL; > >>> +} > >>> + > >>> +static void bpmp_mrq_return_data(int ch, int code, void *data, int sz) > >>> +{ > >>> + int flags = bpmp->ch_area[ch].ib->flags; > >>> + struct ivc *ivc_chan; > >>> + struct mb_data *frame; > >>> + int ret; > >>> + > >>> + if (WARN_ON(sz > BPMP_MSG_DATA_SZ)) > >>> + return; > >>> + > >>> + ivc_chan = bpmp->ivc_channels + ch; > >>> + ret = tegra_ivc_read_advance(ivc_chan); > >>> + WARN_ON(ret); > >>> + > >>> + if (!(flags & DO_ACK)) > >>> + return; > >>> + > >>> + frame = tegra_ivc_write_get_next_frame(ivc_chan); > >>> + if (IS_ERR_OR_NULL(frame)) { > >>> + WARN_ON(1); > >>> + return; > >>> + } > >>> + > >>> + frame->code = code; > >>> + if (data != NULL) > >>> + memcpy_toio(frame->data, data, sz); > >>> + ret = tegra_ivc_write_advance(ivc_chan); > >>> + WARN_ON(ret); > >>> + > >>> + if (flags & RING_DOORBELL) { > >>> + ret = mbox_send_message(bpmp->chan, NULL); > >>> + if (ret < 0) { > >>> + WARN_ON(1); > >>> + return; > >>> + } > >>> + mbox_client_txdone(bpmp->chan, 0); > >>> + } > >>> +} > >>> + > >>> +static void bpmp_mail_return(int ch, int ret_code, int val) > >>> +{ > >>> + bpmp_mrq_return_data(ch, ret_code, &val, sizeof(val)); > >>> +} > >>> + > >>> +static void bpmp_handle_mrq(int mrq_code, int ch) > >>> +{ > >>> + struct mrq *mrq; > >>> + > >>> + spin_lock(&bpmp->lock); > >>> + > >>> + mrq = bpmp_find_mrq(mrq_code); > >>> + if (!mrq) { > >>> + spin_unlock(&bpmp->lock); > >>> + bpmp_mail_return(ch, -EINVAL, 0); > >>> + return; > >>> + } > >>> + > >>> + mrq->handler(mrq_code, mrq->data, ch); > >>> + > >>> + spin_unlock(&bpmp->lock); > >>> +} > >>> + > >>> +static int bpmp_request_mrq(int mrq_code, bpmp_mrq_handler handler, void > >>> *data) > >>> +{ > >>> + struct mrq *mrq; > >>> + unsigned long flags; > >>> + > >>> + if (!handler) > >>> + return -EINVAL; > >>> + > >>> + mrq = devm_kzalloc(bpmp->dev, sizeof(*mrq), GFP_KERNEL); > >>> + if (!mrq) > >>> + return -ENOMEM; > >>> + > >>> + spin_lock_irqsave(&bpmp->lock, flags); > >>> + > >>> + mrq->mrq_code = __MRQ_INDEX(mrq_code); > >>> + mrq->handler = handler; > >>> + mrq->data = data; > >>> + list_add(&mrq->list, &bpmp->mrq_list); > >>> + > >>> + spin_unlock_irqrestore(&bpmp->lock, flags); > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static void bpmp_mrq_handle_ping(int mrq_code, void *data, int ch) > >>> +{ > >>> + int challenge; > >>> + int reply; > >>> + > >>> + challenge = *(int *)bpmp->ch_area[ch].ib->data; > >>> + reply = challenge << (smp_processor_id() + 1); > >>> + bpmp_mail_return(ch, 0, reply); > >>> +} > >>> + > >>> +static int bpmp_mailman_init(void) > >>> +{ > >>> + return bpmp_request_mrq(MRQ_PING, bpmp_mrq_handle_ping, NULL); > >>> +} > >>> + > >>> +static int bpmp_ping(void) > >>> +{ > >>> + unsigned long flags; > >>> + ktime_t t; > >>> + int challenge = 1; > >> > >> > >> Mmmm, shouldn't use a mrq_ping_request instead of an parameter which > >> size may vary depending on the architecture? On a 64-bit big endian > >> architecture, your messages would be corrupted. > > > > > > Clarify one thig first. The mrq_ping_request and mrq_handle_ping above are > > used for the ping form BPMP to CPU. Like I said above, it's among CPU RX > > channel to get some information from BPMP firmware. > > Ok, so mrq_handle_ping *should* use these data structures at the very least. > > > > > Here is the ping request from CPU to BPMP to make sure we can IPC with BPMP > > during the probe stage. > > > > About the endian issue, I think we don't consider that in the message format > > right now. So I think we only support little endian for the IPC messages > > right now. > > Any code in the kernel should function correctly regardless of > endianness. And the problem is not so much with endianness as it is > with the operand size - is the BPMP expecting a 64-bit challenge here? > Considering that the equivalent MRQ uses a 32-bit integer, I'd bet > not. So please use u32/u64 as needed as well as cpu_to_leXX (and > leXX_to_cpu for the opposite) to make your code solid. I second this. > > I understand that you don't want to use the MRQ structures because we > are not handling a MRQ here, but if they are relevant I think this > would still be safer that constructing messages from scalar data. That > or we should introduce a proper structure for these messages, but here > using the MRQ structure looks acceptable to me. Maybe they should not > be named MRQ at all, but that's not for us to decide. We should be using the mrq request structures from the ABI header. > > > > >> > >>> + int reply = 0; > >> > >> > >> And this should probably be a mrq_ping_response. These remarks may > >> also apply to bpmp_mrq_handle_ping(). > > > > That is for receiving the ping request from BPMP. > > > >> > >>> + int ret; > >>> + > >>> + t = ktime_get(); > >>> + local_irq_save(flags); > >>> + ret = bpmp_send_receive_atomic(MRQ_PING, &challenge, > >>> sizeof(challenge), > >>> + &reply, sizeof(reply)); > >>> + local_irq_restore(flags); > >>> + t = ktime_sub(ktime_get(), t); > >>> + > >>> + if (!ret) > >>> + dev_info(bpmp->dev, > >>> + "ping ok: challenge: %d, reply: %d, time: > >>> %lld\n", > >>> + challenge, reply, ktime_to_us(t)); > >>> + > >>> + return ret; > >>> +} > >>> + > >>> +static int bpmp_get_fwtag(void) > >>> +{ > >>> + unsigned long flags; > >>> + void *vaddr; > >>> + dma_addr_t paddr; > >>> + u32 addr; > >> > >> > >> Here also we should use a mrq_query_tag_request. > > > > The is one-way request from CPU to BPMP. So we don't request an MRQ for > > that. It is not clear to me what you mean by 'one-way request' here. We are sending a request to get the tag and we are getting the tag back via the same message's response. Anyway, we should be using the 'struct mrq_query_tag_request' here to be consistent. > > > >> > >>> + int ret; > >>> + > >>> + vaddr = dma_alloc_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, &paddr, > >>> + GFP_KERNEL); > >> > >> > >> dma_addr_t may be 64 bit here, and you may get an address higher than > >> the 32 bits allowed by mrq_query_tag_request! I guess you want to add > >> GFP_DMA32 as flag to your call to dma_alloc_coherent. > > > > BPMP should able to handle the address above 32 bits, but I am not sure does > > it configure to support that? Either way, since this specific MRQ takes in only 32 bit address, I think we should follow Alex's recommendation to use the GFP_DMA32 flag. > > If the message you pass only contains a 32-bit address, then I'm > afraid the protocol is the limiting factor here until it is updated. > > Can't wait for the day when we will have to manage several versions of > this protocol! >_< If we need to pass a larger-than-32 bit address for this MRQ (or for anything that takes in a 32-bit address now), the agreed upon process is to define a new MRQ (i.e one with a different integer id) that takes in new address type (and deprecating the 32-bit MRQ version). > > > > > Will fix this. > > > > > >> > >>> + if (!vaddr) > >>> + return -ENOMEM; > >>> + addr = paddr; > >>> + > >>> + local_irq_save(flags); > >>> + ret = bpmp_send_receive_atomic(MRQ_QUERY_TAG, &addr, > >>> sizeof(addr), > >>> + NULL, 0); > >>> + local_irq_restore(flags); > >>> + > >>> + if (!ret) > >>> + dev_info(bpmp->dev, "fwtag: %s\n", (char *)vaddr); > >>> + > >>> + dma_free_coherent(bpmp->dev, BPMP_MSG_DATA_SZ, vaddr, paddr); > >>> + > >>> + return ret; > >>> +} > >>> + > >>> +static void bpmp_signal_thread(int ch) > >>> +{ > >>> + int flags = bpmp->ch_area[ch].ob->flags; > >>> + struct completion *comp_obj; > >>> + > >>> + if (!(flags & RING_DOORBELL)) > >>> + return; > >>> + > >>> + comp_obj = bpmp_get_completion_obj(ch); > >>> + if (!comp_obj) { > >>> + WARN_ON(1); > >>> + return; > >>> + } > >>> + > >>> + complete(comp_obj); > >>> +} > >>> + > >>> +static void bpmp_handle_rx(struct mbox_client *cl, void *data) > >>> +{ > >>> + int i, rx_ch; > >>> + > >>> + rx_ch = bpmp->soc_data->cpu_rx_ch_index; > >>> + > >>> + if (bpmp_master_acked(rx_ch)) > >>> + bpmp_handle_mrq(bpmp->ch_area[rx_ch].ib->code, rx_ch); > >>> + > >>> + spin_lock(&bpmp->lock); > >>> + > >>> + for (i = 0; i < bpmp->soc_data->nr_thread_ch && > >>> + bpmp->ch_info.tch_to_complete; i++) { > > for_each_set_bit(bpmp->ch_info.tch_to_complete, &i, > bpmp->soc_data->nr_thread_ch) ? > > This will reduce the number of iterations and you won't have to do the > bpmp->ch_info.tch_to_complete & (1 << ch) check below. > > >>> + int ch = bpmp_get_thread_ch(i); > >>> + > >>> + if ((bpmp->ch_info.tch_to_complete & (1 << ch)) && > >>> + bpmp_master_acked(ch)) { > >>> + bpmp->ch_info.tch_to_complete &= ~(1 << ch); > >>> + bpmp_signal_thread(ch); > >>> + } > >>> + } > >>> + > >>> + spin_unlock(&bpmp->lock); > >>> +} > >>> + > >>> +static void bpmp_ivc_notify(struct ivc *ivc) > >>> +{ > >>> + int ret; > >>> + > >>> + ret = mbox_send_message(bpmp->chan, NULL); > >>> + if (ret < 0) > >>> + return; > >>> + > >>> + mbox_send_message(bpmp->chan, NULL); > >> > >> > >> Why the second call to mbox_send_message? May to useful to add a > >> comment explaining it. > > > > Ah!! It should be mbox_client_txdone(). Good catch. > > That makes more sense. :) But did this code work even with that typo? It should have --- mbox_client_txdone() essentilly does nothing now. -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html