[PATCH 1/1] diag: fix userspace read-write freeze during high logging rates

Kasidit Yusuf <kasidit@xxxxxxxxxxx> · Mon, 20 May 2013 15:17:39 +0700

At very high diagchar logging rates (a HSPA+ ftp download with many
instantaneous diag log response messages enabled) on quad-core phones,
it was observed that after about 2 to 20 minutes (depends on logging rate)
- the userspace read to /dev/diag would block, causing a freeze effect
to the userspace program - and further attempts to log would randomly
freeze at an earlier stage then a phone restart was required to do any
meaningful logging. You can test this with "diag_mdlog", use a diag.cfg
with high logging rate log_response_messages like 0x4222, 0x4179 and other
fast logs.

The patched kernel was tested on a "Google Nexus 4 with Android 4.2.2"
(mako) phone - and had no more freezes even after 13 Hours of logging diag
for over-night ftp tests, an older sister-kernel (essentially the same -
but with many more pr_debug() to identify the causes) was also tested
sucessfully on two other "Google Nexus 4" phones, one of them logged
successfully through about 10 hours of fast ftp downloads - no freezes too.

This fix consists of 4 main parts:

1. There were cases that diag_device_write() completed the "enque" and sets
the data_ready flag during an ongoing diagchar_read() which would then
clear the flag on completion - resulting in a lost data_ready notification
and diagchar_read() would block by waiting for the lost wakeup.
The extra flag (array) "data_ready_while_busy" was introduced to
handle that situation.

2. It was observed that zero length buffers in
diag_hsic_read_complete_callback() was causing problems too
- sending zero length entries to the hsic_buf_tbl queue could cause it to
never be properly dequeued and never "diagmem_free()". This is because
diagfwd.c's diag_device_write() would successfully "enqueue" it but then
diagchar_core.c's diagchar_read() would omit it because
"if (driver->hsic_buf_tbl[i].length > 0)".
So it won't be "dequeued" and also diagmem_free would not be called on it.
This is fixed by not allowing zero-length entries into diag_device_write()
- and then calling diagmem_free() on it instead.

3. Consecutive writes (try 10+ conf writes before reading) to diagchar
from userspace would block, although normal steps and ioctl were done.
This was because the diagchar_write() was blocked in
wait_event_interruptible() but the diag_hsic_write_complete_callback()
didn't wake it up. I simply added a wake_up_interruptible() in that
callback and now the consecutive writes complete sucessfully.

4. At very high logging rates (on the Nexus 4 quad-core phone), it was
observed that "count_hsic_write_pool" faced read/write race conditions
(to increment and decrement) and became incorrect - even negative - and
"diag_read_hsic_work_fn" would use it to "determine the current number of
available buffers for writing after reading from the HSIC has completed"
- so the code was modified to determine the number of entries available by
iterating through "hsic_buf_tbl" instead.

Also, added a "driver->num_clients--" to diagchar_close() to update
num_clints on close() from userspace.

Signed-off-by: Kasidit Yusuf <kasidit@xxxxxxxxxxx>
---
 drivers/char/diag/diagchar.h      |  1 +
 drivers/char/diag/diagchar_core.c | 21 ++++++++++-
 drivers/char/diag/diagfwd.c       | 16 ++++++++
 drivers/char/diag/diagfwd_hsic.c  | 79 ++++++++++++++++++++++++++++++++-------
 4 files changed, 103 insertions(+), 14 deletions(-)

diff --git a/drivers/char/diag/diagchar.h b/drivers/char/diag/diagchar.h
index 8d87cf9..4281912 100644
--- a/drivers/char/diag/diagchar.h
+++ b/drivers/char/diag/diagchar.h
@@ -142,6 +142,7 @@ struct diagchar_dev {
 	wait_queue_head_t wait_q;
 	struct diag_client_map *client_map;
 	int *data_ready;
+	int *data_ready_while_busy;
 	int num_clients;
 	int polling_reg_flag;
 	struct diag_write_device *buf_tbl;
diff --git a/drivers/char/diag/diagchar_core.c b/drivers/char/diag/diagchar_core.c
index 2c572ed..9132fcb 100644
--- a/drivers/char/diag/diagchar_core.c
+++ b/drivers/char/diag/diagchar_core.c
@@ -182,6 +182,13 @@ static int diagchar_open(struct inode *inode, struct file *file)
 					goto fail;
 				else
 					driver->data_ready = temp;
+				temp = krealloc(driver->data_ready_while_busy
+					, (driver->num_clients) * sizeof(int),
+							GFP_KERNEL);
+				if (!temp)
+					goto fail;
+				else
+					driver->data_ready_while_busy = temp;
 				diag_add_client(i, file);
 			} else {
 				mutex_unlock(&driver->diagchar_mutex);
@@ -221,6 +228,8 @@ static int diagchar_close(struct inode *inode, struct file *file)
 	struct diagchar_priv *diagpriv_data = file->private_data;
 
 	pr_debug("diag: process exit %s\n", current->comm);
+	driver->num_clients--;
+
 	if (!(file->private_data)) {
 		pr_alert("diag: Invalid file pointer");
 		return -ENOMEM;
@@ -804,6 +813,7 @@ static int diagchar_read(struct file *file, char __user *buf, size_t count,
 
 	wait_event_interruptible(driver->wait_q,
 				  driver->data_ready[index]);
+	driver->data_ready_while_busy[index] = 0;
 	mutex_lock(&driver->diagchar_mutex);
 
 	if ((driver->data_ready[index] & USER_SPACE_LOG_TYPE) && (driver->
@@ -981,7 +991,16 @@ drop_hsic:
 		/* copy number of data fields */
 		COPY_USER_SPACE_OR_EXIT(buf+4, num_data, 4);
 		ret -= 4;
-		driver->data_ready[index] ^= USER_SPACE_LOG_TYPE;
+
+		/* At very high logging rates, it was observed that
+		 * sometimes diag_device_write() completed the "enqueue"
+		 * and sets the data_ready flag during an ongoing read
+		 * in the lines above - so we should not clear the flag
+		 * in these cases
+		 */
+		if (driver->data_ready_while_busy[index] == 0)
+			driver->data_ready[index] ^= USER_SPACE_LOG_TYPE;
+
 		if (driver->ch)
 			queue_work(driver->diag_wq,
 					 &(driver->diag_read_smd_work));
diff --git a/drivers/char/diag/diagfwd.c b/drivers/char/diag/diagfwd.c
index 4419309..7f12a3f 100644
--- a/drivers/char/diag/diagfwd.c
+++ b/drivers/char/diag/diagfwd.c
@@ -332,6 +332,12 @@ int diag_device_write(void *buf, int proc_num, struct diag_request *write_ptr)
 						 driver->logging_process_id)
 				break;
 		if (i < driver->num_clients) {
+			/* Set a "data_ready_while_busy" flag to
+			 * prevent diagchar_read() from clearing new
+			 * data_ready sets that came during its ongoing read.
+			 */
+			driver->data_ready_while_busy[i] |= USER_SPACE_LOG_TYPE;
+
 			driver->data_ready[i] |= USER_SPACE_LOG_TYPE;
 			pr_debug("diag: wake up logging process\n");
 			wake_up_interruptible(&driver->wait_q);
@@ -2056,6 +2062,14 @@ void diagfwd_init(void)
 							, GFP_KERNEL)) == NULL)
 		goto err;
 	kmemleak_not_leak(driver->data_ready);
+	if (driver->data_ready_while_busy == NULL) {
+		driver->data_ready_while_busy = kzalloc(driver->num_clients
+							* sizeof(int)
+							, GFP_KERNEL);
+		if (driver->data_ready_while_busy == NULL)
+			goto err;
+	}
+	kmemleak_not_leak(driver->data_ready_while_busy);
 	if (driver->table == NULL &&
 	     (driver->table = kzalloc(diag_max_reg*
 		      sizeof(struct diag_master_table),
@@ -2166,6 +2180,7 @@ err:
 		kfree(driver->client_map);
 		kfree(driver->buf_tbl);
 		kfree(driver->data_ready);
+		kfree(driver->data_ready_while_busy);
 		kfree(driver->table);
 		kfree(driver->pkt_buf);
 		kfree(driver->write_ptr_1);
@@ -2217,6 +2232,7 @@ void diagfwd_exit(void)
 	kfree(driver->client_map);
 	kfree(driver->buf_tbl);
 	kfree(driver->data_ready);
+	kfree(driver->data_ready_while_busy);
 	kfree(driver->table);
 	kfree(driver->pkt_buf);
 	kfree(driver->write_ptr_1);
diff --git a/drivers/char/diag/diagfwd_hsic.c b/drivers/char/diag/diagfwd_hsic.c
index bd8c52b..94d8b9d 100644
--- a/drivers/char/diag/diagfwd_hsic.c
+++ b/drivers/char/diag/diagfwd_hsic.c
@@ -45,6 +45,7 @@ static void diag_read_hsic_work_fn(struct work_struct *work)
 	int num_reads_submitted = 0;
 	int err = 0;
 	int write_ptrs_available;
+	int i;
 
 	if (!driver->hsic_ch) {
 		pr_err("DIAG in %s: driver->hsic_ch == 0\n", __func__);
@@ -55,12 +56,30 @@ static void diag_read_hsic_work_fn(struct work_struct *work)
 	 * Determine the current number of available buffers for writing after
 	 * reading from the HSIC has completed.
 	 */
-	if (driver->logging_mode == MEMORY_DEVICE_MODE)
-		write_ptrs_available = driver->poolsize_hsic_write -
-					driver->num_hsic_buf_tbl_entries;
-	else
+	if (driver->logging_mode == MEMORY_DEVICE_MODE) {
+		/*
+		 * At very high logging rates on multi-core phones,
+		 * it was observed that "count_hsic_write_pool" faced read/write
+		 * race conditions (to increment and decrement) and became
+		 * incorrect - so don't use it to determine the number of
+		 * entries available - iterate through "hsic_buf_tbl" instead.
+		 */
+		write_ptrs_available = 0;
+		i = (driver->poolsize_hsic_write) - 1;
+		for (; i >= 0; i--)
+			if (driver->hsic_buf_tbl[i].length == 0)
+				write_ptrs_available++;
+	} else {
 		write_ptrs_available = driver->poolsize_hsic_write -
 					driver->count_hsic_write_pool;
+	}
+
+	if (driver->logging_mode == MEMORY_DEVICE_MODE &&
+		(write_ptrs_available <= 0 ||
+			write_ptrs_available > driver->poolsize_hsic_write)) {
+		pr_debug("diag: warning - probe write_ptrs_available: %d"
+			, write_ptrs_available);
+	}
 
 	/*
 	 * Queue up a read on the HSIC for all available buffers in the
@@ -138,17 +157,50 @@ static void diag_hsic_read_complete_callback(void *ctxt, char *buf,
 		if (!buf) {
 			pr_err("diag: Out of diagmem for HSIC\n");
 		} else {
-			/*
-			 * Send data in buf to be written on the
-			 * appropriate device, e.g. USB MDM channel
+
+			/* Contradicting with the original comment above to send
+			 * zero length entries, sending 0 length entries to the
+			 * hsic_buf_tbl queue could cause it to  never be
+			 * properly dequeued and never "diagmem_free()".
+			 *
+			 * This is because diagfwd.c's diag_device_write() would
+			 * successfully "enqueue" it but then diagchar_core.c's
+			 * diagchar_read() would omit it -
+			 * because "if (driver->hsic_buf_tbl[i].length > 0)"...
+			 * So it won't be "dequeued" and also diagmem_free would
+			 * not be called on it.
+			 *
+			 * Also, other diag_device_write() calls with non-0
+			 * length entries would also overwrite this 0-length
+			 * entry because of the
+			 * "if (driver->hsic_buf_tbl[i].length == 0)" check.
+			 *
+			 * Threrefore, if the actual_size is > 0, send it,
+			 * but if the size is 0 then call diagmem_free
+			 * on it and don't send it.
 			 */
-			driver->write_len_mdm = actual_size;
-			err = diag_device_write((void *)buf, HSIC_DATA, NULL);
-			/* If an error, return buffer to the pool */
-			if (err) {
+
+			if (actual_size > 0) {
+				/*
+				 * Send data in buf to be written on the
+				 * appropriate device, e.g. USB MDM channel
+				 */
+				driver->write_len_mdm = actual_size;
+				err = diag_device_write((void *)buf,
+							HSIC_DATA, NULL);
+				/* If an error, return buffer to the pool */
+				if (err) {
+					diagmem_free(driver, buf,
+						POOL_TYPE_HSIC);
+					pr_err("diag: In %s, error calling diag_device_write, err: %d\n",
+						__func__, err);
+				}
+			} else {
+				/* if we reach here this means actual_size is 0
+				 * then free and omit it.
+				 */
+				err = 0;
 				diagmem_free(driver, buf, POOL_TYPE_HSIC);
-				pr_err("diag: In %s, error calling diag_device_write, err: %d\n",
-					__func__, err);
 			}
 		}
 	} else {
@@ -179,6 +231,7 @@ static void diag_hsic_write_complete_callback(void *ctxt, char *buf,
 {
 	/* The write of the data to the HSIC bridge is complete */
 	driver->in_busy_hsic_write = 0;
+	wake_up_interruptible(&driver->wait_q);
 
 	if (!driver->hsic_ch) {
 		pr_err("DIAG in %s: driver->hsic_ch == 0\n", __func__);
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html