On 18.03.2019 20:21, Javier González wrote:
On 18 Mar 2019, at 14.28, Igor Konopko <igor.j.konopko@xxxxxxxxx> wrote:
On 18.03.2019 08:42, Javier González wrote:
On 14 Mar 2019, at 17.04, Igor Konopko <igor.j.konopko@xxxxxxxxx> wrote:
In case when mapping fails (called from writer thread) due to lack of
lines, currently we are calling pblk_pipeline_stop(), which waits
for pending write IOs, so it will lead to the deadlock. Switching
to __pblk_pipeline_stop() in that case instead will fix that.
Signed-off-by: Igor Konopko <igor.j.konopko@xxxxxxxxx>
---
drivers/lightnvm/pblk-map.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/lightnvm/pblk-map.c b/drivers/lightnvm/pblk-map.c
index 5408e32..afc10306 100644
--- a/drivers/lightnvm/pblk-map.c
+++ b/drivers/lightnvm/pblk-map.c
@@ -46,7 +46,7 @@ static int pblk_map_page_data(struct pblk *pblk, unsigned int sentry,
pblk_line_close_meta(pblk, prev_line);
if (!line) {
- pblk_pipeline_stop(pblk);
+ __pblk_pipeline_stop(pblk);
return -ENOSPC;
}
--
2.9.5
Have you seeing this problem?
Before checking if there is a line, we are closing metadata for the
previous line, so all inflight I/Os should be clear. Can you develop on
the case in which this would happen?
So we have following sequence: pblk_pipeline_stop() -> __pblk_pipeline_flush() -> pblk_flush_writer() -> wait for emptying round buffer.
This will never complete, since we still have some RB entries, which cannot be written since writer thread is blocked with waiting inside pblk_flush_writer().
Am I missing sth?
So this will be the case in which we are in the last line and
pblk_flush_writer() needs to allocate an extra line persist the write
buffer? Shouldn’t the rate-limiter take care of this? As I recall, Hans
implemented some logic to guarantee that at least one line is always
available for GC, which in turn will free a line for user data. When we
hit this limit, performance will drop dramatically, but it should not
stall.
The reason I want to understand the real case behind this fix is that by
calling __pblk_pipeline_stop() we are basically stopping all other
inflight I/Os; we should be able to serve all inflight I/Os before a
mapping error triggers the pipeline to get into read-only mode.
Javier,
my understanding was that if we hit that particular case, we are simply
"done" with pblk and there is no way to recover it, so I made this
changes based on code analysis, if it is not true, than this patch does
not make sense anymore.
Hans,
could you help me to understand how rate limiter ensure that what Javier
mentioned about? Thanks