> On 18 Mar 2019, at 14.02, Igor Konopko <igor.j.konopko@xxxxxxxxx> wrote: > > > > On 17.03.2019 20:44, Matias Bjørling wrote: >> On 3/14/19 9:04 AM, Igor Konopko wrote: >>> When we are trying to switch to the new line, we need to ensure that >>> emeta for n-2 line is already written. In other case we can end with >>> deadlock scenario, when the writer has no more requests to write and >>> thus there is no way to trigger emeta writes from writer thread. This >>> is a corner case scenario which occurs in a case of multiple writes >>> error and thus kind of early line close due to lack of line space. >>> >>> Signed-off-by: Igor Konopko <igor.j.konopko@xxxxxxxxx> >>> --- >>> drivers/lightnvm/pblk-core.c | 2 ++ >>> drivers/lightnvm/pblk-write.c | 24 ++++++++++++++++++++++++ >>> drivers/lightnvm/pblk.h | 1 + >>> 3 files changed, 27 insertions(+) >>> >>> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c >>> index 38e26fe..a683d1f 100644 >>> --- a/drivers/lightnvm/pblk-core.c >>> +++ b/drivers/lightnvm/pblk-core.c >>> @@ -1001,6 +1001,7 @@ static void pblk_line_setup_metadata(struct pblk_line *line, >>> struct pblk_line_mgmt *l_mg, >>> struct pblk_line_meta *lm) >>> { >>> + struct pblk *pblk = container_of(l_mg, struct pblk, l_mg); >>> int meta_line; >>> lockdep_assert_held(&l_mg->free_lock); >>> @@ -1009,6 +1010,7 @@ static void pblk_line_setup_metadata(struct pblk_line *line, >>> meta_line = find_first_zero_bit(&l_mg->meta_bitmap, PBLK_DATA_LINES); >>> if (meta_line == PBLK_DATA_LINES) { >>> spin_unlock(&l_mg->free_lock); >>> + pblk_write_emeta_force(pblk); >>> io_schedule(); >>> spin_lock(&l_mg->free_lock); >>> goto retry_meta; >>> diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c >>> index 4e63f9b..4fbb9b2 100644 >>> --- a/drivers/lightnvm/pblk-write.c >>> +++ b/drivers/lightnvm/pblk-write.c >>> @@ -505,6 +505,30 @@ static struct pblk_line *pblk_should_submit_meta_io(struct pblk *pblk, >>> return meta_line; >>> } >>> +void pblk_write_emeta_force(struct pblk *pblk) >>> +{ >>> + struct pblk_line_meta *lm = &pblk->lm; >>> + struct pblk_line_mgmt *l_mg = &pblk->l_mg; >>> + struct pblk_line *meta_line; >>> + >>> + while (true) { >>> + spin_lock(&l_mg->close_lock); >>> + if (list_empty(&l_mg->emeta_list)) { >>> + spin_unlock(&l_mg->close_lock); >>> + break; >>> + } >>> + meta_line = list_first_entry(&l_mg->emeta_list, >>> + struct pblk_line, list); >>> + if (meta_line->emeta->mem >= lm->emeta_len[0]) { >>> + spin_unlock(&l_mg->close_lock); >>> + io_schedule(); >>> + continue; >>> + } >>> + spin_unlock(&l_mg->close_lock); >>> + pblk_submit_meta_io(pblk, meta_line); >>> + } >>> +} >>> + >>> static int pblk_submit_io_set(struct pblk *pblk, struct nvm_rq *rqd) >>> { >>> struct ppa_addr erase_ppa; >>> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h >>> index 0a85990..a42bbfb 100644 >>> --- a/drivers/lightnvm/pblk.h >>> +++ b/drivers/lightnvm/pblk.h >>> @@ -877,6 +877,7 @@ int pblk_write_ts(void *data); >>> void pblk_write_timer_fn(struct timer_list *t); >>> void pblk_write_should_kick(struct pblk *pblk); >>> void pblk_write_kick(struct pblk *pblk); >>> +void pblk_write_emeta_force(struct pblk *pblk); >>> /* >>> * pblk read path >> Hi Igor, >> Is this an error that qemu can force pblk to expose? Can you provide a specific example on what is needed to force the error? > > So I hit this error on PBLKs with low number of LUNs and multiple > write IO errors (should be reproducible with error injection). Then > pblk_map_remaining() quickly mapped all the sectors in line and thus > writer thread was not able to issue all the necessary emeta IO writes, > so it stucks when trying to replace line to new one. So this is > definitely an error/corner case scenario. If the cause if emeta writes, then there is a bug in pblk_line_close_meta(), as the logic to prevent this case is in place.
Attachment:
signature.asc
Description: Message signed with OpenPGP