Hi Jon, On Wed, Nov 6, 2019 at 7:40 PM Jon Derrick <jonathan.derrick@xxxxxxxxx> wrote: > > This patchset optimizes VMD performance through the storage stack by locating > commonly-affined NVMe interrupts on the same VMD interrupt handler lists. > > The current strategy of round-robin assignment to VMD IRQ lists can be > suboptimal when vectors with different affinities are assigned to the same VMD > IRQ list. VMD is an NVMe storage domain and this set aligns the vector > allocation and affinity strategy with that of the NVMe driver. This invokes the > kernel to do the right thing when affining NVMe submission cpus to NVMe > completion vectors as serviced through the VMD interrupt handler lists. > > This set greatly reduced tail latency when testing 8 threads of random 4k reads > against two drives at queue depth=128. After pinning the tasks to reduce test > variability, the tests also showed a moderate tail latency reduction. A > one-drive configuration also shows improvements due to the alignment of VMD IRQ > list affinities with NVMe affinities. Is there any followup on this series? Because of vmd_irq_set_affinity() always returning -EINVAL, so the system can't perform S3 and CPU hotplug. Bug filed here: https://bugzilla.kernel.org/show_bug.cgi?id=216835 Kai-Heng > > An example with two NVMe drives and a 33-vector VMD: > VMD irq[42] Affinity[0-27,56-83] Effective[10] > VMD irq[43] Affinity[28-29,84-85] Effective[85] > VMD irq[44] Affinity[30-31,86-87] Effective[87] > VMD irq[45] Affinity[32-33,88-89] Effective[89] > VMD irq[46] Affinity[34-35,90-91] Effective[91] > VMD irq[47] Affinity[36-37,92-93] Effective[93] > VMD irq[48] Affinity[38-39,94-95] Effective[95] > VMD irq[49] Affinity[40-41,96-97] Effective[97] > VMD irq[50] Affinity[42-43,98-99] Effective[99] > VMD irq[51] Affinity[44-45,100] Effective[100] > VMD irq[52] Affinity[46-47,102] Effective[102] > VMD irq[53] Affinity[48-49,104] Effective[104] > VMD irq[54] Affinity[50-51,106] Effective[106] > VMD irq[55] Affinity[52-53,108] Effective[108] > VMD irq[56] Affinity[54-55,110] Effective[110] > VMD irq[57] Affinity[101,103,105] Effective[105] > VMD irq[58] Affinity[107,109,111] Effective[111] > VMD irq[59] Affinity[0-1,56-57] Effective[57] > VMD irq[60] Affinity[2-3,58-59] Effective[59] > VMD irq[61] Affinity[4-5,60-61] Effective[61] > VMD irq[62] Affinity[6-7,62-63] Effective[63] > VMD irq[63] Affinity[8-9,64-65] Effective[65] > VMD irq[64] Affinity[10-11,66-67] Effective[67] > VMD irq[65] Affinity[12-13,68-69] Effective[69] > VMD irq[66] Affinity[14-15,70-71] Effective[71] > VMD irq[67] Affinity[16-17,72] Effective[72] > VMD irq[68] Affinity[18-19,74] Effective[74] > VMD irq[69] Affinity[20-21,76] Effective[76] > VMD irq[70] Affinity[22-23,78] Effective[78] > VMD irq[71] Affinity[24-25,80] Effective[80] > VMD irq[72] Affinity[26-27,82] Effective[82] > VMD irq[73] Affinity[73,75,77] Effective[77] > VMD irq[74] Affinity[79,81,83] Effective[83] > > nvme0n1q1 MQ CPUs[28, 29, 84, 85] > nvme0n1q2 MQ CPUs[30, 31, 86, 87] > nvme0n1q3 MQ CPUs[32, 33, 88, 89] > nvme0n1q4 MQ CPUs[34, 35, 90, 91] > nvme0n1q5 MQ CPUs[36, 37, 92, 93] > nvme0n1q6 MQ CPUs[38, 39, 94, 95] > nvme0n1q7 MQ CPUs[40, 41, 96, 97] > nvme0n1q8 MQ CPUs[42, 43, 98, 99] > nvme0n1q9 MQ CPUs[44, 45, 100] > nvme0n1q10 MQ CPUs[46, 47, 102] > nvme0n1q11 MQ CPUs[48, 49, 104] > nvme0n1q12 MQ CPUs[50, 51, 106] > nvme0n1q13 MQ CPUs[52, 53, 108] > nvme0n1q14 MQ CPUs[54, 55, 110] > nvme0n1q15 MQ CPUs[101, 103, 105] > nvme0n1q16 MQ CPUs[107, 109, 111] > nvme0n1q17 MQ CPUs[0, 1, 56, 57] > nvme0n1q18 MQ CPUs[2, 3, 58, 59] > nvme0n1q19 MQ CPUs[4, 5, 60, 61] > nvme0n1q20 MQ CPUs[6, 7, 62, 63] > nvme0n1q21 MQ CPUs[8, 9, 64, 65] > nvme0n1q22 MQ CPUs[10, 11, 66, 67] > nvme0n1q23 MQ CPUs[12, 13, 68, 69] > nvme0n1q24 MQ CPUs[14, 15, 70, 71] > nvme0n1q25 MQ CPUs[16, 17, 72] > nvme0n1q26 MQ CPUs[18, 19, 74] > nvme0n1q27 MQ CPUs[20, 21, 76] > nvme0n1q28 MQ CPUs[22, 23, 78] > nvme0n1q29 MQ CPUs[24, 25, 80] > nvme0n1q30 MQ CPUs[26, 27, 82] > nvme0n1q31 MQ CPUs[73, 75, 77] > nvme0n1q32 MQ CPUs[79, 81, 83] > > nvme1n1q1 MQ CPUs[28, 29, 84, 85] > nvme1n1q2 MQ CPUs[30, 31, 86, 87] > nvme1n1q3 MQ CPUs[32, 33, 88, 89] > nvme1n1q4 MQ CPUs[34, 35, 90, 91] > nvme1n1q5 MQ CPUs[36, 37, 92, 93] > nvme1n1q6 MQ CPUs[38, 39, 94, 95] > nvme1n1q7 MQ CPUs[40, 41, 96, 97] > nvme1n1q8 MQ CPUs[42, 43, 98, 99] > nvme1n1q9 MQ CPUs[44, 45, 100] > nvme1n1q10 MQ CPUs[46, 47, 102] > nvme1n1q11 MQ CPUs[48, 49, 104] > nvme1n1q12 MQ CPUs[50, 51, 106] > nvme1n1q13 MQ CPUs[52, 53, 108] > nvme1n1q14 MQ CPUs[54, 55, 110] > nvme1n1q15 MQ CPUs[101, 103, 105] > nvme1n1q16 MQ CPUs[107, 109, 111] > nvme1n1q17 MQ CPUs[0, 1, 56, 57] > nvme1n1q18 MQ CPUs[2, 3, 58, 59] > nvme1n1q19 MQ CPUs[4, 5, 60, 61] > nvme1n1q20 MQ CPUs[6, 7, 62, 63] > nvme1n1q21 MQ CPUs[8, 9, 64, 65] > nvme1n1q22 MQ CPUs[10, 11, 66, 67] > nvme1n1q23 MQ CPUs[12, 13, 68, 69] > nvme1n1q24 MQ CPUs[14, 15, 70, 71] > nvme1n1q25 MQ CPUs[16, 17, 72] > nvme1n1q26 MQ CPUs[18, 19, 74] > nvme1n1q27 MQ CPUs[20, 21, 76] > nvme1n1q28 MQ CPUs[22, 23, 78] > nvme1n1q29 MQ CPUs[24, 25, 80] > nvme1n1q30 MQ CPUs[26, 27, 82] > nvme1n1q31 MQ CPUs[73, 75, 77] > nvme1n1q32 MQ CPUs[79, 81, 83] > > > This patchset applies after the VMD IRQ List indirection patch: > https://lore.kernel.org/linux-pci/1572527333-6212-1-git-send-email-jonathan.derrick@xxxxxxxxx/ > > Jon Derrick (3): > PCI: vmd: Reduce VMD vectors using NVMe calculation > PCI: vmd: Align IRQ lists with child device vectors > PCI: vmd: Use managed irq affinities > > drivers/pci/controller/vmd.c | 90 +++++++++++++++++++------------------------- > 1 file changed, 39 insertions(+), 51 deletions(-) > > -- > 1.8.3.1 >