Adrian Bridgett <adrian@xxxxxxxxxx> writes: > On 13/03/15 15:25, Theodore Ts'o wrote: >> The obvious thing to do is to try using blktrace to see which I/O >> operations are triggering the problem. It may just be that the hard >> drive's internal FTL metadata is very badly fragmented. > Damn :-( Seems a bit odd as it's only a few months old (not used much) > and OSX _seems_ fine. > > Not quite sure what to look for in blktrace. I see the sync but the > timestamps in blktrace are near instantaneous vs 1.3, 2.2 seconds > reported by strace. Trying a "find" and watching it stall blktrace > shows this - which to my amateur eyes shows a 0.5s gap between the read > request at 1.815589 and being issued at 2.373. Two completions being > the only occurrence in the middle. Maybe queue depth was full - but we > have a two completions earlier. > > > 8,0 0 226 1.812363080 271 D WS 498955904 + 16 ( > 988) [jbd2 > /dm-0-8] > 8,0 0 227 1.812439052 0 C WS 498955904 + 16 ( > 75972) [0] > 8,4 0 228 1.812456484 271 A FWS 0 + 0 <- (252,0) 0 > 8,0 0 229 1.812456838 271 Q FWS [jbd2/dm-0-8] > 8,4 0 230 1.812457124 271 A FWS 0 + 0 <- (252,0) 0 > 8,0 0 231 1.812457237 271 Q FWS [jbd2/dm-0-8] > 8,0 0 232 1.812458007 271 G FWS [jbd2/dm-0-8] > 8,0 0 233 1.812458478 271 I FWS ( 471) [jbd2/dm-0-8] > 8,0 0 234 1.812465182 271 G FWS [jbd2/dm-0-8] > 8,0 0 235 1.812465316 271 I FWS ( 134) [jbd2/dm-0-8] > 8,0 0 236 1.815446245 0 C RM 507106648 + 8 ( > 3778637) [0] > 8,4 2 11 1.815585457 31072 A RM 16875872 + 8 <- > (252,0) 1687382 > 4 > 8,0 2 12 1.815585633 31072 A RM 507107680 + 8 <- (8,4) > 16875872 > 8,0 2 13 1.815586124 31072 Q RM 507107680 + 8 [find] > 8,0 2 14 1.815588326 31072 G RM 507107680 + 8 [find] > 8,0 2 15 1.815589269 31072 I RM 507107680 + 8 ( 943) > [find] > 8,0 0 237 1.830102994 0 C WS 0 (17739914) [0] > 8,0 0 238 2.373458367 0 C WS 0 [0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As far as I can see it stuck on superblock completion which was issued as 228'th request. It takes ~500ms to complete it. Tend to agree with Ted looks likely your SSD has problems probably with FTL layer. In order to check that filesystem is not related with the issye you may try RAW IO: ###umount your filesystem. *This is important, otherwise you may corrupt your FS* #umount /$YOUR_FS ### Read and write back your superblock. Second operation should give ### you same timings as superblock commit from JBD2 #dd if=/dev/$YOUR_DISK of=super_block bs=4k count=4 iflag=direct #dd if=super_block of=/dev/$YOUR_DISK bs=4k count=4 conv=fsync,notrunc oflag=direct Superblock is most frequently rewritten block in whole disk, so if previous filesystem do not issued TRIM requests back to SSD, so it's FTL may become very slow. You should try fstrim(8), probably it can helps > 8,0 0 239 2.373469557 0 D RM 507107680 + 8 > (557880288) [swap > per/0] > 8,4 0 240 2.373486949 15158 A WFS 8724112 + 8 <- (252,0) > 8722064 > 8,0 0 241 2.373487336 15158 A WFS 498955920 + 8 <- (8,4) > 8724112 > 8,0 0 242 2.373488480 15158 Q WFS 498955920 + 8 > [kworker/0:2] > 8,0 0 243 2.373490158 15158 G WFS 498955920 + 8 > [kworker/0:2] > 8,0 0 244 2.373491573 15158 I WFS 498955920 + 8 ( 1415) > [kwork > er/0:2] > 8,0 0 245 2.373491927 15158 D WS 498955920 + 8 ( 354) > [kworker/0:2] > 8,0 0 246 2.373596570 0 C WS 498955920 + 8 ( > 104643) [0] > 8,0 0 247 2.375251869 0 C RM 507107680 + 8 ( > 1782312) [0] > 8,4 2 16 2.375452116 31072 A RM 16875072 + 8 <- > (252,0) 16873024 > 8,0 2 17 2.375452283 31072 A RM 507106880 + 8 <- (8,4) > 16875072 > 8,0 2 18 2.375452776 31072 Q RM 507106880 + 8 [find] > > I'm looking at 'watch -n0 cat /proc/diskstats |egrep -v "ram|loop"' > and it's normally sat down at 0/1 for I/Os in progress (have seen hit > 100 _briefly_ and then drop just as quickly). > >> What model SSD is it, by the way? > It's the built-in Apple MBP 512GB drive - "APPLE SSD SM0512F" which > apparently means Samsung (I've been reading about issues with their > 840EVO drives... > > Thanks once again (and for all your immense work over the many, many years!) > > Adrian > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature