4

I'm trying to understand the Linux IO stack by analyzing blktrace trace.

Below is the trace i captured by executing simple fio sequential write with no Direct IO (5 x 8k request size).

  8,16   1        1     0.000000000 10006  U   N [fio] 0
  8,16   1        2     0.000024099 10006  U   N [fio] 0
  8,16   1        3     0.000034879 10006  U   N [fio] 0
  8,16   1        4     0.000045234 10006  U   N [fio] 0
  8,16   1        5     0.000055410 10006  U   N [fio] 0
  8,16   1        6     5.110792867 22364  A  WS 903168 + 8 <- (8,17) 901120
  8,16   1        7     5.110794692 22364  Q  WS 903168 + 8 [kjournald]
  8,16   1        8     5.110798680 22364  G  WS 903168 + 8 [kjournald]
  8,16   1        9     5.110800405 22364  P   N [kjournald]
  8,16   1       10     5.110801662 22364  I  WS 903168 + 8 [kjournald]
  8,16   1       11     5.110804742 22364  A  WS 903176 + 8 <- (8,17) 901128
  8,16   1       12     5.110804951 22364  Q  WS 903176 + 8 [kjournald]
  8,16   1       13     5.110806188 22364  M  WS 903176 + 8 [kjournald]
  8,16   1       14     5.110807168 22364  A  WS 903184 + 8 <- (8,17) 901136
  8,16   1       15     5.110807321 22364  Q  WS 903184 + 8 [kjournald]
  8,16   1       16     5.110807592 22364  M  WS 903184 + 8 [kjournald]
  8,16   1       17     5.110808138 22364  A  WS 903192 + 8 <- (8,17) 901144
  8,16   1       18     5.110808292 22364  Q  WS 903192 + 8 [kjournald]
  8,16   1       19     5.110808498 22364  M  WS 903192 + 8 [kjournald]
  8,16   1       20     5.110808948 22364  A  WS 903200 + 8 <- (8,17) 901152
  8,16   1       21     5.110809092 22364  Q  WS 903200 + 8 [kjournald]
  8,16   1       22     5.110809284 22364  M  WS 903200 + 8 [kjournald]
  8,16   1       23     5.110809726 22364  A  WS 903208 + 8 <- (8,17) 901160
  8,16   1       24     5.110809872 22364  Q  WS 903208 + 8 [kjournald]
  8,16   1       25     5.110810062 22364  M  WS 903208 + 8 [kjournald]
  8,16   1       26     5.110814625 22364  A  WS 903216 + 8 <- (8,17) 901168
  8,16   1       27     5.110814796 22364  Q  WS 903216 + 8 [kjournald]
  8,16   1       28     5.110815070 22364  M  WS 903216 + 8 [kjournald]
  8,16   1       29     5.110815547 22364  A  WS 903224 + 8 <- (8,17) 901176
  8,16   1       30     5.110815692 22364  Q  WS 903224 + 8 [kjournald]
  8,16   1       31     5.110815881 22364  M  WS 903224 + 8 [kjournald]
  8,16   1       32     5.110816338 22364  A  WS 903232 + 8 <- (8,17) 901184
  8,16   1       33     5.110816483 22364  Q  WS 903232 + 8 [kjournald]
  8,16   1       34     5.110816675 22364  M  WS 903232 + 8 [kjournald]
  8,16   1       35     5.110817040 22364  A  WS 903240 + 8 <- (8,17) 901192
  8,16   1       36     5.110817185 22364  Q  WS 903240 + 8 [kjournald]
  8,16   1       37     5.110817375 22364  M  WS 903240 + 8 [kjournald]
  8,16   1       38     5.110819614 22364  U   N [kjournald] 1
  8,16   1       39     5.110821326 22364  D  WS 903168 + 80 [kjournald]
  8,16   1       40     5.111602870     0  C  WS 903168 + 80 [0]
  8,16   1       41     5.111724953 22364  A  WS 51910880 + 8 <- (8,17) 51908832
  8,16   1       42     5.111725778 22364  Q  WS 51910880 + 8 [kjournald]
  8,16   1       43     5.111727434 22364  G  WS 51910880 + 8 [kjournald]
  8,16   1       44     5.111728735 22364  P   N [kjournald]
  8,16   1       45     5.111729356 22364  I  WS 51910880 + 8 [kjournald]
  8,16   1       46     5.111730264 22364  A  WS 51910888 + 8 <- (8,17) 51908840
  8,16   1       47     5.111730424 22364  Q  WS 51910888 + 8 [kjournald]
  8,16   1       48     5.111731118 22364  M  WS 51910888 + 8 [kjournald]
  8,16   1       49     5.111732441 22364  U   N [kjournald] 1
  8,16   1       50     5.111733167 22364  D  WS 51910880 + 16 [kjournald]
  8,16   1       51     5.112248377     0  C  WS 51910880 + 16 [0]
  8,16   1       52     5.112288403 22364  A FWFS 51910896 + 8 <- (8,17) 51908848
  8,16   1       53     5.112289215 22364  Q  WS 51910896 + 8 [kjournald]
  8,16   1       54     5.112290348 22364  G  WS 51910896 + 8 [kjournald]
  8,16   1       55     5.112291577 22364  P   N [kjournald]
  8,16   1       56     5.112292055 22364  I  WS 51910896 + 8 [kjournald]
  8,16   1       57     5.112293139 22364  D  WS 51910896 + 8 [kjournald]
  8,16   1       58     5.112303160 22364  U   N [kjournald] 1
  8,16   1       59     5.112767922     0  C  WS 51910896 + 8 [0]
  8,16   1       60    39.159727927 10007  A   W 796696 + 8 <- (8,17) 794648
  8,16   1       61    39.159729606 10007  Q   W 796696 + 8 [flush-8:16]
  8,16   1       62    39.159733195 10007  G   W 796696 + 8 [flush-8:16]
  8,16   1       63    39.159735006 10007  P   N [flush-8:16]
  8,16   1       64    39.159736356 10007  I   W 796696 + 8 [flush-8:16]
  8,16   1       65    39.162686094     0 UT   N [swapper] 1
  8,16   1       66    39.162702603    23  U   N [kblockd/1] 1
  8,16   1       67    39.162704841    23  D   W 796696 + 8 [kblockd/1]
  8,16   1       68    39.163311755     0  C   W 796696 + 8 [0]

Notes:

  • OS is CentOS with kernel version 2.6.32-358.18.1.el6.x86_64
  • ext3 with Journal ordered mode used with default option (commit = 5 seconds)
  • The disk offset of 903168 is the data file where fio write into it
  • The disk offset of 51910880 should be where the journal located. (I did a test with external journal and can't find this range in that trace) Another hint is the operation is FWFS. After some study, i can confirm that this offset 51910880 is journal descriptor block, offset 51910888 is filesystem metadata block and offset 51910896 is journal commit block
  • I assume disk offset 796696 is where the metadata resides

So my questions are:

  • Is kjournald suppose to flush file data? I always think that kjournald only flush dirty journal block? How about pdflush?
  • Difference between kjournald and jbd2/vdb*-*? Is it one for ext3 and another for ext4?
  • Why metadata is flushed only after ~34 seconds? Is this always the case when flush is initiated, kblockd is used to issue write request to device driver?
cheng wee
  • 151
  • 3

0 Answers0