debian/badblockhowto.html

   1 <html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Bad block HOWTO for smartmontools</title><meta name="generator" content="DocBook XSL Stylesheets V1.75.2"><meta name="description" content="This article describes what actions might be taken when smartmontools detects a bad block on a disk. It demonstrates how to identify the file associated with an unreadable disk sector, and how to force that sector to reallocate."></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="article" title="Bad block HOWTO for smartmontools"><div class="titlepage"><div><div><h2 class="title"><a name="index"></a>Bad block HOWTO for smartmontools</h2></div><div><div class="author"><h3 class="author"><span class="firstname">Bruce</span> <span class="surname">Allen</span></h3><div class="affiliation"><div class="address"><p><br>
   2       <code class="email">&lt;<a class="email" href="mailto:smartmontools-support@lists.sourceforge.net">smartmontools-support@lists.sourceforge.net</a>&gt;</code><br>
   3      </p></div></div></div></div><div><div class="author"><h3 class="author"><span class="firstname">Douglas</span> <span class="surname">Gilbert</span></h3><div class="affiliation"><div class="address"><p><br>
   4       <code class="email">&lt;<a class="email" href="mailto:smartmontools-support@lists.sourceforge.net">smartmontools-support@lists.sourceforge.net</a>&gt;</code><br>
   5      </p></div></div></div></div><div><p class="copyright">Copyright © 2004, 2005, 2006, 2007 Bruce Allen</p></div><div><div class="legalnotice" title="Legal Notice"><a name="id2541562"></a><p>
   6       Permission is granted to copy, distribute and/or modify this document
   7       under the terms of the GNU Free Documentation License, Version 1.1
   8       or any later version published by the Free Software Foundation;
   9       with no Invariant Sections, with no Front-Cover Texts, and with
  10       no Back-Cover Texts.
  11    </p><p>
  12     For an online copy of the license see
  13     <a class="ulink" href="http://www.fsf.org/copyleft/fdl.html" target="_top">
  14     <code class="literal">www.fsf.org/copyleft/fdl.html</code></a>.
  15    </p></div></div><div><p class="pubdate">2007-01-23</p></div><div><div class="revhistory"><table border="1" width="100%" summary="Revision history"><tr><th align="left" valign="top" colspan="3"><b>Revision History</b></th></tr><tr><td align="left">Revision 1.1</td><td align="left">2007-01-23</td><td align="left">dpg</td></tr><tr><td align="left" colspan="3">
  16              add sections on ReiserFS and partition table damage
  17        </td></tr><tr><td align="left">Revision 1.0</td><td align="left">2006-11-14</td><td align="left">dpg</td></tr><tr><td align="left" colspan="3">
  18              merge BadBlockHowTo.txt and BadBlockSCSIHowTo.txt
  19        </td></tr></table></div></div><div><div class="abstract" title="Abstract"><p class="title"><b>Abstract</b></p><p>
  20     This article describes what actions might be taken when smartmontools
  21     detects a bad block on a disk. It demonstrates how to identify the file
  22     associated with an unreadable disk sector, and how to force that sector
  23     to reallocate.
  24   </p></div></div></div><hr></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="sect1"><a href="#intro">Introduction</a></span></dt><dt><span class="sect1"><a href="#rfile">Repairs in a file system</a></span></dt><dd><dl><dt><span class="sect2"><a href="#e2_example1">ext2/ext3 first example</a></span></dt><dt><span class="sect2"><a href="#e2_example2">ext2/ext3 second example</a></span></dt><dt><span class="sect2"><a href="#unassigned">Unassigned sectors</a></span></dt><dt><span class="sect2"><a href="#reiserfs_ex">ReiserFS example</a></span></dt></dl></dd><dt><span class="sect1"><a href="#sdisk">Repairs at the disk level</a></span></dt><dd><dl><dt><span class="sect2"><a href="#partition">Partition table problems</a></span></dt><dt><span class="sect2"><a href="#lvm">LVM repairs</a></span></dt><dt><span class="sect2"><a href="#bb">Bad block reassignment</a></span></dt></dl></dd></dl></div><div class="sect1" title="Introduction"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="intro"></a>Introduction</h2></div></div></div><p>
  25 Handling bad blocks is a difficult problem as it often involves
  26 decisions about losing information. Modern storage devices tend
  27 to handle the simple cases automatically, for example by writing
  28 a disk sector that was read with difficulty to another area on
  29 the media. Even though such a remapping can be done by a disk
  30 drive transparently, there is still a lingering worry about media
  31 deterioration and the disk running out of spare sectors to remap.
  32 </p><p>
  33 Can smartmontools help? As the <acronym class="acronym">SMART</acronym> acronym
  34 <sup>[<a name="id2506421" href="#ftn.id2506421" class="footnote">1</a>]</sup>
  35 suggests, the <span class="command"><strong>smartctl</strong></span> command and the
  36 <span class="command"><strong>smartd</strong></span> daemon concentrate on monitoring and analysis.
  37 So apart from changing some reporting settings, smartmontools will not
  38 modify the raw data in a device. Also smartmontools only works with
  39 physical devices, it does not know about partitions and file systems.
  40 So other tools are needed. The job of smartmontools is to alert the user
  41 that something is wrong and user intervention may be required.
  42 </p><p>
  43 When a bad block is reported one approach is to work out the mapping between
  44 the logical block address used by a storage device and a file or some other
  45 component of a file system using that device. Note that there may not be such
  46 a mapping reflecting that a bad block has been found at a location not
  47 currently used by the file system. A user may want to do this analysis to
  48 localize and minimize the number of replacement files that are retrieved from
  49 some backup store. This approach requires knowledge of the file system
  50 involved and this document uses the Linux ext2/ext3 and ReiserFS file systems
  51 for examples. Also the type of content may come into play. For example if
  52 an area storing video has a corrupted sector, it may be easiest to accept
  53 that a frame or two might be corrupted and instruct the disk not to retry
  54 as that may have the visual effect of causing a momentary blank into a 1
  55 second pause (while the disk retries the faulty sector, often accompanied
  56 by a telltale clicking sound).
  57 </p><p>
  58 Another approach is to ignore the upper level consequences (e.g. corrupting
  59 a file or worse damage to a file system) and use the facilities offered by
  60 a storage device to repair the damage. The SCSI disk command set is used
  61 elaborate on this low level approach.
  62 </p></div><div class="sect1" title="Repairs in a file system"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="rfile"></a>Repairs in a file system</h2></div></div></div><p>
  63 This section contains examples of what to do at the file system level
  64 when smartmontools reports a bad block. These examples assume the Linux
  65 operating system and either the ext2/ext3 or ReiserFS file system. The
  66 various Linux commands shown have man pages and the reader is encouraged
  67 to examine these. Of note is the <span class="command"><strong>dd</strong></span> command which is
  68 often used in repair work
  69 <sup>[<a name="id2506498" href="#ftn.id2506498" class="footnote">2</a>]</sup>
  70 and has a unique command line syntax.
  71 </p><p>
  72 The authors would like to thank Sergey Vlasov, Theodore Ts'o,
  73 Michael Bendzick, and others for explaining this approach. The authors would
  74 like to add text showing how to do this for other file systems, in
  75 particular XFS, and JFS: please email if you can provide this
  76 information.
  77 </p><div class="sect2" title="ext2/ext3 first example"><div class="titlepage"><div><div><h3 class="title"><a name="e2_example1"></a>ext2/ext3 first example</h3></div></div></div><p>
  78 In this example, the disk is failing self-tests at Logical Block
  79 Address LBA = 0x016561e9 = 23421417.  The LBA counts sectors in units
  80 of 512 bytes, and starts at zero.
  81 </p><p>
  82 </p><pre class="programlisting">
  83 root]# smartctl -l selftest /dev/hda:
  84
  85 SMART Self-test log structure revision number 1
  86 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
  87 # 1  Extended offline    Completed: read failure       90%       217         0x016561e9
  88 </pre><p>
  89 Note that other signs that there is a bad sector on the disk can be
  90 found in the non-zero value of the Current Pending Sector count:
  91 </p><pre class="programlisting">
  92 root]# smartctl -A /dev/hda
  93 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  94   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  95 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
  96 197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       1
  97 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       1
  98 </pre><p>
  99 </p><p>
 100 First Step: We need to locate the partition on which this sector of
 101 the disk lives:
 102 </p><pre class="programlisting">
 103 root]# fdisk -lu /dev/hda
 104
 105 Disk /dev/hda: 123.5 GB, 123522416640 bytes
 106 255 heads, 63 sectors/track, 15017 cylinders, total 241254720 sectors
 107 Units = sectors of 1 * 512 = 512 bytes
 108
 109    Device Boot    Start       End    Blocks   Id  System
 110 /dev/hda1   *        63   4209029   2104483+  83  Linux
 111 /dev/hda2       4209030   5269319    530145   82  Linux swap
 112 /dev/hda3       5269320 238227884 116479282+  83  Linux
 113 /dev/hda4     238227885 241248104   1510110   83  Linux
 114 </pre><p>
 115
 116 The partition <code class="filename">/dev/hda3</code> starts at LBA 5269320 and
 117 extends past the 'problem' LBA.  The 'problem' LBA is offset
 118 23421417 - 5269320 = 18152097 sectors into the partition
 119 <code class="filename">/dev/hda3</code>.
 120 </p><p>
 121 To verify the type of the file system and the mount point, look in
 122 <code class="filename">/etc/fstab</code>:
 123 </p><pre class="programlisting">
 124 root]# grep hda3 /etc/fstab
 125 /dev/hda3 /data ext2 defaults 1 2
 126 </pre><p>
 127 You can see that this is an ext2 file system, mounted at
 128 <code class="filename">/data</code>.
 129 </p><p>
 130 Second Step: we need to find the block size of the file system
 131 (normally 4096 bytes for ext2):
 132 </p><pre class="programlisting">
 133 root]# tune2fs -l /dev/hda3 | grep Block
 134 Block count:              29119820
 135 Block size:               4096
 136 </pre><p>
 137 In this case the block size is 4096 bytes.
 138
 139 Third Step: we need to determine which File System Block contains this
 140 LBA.  The formula is:
 141 </p><pre class="programlisting">
 142   b = (int)((L-S)*512/B)
 143 where:
 144 b = File System block number
 145 B = File system block size in bytes
 146 L = LBA of bad sector
 147 S = Starting sector of partition as shown by fdisk -lu
 148 and (int) denotes the integer part.
 149 </pre><p>
 150
 151 In our example, L=23421417, S=5269320, and B=4096.  Hence the
 152 'problem' LBA is in block number
 153 </p><pre class="programlisting">
 154    b = (int)18152097*512/4096 = (int)2269012.125
 155 so b=2269012.
 156 </pre><p>
 157 </p><p>
 158 Note: the fractional part of 0.125 indicates that this problem LBA is
 159 actually the second of the eight sectors that make up this file system
 160 block.
 161 </p><p>
 162 Fourth Step: we use debugfs to locate the inode stored in this block,
 163 and the file that contains that inode:
 164 </p><pre class="programlisting">
 165 root]# debugfs
 166 debugfs 1.32 (09-Nov-2002)
 167 debugfs:  open /dev/hda3
 168 debugfs:  testb 2269012
 169 Block 2269012 not in use
 170 </pre><p>
 171
 172 If the block is not in use, as in the above example, then you can skip
 173 the rest of this step and go ahead to Step Five.
 174 </p><p>
 175 If, on the other hand, the block is in use, we want to identify
 176 the file that uses it:
 177 </p><pre class="programlisting">
 178 debugfs:  testb 2269012
 179 Block 2269012 marked in use
 180 debugfs:  icheck 2269012
 181 Block   Inode number
 182 2269012 41032
 183 debugfs:  ncheck 41032
 184 Inode   Pathname
 185 41032   /S1/R/H/714197568-714203359/H-R-714202192-16.gwf
 186 </pre><p>
 187 In this example, you can see that the problematic file (with the mount
 188 point included in the path) is:
 189 <code class="filename">/data/S1/R/H/714197568-714203359/H-R-714202192-16.gwf</code>
 190 </p><p>
 191 When we are working with an ext3 file system, it may happen that the
 192 affected file is the journal itself.  Generally, if this is the case,
 193 the inode number will be very small.  In any case, debugfs will not
 194 be able to get the file name:
 195 </p><pre class="programlisting">
 196 debugfs:  testb 2269012
 197 Block 2269012 marked in use
 198 debugfs:  icheck 2269012
 199 Block   Inode number
 200 2269012 8
 201 debugfs:  ncheck 8
 202 Inode   Pathname
 203 debugfs:
 204 </pre><p>
 205 </p><p>
 206 To get around this situation, we can remove the journal altogether:
 207 </p><pre class="programlisting">
 208 tune2fs -O ^has_journal /dev/hda3
 209 </pre><p>
 210
 211 and then start again with Step Four: we should see this time that the
 212 wrong block is not in use any more.  If we removed the journal file, at
 213 the end of the whole procedure we should remember to rebuild it:
 214 </p><pre class="programlisting">
 215 tune2fs -j /dev/hda3
 216 </pre><p>
 217 </p><p>
 218 Fifth Step
 219 <span class="emphasis"><em>NOTE:</em></span> This last step will <span class="emphasis"><em>permanently
 220
 221 </em></span> and irretrievably <span class="emphasis"><em>destroy</em></span> the contents
 222 of the file system block that is damaged: if the block was allocated to
 223 a file, some of the data that is in this file is going to be overwritten
 224 with zeros.  You will not be able to recover that data unless you can
 225 replace the file with a fresh or correct version.
 226 </p><p>
 227 To force the disk to reallocate this bad block we'll write zeros to
 228 the bad block, and sync the disk:
 229 </p><pre class="programlisting">
 230 root]# dd if=/dev/zero of=/dev/hda3 bs=4096 count=1 seek=2269012
 231 root]# sync
 232 </pre><p>
 233 </p><p>
 234 Now everything is back to normal: the sector has been reallocated.
 235 Compare the output just below to similar output near the top of this
 236 article:
 237 </p><pre class="programlisting">
 238 root]# smartctl -A /dev/hda
 239 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 240   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       1
 241 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       1
 242 197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
 243 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       1
 244 </pre><p>
 245
 246 Note: for some disks it may be necessary to update the SMART Attribute values by using
 247 <span class="command"><strong>smartctl -t offline /dev/hda</strong></span>
 248 </p><p>
 249 We have corrected the first errored block.  If more than one blocks
 250 were errored, we should repeat all the steps for the subsequent ones.
 251 After we do that, the disk will pass its self-tests again:
 252
 253 </p><pre class="programlisting">
 254 root]# smartctl -t long /dev/hda  [wait until test completes, then]
 255 root]# smartctl -l selftest /dev/hda
 256
 257 SMART Self-test log structure revision number 1
 258 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 259 # 1  Extended offline    Completed without error       00%       239         -
 260 # 2  Extended offline    Completed: read failure       90%       217         0x016561e9
 261 # 3  Extended offline    Completed: read failure       90%       212         0x016561e9
 262 # 4  Extended offline    Completed: read failure       90%       181         0x016561e9
 263 # 5  Extended offline    Completed without error       00%        14         -
 264 # 6  Extended offline    Completed without error       00%         4         -
 265 </pre><p>
 266 </p><p>
 267 and no longer shows any offline uncorrectable sectors:
 268
 269 </p><pre class="programlisting">
 270 root]# smartctl -A /dev/hda
 271 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 272   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       1
 273 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       1
 274 197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
 275 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
 276 </pre><p>
 277 </p></div><div class="sect2" title="ext2/ext3 second example"><div class="titlepage"><div><div><h3 class="title"><a name="e2_example2"></a>ext2/ext3 second example</h3></div></div></div><p>
 278 On this drive, the first sign of trouble was this email from smartd:
 279 </p><pre class="programlisting">
 280     To: ballen
 281     Subject: SMART error (selftest) detected on host: medusa-slave166.medusa.phys.uwm.edu
 282
 283     This email was generated by the smartd daemon running on host:
 284     medusa-slave166.medusa.phys.uwm.edu in the domain: master001-nis
 285
 286     The following warning/error was logged by the smartd daemon:
 287     Device: /dev/hda, Self-Test Log error count increased from 0 to 1
 288 </pre><p>
 289 </p><p>
 290 Running <span class="command"><strong>smartctl -a /dev/hda</strong></span> confirmed the problem:
 291
 292 </p><pre class="programlisting">
 293 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 294 # 1  Extended offline    Completed: read failure       80%       682         0x021d9f44
 295
 296 Note that the failing LBA reported is 0x021d9f44 (base 16) = 35495748 (base 10)
 297
 298 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 299   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
 300 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
 301 197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       3
 302 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       3
 303 </pre><p>
 304 </p><p>
 305 and one can see above that there are 3 sectors on the list of pending
 306 sectors that the disk can't read but would like to reallocate.
 307 </p><p>
 308 The device also shows errors in the SMART error log:
 309 </p><pre class="programlisting">
 310 Error 212 occurred at disk power-on lifetime: 690 hours
 311   After command completion occurred, registers were:
 312   ER ST SC SN CL CH DH
 313   -- -- -- -- -- -- --
 314   40 51 12 46 9f 1d e2  Error: UNC 18 sectors at LBA = 0x021d9f46 = 35495750
 315
 316   Commands leading to the command that caused the error were:
 317   CR FR SC SN CL CH DH DC   Timestamp  Command/Feature_Name
 318   -- -- -- -- -- -- -- --   ---------  --------------------
 319   25 00 12 46 9f 1d e0 00 2485545.000  READ DMA EXT
 320 </pre><p>
 321 </p><p>
 322 Signs of trouble at this LBA may also be found in SYSLOG:
 323 </p><pre class="programlisting">
 324 [root]# grep LBA /var/log/messages | awk '{print $12}' | sort | uniq
 325  LBAsect=35495748
 326  LBAsect=35495750
 327 </pre><p>
 328 </p><p>
 329 So I decide to do a quick check to see how many bad sectors there
 330 really are. Using the bash shell I check 70 sectors around the trouble
 331 area:
 332 </p><pre class="programlisting">
 333 [root]# export i=35495730
 334 [root]# while [ $i -lt 35495800 ]
 335         &gt; do echo $i
 336         &gt; dd if=/dev/hda of=/dev/null bs=512 count=1 skip=$i
 337         &gt; let i+=1
 338         &gt; done
 339
 340 &lt;SNIP&gt;
 341
 342 35495734
 343 1+0 records in
 344 1+0 records out
 345 35495735
 346 dd: reading `/dev/hda': Input/output error
 347 0+0 records in
 348 0+0 records out
 349
 350 &lt;SNIP&gt;
 351
 352 35495751
 353 dd: reading `/dev/hda': Input/output error
 354 0+0 records in
 355 0+0 records out
 356 35495752
 357 1+0 records in
 358 1+0 records out
 359
 360 &lt;SNIP&gt;
 361 </pre><p>
 362 </p><p>
 363 which shows that the seventeen sectors 35495735-35495751 (inclusive)
 364 are not readable.
 365 </p><p>
 366 Next, we identify the files at those locations.  The partitioning
 367 information on this disk is identical to the first example above, and
 368 as in that case the problem sectors are on the third partition
 369 <code class="filename">/dev/hda3</code>.  So we have:
 370 </p><pre class="programlisting">
 371      L=35495735 to 35495751
 372      S=5269320
 373      B=4096
 374 </pre><p>
 375 so that b=3778301 to 3778303 are the three bad blocks in the file
 376 system.
 377
 378 </p><pre class="programlisting">
 379 [root]# debugfs
 380 debugfs 1.32 (09-Nov-2002)
 381 debugfs:  open /dev/hda3
 382 debugfs:  icheck 3778301
 383 Block   Inode number
 384 3778301 45192
 385 debugfs:  icheck 3778302
 386 Block   Inode number
 387 3778302 45192
 388 debugfs:  icheck 3778303
 389 Block   Inode number
 390 3778303 45192
 391 debugfs:  ncheck 45192
 392 Inode   Pathname
 393 45192   /S1/R/H/714979488-714985279/H-R-714979984-16.gwf
 394 debugfs:  quit
 395 </pre><p>
 396 Note that the first few steps of this procedure could also be done
 397 with a single command, which is very helpful if there are many bad
 398 blocks (thanks to Danie Marais for pointing this out):
 399 </p><pre class="programlisting">
 400 debugfs: icheck 3778301 3778302 3778303
 401 </pre><p>
 402 </p><p>
 403 And finally, just to confirm that this is really the damaged file:
 404 </p><p>
 405 </p><pre class="programlisting">
 406 [root]# md5sum /data/S1/R/H/714979488-714985279/H-R-714979984-16.gwf
 407 md5sum: /data/S1/R/H/714979488-714985279/H-R-714979984-16.gwf: Input/output error
 408 </pre><p>
 409 </p><p>
 410 Finally we force the disk to reallocate the three bad blocks:
 411 </p><pre class="programlisting">
 412 [root]# dd if=/dev/zero of=/dev/hda3 bs=4096 count=3 seek=3778301
 413 [root]# sync
 414 </pre><p>
 415 </p><p>
 416 We could also probably use:
 417 </p><pre class="programlisting">
 418 [root]# dd if=/dev/zero of=/dev/hda bs=512 count=17 seek=35495735
 419 </pre><p>
 420 </p><p>
 421 At this point we now have:
 422 </p><pre class="programlisting">
 423 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 424   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
 425 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
 426 197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
 427 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
 428 </pre><p>
 429 </p><p>
 430 which is encouraging, since the pending sectors count is now zero.
 431 Note that the drive reallocation count has not yet increased: the
 432 drive may now have confidence in these sectors and have decided not to
 433 reallocate them..
 434 </p><p>
 435 A device self test:
 436 </p><pre class="programlisting">
 437   [root#] smartctl -t long /dev/hda
 438 (then wait about an hour) shows no unreadable sectors or errors:
 439
 440 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 441 # 1  Extended offline    Completed without error       00%       692         -
 442 # 2  Extended offline    Completed: read failure       80%       682         0x021d9f44
 443 </pre><p>
 444 </p></div><div class="sect2" title="Unassigned sectors"><div class="titlepage"><div><div><h3 class="title"><a name="unassigned"></a>Unassigned sectors</h3></div></div></div><p>
 445 This section was written by Kay Diederichs. Even though this section
 446 assumes Linux and the ext2/ext3 file system, the strategy should be
 447 more generally applicable.
 448 </p><p>
 449 I read your badblocks-howto at and greatly
 450 benefited from it. One thing that's (maybe) missing is that often the
 451 <span class="command"><strong>smartctl -t long</strong></span> scan finds a bad sector which is
 452 <span class="emphasis"><em> not</em></span> assigned to
 453 any file. In that case it does not help to run debugfs, or rather
 454 debugfs reports the fact that no file owns that sector. Furthermore,
 455 it is somewhat laborious to come up with the correct numbers for
 456 debugfs, and debugfs is slow ...
 457 </p><p>
 458 So what I suggest in the case of presence of
 459 Current_Pending_Sector/Offline_Uncorrectable errors is to create a
 460 huge file on that file system.
 461 </p><pre class="programlisting">
 462   dd if=/dev/zero of=/some/mount/point bs=4k
 463 </pre><p>
 464 creates the file. Leave it running until the partition/file system is
 465 full. This will make the disk reallocate those sectors which do not
 466 belong to a file. Check the <span class="command"><strong>smartctl -a</strong></span> output after
 467 that and make
 468 sure that the sectors are reallocated. If any remain, use the debugfs
 469 method.  Of course the usual caveats apply - back it up first, and so
 470 on.
 471 </p></div><div class="sect2" title="ReiserFS example"><div class="titlepage"><div><div><h3 class="title"><a name="reiserfs_ex"></a>ReiserFS example</h3></div></div></div><p>
 472 This section was written by Joachim Jautz with additions from Manfred
 473 Schwarb.
 474 </p><p>
 475 The following problems were reported during a scheduled test:
 476 </p><pre class="programlisting">
 477 smartd[575]: Device: /dev/hda, starting scheduled Offline Immediate Test.
 478 [... 1 hour later ...]
 479 smartd[575]: Device: /dev/hda, 1 Currently unreadable (pending) sectors
 480 smartd[575]: Device: /dev/hda, 1 Offline uncorrectable sectors
 481 </pre><p>
 482 </p><p>
 483 [Step 0] The SMART selftest/error log
 484 (see <span class="command"><strong>smartctl -l selftest</strong></span>) indicated there was a problem
 485 with block address (i.e. the 512 byte sector at) 58656333. The partition
 486 table (e.g. see <span class="command"><strong>sfdisk -luS /dev/hda</strong></span> or
 487 <span class="command"><strong>fdisk -ul /dev/hda</strong></span>) indicated that this block was in the
 488 <code class="filename">/dev/hda3</code> partition which contained a ReiserFS file
 489 system. That partition started at block address 54781650.
 490 </p><p>
 491 While doing the initial analysis it may also be useful to take a copy
 492 of the disk attributes returned by <span class="command"><strong>smartctl -A /dev/hda</strong></span>.
 493 Specifically the values associated with the "Reallocated_Sector_Ct" and
 494 "Reallocated_Event_Count" attributes (for ATA disks, the grown list (GLIST)
 495 length for SCSI disks). If these are incremented at the end of the procedure
 496 it indicates that the disk has re-allocated one or more sectors.
 497 </p><p>
 498 [Step 1] Get the file system's block size:
 499 </p><pre class="programlisting">
 500 # debugreiserfs /dev/hda3 | grep '^Blocksize'
 501 Blocksize: 4096
 502 </pre><p>
 503 </p><p>
 504 [Step 2] Calculate the block number:
 505 </p><pre class="programlisting">
 506 # echo "(58656333-54781650)*512/4096" | bc -l
 507 484335.37500000000000000000
 508 </pre><p>
 509 It is re-assuring that the calculated 4 KB damaged block address in
 510 <code class="filename">/dev/hda3</code> is less than "Count of blocks on the
 511 device" shown in the output of <span class="command"><strong>debugreiserfs</strong></span> shown above.
 512 </p><p>
 513 [Step 3] Try to get more info about this block =&gt; reading the block
 514 fails as expected but at least we see now that it seems to be unused.
 515 If we do not get the `Cannot read the block' error we should
 516 check if our calculation in [Step 2] was correct ;)
 517 </p><pre class="programlisting">
 518 # debugreiserfs -1 484335 /dev/hda3
 519 debugreiserfs 3.6.19 (2003 http://www.namesys.com)
 520
 521 484335 is free in ondisk bitmap
 522 The problem has occurred looks like a hardware problem.
 523 </pre><p>
 524 </p><p>
 525 If you have bad blocks, we advise you to get a new hard drive, because
 526 once you get one bad block that the disk drive internals cannot hide from
 527 your sight, the chances of getting more are generally said to become
 528 much higher (precise statistics are unknown to us), and this disk
 529 drive is probably not expensive enough for you to risk your
 530 time and data on it. If you don't want to follow that
 531 advice then if you have just a few bad blocks, try writing to the
 532 bad blocks and see if the drive remaps the bad blocks (that means
 533 it takes a block it has in reserve and allocates it for use for
 534 of that block number). If it cannot remap the block, use
 535 <span class="command"><strong>badblock</strong></span> option (-B) with reiserfs utils to handle
 536 this block correctly.
 537 </p><pre class="programlisting">
 538 bread: Cannot read the block (484335): (Input/output error).
 539
 540 Aborted
 541 </pre><p>
 542 So it looks like we have the right (i.e. faulty) block address.
 543 </p><p>
 544 [Step 4] Try then to find the affected file
 545 <sup>[<a name="id2550815" href="#ftn.id2550815" class="footnote">3</a>]</sup>:
 546 </p><pre class="programlisting">
 547 tar -cO /mydir | cat &gt;/dev/null
 548 </pre><p>
 549 If you do not find any unreadable files, then the block may be free or
 550 located in some metadata of the file system.
 551 </p><p>
 552 [Step 5] Try your luck: bang the affected block with
 553 <span class="command"><strong>badblocks -n</strong></span> (non-destructive read-write mode, do unmount
 554 first), if you are very lucky the failure is transient and you can provoke
 555 reallocation
 556 <sup>[<a name="id2550862" href="#ftn.id2550862" class="footnote">4</a>]</sup>:
 557 </p><pre class="programlisting">
 558 # badblocks -b 4096 -p 3 -s -v -n /dev/hda3 `expr 484335 + 100` `expr 484335 - 100`
 559 </pre><p>
 560 <sup>[<a name="id2550876" href="#ftn.id2550876" class="footnote">5</a>]</sup>
 561 </p><p>
 562 check success with <span class="command"><strong>debugreiserfs -1 484335 /dev/hda3</strong></span>.
 563 Otherwise:
 564 </p><p>
 565 [Step 6] Perform this step <span class="emphasis"><em>only</em></span> if Step 5 has failed
 566 to fix the problem: overwrite that block to force reallocation:
 567 </p><pre class="programlisting">
 568 # dd if=/dev/zero of=/dev/hda3 count=1 bs=4096 seek=484335
 569 1+0 records in
 570 1+0 records out
 571 4096 bytes transferred in 0.007770 seconds (527153 bytes/sec)
 572 </pre><p>
 573 </p><p>
 574 [Step 7] If you can't rule out the bad block being in metadata, do
 575 a file system check:
 576 </p><pre class="programlisting">
 577 reiserfsck --check
 578 </pre><p>
 579 This could take a long time so you probably better go for lunch ...
 580 </p><p>
 581 [Step 8] Proceed as stated earlier. For example, sync disk and run a long
 582 selftest that should succeed now.
 583 </p></div></div><div class="sect1" title="Repairs at the disk level"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="sdisk"></a>Repairs at the disk level</h2></div></div></div><p>
 584 This section first looks at a damaged partition table. Then it ignores
 585 the upper level impact of a bad block and just repairs the underlying
 586 sector so that defective sector will not cause problems in the future.
 587 </p><div class="sect2" title="Partition table problems"><div class="titlepage"><div><div><h3 class="title"><a name="partition"></a>Partition table problems</h3></div></div></div><p>
 588 Some software failures can lead to zeroes or random data being written
 589 on the first block of a disk. For disks that use a DOS-based partitioning
 590 scheme this will overwrite the partition table which is found at the
 591 end of the first block. This is a single point of failure so after the
 592 damage tools like <span class="command"><strong>fdisk</strong></span> have no alternate data to use
 593 so they report no partitions or a damaged partition table.
 594 </p><p>
 595 One utility that may help is
 596 <a class="ulink" href="http://www.cgsecurity.org/wiki/TestDisk" target="_top">
 597 <code class="literal">testdisk</code></a> which can scan a disk looking for
 598 partitions and recreate a partition table if requested.
 599 <sup>[<a name="id2550980" href="#ftn.id2550980" class="footnote">6</a>]</sup>
 600 </p><p>
 601 Programs that create DOS partitions
 602 often place the first partition at logical block address 63. In Linux
 603 a loop back mount can be attempted at the appropriate offset of a disk
 604 with a damaged partition table. This approach may involve placing the
 605 disk with the damaged partition table in a working computer or perhaps
 606 an external USB enclosure. Assuming the disk with the damaged partition
 607 is <code class="filename">/dev/hdb</code>. Then the following read-only loop back
 608 mount could be tried:
 609 </p><pre class="programlisting">
 610 # mount -r /dev/hdb -o loop,offset=32256 /mnt
 611 </pre><p>
 612 The offset is in bytes so the number given is (63 * 512). If the file
 613 system cannot be identified then a '-t &lt;fs_type&gt;'
 614 may be needed (although this is not a good sign). If this mount is
 615 successful, a backup procedure is advised.
 616 </p><p>
 617 Only the primary DOS partitions are recorded in the first block of
 618 a disk. The extended DOS partition table is placed elsewhere on
 619 a disk. Again there is only one copy of it so it represents another
 620 single point of failure. All DOS partition information can be
 621 read in a form that can be used to recreate the tables with the
 622 <span class="command"><strong>sfdisk</strong></span> command. Obviously this needs to be done
 623 beforehand and the file put on other media. Here is how to fetch the
 624 partition table information:
 625 </p><pre class="programlisting">
 626 # sfdisk -dx /dev/hda &gt; my_disk_partition_info.txt
 627 </pre><p>
 628 Then <code class="filename">my_disk_partition_info.txt</code> should be placed on
 629 other media. If disaster strikes, then the disk with the damaged partition
 630 table(s) can be placed in a working system, let us say the damaged disk is
 631 now at <code class="filename">/dev/hdc</code>, and the following command restores
 632 the partition table(s):
 633 </p><pre class="programlisting">
 634 # sfdisk -x -O part_block_prior.img /dev/hdc &lt; my_disk_partition_info.txt
 635 </pre><p>
 636 Since the above command is potentially destructive it takes a copy of the
 637 block(s) holding the partition table(s) and puts it in
 638 <code class="filename">part_block_prior.img</code> prior to any changes. Then it
 639 changes the partition tables as indicated by
 640 <code class="filename">my_disk_partition_info.txt</code>. For what it is worth the
 641 author did test this on his system!
 642 <sup>[<a name="id2551099" href="#ftn.id2551099" class="footnote">7</a>]</sup>
 643 </p><p>
 644 For creating, destroying, resizing, checking and copying partitions, and
 645 the file systems on them, GNU's
 646 <a class="ulink" href="http://www.gnu.org/software/parted" target="_top">
 647 <code class="literal">parted</code></a> is worth examining.
 648 The <a class="ulink" href="http://www.tldp.org/HOWTO/Large-Disk-HOWTO.html" target="_top">
 649 <code class="literal">Large Disk HOWTO</code></a> is also a useful resource.
 650 </p></div><div class="sect2" title="LVM repairs"><div class="titlepage"><div><div><h3 class="title"><a name="lvm"></a>LVM repairs</h3></div></div></div><p>
 651 This section was written by Frederic BOITEUX. It was titled: "HOW TO
 652 LOCATE AND REPAIR BAD BLOCKS ON AN LVM VOLUME".
 653 </p><p>
 654 Smartd reports an error in a short test :
 655 </p><pre class="programlisting">
 656 # smartctl -a /dev/hdb
 657 ...
 658 SMART Self-test log structure revision number 1
 659 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 660 # 1  Short offline       Completed: read failure       90%        66         37383668
 661 </pre><p>
 662 So the disk has a bad block located in LBA block 37383668
 663 </p><p>
 664 In which physical partition is the bad block ?
 665 </p><pre class="programlisting">
 666 # sfdisk -luS /dev/hdb  # or 'fdisk -ul /dev/hdb'
 667
 668 Disk /dev/hdb: 9729 cylinders, 255 heads, 63 sectors/track
 669 Units = sectors of 512 bytes, counting from 0
 670
 671    Device Boot    Start       End   #sectors  Id  System
 672 /dev/hdb1            63    996029     995967  82  Linux swap / Solaris
 673 /dev/hdb2   *    996030   1188809     192780  83  Linux
 674 /dev/hdb3       1188810 156296384  155107575  8e  Linux LVM
 675 /dev/hdb4             0         -          0   0  Empty
 676 </pre><p>
 677
 678 It's in the <code class="filename">/dev/hdb3</code> partition, a LVM2 partition.
 679 From the LVM2 partition beginning, the bad block has an offset of
 680 </p><pre class="programlisting">
 681 (37383668 - 1188810) = 36194858
 682 </pre><p>
 683 </p><p>
 684 We have to find in which LVM2 logical partition the block belongs to.
 685 </p><p>
 686 In which logical partition is the bad block ?
 687 </p><p>
 688 <span class="emphasis"><em>IMPORTANT</em></span> : LVM2 can use different schemes dividing
 689 its physical partitions to logical ones : linear, striped, contiguous or
 690  not... The following example assumes that allocation is linear !
 691 </p><p>
 692 The physical partition used by LVM2 is divided in PE (Physical Extent)
 693 units of the same size, starting at pe_start' 512 bytes blocks from
 694 the beginning of the physical partition.
 695 </p><p>
 696 The 'pvdisplay' command gives the size of the PE (in KB) of the
 697 LVM partition :
 698 </p><pre class="programlisting">
 699 #  part=/dev/hdb3 ; pvdisplay -c $part | awk -F: '{print $8}'
 700 4096
 701 </pre><p>
 702 </p><p>
 703 To get its size in LBA block size (512 bytes or 0.5 KB), we multiply this
 704 number by 2 : 4096 * 2 = 8192 blocks for each PE.
 705 </p><p>
 706 To find the offset from the beginning of the physical partition is a
 707 bit more difficult : if you have a recent LVM2 version, try :
 708 </p><pre class="programlisting">
 709 # pvs -o+pe_start $part
 710 </pre><p>
 711 </p><p>
 712 Either, you can look in /etc/lvm/backup :
 713 </p><pre class="programlisting">
 714 # grep pe_start $(grep -l $part /etc/lvm/backup/*)
 715                         pe_start = 384
 716 </pre><p>
 717 </p><p>
 718 Then, we search in which PE is the badblock, calculating the PE rank
 719 in which the faulty block of the partition is :
 720 physical partition's bad block number / sizeof(PE) =
 721 </p><pre class="programlisting">
 722 36194858 / 8192 = 4418.3176
 723 </pre><p>
 724 </p><p>
 725 So we have to find in which LVM2 logical partition is used the PE
 726 number 4418 (count starts from 0) :
 727 </p><pre class="programlisting">
 728 # lvdisplay --maps |egrep 'Physical|LV Name|Type'
 729   LV Name                /dev/WDC80Go/racine
 730     Type                linear
 731     Physical volume     /dev/hdb3
 732     Physical extents    0 to 127
 733   LV Name                /dev/WDC80Go/usr
 734     Type                linear
 735     Physical volume     /dev/hdb3
 736     Physical extents    128 to 1407
 737   LV Name                /dev/WDC80Go/var
 738     Type                linear
 739     Physical volume     /dev/hdb3
 740     Physical extents    1408 to 1663
 741   LV Name                /dev/WDC80Go/tmp
 742     Type                linear
 743     Physical volume     /dev/hdb3
 744     Physical extents    1664 to 1791
 745   LV Name                /dev/WDC80Go/home
 746     Type                linear
 747     Physical volume     /dev/hdb3
 748     Physical extents    1792 to 3071
 749   LV Name                /dev/WDC80Go/ext1
 750     Type                linear
 751     Physical volume     /dev/hdb3
 752     Physical extents    3072 to 10751
 753   LV Name                /dev/WDC80Go/ext2
 754     Type                linear
 755     Physical volume     /dev/hdb3
 756     Physical extents    10752 to 18932
 757 </pre><p>
 758 </p><p>
 759 So the PE #4418 is in the <code class="filename">/dev/WDC80Go/ext1</code>
 760 LVM logical partition.
 761 </p><p>
 762 Size of logical block of file system on <code class="filename">/dev/WDC80Go/ext1
 763 </code> :
 764 </p><p>
 765 It's a ext3 fs, so I get it like this :
 766 </p><pre class="programlisting">
 767 # dumpe2fs /dev/WDC80Go/ext1 | grep 'Block size'
 768 dumpe2fs 1.37 (21-Mar-2005)
 769 Block size:               4096
 770 </pre><p>
 771 </p><p>
 772 bad block number for the file system :
 773 </p><p>
 774 The logical partition begins on PE 3072 :
 775 </p><pre class="programlisting">
 776  (# PE's start of partition * sizeof(PE)) + parttion offset[pe_start] =
 777  (3072 * 8192) + 384 = 25166208
 778 </pre><p>
 779 512b block of the physical partition, so the bad block number for the
 780 file system  is :
 781 </p><pre class="programlisting">
 782 (36194858 - 25166208) / (sizeof(fs block) / 512)
 783 = 11028650 / (4096 / 512)  = 1378581.25
 784 </pre><p>
 785 </p><p>
 786 Test of the fs bad block :
 787 </p><pre class="programlisting">
 788 dd if=/dev/WDC80Go/ext1 of=block1378581 bs=4096 count=1 skip=1378581
 789 </pre><p>
 790 </p><p>
 791 If this dd command succeeds, without any error message in console or
 792 syslog, then the block number calculation is probably wrong ! *Don't*
 793 go further, re-check it and if you don't find the error, please
 794 renounce !
 795 </p><p>
 796 Search / correction follows the same scheme as for simple
 797 partitions :
 798 </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>
 799 find possible impacted files with debugfs (icheck &lt;fs block nb&gt;,
 800 then ncheck &lt;icheck nb&gt;).
 801 </p></li><li class="listitem"><p>
 802 reallocate bad block writing zeros in it, *using the fs block size* :
 803 </p></li></ul></div><p>
 804 </p><p>
 805 </p><pre class="programlisting">
 806 dd if=/dev/zero of=/dev/WDC80Go/ext1 count=1 bs=4096 seek=1378581
 807 </pre><p>
 808 </p><p>
 809 Et voilà !
 810 </p></div><div class="sect2" title="Bad block reassignment"><div class="titlepage"><div><div><h3 class="title"><a name="bb"></a>Bad block reassignment</h3></div></div></div><p>
 811 The SCSI disk command set and associated disk architecture are assumed
 812 in this section. SCSI disks have their own logical to physical mapping
 813 allowing a damaged sector (usually carrying 512 bytes of data) to be
 814 remapped irrespective of the operating system, file system or software
 815 RAID being used.
 816 </p><p>
 817 The terms <span class="emphasis"><em>block</em></span> and <span class="emphasis"><em>sector</em></span> are
 818 used interchangeably, although block tends to get used in higher level or
 819 more abstract contexts such as a <span class="emphasis"><em>logical block</em></span>.
 820 </p><p>
 821 When a SCSI disk is formatted, defective sectors identified during
 822 the manufacturing process (the so called primary list: PLIST),
 823 those found during the format itself (the certification list: CLIST),
 824 those given explicitly to the format command (the DLIST) and optionally
 825 the previous grown list (GLIST) are not used in the logical block
 826 map. The number (and low level addresses) of the unmapped sectors can be
 827 found with the READ DEFECT DATA SCSI command.
 828 </p><p>
 829 SCSI disks tend to be divided into zones which have spare sectors and
 830 perhaps spare tracks, to support the logical block address mapping
 831 process. The idea is that if a logical block is remapped, the heads do not
 832 have to move a long way to access the replacement sector. Note that spare
 833 sectors are a scarce resource.
 834 </p><p>
 835 Once a SCSI disk format has completed successfully, other problems
 836 may appear over time. These fall into two categories:
 837 </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>
 838 recoverable: the Error Correction Codes (ECC) detect a problem
 839 but it is small enough to be corrected. Optionally other strategies
 840 such as retrying the access may retrieve the data.
 841 </p></li><li class="listitem"><p>
 842 unrecoverable: try as it may, the disk logic and ECC algorithms
 843 cannot recover the data. This is often reported as a
 844 <span class="emphasis"><em>medium error</em></span>.
 845 </p></li></ul></div><p>
 846 </p><p>
 847 Other things can go wrong, typically associated with the transport and
 848 they will be reported using a term other than
 849 <span class="emphasis"><em>medium error</em></span>. For example a disk may decide a read
 850 operation was successful but a computer's host bus adapter (HBA) checking
 851 the incoming data detects a CRC error due to a bad cable or termination.
 852 </p><p>
 853 Depending on the disk vendor, recoverable errors can be ignored. After all,
 854 some disks have up to 68 bytes of ECC above the payload size of 512 bytes
 855 so why use up spare sectors which are limited in number
 856 <sup>[<a name="id2551516" href="#ftn.id2551516" class="footnote">8</a>]</sup>
 857 ?
 858 If the disk can recover the data and does decide to re-allocate (reassign)
 859 a sector, then first it checks the settings of the ARRE and AWRE bits in the
 860 read-write error recovery mode page. Usually these bits are set
 861 <sup>[<a name="id2551535" href="#ftn.id2551535" class="footnote">9</a>]</sup>
 862 enabling automatic (read or write) re-allocation. The automatic
 863 re-allocation may also fail if the zone (or disk) has run out of spare
 864 sectors.
 865 </p><p>
 866 Another consideration with RAIDs, and applications that require a high
 867 data rate without pauses, is that the controller logic may not want a
 868 disk to spend too long trying to recover an error.
 869 </p><p>
 870 Unrecoverable errors will cause a <span class="emphasis"><em>medium error</em></span> sense
 871 key, perhaps with some useful additional sense information. If the extended
 872 background self test includes a full disk read scan, one would expect the
 873 self test log to list the bad block, as shown in the <a class="xref" href="#rfile" title="Repairs in a file system">the section called &#8220;Repairs in a file system&#8221;</a>.
 874 Recent SCSI disks with a periodic background scan should also list
 875 unrecoverable read errors (and some recoverable errors as well). The
 876 advantage of the background scan is that it runs to completion while self
 877 tests will often terminate at the first serious error.
 878 </p><p>
 879 SCSI disks expect unrecoverable errors to be fixed manually using the
 880 REASSIGN BLOCKS SCSI command since loss of data is involved. It is possible
 881 that an operating system or a file system could issue the REASSIGN BLOCKS
 882 command itself but the authors are unaware of any examples. The REASSIGN BLOCKS
 883 command will reassign one or more blocks, attempting to (partially ?) recover
 884 the data (a forlorn hope at this stage), fetch an unused spare sector from the
 885 current zone while adding the damaged old sector to the GLIST (hence the
 886 name "grown" list). The contents of the GLIST may not be that interesting
 887 but <span class="command"><strong>smartctl</strong></span> prints out the number of entries in the grown
 888 list and if that number grows quickly, the disk may be approaching the end
 889 of its useful life.
 890 </p><p>
 891 Here is an alternate brute force technique to consider: if the data on the
 892 SCSI or ATA disk has all been backed up (e.g. is held on the other disks in
 893 a RAID 5 enclosure), then simply reformatting the disk may be the least
 894 cumbersome approach.
 895 </p><div class="sect3" title="Example"><div class="titlepage"><div><div><h4 class="title"><a name="sexample"></a>Example</h4></div></div></div><p>
 896 Given a "bad block", it still may be useful to look at the
 897 <span class="command"><strong>fdisk</strong></span> command (if the disk has multiple partitions)
 898 to find out which partition is involved, then use
 899 <span class="command"><strong>debugfs</strong></span> (or a similar tool for the file system in
 900 question) to find out which, if any, file or other part of the file system
 901 may have been damaged. This is discussed in the <a class="xref" href="#rfile" title="Repairs in a file system">the section called &#8220;Repairs in a file system&#8221;</a>.
 902 </p><p>
 903 Then a program that can execute the REASSIGN BLOCKS SCSI command is
 904 required. In Linux (2.4 and 2.6 series), FreeBSD, Tru64(OSF) and Windows
 905 the author's <span class="command"><strong>sg_reassign</strong></span> utility in the sg3_utils
 906 package can be used. Also found in that package is
 907 <span class="command"><strong>sg_verify</strong></span> which can be used to check that a block is
 908 readable.
 909 </p><p>
 910 Assume that logical block address 1193046 (which is 123456 in hex) is
 911 corrupt
 912 <sup>[<a name="id2551756" href="#ftn.id2551756" class="footnote">10</a>]</sup>
 913 on the disk at <code class="filename">/dev/sdb</code>. A long selftest command like
 914 <span class="command"><strong>smartctl -t long /dev/sdb</strong></span> may result in log results
 915 like this:
 916 </p><pre class="programlisting">
 917 # smartctl -l selftest /dev/sdb
 918 smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
 919 Home page is http://smartmontools.sourceforge.net/
 920
 921
 922 SMART Self-test log
 923 Num  Test              Status            segment  LifeTime  LBA_first_err [SK ASC ASQ]
 924      Description                         number   (hours)
 925 # 1  Background long   Failed in segment      -     354           1193046 [0x3 0x11 0x0]
 926 # 2  Background short  Completed              -     323                 - [-   -    -]
 927 # 3  Background short  Completed              -     194                 - [-   -    -]
 928 </pre><p>
 929 </p><p>
 930 The <span class="command"><strong>sg_verify</strong></span> utility can be used to confirm that there
 931 is a problem at that address:
 932 </p><pre class="programlisting">
 933 # sg_verify --lba=1193046 /dev/sdb
 934 verify (10):  Fixed format, current;  Sense key: Medium Error
 935  Additional sense: Unrecovered read error
 936   Info fld=0x123456 [1193046]
 937   Field replaceable unit code: 228
 938   Actual retry count: 0x008b
 939 medium or hardware error, reported lba=0x123456
 940 </pre><p>
 941 </p><p>
 942 Now the GLIST length is checked before the block reassignment:
 943 </p><pre class="programlisting">
 944 # sg_reassign --grown /dev/sdb
 945 &gt;&gt; Elements in grown defect list: 0
 946 </pre><p>
 947 </p><p>
 948 And now for the actual reassignment followed by another check of the GLIST
 949 length:
 950 </p><pre class="programlisting">
 951 # sg_reassign --address=1193046 /dev/sdb
 952
 953 # sg_reassign --grown /dev/sdb
 954 &gt;&gt; Elements in grown defect list: 1
 955 </pre><p>
 956 </p><p>
 957 The GLIST length has grown by one as expected. If the disk was unable to
 958 recover any data, then the "new" block at lba 0x123456 has vendor specific
 959 data in it. The <span class="command"><strong>sg_reassign</strong></span> utility can also do bulk
 960 reassigns, see <span class="command"><strong>man sg_reassign</strong></span> for more information.
 961 </p><p>
 962 The <span class="command"><strong>dd</strong></span> command could be used to read the contents of
 963 the "new" block:
 964 </p><pre class="programlisting">
 965 # dd if=/dev/sdb iflag=direct skip=1193046 of=blk.img bs=512 count=1
 966 </pre><p>
 967 </p><p>
 968 and a hex editor
 969 <sup>[<a name="id2551874" href="#ftn.id2551874" class="footnote">11</a>]</sup>
 970 used to view and potentially change the
 971 <code class="filename">blk.img</code> file. An altered <code class="filename">blk.img</code>
 972 file (or <code class="filename">/dev/zero</code>) could be written back with:
 973 </p><pre class="programlisting">
 974 # dd if=blk.img of=/dev/sdb seek=1193046 oflag=direct bs=512 count=1
 975 </pre><p>
 976 </p><p>
 977 More work may be needed at the file system level, especially if the
 978 reassigned block held critical file system information such as
 979 a superblock or a directory.
 980 </p><p>
 981 Even if a full backup of the disk is available, or the disk has been
 982 "ejected" from a RAID, it may still be worthwhile to reassign the bad
 983 block(s) that caused the problem (or simply format the disk (see
 984 <span class="command"><strong>sg_format</strong></span> in the sg3_utils package)) and re-use the
 985 disk later (not unlike the way a replacement disk from a manufacturer
 986 might be used).
 987 </p><p>
 988 $Id: badblockhowto.xml 2873 2009-08-11 21:46:20Z dipohl $
 989 </p></div></div></div><div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a name="ftn.id2506421" href="#id2506421" class="para">1</a>] </sup>
 990 Self-Monitoring, Analysis and Reporting Technology -&gt; SMART
 991 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2506498" href="#id2506498" class="para">2</a>] </sup>
 992 Starting with GNU coreutils release 5.3.0, the <span class="command"><strong>dd</strong></span>
 993 command in Linux includes the options 'iflag=direct' and 'oflag=direct'.
 994 Using these with the <span class="command"><strong>dd</strong></span> commands should be helpful,
 995 because adding these flags should avoid any interaction
 996 with the block buffering IO layer in Linux and permit direct reads/writes
 997 from the raw device.  Use <span class="command"><strong>dd --help</strong></span> to see if your
 998 version of dd supports these options. If not, the latest code for dd
 999 can be found at <a class="ulink" href="http://alpha.gnu.org/gnu/coreutils" target="_top">
1000 <code class="literal">alpha.gnu.org/gnu/coreutils</code></a>.
1001 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2550815" href="#id2550815" class="para">3</a>] </sup>
1002 Do not use <span class="command"><strong>tar -c -f /dev/null</strong></span> or
1003 <span class="command"><strong>tar -cO /mydir &gt;/dev/null</strong></span>. GNU tar does not
1004 actually read the files if <code class="filename">/dev/null</code> is used as
1005 archive path or as standard output, see <span class="command"><strong>info tar</strong></span>.
1006 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2550862" href="#id2550862" class="para">4</a>] </sup>
1007 Important: set blocksize range is arbitrary, but do not only test a single
1008 block, as bad blocks are often social. Not too large as this test probably
1009 has not 0% risk.
1010 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2550876" href="#id2550876" class="para">5</a>] </sup>
1011 The rather awkward `expr 484335 + 100` (note the back quotes) can be replaced
1012 with $((484335+100)) if the bash shell is being used. Similarly the last
1013 argument can become $((484335-100)) .
1014 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2550980" href="#id2550980" class="para">6</a>] </sup>
1015 <span class="command"><strong>testdisk</strong></span> scans the media for the beginning of file
1016 systems that it recognizes. It can be tricked by data that looks
1017 like the beginning of a file system or an old file system from a
1018 previous partitioning of the media (disk). So care should be taken.
1019 Note that file systems should not overlap apart from the fact that
1020 extended partitions lie wholly within a extended partition table
1021 allocation. Also if the root partition of a Linux/Unix installation
1022 can be found then the <code class="filename">/etc/fstab</code> file is a useful
1023 resource for finding the partition numbers of other partitions.
1024 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2551099" href="#id2551099" class="para">7</a>] </sup>
1025 Thanks to Manfred Schwarb for the information about storing partition
1026 table(s) beforehand.
1027 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2551516" href="#id2551516" class="para">8</a>] </sup>
1028 Detecting and fixing an error with ECC "on the fly" and not going the further
1029 step and reassigning the block in question may explain why some disks have
1030 large numbers in their read error counter log. Various worried users have
1031 reported large numbers in the "errors corrected without substantial delay"
1032 counter field which is in the "Errors corrected by ECC fast" column in
1033 the <span class="command"><strong>smartctl -l error</strong></span> output.
1034 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2551535" href="#id2551535" class="para">9</a>] </sup>
1035 Often disks inside a hardware RAID have the ARRE and AWRE bits
1036 cleared (disabled) so the RAID controller can do things manually or flag
1037 the disk for replacement.
1038 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2551756" href="#id2551756" class="para">10</a>] </sup>
1039 In this case the corruption was manufactured by using the WRITE LONG
1040 SCSI command. See <span class="command"><strong>sg_write_long</strong></span> in sg3_utils.
1041 </p></div><div class="footnote"><p><sup>[<a name="ftn.id2551874" href="#id2551874" class="para">11</a>] </sup>
1042 Most window managers have a handy calculator that will do hex to
1043 decimal conversions. More work may be needed at the file system level,
1044 </p></div></div></div></body></html>