Dobrica Pavlinušić's random unstructured stuff
Bad block HOWTO for smartmontools: Revision 4
My transcript for fixing error based on original documentation root@t42:~# smartctl -l selftest /dev/hda smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 50% 1863 48784734 # 2 Extended offline Completed: read failure 50% 1719 48784734 sic we have an error at 48784734 root@t42:~# smartctl -A /dev/hda smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 65536 2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 3662 3 Spin_Up_Time 0x0007 250 250 033 Pre-fail Always - 1 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 1520 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0 9 Power_On_Hours 0x0012 096 096 000 Old_age Always - 1866 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1319 191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1703983 193 Load_Cycle_Count 0x0012 095 095 000 Old_age Always - 56800 194 Temperature_Celsius 0x0002 171 171 000 Old_age Always - 32 (Lifetime Min/Max 14/42) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 5 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 1 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 And we do have Current_Pending_Sector root@t42:~# fdisk -lu /dev/hda Disk /dev/hda: 40.0 GB, 40007761920 bytes 255 heads, 63 sectors/track, 4864 cylinders, total 78140160 sectors Units = sectors of 1 * 512 = 512 bytes Disk identifier: 0xcccdcccd Device Boot Start End Blocks Id System /dev/hda1 * 63 75119939 37559938+ 83 Linux /dev/hda2 75119940 78140159 1510110 5 Extended /dev/hda5 75120003 78140159 1510078+ 82 Linux swap / Solaris sector is part of /dev/hda1 let's find it, first what is filesystem block size? root@t42:~# tune2fs -l /dev/hda1 | grep Block Block count: 9389984 Block size: 4096 Blocks per group: 32768 Then let's calculate offset in /dev/hda1 partition: root@t42:~# bc ( ( 48784734 - 63 ) * 512 ) / 4096 6098083 Let's see do we have any files there... root@t42:~# debugfs /dev/hda1 debugfs 1.41.3 (12-Oct-2008) debugfs: icheck 6098083 Block Inode number 6098083 <block not found> No files landed on it yet. So let's just relocate it: root@t42:~# dd if=/dev/zero of=/dev/hda1 bs=4096 count=1 seek=6098083 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 6.0343e-05 s, 67.9 MB/s root@t42:~# sync And check smart status again: root@t42:~# smartctl -A /dev/hda smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 1 2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 3662 3 Spin_Up_Time 0x0007 250 250 033 Pre-fail Always - 1 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 1520 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0 9 Power_On_Hours 0x0012 096 096 000 Old_age Always - 1866 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1319 191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1703983 193 Load_Cycle_Count 0x0012 095 095 000 Old_age Always - 56800 194 Temperature_Celsius 0x0002 157 157 000 Old_age Always - 35 (Lifetime Min/Max 14/42) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 6 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 You will see that Current_Pending_Sector dropped to 0 and Reallocated_Event_Count increased to 6. It's probably time to throw away this disk... Let's run test again root@t42:~# smartctl -t long /dev/hda smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 29 minutes for test to complete. Test will complete after Tue Jan 27 14:07:03 2009 Use smartctl -X to abort test. And wait for test to finish to get: root@t42:~# smartctl -l selftest /dev/hda smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 1866 - # 2 Extended offline Completed: read failure 50% 1863 48784734 # 3 Extended offline Completed: read failure 50% 1719 48784734 |