My question is a quite simple , I want to run the command smartctl -i -A on all disks that the server have.
Think that I've too much server with different number of disks and RAID Controllers, then I need to scan all drivers for a diagnosis.
I'm thinking of running smartctl --scan | awk '{print $1}' >> test.log, so if I open the test.log I'll have all the drives information in it.
After this I need to run some if or do constructions to scan with smartctl all drivers.
I don't know if this is the best way to do this, since I need to identify the RAID Controller too.
Am heading in the right direction?
Edit:
I'm used to use these commands to troubleshoot:
Without RAID Controller
for i in {c..d}; do
echo "Disk sd$i" $SN $MD
smartctl -i -A /dev/sd$i |grep -E "^ "5"|^"197"|^"198"|"FAILING_NOW"|"SERIAL""
done
PERC Controller
for i in {0..12}; do
echo "$i" $SN $MD
smartctl -i -A -T permissive /dev/sda -d megaraid,$i |grep -E "^ "5"|^"197"|^"198"|"FAILING_NOW"|"SERIAL""
done
/usr/sbin/megastatus –physical
/usr/sbin/megastatus --logical
3ware Controller
for i in {0..10}; do
echo "Disk $i" $SN $MD
smartctl -i -A /dev/twa0 -d 3ware,$i |grep -E "^ "5"|^"197"|^"198"|"FAILING_NOW"|"SERIAL""
done
SmartArray & Megaraid Controler:
smartctl –a –d cciss,0 /dev/cciss/c0d0
/opt/3ware/9500/tw_cli show
cd /tmp
DD (Rewrite disk block (DESTROY DATA)):
dd if=/dev/zero of=/dev/HD* bs=4M
HD*: sda, sdb…
Burning (Stress test (DESTROY DATA)):
/opt/systems/bin/vs-burnin --destructive --time=<hours> /tmp/burninlog.txt
Dmesg&kernerrors:
tail /var/log/kernerrors
dmesg |grep –i –E “”ata”|”fault”|”error”
So what I'm trying to do is automate these commands.
I want that the script verify all disks that the host have and run the appropriate smartctl command for the case.
Something like a menu with some options that let me choose if I want to run a smartctl or some destructive command, if I choose to run smartctl
the script will scan all disks and runs the command according to the host configuration ( with / without RAID controller),
and if I choose to run a destructive command, the script will ask me to put the disk number that I want to do this.
Edit 2:
I resolved my problem with the following script:
#!/bin/bash
# Troubleshoot.sh
# A more elaborate version of Troubleshoot.sh.
SUCCESS=0
E_DB=99 # Error code for missing entry.
declare -A address
# -A option declares associative array.
if [ -f Troubleshoot.log ]
then
rm Troubleshoot.log
fi
if [ -f HDs.log ]
then
rm HDs.log
fi
smartctl --scan | awk '{print $1}' >> HDs.log
lspci | grep -i raid >> HDs.log
getArray ()
{
i=0
while read line # Read a line
do
array[i]=$line # Put it into the array
i=$(($i + 1))
done < $1
}
getArray "HDs.log"
for e in "${array[@]}"
do
if [[ $e =~ /dev/sd* || $e =~ /dev/hd* ]]
then
echo "smartctl -i -A $e" >> Troubleshoot.log
smartctl -i -A $e >> Troubleshoot.log # Run smartctl into all disks that the host have
fi
done
exit $? # In this case, exit code = 99, since that is function return.
I don't know if this solution is the right or the best one, but works for me!
Appreciate all help!!