7

My question is a quite simple , I want to run the command smartctl -i -A on all disks that the server have. Think that I've too much server with different number of disks and RAID Controllers, then I need to scan all drivers for a diagnosis. I'm thinking of running smartctl --scan | awk '{print $1}' >> test.log, so if I open the test.log I'll have all the drives information in it.
After this I need to run some if or do constructions to scan with smartctl all drivers. I don't know if this is the best way to do this, since I need to identify the RAID Controller too. Am heading in the right direction?

Edit:

I'm used to use these commands to troubleshoot:

Without RAID Controller

for i in {c..d}; do
    echo "Disk sd$i" $SN $MD
    smartctl -i -A /dev/sd$i |grep -E "^  "5"|^"197"|^"198"|"FAILING_NOW"|"SERIAL""
done

PERC Controller

for i in {0..12}; do
    echo "$i" $SN $MD
    smartctl -i -A -T permissive /dev/sda -d megaraid,$i |grep -E "^  "5"|^"197"|^"198"|"FAILING_NOW"|"SERIAL""
done
/usr/sbin/megastatus –physical
/usr/sbin/megastatus --logical

3ware Controller

for i in {0..10}; do
    echo "Disk $i" $SN $MD
    smartctl -i -A /dev/twa0 -d 3ware,$i |grep -E "^  "5"|^"197"|^"198"|"FAILING_NOW"|"SERIAL""
done

SmartArray & Megaraid Controler:

smartctl –a –d cciss,0 /dev/cciss/c0d0
/opt/3ware/9500/tw_cli show
cd /tmp

DD (Rewrite disk block (DESTROY DATA)):

dd if=/dev/zero of=/dev/HD* bs=4M
HD*: sda, sdb…

Burning (Stress test (DESTROY DATA)):

/opt/systems/bin/vs-burnin --destructive --time=<hours> /tmp/burninlog.txt

Dmesg&kernerrors:

tail /var/log/kernerrors
dmesg |grep –i –E “”ata”|”fault”|”error”

So what I'm trying to do is automate these commands.
I want that the script verify all disks that the host have and run the appropriate smartctl command for the case.
Something like a menu with some options that let me choose if I want to run a smartctl or some destructive command, if I choose to run smartctl
the script will scan all disks and runs the command according to the host configuration ( with / without RAID controller),
and if I choose to run a destructive command, the script will ask me to put the disk number that I want to do this.


Edit 2:

I resolved my problem with the following script:

#!/bin/bash
# Troubleshoot.sh
# A more elaborate version of Troubleshoot.sh.

SUCCESS=0
E_DB=99    # Error code for missing entry.

declare -A address
#       -A option declares associative array.



if [ -f Troubleshoot.log ]
then
    rm Troubleshoot.log
fi

if [ -f HDs.log ]
then
    rm HDs.log
fi

smartctl --scan | awk '{print $1}' >> HDs.log
lspci | grep -i raid >> HDs.log

getArray ()
{
    i=0
    while read line # Read a line
    do
        array[i]=$line # Put it into the array
        i=$(($i + 1))
    done < $1
}

getArray "HDs.log"


for e in "${array[@]}"
do
    if [[ $e =~ /dev/sd* || $e =~ /dev/hd* ]]
        then
            echo "smartctl -i -A $e" >> Troubleshoot.log
            smartctl -i -A $e >> Troubleshoot.log # Run smartctl into all disks that the host have
    fi
done
exit $?   # In this case, exit code = 99, since that is function return.

I don't know if this solution is the right or the best one, but works for me!
Appreciate all help!!

jasonwryan
  • 71,734
  • 34
  • 193
  • 226
ZeroNegative
  • 93
  • 1
  • 1
  • 4
  • Welcome on this site, your question is good, but to ask a question with a maximum chance of succes, you should describe precisely your goal, what you've tried so far, describe your problem if there's one and then ask your question. By applying this simple advise you could get much better answer. – Kiwy Mar 27 '14 at 12:30
  • 1
    And please give info on the system you are using as well. Otherwise **[serverfault](http://serverfault.com/questions)** might also be a place to ask this kind of question (**but do not ask the same question on both places!**) – Ouki Mar 27 '14 at 12:35
  • @Kiwy: I put some other information, see if it's good now :) – ZeroNegative Mar 27 '14 at 12:58
  • @Ouki: I can't put to much details because of some policies of the company, all I can do is put the information that I've wrote on Edit. – ZeroNegative Mar 27 '14 at 13:01
  • @ZeroNegative what you want basically is that someone develop a script for you. and you're very unlikely to find someone who will accomplish such a task on U&L or on any other Stack Exchange web site but your question is much more complete yes. Also take into account that I've edited 2 time your question please, try to analyse what I made so you can format correctly you question and answers next time ;-) – Kiwy Mar 27 '14 at 13:13
  • @Kiwi sure, I'll! Well, I just need that someone give me some way to start this, I had thought of run smartctl --scan | awk '{print $1}' >> test.log to write a log with the disks path and want to do something to catch this path and put on the command, something like this: test.log have 2 lines, /dev/sda and /dev/sdb, then the script will pick this two paths and run smartctl -i -A /dev/sda && smartctl -i -A /dev/sdb. Am I looking for help in the wrong place? Appreciate the help! – ZeroNegative Mar 27 '14 at 13:30

2 Answers2

1

So what I'm trying to do is automate these commands.

This already exist and manifests in smartd.

You normally need to configure your desired behaviour in /etc/smartd.conf

Example:

# DEVICESCAN: tells smartd to scan for all ATA and SCSI devices
# Alternative setting to report more useful raw temperature in syslog.
DEVICESCAN -I 194 -I 231 -I 9

You can alternatively put your disks explicitly like

/dev/sdc -d 3ware,0 -a -s L/../../7/01

If smartd discover an error, you'll get an email:

/dev/hdc -a -I 194 -W 4,45,55 -R 5 -m [email protected]

There are also a number of other options and switches, you'll need to read the manpage of smartd.conf.

0

@xeruf. I run into this question in my own work from time to time and I keep having to redo my work lol. Just so me and other people can find a basic smart output command I run this:

for i in $(ls /dev/sd?); do
    echo "$i"
    sudo smartctl -a $i
done
roaima
  • 107,089
  • 14
  • 139
  • 261