Disk failure and suprises

Once in a while – and especially if you have a System with an uptime > 300d – HW tends to fail.

Good thing, if you have a Cluster, where you can do the maintance on one Node, while the import stuff is still running on the other ones. Also good to always have a Backup of the whole content, if a disk fails.

One word before I continue: Regarding Software-RAIDs: I had a big problem once with a HW RAID Controller going bonkers and spent a week to find another matching controller to get the data back. At least for redundant Servers it is okay for me to go with SW RAID (e.g. mdraid). And if you can, you should go with ZFS in any case :-).

Anyhow, if you see graphs like this one:

sda-week

You know that something goes terribly wrong.

Doing a quick check, states the obvious:

# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sda4[0](F) sdb4[1]
      1822442815 blocks super 1.2 [2/1] [_U]

md2 : active raid1 sda3[0](F) sdb3[1]
      1073740664 blocks super 1.2 [2/1] [_U]

md1 : active raid1 sda2[0](F) sdb2[1]
      524276 blocks super 1.2 [2/1] [_U]

md0 : active raid1 sda1[0](F) sdb1[1]
      33553336 blocks super 1.2 [2/1] [_U]

unused devices: <none>

So /dev/sda seems to be gone from the RAID. Let’s do the checks.

Hardware check

hdparm:

# hdparm -I /dev/sda

/dev/sda:
HDIO_DRIVE_CMD(identify) failed: Input/output error 

smartctl:

# smartctl -a /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-34-pve] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

Vendor:               /0:0:0:0
Product:
User Capacity:        600,332,565,813,390,450 bytes [600 PB]
Logical block size:   774843950 bytes
scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46
>> Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

/dev/sda is dead, Jim

Next thing is to schedule a Disk Replacement and moving all services to another host/prepare the host to shutdown for maintenance.

Preparation for disk replacement

Stop all running Containers:

# for VE in $(vzlist -Ha -o veid); do vzctl stop $VE; done

I also disabled the “start at boot” option to have a quick startup of the Proxmox Node.

Next: Remove the faulty disk from the md-RAID:

# mdadm /dev/md0 -r /dev/sda1
# mdadm /dev/md1 -r /dev/sda2
# mdadm /dev/md2 -r /dev/sda3
# mdadm /dev/md3 -r /dev/sda4

Shutting down the System.

… some guy in the DC moved to the server at the expected time and replaces the faulty disk …

After that, the system is online again.

  1. copy partition table from /dev/sdb to /dev/sda
    # sgdisk -R /dev/sda /dev/sdb
    
  2. recreatea another GUID for /dev/sda
    # sgdisk -G /dev/sda
    

Then add /dev/sda to the RAID again.

# mdadm /dev/md0 -a /dev/sda1
# mdadm /dev/md1 -a /dev/sda2
# mdadm /dev/md2 -a /dev/sda3
# mdadm /dev/md3 -a /dev/sda4



# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sda4[2] sdb4[1]
      1822442815 blocks super 1.2 [2/1] [_U]
      [===>.................]  recovery = 16.5% (301676352/1822442815) finish=329.3min speed=76955K/sec

md2 : active raid1 sda3[2] sdb3[1]
      1073740664 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sda2[2] sdb2[1]
      524276 blocks super 1.2 [2/1] [_U]
        resync=DELAYED

md0 : active raid1 sda1[2] sdb1[1]
      33553336 blocks super 1.2 [2/2] [UU]

unused devices: <none>

After nearly 12h, the resync was completed:

diskstats_iops-day

and then this happened:

# vzctl start 300
# vzctl enter 300
enter into CT 300 failed
Unable to open pty: No such file or directory

There are plenty of comments if you search for Unable to open pty: No such file or directory

But

# svzctl exec 300 /sbin/MAKEDEV tty
# vzctl exec 300 /sbin/MAKEDEV pty
# vzctl exec 300 mknod --mode=666 /dev/ptmx c 5 2

did not help:

# vzctl enter 300
enter into CT 300 failed
Unable to open pty: No such file or directory    

And

# strace -ff vzctl enter 300

produces a lot of garbage – meaning stacktraces that did not help to solve the problem.
Then we were finally able to enter the container:

# vzctl exec 300 mount -t devpts none /dev/pts

But having a look into the process list was quite devastating:

# vzctl exec 300 ps -A
  PID TTY          TIME CMD
    1 ?        00:00:00 init
    2 ?        00:00:00 kthreadd/300
    3 ?        00:00:00 khelper/300
  632 ?        00:00:00 ps    

That is not really what you expect when you have a look into the process list of a Mail-/Web-Server, isn’t it?
After looking araound into the system and searching through some configuration files, it became obvious, that there was a system update in the past, but someone forgot to install upstart. So that was easy, right?

# vzctl exec 300 apt-get install upstart
Reading package lists...
Building dependency tree...
Reading state information...
The following packages will be REMOVED:
  sysvinit
The following NEW packages will be installed:
  upstart
WARNING: The following essential packages will be removed.
This should NOT be done unless you know exactly what you are doing!
  sysvinit
0 upgraded, 1 newly installed, 1 to remove and 0 not upgraded.
Need to get 486 kB of archives.
After this operation, 851 kB of additional disk space will be used.
You are about to do something potentially harmful.
To continue type in the phrase 'Yes, do as I say!'
 ?] Yes, do as I say!

BUT:

Err http://ftp.debian.org/debian/ wheezy/main upstart amd64 1.6.1-1
  Could not resolve 'ftp.debian.org'
Failed to fetch http://ftp.debian.org/debian/pool/main/u/upstart/upstart_1.6.1-1_amd64.deb  Could not resolve     'ftp.debian.org'
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

No Network – doooh.

So… that was that. Another plan to: chroot. Let’s start:
First we need to shutdown the container – or at least what is running of it:

# vzctl stop 300
Stopping container ...
Container is unmounted    

Second, we have to mount a bunch of devices to the FS:

# mount -o bind /dev /var/lib/vz/private/300/dev
# mount -o bind /dev/shm /var/lib/vz/private/300/dev/shm
# mount -o bind /proc /var/lib/vz/private/300/proc
# mount -o bind /sys /var/lib/vz/private/300/sys

Then perform the chroot and the installtion itself:

# chroot /var/lib/vz/private/300 /bin/bash -i

# apt-get install upstart
# exit      

At last, umount all the things:

# umount -l /var/lib/vz/private/300/sys
# umount -l /var/lib/vz/private/300/proc
# umount -l /var/lib/vz/private/300/dev/shm
# umount -l /var/lib/vz/private/300/dev

If you have trouble, because some of the devices are busy, kill the processes you find out with:

# lsof /var/lib/vz/private/300/dev

Or just clean the whole thing

# lsof 2> /dev/null | egrep '/var/lib/vz/private/300'    

Try to umount again :-).

Now restarting the container again.

# Starting container ...
# Container is mounted
# Adding IP address(es): 10.10.10.130
# Setting CPU units: 1000
# Setting CPUs: 2
# Container start in progress...  

And finally:

 # vzctl enter 300
 root@300 #  
 root@300 # ps -A | wc -l
 142

And this looks a lot better \o/.

One Reply to “Disk failure and suprises”

Leave a Reply