// archives

Archive for October, 2006

Disk performance, Opterons and Smartraid 5.

I wrote to the Linux Kernel Mailing List earlier this year complaining that I was getting shocking performance from a Smartraid V card, in an Opteron server, running Linux 2.6. It’s worth noting that performance seems much better if installing a 32 bit distribution and kernel is attempted.

I had a similar experience yesterday trying to recover data from an identical box, this time using the i2o_block driver rather than dpt_i2o.

At seemingly random times, the cpu would jump to 100% utilisation in io_wait, and the machine would seem to lock up as before. This happened after installing 2.6.18.1 too. After 10 minutes, the machine would spring back to life, and the cpu would be mainly idle again.

Eventually I spotted that trying to rsync some files off this computer were ‘always’ hard, and some were ‘always’ easy.

Then I noticed that the device was 52% fragmented ! It looks like this was the reason behind the bad performance. Where the slightest bit of fragmentation was present, IO performance went out of the window.

With cheap cards like the (from lspci) 0 RAID bus controller: Adaptec (formerly DPT) SmartRAID V
Controller (rev 01)
, and the AMD Opteron Processor 246, ext3, and linux i2o_block or dpt_i2o, the only way not to get depressing performance is to hugely overspec the space requirements on your drive, and keep an eye on how fragmented the partitions are.

Additionally, we have no problem with Adaptec AAC-RAID cards running 64 bit linux, and strongly recommend these over the i2o_block supported cards.

Are security risks over-hyped?

According to this week’s Computer Weekly, 79% of ‘top IT professionals’ surveyed by recruitment consultant PSD think that IT security risks are over-hyped.

All IT support desks suffer a similar problem – very few people notice when everything is going well, but everyone notices when something is going wrong. Security support desks will suffer a similar fate.

If IT security risks received enough exposure, then corporate and desktop computers would all be patched up to date in order to prevent identity theft, the seizure of trade secrets, applications would be designed to prevent customer data loss, spam botnets would not exist, and corporate defacement attacks would not happen.

The article offers some kind of explanation, “IT recruiter Mark Sullivan said, ‘The trouble with security threats is that there has not been a massive attack on the internet recently. IT security prevents large losses from happening and maybe that is not put across strongly enough.’ “.

If this is true, then this approach will mean we suffer another large, headling grabbing security breach at a major firm. If this happens in the world of e-commerce, then it will continue to frighten people from shopping on-line, and this is bad news for everyone in the industry.

Handling timeout properly

I love Sysadmin magazine, it has been my first exposure to lots of really good technologies. Which is why when I come across an oversight, I think it’s a terrible shame. I think this month’s Shell techniques article on handling timeout is missing the most useful technique – the unix alarm signal.

An alarm signal allows you to send a special signal to a running process after a given length, in order to handle the unintended or negative effects of time passing, such as users not interacting with your application in an intended time, or a network or inter-process communication failing to receive a response.

Handling an alarm signal in bash is the same as handling other signals – such as safely handling the stop signal which interrupts a process when you hit control and c.

You ‘catch’ an alarm signal in a bash script using ‘trap’ :

#!/bin/bash
sleep 5 && kill -s 14 $$ &
trap timeup 14

timeup()
{
echo “Time up .. abort!”
exit 1
}

for i in 1 2 3 4 5 6 7;
do
echo $i
sleep 1
done

Here the ‘sleep’ command at the top that runs in the background forces the script to be completed within 5 seconds, otherwise an alarm event will be sent to the script. This is useful, it allows you as a sysadmin to create a script which handles a timeout differently to a deliberate introduction. A block which catches signal 2 demonstrates how this can look :

factory:~ andy$ sh dingdong.sh
1
2
^Cinterrupted !
factory:~ andy$ sh dingdong.sh
1
2
3
4
5
Time up .. abort!

Changing the first ‘sleep’ line to a period greater than the time taken to currently run this script (e.g. 10 seconds) allows it to finish without one of these signals being generated.

factory:~ andy$ sh dingdong.sh
1
2
3
4
5
6
7

This technique is much more flexible than using read -t to handle timeout as it allows you to cause a more complex tidy-up procedure to be invoked when timeout is encoutered. You can also handle timeout of for example a dns lookup or wget, as well as user input.