|
The TS7300 in my test rig (click to enlarge),
showing its own power consumption. Ignore the
temperature data, I disabled those sensors.
|
High Availability and the TS7300 Single-Board Computer
Rec 19-jan-2008 09:42
A classic challenge in network monitoring is this: when you design a system to
monitor another system's availability, the former must be more reliable/available
than the latter. Say, if you are monitoring a bunch of redundant servers that
yield an overall >99.99% availability, the monitoring system itself must be even
more available/stable/reliable than that, say, >99.999%.
Because of that, when I was choosing the hardware for my monitoring system, I
wanted a physically small and rugged computer that had good internal redundancy
and outstanding reliability. Essentially, a computer that "doesn't break".
The most common causes of failure in PCs are: coolers, power supplies and hard disks.
Let's start by replacing the hard disks by flash memory, which, with no moving parts, are
way more reliable.
I also want it to continue alive even in face of a 48-hour power loss. Not many
UPSs systems can support ordinary PCs for that long the ones that do cost a
small fortune. If we can't scale up the UPS, let's scale down the power
requirements: let's try a computer that draws less that 3W of power. (Ordinary
PCs
draw about 100W
-- see
this page to get a rough estimate for your case).
Yet, I want a computer powerful enough to run a full
Debian Linux
setup -- the monitoring software, like most our production software, runs on this platform.
The performance requirements are quite modest, though. We will be running lots
of
Perl scripts to poll services and count how many minutes they are operating
in full redundant mode, under the various degraded modes
and how many minutes it has been offine. There will be about 30 or 40 such monitors.
This computer will double as an emergency access station so we can access the
server's consoles over their serial ports. This means we need at least 8
serial ports.
I also want USB ports so we can connect a
GSM modem
for emergency access in case of Internet link failure and to send
SMS.
We also need 2
Ethernet ports to
connect to two
switches in a
redundant configuration. 100BaseT is more than enough; no need for Gigabit speeds.
There is at least one computer that does fit this bill perfectly: the
TS-7300 from
Technologic Systems. It features a 200MHz
ARM
CPU with 128
MiB of RAM and 2 SD card slots, which act like the hard disks
in a conventional computer. The performance is modest: it is about as fast
as a 166MHz
Pentium classic.
However, for this application, it is more than enough. This also means
the CPU doesn't even get appreciably hot, dispensing with coolers altogether.
This ARM
CPU is pretty
much the same you have in many mobile phones these days. In other words: if mobile
phones had two ethernet ports, I could just as well be using a mobile phone instead.
The included SD card comes with a full
Debain-ARM
distro, so there are almost no
additional complexities in the software part.
One oddity with the included software stack is that the root filesystem
is
ext2. That the
bootloader requires
this is understandable, for simplicity seasons. But using ext2 for the full
debian distro seems to me a bad idea;
first, I've always heard that for flash-based storage we should use flash-optimized
filesystems like
jffs2 or
yaffs -- perhaps the old limitation
of ~100.000 writes in the same page most flash devices used to have has been
fixed by now and I haven't heart of that? I did read somewhere that SD card firmwares had spare
sectors and the ability to remap them just like ordinary modern hard disks do,
but I have no idea how effective those actually are in real-life workloads.
Second, any crashes and you'll have to go to through the lengthy
fsck process in the startup (this is
the reason why I've been a happy
reiserfs user for many, many years).
I'll try to figure out a way to create new partitions with better filesystems
and see how all that interacts with the boot process and assess the reliability
results.
The TS7300 boot process is kinda cool. The ROM code can checksum
part of the
SD card
contents (including its serial number) and refuse to boot if it doesn't check out.
This makes it a bit harder to hack your TS7300 simply by inserting another SD card.
If the
checksum goes fine, it loads the kernel image and
initrd,
potentially dropping you to a shell mere 1.7 seconds after startup. The default
initrd has
busybox and other goodies and
the linuxrc script has an option to
pivot to another root system with the full Debian sytem -- which takes a lot longer
to boot because the initscripts thing. I wonder how fast could we get a full
system boot using
init replacements such
as
runit along with
daemontools
and minimalistic
DJB-style services.
An intriguing feature of the TS7300 is that many peripherals -- such as the second
Ethernet, the video card, 8 (of 10) serial ports and two
GPIO
blocks -- are implemented in an
FPGA
(now
that your average mobile phone doesn't have... yet). This means we can change
them -- I'm planning to delete the video card and replace it with more 8 UARTs and
add a daughterboard with a few
MAX232
level converters and
D-Sub9 connectors.
In the end, this will get me 18 serial ports, enough for a full
rack
of servers.
I look forward to the not so distant day when ordinary PCs will have not one, but
possibly several, large FPGAs inside -- this will open up fantastic possibilities
in terms of high performance computing.
In the picture you see the TS7300 in my test rig with a power supply I designed that
measures how much power the system is using. 9V is the input power from an
wall-wart;
the supply converts this to the regulated 5V the TS7300 requires. We see its power
consumption is about 480mA in idle mode at 200MHz. I measured that when the CPU goes
to 100%, it takes as much as 600mA. Scaling the clock down to 42MHz brings consumption
down to 380mA in idle mode. When I set the CPU clock to 14MHz it consumed even less,
but the Ethernet port stopped working.
I'm using this data to design the power supply + multiserial daughterboard.
Technologic Systems does offer a battery backup system called
TS-BAT3, but at 1000mAh
capacity, it would keep the system alive for only 2 hours. I plan to use
two banks of 6 ordinary
Duracell MN1300 non-rechargeable 15000mAh alkaline D-cells
to exceed the desired 48-hour mark.
I tried to plug a TS9989i GSM minimodem in the USB port but it did not work.
I guess the minimodem takes more current than the USB port can provide. I will
make a powered USB cable to test that. Oh, boy, it already seems I'll be needing
need even more batteries.
And the good thing is that the TS7300 is quite affordable -- the test one I bought
went for about USD$420, including the SD card with the software, a few extras
and the outrageously expensive $100 shipping
UPS charged me.
There are cell phones that cost more than that.
top