spacer spacer spacer
spacer spacer spacer
spacer
NASA Logo - Jet Propulsion Laboratory    + View the NASA Portal
spacer
JPL Home Earth Solar System Stars & Galaxies Technology
Parallel Applications Technologies Group
PAT Home PAT News PAT Projects PAT People PAT Publications blank blank
spacer
spacer spacer spacer
spacer



Raid Again Storage using Commodity Hardware And Linux

RASCHAL


Using Linux and inexpensive IDE drives for building large storage systems is becoming more and more common. RASCHAL is such a storage system,  with 40TB of storage, built at JPL for use by the PAT group data intensive science investigations, and also as a testbed for cluster storage systems. The whole system was designed and built in about six weeks, and it became operational in April 2003.
The system was assembled by Jimi Patel, who also handled the final installation details. This storage cluster has been under heavy use ever since it became operational. It is used by the OnEarth website, for both the public access data and for work on the future world landmass image.

Please scroll to the bottom of this page for updates on this system.

Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government, Jet Propulsion Laboratory, or California Institute of Technology.

RASCHAL has 160 IDE drives, organized in ten separate rack mountable PC cases, with each case containing a Linux system and sixteen IDE 250GB drives.
Each Linux system has 1GB of RAM, a 2.4GHz  XEON CPU, an ATX motherboard with integrated dual copper Gigabit Ethernet and video card, a CD-ROM drive and two 8 port IDE RAID cards. Each set of eight drives controlled by an IDE RAID card is configured as RAID5, with one of the eight drives being redundant. The IDE drives are mounted in externally accessible removable trays. Since both the trays and the RAID card provide support for hot-swap, a single disk failure within each set of eight drives can be detected, and a replacement can be installed without affecting the system operational state.
The PC cases are 4U tall but due to mechanical considerations they are mounted in the rack with 1U of spacing between them. Each case is powered by a 450W power supply. Operational power requirement seems to be closer to 250W per case, with no overheating observed. Each case contains a separator wall with five fans, producing a forced air flow from the front of the case to the back.
The system at the top left is the host system, an 8 CPU SGI O300 server and an external 1TB Ciprico Fiber Chanel disk system. In the back of the left rack, a 24 port copper Gigabit Ethernet switch is also mounted. It provides two connections to each storage unit and four additional ports for client systems.
No configuration data or operating system is loaded on any of the storage units. Upon power-up, each one of them starts by loading a Linux kernel from a CD-ROM, followed by loading a complete Linux OS from the host system via NFS. Since all unit shares the same identical copy of the software, maintenance of the whole system is simplified. Each set of 16 drives is configured as one logical drive, and is accessed by the hosts systems via NFS. Point to point sustained data rates in excess of 20MB per second have been observed.
A picture of a drive mounted in a tray A 250GB disk mounted in a tray, ready to be used. The hot-plug connector in the back provides both the parallel IDE connection and the power required by the disk. The tray has a very simple mechanical lock system to prevent accidental removal of the disk, and also provides a power and an activity LED on the front panel
These drives were the largest drives available at the time RASCHAL components were chosen, and providing the best disk price to overhead price ratio. Since they are low power, 5200 RPM drives, they are also a very good match for this type of application. If the Maxtor 320GB drives have the same profile, using them will increase each unit capacity to 5.1TB. They might also require a higher rated power supply.
The fans that provide forced air cooling From this angle, five fans providing forced air flow to are visible. The front of the case is the far end in this picture. The metal brace in the foreground of this picture provide extra stiffness to the case, and also provides mechanical support to the PCI cards.
The fans hosted in the middle wall can be individually removed, as in this picture. All the power and IDE cables have to be routed under this separating wall, removing the fans provide a bit more space. The drives are aranged in three stacks of five, with the last one mounted vertical, of the lef side of the others. Each case holds 16 IDE drives, for a total capacity of 4TB.
IDE RAID controller and connecting cables In this view, the stiffener brace is removed, providing a better view of the two IDE RAID cards. The power supply, the CPU fan and the memory banks are also visible. A mix of flat and round IDE cables was used. The middle connector of the IDE cables is not used, since each drive needs a unique connection. Installing these cables, and the power cables is not easy. Space under the airflow dam is limited, and due to the location of the PCI slots, long cables are required. The new serial ATA interface will greatly improve this situation, but will not improve the performance.
While the motherboard is dual CPU capable, only one XEON CPU is used. This reduces the power consumption by a considerable ammount, and should not affect performance much.
Whole box This is a view of the whole PC, without the top cover. The front of the case to the right. The CD-ROM is mounted on top of the IDE drive cages. The mounting bracket on the top right of the drive cage can host a slim floppy disk, not used in RASCHAL. Above the CD, there is a flat speaker (blue label) and the wires to the power and reset buttons.
The fan separator wall with the five fans is clearly visible in the middle of the case. The system does not seem to overheat, and five fans of 5Watt each seems to be overkill. A few of these could be removed and replaced with baffles, to preserve the air dam integrity. Most of the motherboard is covered by the IDE cables. The CPU fan is visible, a second CPU socket is right below the first one, not in use. The only PCI cards are two IDE RAID cards. Video and dual Gigabit Ethernet are integrated on the motherboard.
The power supply is in the top left of the picture. Since some of the power supplies had to be replaced due to a brownout, we discovered that this configuration is very demanding. The original 460W power supply works great, but 500W and 600W versions fail to power up the system, due to a lack of enough power on the 12V and 5V power lines, the only ones used by the disks and fans. It seems to be more tolerant during the initial power surge caused by the disk spinup.
The back panel of the case is a standard ATX breakout panel plus the two GigE connectors.

News:

After a short an unfortunate incident caused by a brownout, RASCHAL is fully functional again. Moving the drives in use to a spare unit and rebooting was all that was needed to recover function, in about 4-5 hours. Three failed power supplies were identified and replaced within a week, and all ten units are again available.

Storage performance proved to be reasonable, with sustained data rates in excess of 20MB/sec being observed for a single host. The gigabit ethernet router acts as a matrix switch, including connections for multiple hosts, make disk operations can be active at the same time, providing much higher system crossection bandwidth.

Raschal seems to have a hard time dealing with heavy IO loads, especially when they generated by multi-megabyte read and write requests. This seems to trigger a serious failure in the IRIX NFS v3 client.

July 18th 2003: - A few RASCHAL units have been updated to Linux kernel version 2.6.0test1

Dec 28th 2005: RASCHAL is now running Linux kernel 2.6.3 from an additional system disk installed in each box. These changes have improved the NFS performance to 40MB per second, and much improved system stability. The 3Ware RAID card bios has also been updated, which makes it possible to use the command line interface to manage the RAID units without having to cycle the power, even if the software is not supposed to work under Linux 2.6.
A very large number of the drives, close to 50%, have failed in the last two years and have been replaced. The exact cause for the very high failure rate is not known, but there are a number of factors that might contribute. Construction work in and around the server room has taken place almost constantly, with numerous power failures, brownouts and power glitches. The drives themselves were from the first batch of available 250GB drives on the market. The original power supplies for the RASCHAL computers are marginal for this configuration, especially at power-up. Larger, newer or faster disks can not be used in this configuration to due higher power requirement and software limitations.
spacer
spacer spacer spacer
spacer
Privacy / Copyrights FAQ Contact JPL Sitemap
spacer
spacer spacer spacer
spacer
FIRST GOV   NASA Home Page This page, http://pat.jpl.nasa.gov/public/lucian/RASCHAL.html, is maintained by Lucian Plesea and was last modified Friday, 06-Jan-2006 10:17:43 PST
spacer
spacer spacer spacer
spacer spacer spacer
JPL NASA Caltech
Last update: Tue Jun 10 10:41:03 PDT 2003

Lucian Plesea