Close

Results 1 to 2 of 2
  1. #1
    DF VIP Member
    cronus71's Avatar
    Join Date
    May 2001
    Location
    Indonesia
    Posts
    7,081
    Thanks
    603
    Thanked:        746
    Karma Level
    1104

    Info IBM Builds Biggest Data Drive Ever

    The system could enable detailed simulations of real-world phenomena—or store 24 billion MP3s.

    • Thursday, August 25, 2011
    • By Tom Simonite



    A data repository almost 10 times bigger than any made before is being built by researchers at IBM's Almaden, California, research lab. The 120 petabyte "drive"—that's 120 million gigabytes—is made up of 200,000 conventional hard disk drives working together. The giant data container is expected to store around one trillion files and should provide the space needed to allow more powerful simulations of complex systems, like those used to model weather and climate.
    A 120 petabyte drive could hold 24 billion typical five-megabyte MP3 files or comfortably swallow 60 copies of the biggest backup of the Web, the 150 billion pages that make up the Internet Archive's WayBack Machine.
    The data storage group at IBM Almaden is developing the record-breaking storage system for an unnamed client that needs a new supercomputer for detailed simulations of real-world phenomena. However, the new technologies developed to build such a large repository could enable similar systems for more conventional commercial computing, says Bruce Hillsberg, director of storage research at IBM and leader of the project.
    "This 120 petabyte system is on the lunatic fringe now, but in a few years it may be that all cloud computing systems are like it," Hillsberg says. Just keeping track of the names, types, and other attributes of the files stored in the system will consume around two petabytes of its capacity.




    Steve Conway, a vice president of research with the analyst firm IDC who specializes in high-performance computing (HPC), says IBM's repository is significantly bigger than previous storage systems. "A 120-petabye storage array would easily be the largest I've encountered," he says. The largest arrays available today are about 15 petabytes in size. Supercomputing problems that could benefit from more data storage include weather forecasts, seismic processing in the petroleum industry, and molecular studies of genomes or proteins, says Conway.
    IBM's engineers developed a series of new hardware and software techniques to enable such a large hike in data-storage capacity. Finding a way to efficiently combine the thousands of hard drives that the system is built from was one challenge. As in most data centers, the drives sit in horizontal drawers stacked inside tall racks. Yet IBM's researchers had to make those significantly wider than usual to fit more disks into a smaller area. The disks must be cooled with circulating water rather than standard fans.
    The inevitable failures that occur regularly in such a large collection of disks present another major challenge, says Hillsberg. IBM uses the standard tactic of storing multiple copies of data on different disks, but it employs new refinements that allow a supercomputer to keep working at almost full speed even when a drive breaks down.


    When a lone disk dies, the system pulls data from other drives and writes it to the disk's replacement slowly, so the supercomputer can continue working. If more failures occur among nearby drives, the rebuilding process speeds up to avoid the possibility that yet another failure occurs and wipes out some data permanently. Hillsberg says that the result is a system that should not lose any data for a million years without making any compromises on performance.
    The new system also benefits from a file system known as GPFS that was developed at IBM Almaden to enable supercomputers faster data access. It spreads individual files across multiple disks so that many parts of a file can be read or written at the same time. GPFS also enables a large system to keep track of its many files without laboriously scanning through every one. Last month a team from IBM used GPFS to index 10 billion files in 43 minutes, effortlessly breaking the previous record of one billion files scanned in three hours.
    Software improvements like those being developed for GPFS and disk recovery are crucial to enabling such giant data drives, says Hillsberg, because in order to be practical, they must become not only bigger, but also faster. Hard disks are not becoming faster or more reliable in proportion to the demands for more storage, so software must make up the difference.
    IDC's Conway agrees that faster access to larger data storage systems is becoming crucial to supercomputing—even though supercomputers are most often publicly compared on their processor speeds, as is the case with the global TOP500 list used to determine international bragging rights. Big drives are becoming important because simulations are getting larger and many problems are tackled using so-called iterative methods, where a simulation is run thousands of times and the results compared, says Conway. "Checkpointing," a technique in which a supercomputer saves snapshots of its work in case the job doesn't complete successfully, is also common. "These trends have produced a data explosion in the HPC community," says Conway.

    http://www.technologyreview.com/computing/38440/?p1=A3
    “If I asked you to have sex with me, would the answer to that question be the same as the answer to this question?”


  2. #2
    DF VIP Member Zippeyrude's Avatar
    Join Date
    Dec 2002
    Location
    UK
    Posts
    4,317
    Thanks
    238
    Thanked:        792
    Karma Level
    535

    Default Re: IBM Builds Biggest Data Drive Ever

    yawn, it will be normal and then small in years to come.




    Thanks to Zippeyrude

    raelmadrid (27th August 2011)  


Similar Threads

  1. What cars do you drive ?
    By clayton in forum Cars & Motorbikes
    Replies: 625
    Last Post: 18th March 2010, 09:12 AM
  2. If your xbox dvd drive is dead... read in ;)
    By maltloaf in forum Microsoft Consoles
    Replies: 3
    Last Post: 26th December 2002, 12:33 PM
  3. Replies: 18
    Last Post: 12th October 2002, 01:59 PM
  4. Buggered drive permanently by tweaking pot?
    By oka97 in forum Microsoft Consoles
    Replies: 10
    Last Post: 9th October 2002, 05:36 PM
  5. XBOX DVD drive fails to eject
    By treepid in forum Microsoft Consoles
    Replies: 1
    Last Post: 13th September 2002, 04:35 PM

Social Networking Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •