In our highly digitized world, the average person generates enough information to fill several CDs a year. Now think how much data a large organization, such as a governmental agency, produces on a daily basis. The amounts of emails, presentations, spreadsheets and other products multiply at an exponential rate.
All of that information is stored, but if you ever had to go find a particular bit of data, how would you begin to sift through the meaningless zeroes and ones to get to the proverbial needle in a haystack?
The problem is not a new one, but it is becoming more critical as the amount of information being produced, collected and stored far exceeds its capabilities to be processed and analyzed. Developments in data mining software to help analysts sort through the avalanche of information cannot keep pace with innovations in data storage devices that can accommodate thousands of giga-bytes. In the intelligence sector and the defense weapons testing communities in particular, the lack of analytical tools to search through and understand the sea of information is being felt pointedly. Both industries sop up voluminous quantities of data daily and have similar challenges in searching for the diamonds in the rough.
Military testers, analysts and engineers encounter haystacks full of needles, but with current data mining methods, which require a lot of human intervention, correlations can be difficult to ascertain.
"The human is definitely the choke point," says Dr. James A. Wall, director of the computing and information technology division in the Texas Center for Applied Technology--the research arm of Texas A&M University's Texas Engineering Experiment Station.
"We can collect and store information at rates we never have. But if we can't take advantage of it, it's a limiting factor," he says.
Before building new weapons technologies, the Defense Department virtually constructs and tests concepts in simulations. In a recent test event for the Future Combat Systems, the Army's Operational Test Command at Fort Hood collected more than 23 terabytes of network data.
A terabyte is 1,000 gigabytes. Or put another way, a terabyte of the letter "A" typed consecutively in 12-point Courier font would form a chain long enough to circumnavigate the Earth's equator 63 times, says Wall.
When FCS--the Army's digitally connected fleet of combat systems--goes into full testing, it could generate up to 100 terabytes of data each month.
"That's a lot of...