Too much data not enough information.
It turns out that much of the available data on consumers sits idle in electronic storage closets, never to be used again.
Now promoters of an emerging software field called “data mining” say they’ve found a solution a way to turn otherwise idle information into useful nuggets of knowledge.
It is a solution a growing number of Los Angeles-area companies are incorporating in their daily operations.
First developed to help scientists get a handle on reams of experiment results, the programs have been adapted by third-party software firms for business applications.
Usama Fayyad helped create one of the earliest data mining applications while working as a researcher at Pasadena’s Jet Propulsion Laboratory. Astronomers used his program to identify common characteristics in thousands of pictures of star fields to help them differentiate stars from galaxies and other heavenly bodies.
“But we quickly realized that we scientists were not unique in having too much data,” Fayyad said. “With the proliferation of databases everywhere, now anyone can generate a huge database.”
The software works by making repeated passes through a collection of data, analyzing relationships between various categories or “fields” of data.
For example, the program may look to see how often shoe polish appears in the same purchase record as socks, but only when the shoe polish is black and the socks are not white athletic tube socks.
While humans are fairly good at making such correlations on their own with up to seven or eight variables, databases can contain records with thousands of fields. And that, Fayyad says, is far beyond the human brain’s capacity to correlate.
The Measurement Group Inc. in Culver City analyzes questionnaire results for large non-profit organizations, including health care provider groups and government agencies.
“Based on things like age, gender, geographic location and income level, we can cross-link (questionnaire respondents) and find out relations between certain things,” said Tay Kuo, the company’s information systems coordinator.
A typical use might be to see if the number and types of people using a government program justifies its funding, Kuo said. If not, the Measurement Group will search for ways revealed by the data to better tailor the services.
So far, the application for which data mining has been a real boon is in helping companies better understand customers and their buying and payment habits, said Paul Watkins, a professor of accounting at USC ,where data mining techniques are taught in both the computer science department and Marshall School of Business.
One frequently repeated anecdote concerns a grocery chain that used data mining on several months’ worth of customer receipts. What emerged was that new fathers who go shopping for diapers often end up buying beer during the same visit. That prompted the chain to move the two products to aisles closer to one another.
Compton-based Ralphs Grocery Co. uses data mining to discover which products customers buy at the same time information which can be used to create more-effective sales promotions.
“We do an awful lot with scanner data and customer transaction data,” said Mark Orr, Ralphs vice president for marketing analysis. “We want to know what makes an effective promotion.” If a sales promotion isn’t properly targeted, “you’re shooting yourself in the foot,” Orr said, because you will either attract too few buyers or the wrong kinds of buyers, such as those who only buy the loss-leader products.
Another company using data mining and data warehousing (the storage of massive amounts of information in a usable format) is Ventura-based Patagonia, a mail-order, retail and wholesale clothing company.
“One of things we’re having big problems with is having multiple business divisions, all operating on different computer systems,” said Shaun Mueller, a business analyst at Patagonia. Mueller is overseeing the company’s $300,000 effort to incorporate data mining into operations.
Each of Patagonia’s three divisions wholesale, retail and mail-order keeps records slightly different, Mueller said, “so we have trouble pulling all this information together and finding out how we’re doing.”
In addition to its three U.S. systems, the company has three systems in its European operations and another three in its Asian operations, Mueller said.
Patagonia contracted with L.A.-based Vertex Systems Inc. to adapt the programs specifically to fit Patagonia’s needs.
Now Patagonia can look at which customers are most likely to respond to an ad in a given publication, Mueller said, and to match the right kind of catalog to a specific type of buyer.
The basic mining software is produced by computer luminaries Oracle Corp., Silicon Graphics Inc., Sybase Inc. and others. Then, the data mining programs are customized to fit the needs of particular clients, which so far tend to be retail chains, utilities, financial institutions and others that handle millions or billions of transactions annually.
“We saw clients were accumulating a lot of information, but access to it by their management was restricted, or just not very effective,” said Vertex founder and CEO Ivan Nikkhoo. “They were asking us for ways to better manage their information, and that’s what this allows.”
The price for such projects, which range from a few hundred thousand dollars to more than $1 million, is beyond the reach of most small and medium-sized businesses, Nikkhoo said.
But Fayyad, who now works for a division of Microsoft Corp., says less-expensive products are coming.
Rather than creating a single data mining package that addresses the needs of any business, Fayyad says developers will focus on designing packages that address the needs of specific business segments.
Fayyad predicts that within a year or two small businesses like restaurants and accounting firms will begin to benefit from affordable data mining packages designed specifically for their industries.