Knowledge@Wharton Marketing Research Article

View Article on Knowledge@Wharton Mobile

Time for a Data Diet? Deciding What Customer Information to Keep -- and What to Toss

Published: March 18, 2009 in Knowledge@Wharton
Article Image
Print Get PDF of Article Send a Comment Send to a Friend
Bookmark and Share

Heartland Payment Systems, a credit card processor, may have had up to 100 million records exposed to malicious hackers. Payment processors CheckFree and RBS Worldpay, and employment site Monster.com have all reported data breaches in recent months, as have universities and government agencies. Experts at Wharton say that personal data is increasingly a liability for companies, and suggest that part of the solution may be minimizing the customer information these companies keep.

Indeed, according to Wharton marketing professors Eric Bradlow and Peter Fader companies should deploy a technique called "data minimization." The concept: Keep the customer data a company needs for competitive advantage and purge the rest. "I think there is a fear and paranoia among companies that ... if they don't keep every little piece of information on a customer, they [can't function]," says Bradlow. "Companies continue to squirrel away data for a rainy day. We're not saying throw data away meaninglessly, but use what you need for forecasting and get rid of the rest."

Wharton Professors Eric Bradlow and Peter Fader on "The Data Dilemma"

The problem with the data hoarding approach is that companies can't use most of the information they keep, adds Fader. Meanwhile, they become data pack rats, chasing an illusory dream of one-to-one marketing, which he says "is a myth. The best thing to do is aggregate information so companies can predict something like, 'Among all people who bought five times or more, how many times are they likely to buy in the next year?'"

Fader and Bradlow discussed data minimization concepts when they presented papers at the recent Wharton Information Security Best Practices Conference. Their papers illustrate how companies can still predict customer behavior even if they minimize the customer data they keep.

However, data minimization isn't a panacea, argues Wharton operations and information management professor Eric Clemons. Some industries -- such as insurance or credit card companies -- may need to collect detailed customer data for competitive advantage. Meanwhile, companies that serve as pack rats for customer data are focusing on installing better defenses and procedures to protect information.

"The dominant argument of the day is that more data improves the accuracy of targeting," says Andrea Matwyshyn, a legal studies and business ethics professor at Wharton. "But there are additional risks associated with storing that information. More may not always be better."

Indeed, the cost of a data breach in 2008 was $202 per compromised record, up 2.5% from $197 per record in 2007, according to the Ponemon Institute, a Michigan group that researches and consults on privacy and information security issues. Ponemon's estimates are based on interviews with companies that have suffered breaches to customer records that include credit card numbers and, in some cases, personally identifiable information. Following a data breach, companies often must hire security consultants, engage legal counsel and offer credit monitoring services to affected customers. The Institute also found that companies will lose customers in the year following a data breach. For example, health care and financial firms lost 6.5% and 5.5% of customers, respectively, after such incidents.

Fader and Bradlow argue that companies are taking on undue risk to their reputations by hoarding data with little business benefits. While companies generally disclose what data they keep in little-read privacy statements, consumers can still be surprised when breaches occur. "Companies are actively collecting data without realizing the work involved," says Fader. "And given how companies are stretched thin, they can't manage the data correctly. Keeping detailed data is a blessing and a curse."

What to Keep?

The real challenge for companies is assessing what customer information they need to retain, says Fader, who adds that firms may be keeping an excessive amount of data because they can't pinpoint what they actually want. "Data minimization involves more than just the data. You can't minimize data until you know what to do with it. What data elements do you need to predict customer behavior?"

The inability to answer those tough questions, says Bradlow, could be one reason why companies default to storing as much data as possible -- not the best strategy when it's clear that many companies don't know what to do with this data even when they have it.

Fader and Bradlow recommend a simple approach to data minimization. First, companies should figure out what information they need to track consumer behavior. Then, aggregate that information -- including, for example, grocery bills, shopping frequencies and e-commerce sales for a retailer -- over a defined period such as two to four months. With that aggregated information, a company can create histograms -- graphical representations of aggregated data --and throw away original data.

Fader suggests that histograms offer accuracy rates close to individual targeting -- without the risk. Purging individual information lowers costs because companies don't have to secure information in transit, store and analyze data, and navigate a bevy of regulations across the globe. "Maintaining data warehousing is costly because the minute you keep data, you have to protect it," says Bradlow. "Most firms realize they can't do one-to-one targeting so why not only keep data that's relevant?"

According to Matwyshyn, the discussion by Fader and Bradlow was an eye opener for privacy and legal experts at the Wharton security conference. What remains to be seen is whether privacy experts, marketers and security professionals can agree that data minimization is an important step. "The key is that there is discussion on the issue," says Matwyshyn. "Marketers and privacy experts may not be as far apart as people presume."

Fader and Bradlow acknowledge that the argument for data minimization is only just beginning. For data minimization to become the norm, a company's management, privacy officers, legal counsel and marketing team will have to reach consensus on customer data collection. Legal and privacy experts are likely to support data minimization, while marketers will argue for keeping all the data they can collect.

Poaching Profitable Customers

In addition, data minimization practices will vary by industry. Clemons says that data can be a competitive advantage for many companies. For instance, Capital One used customer data to better segment its most profitable customers and poach similar ones from larger rivals. In this example, customer information led to varied pricing models -- such as interest rates that varied by customer credit ratings -- that maximized the profit from the top decile, or 10%, of customers. "Under the uniform pricing models of the mid 1990s, the top decile of customers produces 150 times more profit than average," says Clemons. "Capital One found a way to attract the best customers away from other issuers."

In a co-authored study, Clemons found that Capital One used what it calls an information-based strategy that allows the company to try varying approaches based on differences between itself and rivals. This strategy allowed Capital One to deploy a mass customization model. That model also generated returns, says Clemons. Capital One sustained double-digit returns on equity and double-digit increases in sales and profit growth due to its approach.

Clemons argues that storing customer data in bulk could lead to new pricing strategies. He agrees that one-to-one marketing is illusory at best, but a move to precision pricing -- or figuring out exactly what an individual customer will pay -- may warrant being a data pack rat. "I am not talking about pursuing some sort of illusory one-to-one marketing relationship with customers," says Clemons. "I'm talking about making the transition to precision pricing, which does indeed require understanding your customer."

Meanwhile, there's another conundrum companies face: Data purged today could be valuable tomorrow. "Ten years ago, one of my clients wanted to purge his database. It was an insurance company, but once you purge your database, you know no more about your customers than a new entrant," says Clemons. "That was okay under existing pricing models, but after any form of insurance deregulation, the information they were purging would have been enormously valuable."

Ultimately, the choice to follow data minimization practices boils down to one question: What will a company do with the data?

"If you are collecting data just for its own purposes, follow a minimization approach," suggests Matwyshyn. "If a company is doing something else with data, like selling it, then there's no incentive to minimize the risk."

Bradlow says data minimization has the potential to be one of the key security tools used by companies, even if it remains largely an academic concept today. "Security professionals will buy [data minimization]. Next, you have to convince the marketing world and begin giving talks outside the ivory tower. I think firms will start buying in."

Back to Top

 

Comments (1)

Logging you in...
Close Login with your OpenID
  • Logged in as
I definitively agree with you. You don't analyze on personally identifiable information such as credit card or social security number so why keep it? I can think of some cases where data is augmented over time or keeping a full address allows very precise geo-targeting but most the time your behavioral analytics tools don't need any PII.

I do take exception to your point about aggregating data as this will depend on the application. At Quantivo, we help retailers and online sites find value in their purchase and web analytics data - understanding market baskets, loyalty, click patterns, etc. The value is in the details of the individual transactions and would be lost with any aggregation. But your first point about minimizing the data is still correct: throw away any attributes you don't need or cannot act on.

Warm regards,

Albert Gouyet
Quantivo Corporation
www.quantivo.com
This comment has 0 hidden replies. Show them!

Post a new comment

Back to Top