Thursday November 27, 2014

Mining Big Data

Boards need to adopt a predictive and prescriptive approach to data analysis.

Plus: How to use data to better understand shareholder behavior and open new markets and six rules to help boards embrace big data.

Illustration by Gregory Copeland 


Conversations about “big data” usually engender some type of emotional reaction. IT departments, typically chief information officers (CIOs) and their leadership teams, often see big data as yet another problem to solve, another request to document and answer. They have visions of “out-of-control” business peers asking for more and more reports and queries. They also see demands for exponentially more storage at a time when IT budgets are essentially flat. The business community, of which boards are an integral part, is no better. These folks are drooling at all the analyses that can be done, hoping to find some magic insight displayed on an iPad that will allow them to easily meet their goals. “I will know it when I see it” emerges as a daily business requirement. Boards appear to have split personalities— worrying about missed competitive opportunities, but also about the business and privacy risks of all this data.

More from NACD Directorship’s Big Data cover story:
Understanding Investor Behavior
Why Boards Must Embrace Big Data

All of these perceptions are valid to a degree. They get in the way, however, of truly understanding the company-specific business risks and opportunities associated with big data—which will greatly hinder the development of a winning business strategy. To overcome these barriers, I have developed a road map for boards with four major routes:

  • Provide a common definition and language.
  • Provide a model to guide the pursuit of business value.
  • Share case studies to use as examples.
  • Discuss the critical success factors to monitor and oversee.

Defining Big Data
If asked to identify “their data,” most organizations and boards would point to the numerous databases secured in their data center. While there is a large amount of data stored, this is not big data. Big data includes those databases plus:

  • “Dark” data—internal data collected in the course of doing business, often in archives or generally not accessible.
  • Internal unstructured data (memos, reports, client notes, call center recordings, meeting videos, etc.).
  • “Machine” data internal or external to the company, usually generated by sensors (security cameras, machine maintenance monitoring, operational metrics, traffic monitoring, etc.).
  • Databases external to the company (suppliers, universities, government, industry, etc.).
  • Unstructured data external to the company (the Internet, customer tweets, social network postings, YouTube, various photo sites, etc.).

Gartner, an IT research and advisory company, suggests three other concepts to fully describe big data:

  • High volume: Due to the usability provided by mobile devices (smartphones, iPads, etc.) and sensors, the ability to create new data is convenient and simple. The connectivity of these devices with other people, machines, and networks allows this data to be easily shared and replicated—further increasing the volume.
  • High velocity: The same usability and connectivity turns weekly/daily analyses, assessments, and feedback into real time. Think of how quickly we can forward e-mails, pictures, tweets, and more.
  • High variety: As can be seen by the definition above, big data comes in many shapes and sizes, especially external to the company.

Given this definition, what is the potential business value associated with big data? Where should boards look to make sure their organizations are using big data to their strategic advantage? According to work by Gartner, business value generally falls into three buckets:

  • Making better informed decisions.
  • Discovering hidden insights.
  • Digitizing business processes.

For example, according to Gartner, Agco completed a pattern analysis of product configuration options for farm machinery and real-time customer demand to determine the optimal base configurations for their machines. As a result, it reduced product variety by 61 percent and slashed days of inventory by 81 percent while still maintaining service levels.

Infinity Insurance discovered new insights that greatly improved its fraud operation, Gartner found. Infinity text-mined years of adjuster reports to look for key drivers of fraudulent claims. As a result, the company reduced fraud by 75 percent, and eliminated marketing to customers with a high likelihood of fraudulent claims.

Another example cited by Gartner is McDonald’s. Its bakery operation used photo-analyses of 1,000 buns per minute to monitor and adjust color, shape, and seed distribution, saving thousands of pounds of waste at higher quality and less cost. This essentially eliminated an existing business process.

To achieve this business value, each organization needs to make a “mind/culture shift” regarding analyses. Most organizations are mired in a “what happened” approach to analytics. With this mind-set, big data can create an exponential increase in reports, costs, and spreadsheets, but little improvement in business value. To truly make better informed decisions, discover hidden insights, and automate business processes, organizations need to adopt and operationalize a predictive and prescriptive approach to analysis focusing on “what will happen” and “how can we make it happen.”

Possibilities Revealed
Given this common definition of big data and how to achieve business value, what have others done so far? What are some additional case studies of business success using big data? What other possibilities exist? Here are other examples cited by Gartner:

  • The University at Buffalo at the State University of New York, is analyzing more than 2,000 genetic/environmental factors, and combining these with data that includes medical records, lab results, patient surveys, and U.S. National Institutes of Health genomic data to help find a drug to cure multiple sclerosis.
  • Companies are combining and analyzing customer transaction history, call center records, web logs, and social media to get a 360-degree view of customer interests and activity.
  • Credit cards with stolen or fictitious identities are mitigated by analyzing patterns of behavior (payment activity, raising credit limits, common addresses, etc.) across thousands of credit cards.
  • New York City uses big data to correlate unsafe building calls with socioeconomic data and arson statistics to improve housing inspections.
  • A government agency monitors hospital room visits for food poisoning to detect tainted food injected into the national food supply chain. Other possibilities include:
  • In the financial services industry, using big data to assess the risks associated with “unbanked” consumers, and designing products that meet the needs of this segment.
  • In the retail and consumer products industry, using social networks and Twitter plus text and image analytics to better predict or detect fashion trends.
  • In the oil and gas industry, using image, pattern- finding, and real-time analytics to more efficiently leverage seismic and exploratory machinery data.
  • In healthcare, combining patient data with clinical trial data, social network discussions, doctors’ notes, and Internet-connected monitoring equipment to improve the speed and accuracy of diagnoses.

The possibilities and opportunities for market-changing business value are endless. But what do boards need to monitor to ensure their organizations will be successful pursuing predictive and prescriptive analytics?

Better Questions
As with nearly every technology innovation or capability, the keys to success in the use of big data have little to do with the type of technology used. First, there needs to be change in the analytic process. Rather than asking questions such as “How much did our business grow in the past year?” a better, big data question would be “How can we increase customer shopping cart value by 20 percent and loyalty by 30 percent by better understanding customer interests and behavior, and considering a range of economic forecasts and competitor moves?”

The new types of questions generally have these characteristics:

  • Specific but open-ended.
  • Relates to a business process and the achievement of a strategic goal.
  • Focuses on optimizing or innovating, not informing.
  • Considers change relative to other indicators or processes.
  • Leverages and integrates internal and external data.
  • Forward-looking.
  • More about differentiation than just comparison.
  • Considers various scenarios.
  • Actionable: More about “do it” than “prove it.”

There needs to be a major culture change in IT, and often the business divisions as well. As with any major transformation, the board should ask for and provide oversight to a formal change management plan and process with clear deliverables and outcomes.

Two key characteristics of such a plan include an increase in innovation and trust. Innovation is often developed through a new set of behavioral norms, rewards-based collaboration, and expected business improvements. Trust is often built through a series of smaller, successful big data initiatives based on internal data only. These are gradually expanded as the desired insight is achieved. Even if innovation and trust are improved, one of the most common corporate culture barriers is the movement to more data-based decision making—a discipline practiced inconsistently in many companies.

Finally, bad or conflicting data need to be eliminated. Most enterprises still lack a comprehensive master data management plan and process. Thus, there is limited confidence in the data’s origin, meaning, or accuracy. Gartner research shows that poor data quality is a primary reason 40 percent of all business initiatives fail to achieve their targeted business benefits.

In addition, data quality affects overall labor productivity by as much as 20 percent. Mergers and reorganizations only exacerbate this problem. Consequently, more time is spent in data reconciliation and validation than in the mining of customer and market insight.

No one is blameless in addressing this critical success factor. The business community needs to take its role as data steward very seriously—in fact, to reward it as one of its core competencies. This means making sure data is accurate, well defined, secure, and single sourced. It does not mean “owned” or “hoarded” so that no one else in the organization can access or analyze it.

Similarly, the IT folks need to view their role as more than just “storage people that do what they are told.” They know firsthand what and where the data quality problems are, and should be expected to propose solutions to them. They should also be as concerned about “single sourcing” as anyone else in the organization. Finally, they should initiate data modeling and management efforts that are completed in weeks and months, not years.

Finally, and perhaps most importantly, there needs to be a major upgrade in skills. Rather than the traditional reporting and spreadsheet skills, big data requires advanced analytical people who understand data and know how to mine it, what to look for and pursue, what models need to be developed, and what information sources are required to answer the new set of questions. To be truly effective, these skills need to be applied in the context of business acumen. In other words, big data analysts will not rest until they have found a specific piece of insight that will positively improve some business result.

Ideal Data Cruncher
To respond to this, many organizations are hiring data scientists. In addition to the core vocational skills highlighted (data management, analytics modeling, business analysis), the best data scientists have a number of soft skills:

  • Communication, both up and down the data supply chain. Those who excel here are great at persuasion, expectation management, and tailoring findings to their audience.
  • Collaboration. To do their job well, data scientists require the time and attention of both the technical and business community. Thus, they need to be perceived as partners and valued advisors who are sensitive to time pressures and competing priorities.
  • Leadership. Due to the complex and somewhat open-ended nature of their work, data scientists are often called upon to lead teams of skilled professionals toward some common purpose or goal.
  • Creativity. By its very nature, data scientists’ success involves innovation-oriented analyses with little or no clear path. In addition, they must be creative in sourcing data, testing various models, and employing a range of analytic techniques.
  • Discipline. While creativity is critical, so are a scientific approach and method. This will ensure the validity of the insight and conclusions. But while a proven methodology is necessary, results perfection is not—especially in today’s fast-paced business world. A great data scientist understands and actively manages the difference.
  • Passion. This may seem like a strange skill for a typically introverted statistician. When hiring for a data scientist position, however, organizations should look for someone with an insatiable analytic curiosity, who loves to solve seemingly insurmountable problems, and who is almost obsessed with finding unique ways to accelerate business results.

Big data can have a dramatic impact on the success of any enterprise, or it can become an under-contributing major expense. But none of the keys to success have anything to do with technology. Boards should actively engage with and oversee senior management to make sure that proven practices are followed and followed well, and that measurable business outcomes are achieved.

Steve Weber is a retired executive partner of Gartner Inc., and a former CIO of two financial services organizations.

Leave a Reply