INTRODUCTION
Games
are typically played by two or more participants. Every game has defined
rules and boundaries. For every game there is a method of keeping score.
Games may be a short term, one time
processes. Sports contests are typical examples of this type of game. In
this type of contest the participants focus on the particular outcome of a
single contest.
This article focuses on a longer term game…
one that is repeated indefinitely. The goal of the long term game is to
continue to play and continue to score with increasing frequency and value.
Most businesses engage in this type of strategy.
Business can be viewed as a contest where we
keep score with money. We develop rules for making decisions that are used
to attract new customers, retain existing customers, reduce expenses, and
minimize risks. The more effective we are with making these decisions, the
more we score.
In the context of customer relationship
management small businesses engage in a relatively simple contest. Many
times, they have the luxury of knowing their customers personally. They can
adapt their strategies and decisions making based on personal feedback with
a high degree of accuracy. These types of businesses adapt to the variety
of individual games played between the business and its customer base. Each
experience is played with the intention of winning. Most times, these games
are played to create a mutually beneficial scenario.
Alternatively, large businesses must shift
from this mindset. They do not have the luxury of knowing each of their
customers needs on an individual basis. They must make decisions based on
group behavior. The outcome of any one interaction is relatively
unimportant. The focus shifts to group behavior… strategies that generate
long term success based on group behavior. We shift from a position of
playing the game to the position of sponsoring the game.
Most of us have gambled at one point or
another. Whether at a charity fundraiser, purchasing a lottery ticket, or a
trip to a casino, the games have many consistent characteristics. We pay a
price for the opportunity to receive something more than we risk. As
individuals, we generally don’t have accurate perceptions of the risk/reward
ratios involved. We are willing to gamble a relatively small amount in
hopes of becoming a big winner.
The sponsor of the game, however, fully
understands the risks and rewards. The sponsor of the game generally
attempts to build an interaction with a particular public appeal. They are
fully aware that some individuals will walk away winners. In fact, they
need those winners to maintain the game. But they also know that by playing
the game consistently, with a large number of occurrences, that the
probabilities guarantee that they are the true winners.
For large business, the game is only slightly
modified. Rather, we are attempting to model human behavior that is highly
inconsistent. The game is not played with fixed probabilities. Because
these businesses can not accurately analyze the risk/reward structure of
each decision and business relationship, they must develop a strategy where
the organization makes decisions that guarantee long-term success.
Some customers receive higher than
anticipated value. Others may not receive the full value expected. The
customer gambles that they can negotiate an arrangement that will provide
them with a product or service at a fair value. But, the probability of
long-term success is still with the sponsor of the game -- the business.
GAME CREATION
Imagine
sitting in a high-stakes poker game. And in this game you are allowed to
see your cards, and the cards of every other player before you decide
whether to bet. For a relatively minimal cost, you are allowed to sit in
and simply observe. When a situation that is beneficial to you develops,
you then execute the privilege of participating.
This is the environment most large businesses
enjoy. If they accurately evaluate their environment and have the
discipline to only participate in probabilistically correct decision making,
they are virtually guaranteed to be a winner.
The experts at playing these games have an
established set of decision rules for when to sit and watch, when to
participate, and how to play when they do participate.
The successful large business executive
creates environments that guarantee success for their organization. These
contests range from sales, to customer retention, to loss prevention and
fraud detection. All decision making is geared to selection of
opportunities to increase the score or reduce the risk of loss.
Just as casinos have developed sophisticated
games of chance to entertain their customers while guaranteeing their
success, Data Mining and CRM have developed sophisticated techniques of data
analysis in the business environment. And as with their gaming
counterparts, the business implementation of advanced gaming technology
requires an understanding of the characteristics of the tools being employed
and advanced skills in decision making.
Data Mining and CRM are the advanced
technologies of the skilled business decision maker. It is no longer
sufficient to simply review a report. In the development of advanced
technology solutions in the business environment, it is necessary to
increase the precision of the tactics we employ.
We can use a variety of tools to enhance our
skills. Realistically, we use these tools to improve our decision making
while playing the game. Our intent is to improve our position in order to
achieve a higher score.
KEEPING SCORE
In the world of advanced technology,
performance is a subjective matter. That means we must take the time to
define it on our terms.
At the inception of the project, it is
important to fully and completely define the metrics by which the success of
a project will be evaluated. The evaluation criteria should include the
realistic constraints to be expected in the delivery environment, as well as
the operational metrics of performance.
The key is to develop a mathematical formula
that will make a significant contribution in live decision making. By being
highly precise in defining our decision rules, and by applying them
consistently, we are able to adapt the rules effectively as we gain
additional experience.
Failure to completely and accurately define
our performance criteria appropriately, often leads to the development of
good solutions to the wrong contest.
It is important to keep in mind that many of
the advanced technologies employed in the data mining and CRM fields attempt
to optimize performance based on the criteria utilized.
THE TOOLS
There
are many alternatives available. From traditional statistics, to various
qualitative tools, to the advanced technologies employed in the data mining
and CRM arenas. One of the keys to success is selecting the correct
technology for the situation at hand.
When we strip away all of the hype and all of
mystique surrounding the advanced technologies, we are left with an array of
tools. These tools perform a very simple function. They use a database of
historical experience to build mathematical models intended to assist in
future decision making.
Traditional statistical analysis is often of
limited value. It is not that these tools are somehow flawed. Rather, it
is that they are overly simplistic and, in many cases inappropriate for the
task of modeling human behavior.
Traditional statistical techniques are overly
simplistic as they are suitable for only the most basic support of our
decision making. They typically assume that the interactions in our
decision variables are independent of each other, when in fact, we are
bombarded with multiple inputs that are highly interrelated.
Additionally, these simple modeling
techniques generally attempt to build linear relationships between the
inputs and the desired output. It is often the case that the basic
recognition of the non-linear aspects of a solution space will generate
improved decision making.
Traditional statistical analysis is often an
inappropriate choice because we are attempting to model human behavior.
Human behavior is typically not normally distributed; rarely has a stable
mean and standard deviation; and never has inputs into a model that cause a
particular type of behavior -- conditions that are necessary for the correct
application of traditional statistical tools.
The advanced modeling tools used in data
mining are not ‘better’ tools. They are simply better suited to modeling
the realities of human behavior.
The techniques employed in data mining are
often criticized as less rigorous and more complex than traditional
statistical analysis tools. Both of these criticisms should be viewed
realistically.
These techniques are less rigorous from the
perspective of not offering ‘right’ answers. However, this is not
deficiency of the tools. Rather, it is a reality of modeling human behavior
that is inconsistent and constantly changing.
The tools associated with data mining are
generally more complex than traditional statistics. The mathematical
formulas derived are generally not ‘simple’. Again, this should not be
viewed as undesirable. Our goal is to achieve more sophisticated, and more
accurate decision making in a highly complex environment.
THE ENVIRONMENT
One of the pitfalls often encountered in
data mining and CRM project management lies in a failure to recognize the
limitations of the technologies employed due to the type of environment in
which these technologies are expected to perform.
The environment that most business decision
makers function in is not precise. In most cases where data mining and CRM
are employed, we are modeling human behavior, not physical systems.
Human
decision making is subject to inconsistencies, both between and within
observations. What this means is, that given the same set of factors
influencing our decisions, the answers may be very different. And these
differences in responses can be expected, not only from different
individuals, but by the same individual at different points in time.
The implication of the inconsistency of human
behavior is that, at best, we hope identify a set of characteristics that
allow us to expect a particular type of response at a probabilistically
reliable rate. Further, our expectations of a particular behavior pattern
can only be expected in a group of individuals displaying a common set of
characteristics. We can not expect to predict the performance of any one
individual in other than a probabilistic fashion.
We are often tempted to look for highly
precise answers… to expect a solution to be right or wrong. We can not win
on every play. Human decision making is not precise. Our training in math
and the physical sciences does not apply to anticipating human behavior.
Our environments can not be expected to meet our objectives in black and
white terms.
Recognizing the limitations we are faced with
in human behavior modeling is a first step in developing enhanced decision
support mechanisms.
THE CONTINUUM – WHERE DO WE WIN AND WHERE DO WE LOSE?
Traditional techniques often set out to
come up with most likely outcomes. Ways of most accurately describing group
behavior in aggregate. In doing so, annoying discrepancies in the group
behavior are often assumed away, or discarded completely. Observations
referred to as ‘outliers’, observations more than three standard deviations
from the mean, are typical examples.
The astute business analyst, as with the
astute gambler, will recognize that while many situations may appear to be
similar in value, those at the extremes tend to have the most impact whether
positive or negative. It is in accurately and reliably, identifying the
extremes in the continuum that we can make the most impact.
In most cases, we are focused on one tail or
the other, such as fraud detection or credit screening. In some problems,
we may benefit by identifying occurrences from both tails, such as in
response modeling where it is possible to increase sales and reduce expenses
simultaneously.
BUSINESS OPPORTUNITIES
What types of considerations should we
use in setting performance criteria? Typically, we use three levels of
criteria.
Ultimately, we want to know the impact of our
decision models on our business performance. This is the ultimate set of
metrics. Our goal may be to increase sales revenue, reduce expenses,
increase net profit or reduce bad loans. Whatever we decide our priorities
are, these metrics become our touchstone for all future decisions. It
doesn’t matter how well our prospective models perform on a lift chart, or
how much we reduce our error metric, if we don’t meet the business criteria,
complete with the realities of the constraint system in which we will
operate, we will not have a winning model.
ANALYTIC OPPORTUNITIES
Our business goals are often implemented
by using analytic surrogates. Can I increase my response rate? Can I
reduce my false positives in my credit scoring model?
These analytic targets can be misleading. It
must be remembered that they are, at best, surrogates for our true business
metrics.
TECHNICAL OPPORTUNITIES
Technical opportunities are what we see
sitting at the table in front of the software during the model development
process. The lift chart looks good. The r-squared continues to improve.
This is a good point to emphasize one of the
key questions any model architect should be asking on a regular basis. So
what!?! Remember, this is not an academic problem. This is real life --
which generally involves real dollars.
Our technical enhancements may, or may not,
directly translate into additional business benefits. The only way to know
for sure is to periodically test our developing models using our true
business metrics.
BASELINE PERFORMANCE
It should be apparent to any analyst
developing data mining and CRM models that, in dealing with human behavior
modeling, there are no right answers -- there is no final solution upon
which improvements cannot be made.
Instead, we hope to improve on what has been
done in the past. To that end, we have to have measured our existing
efforts using the same quantitative performance metrics we plan to use in
the future.
INCREMENTAL PERFORMANCE ENHANCEMENT
Our baseline gives us a point of
reference. How is this model performing? Is it significantly better/worse
than the techniques we have been employing? What level of improvement is
significant? At what point am I willing to terminate this development
effort and field a new model?
The frustrating reality of modeling human
behavior is that we are always putting ourselves at risk. We must determine
how much, or how little to devote to our development efforts.
There is no way to determine, in advance,
what the pay off is going to be. If we know that answer, we’d have enough
information not to even need the development effort.
We do know that human behaviors change over
time. Sometimes gradually -- and sometimes in very dynamic shifts. And
with those changes, our existing decision models must evolve to meet new
challenges as well.
What is often overlooked in data mining and
CRM is that these are not projects. It is not something to be initiated and
concluded. It is a dynamic game that continues over time. There is no one
‘right’ answer. Our decision-making and our model development is constantly
evolving. It is a goal directed process, and must shift with changes in
corporate priority, and with experience. It must evolve to maintain value
in helping us evaluate the complex realities in which we operate.
ABOUT THE AUTHOR
Thomas
A. "Tony" Rathburn
is a senior-level consultant with The Modeling Agency (TMA). Tony's is a
data mining expert with industry strengths in financial forecasting,
time-series modeling, stock selection, insurance and banking applications,
customer behavior modeling and market segmentation. Prior to working with
TMA, Tony was responsible for development, marketing and delivery of a
seven-course curriculum in the business utilization of neural network
technology for NeuralWare, Incorporated, a neural network development tools
company based in Pittsburgh. He was responsible for all aspects of contract
data analysis projects for clients. Tony teaches The Modeling Agency’s
popular “Data Mining: Level I”
course and is a modeling consultant for Unica’s
Affinium Model
Predict Express program.
All Rights Reserved by
Thomas A. "Tony" Rathburn and The Modeling Agency.
Copyright © 2007
During the dot-com craze, it seemed like Web analytics vendors were
popping up like weeds. Many of the leading BI vendors at the time
quickly moved to jump on the Web analytics bandwagon as well by
introducing products designed to support everything from
click-stream analysis and online campaign optimization to Web site
visitor segmentation and conversion analysis.
During this time, we also saw some of the first analytic application
service providers (ASPs) come into existence, offering hosted tools
and services for analyzing and optimizing customer Web site
interactions. As a result, the term personalization became a hot
buzzword among BI, Web analytics, and other e-commerce vendors.
But the sudden crash of the dot-com business model and subsequent
downturn in the economy resulted in a shakeout of Web analytics
vendors, with many disappearing altogether and others getting
acquired.
That
was then and times have changed. Today, online shopping is firmly
established in the minds of consumers. In fact, with more people
making online purchases than ever before, digital marketing
reportedly grew 24% in 2005. In addition, the number of consumers
researching products online and then making their purchases through
other channels is also growing. The result: companies now seek not
only to be able to measure and optimize (i.e., "personalize") online
customer interactions, but to be able to do so across multiple
channels.
Increasingly, Web analytics are being seen as the platform for
enabling companies to effect multichannel personalization.
In an effort to capitalize on the resurgence of interest in Web
analytics, a group of marketing firms, analytics vendors, hardware
companies, and others recently joined together to form the Web
Analytics Association (WAA). Current WAA members include ClickTracks,
Coremetrics, Harvest Solutions, HP, IBM, Nedstat, Omniture, Site
Intelligence, Visual Sciences, WebSideStory, WebTrends, and ZAAZ.
The WAA is a nonprofit organization that seeks to promote the use of
Web analytics by tackling a number of issues currently standing in
the way of greater use of the technology. The market for Web
analytics products and services is definitely growing, but a lack of
standard measurement methodologies, metrics, and terminology is
confusing potential users. The establishment of best practices is
spotty, too. The WAA seeks to lend its technical expertise and
advice to the industry in these matters, as well as to help educate
and refine the market by offering education and certification
programs. The WAA also plans to promote a better understanding of
regional issues facing the Web analytics industry.
Two major issues the WAA is going to have to address are consumer
privacy and data security, which go hand in hand. And both have
received a lot of negative attention lately. For the past few weeks,
consumers have been treated to one headline after another revealing
just how shoddy some organizations' data security practices for
safeguarding consumer information have become. First, ChoicePoint
revealed that it had been suckered by fictitious "companies" into
allowing access to its massive consumer database -- potentially
resulting in the exposure of hundreds of thousands of consumers to
fraud, identity theft, and other illegal activities (and for how
long is anyone's guess). This incident was quickly followed by a
security gaffe at Bank of America whereby it lost digital tapes
containing the credit card account records of 1.2 million federal
employees -- including 60 US senators! Next, LexisNexis revealed
that a number of incidents had taken place involving potentially
fraudulent access to information about US individuals at its
recently acquired Seisint unit. And as I write this article,
the Department of Motor Vehicles in Las Vegas, Nevada, USA,
announced that burglars had stolen a computer containing the
personal information of nearly 9,000 people.
The result of these mess-ups is that consumer data collection
practices of all kinds are facing increasing calls for legislative
action -- not only by the usual consumer privacy and advocacy
groups, but by influential members of the US Congress. Moreover, in
recent testimony to the US Congress -- a direct result of the
aforementioned gaffes -- Federal Trade Commission (FTC) officials
estimated there were 10 million US victims of identity theft between
early 2002 and early 2003, at a total estimated cost of US $53
billion to US businesses and individuals! It appears that the FTC
hasn't even tried to put a number on international losses from
identity theft and fraud. Thus, if for no other reason than
addressing the legislative issues facing the Web analytics industry,
the WAA certainly has its work cut out for it.
The WAA is an organization whose time has come. Web analytics -- and
consumer data collection -- can be conducted in a manner that is
beneficial to both companies and consumers, and in a way that
safeguards personal identifying information. I hope the WAA can lend
its expertise in helping to establish some standard (and common
sense) data collection and security practices.