direct
281.667.4200

training
888.742.2454

fax
281.652.5721

email
send a message
 
newsletter
Receive articles, training schedule updates and industry announcements:

 
predictive Analytics & Data Mining I
Model Development

A Tactical Drill-Down of Process, methods, Tools and Techniques

Two Days: $1,295
Levels I and II Package: $2,995

  

 Dates Site Details Instructor
 August 23 & 24, 2010 Denver, CO Tim Graettinger, PhD
 October 4 & 5, 2010 Washington, DC Tim Graettinger, PhD
 December 6 & 7, 2010 Las Vegas, NV Tim Graettinger, PhD

 

Seating is limited to 18 participants.  Register early! 
Proceed to the
On-Line Registration Form to reserve your space today.

 

 
ABOUT THIS COURSE
The Modeling Agency's "Model Development" course presents a deep dive into the data mining process at a tactical level.  Attendees will observe demonstrations of machine learning methods and computer-guided analytical techniques for extracting and interpreting complex patterns and relationships from large volumes of data.  If you desire an intensive tactical orientation to data mining concepts, tools, techniques and supporting methods, then this event is designed for you.
 
This vendor-neutral course broadly covers data-driven information discovery techniques and model-building tactics without restriction to any particular modeling tool. Popular open-source and commercial packages are leveraged to illustrate methods, but not to showcase the tools.
 
There are no prerequisites for this course.  However, participants will benefit by reviewing the CRISP-DM guide ahead of the training.

Each course in the series is designed to be taken independently or as a natural progression from tactics to strategy and practice.  View the course series overview page to compare the two primary orientations and target the most fitting agenda for your experience, situation and objectives.
 

   
WHO SHOULD ATTEND

IT PROFESSIONALS who wish to expand their skills in this increasingly visible area within the corporate IT agenda

PROJECT LEADERS who must report on developmental progress, resource requirements and system performance

DECISION SUPPORT SYSTEM ARCHITECTS who require an understanding of the infrastructures required for supporting a data mining solution

BUSINESS ANALYSTS who must develop and interpret the models, communicate the results and make actionable recommendations

FUNCTIONAL ANALYSTS: Customer Relationship Managers, Risk Analysts, Business Forecasters, Statistical Analysts, Inventory Flow Analysts, Direct Marketing Analysts, Medical Diagnostic Analysts, Market Timers, e-commerce System Architects and Web Data Analysts
 

 
BENEFITS OF ATTENDING
  • Vendor-neutral exposure to tools and techniques that will place you months ahead in method planning and product surveying

  • Examine which methods and tools are most effective for your needs

  • Avoid pitfalls in data preparation, modeling, and results interpretation

  • Leave with resources, contacts and actionable plans to substantially increase your analysis capabilities while minimizing dead ends


THE BUSINESS CHALLENGE

The rapid emergence of electronic data processing and collection methods has lead some to call recent times as the "Information Age."  However, it may be more accurately termed as "The Age of the Data Glut."  Most businesses either posses a large database or have access to one.  These databases contain so much data that it becomes very difficult to understand just what that data is telling us.

There is hardly a transaction that does not generate a computer record somewhere.  All this data has meaning with respect to making better business decisions or understanding customer needs and preferences.  But how do you discover those needs and preferences in a database that contains gigabits of seemingly incomprehensible numbers and facts? Data mining and predictive analytics does just that.

The intent of this course is to offer attendees a stronger grasp of data mining techniques, a solid understanding of how various methods and tools apply to different kinds of data intensive problems, and how to overcome limitations that cause predictive models to underperform.

 


WHAT YOU WILL LEARN

  • The data mining process and general implementation

  • How to prepare raw data and benefit from visualization

  • Various data mining methods and how they compare

  • Advanced model building techniques

  • Results analysis and validation

  • Technology and product selection

  • Solution integration, ongoing performance and maintenance

  • Where to begin and how to obtain resources and support


WHAT MAKES THIS COURSE UNIQUE

This course does not restrict or skew the presentation of data mining methods through a single product.  Rather, the course gives consideration to all resources from a vendor-neutral position. The instructor possesses a wealth of pragmatic experience in applying data mining technology across industries in real-world applications.  This course insists upon making predictive analytics constructive and interpretable in a business or organizational setting.

In addition, live modeling demonstrations projected from the presenter's machine will support the instructional sessions. The demonstrations will highlight superior performance as well as pitfalls. The instructor will show how to evaluate various packages based on strengths, limitations, value and general performance.

 


COURSE OUTLINE

INTRODUCTION

  • What you will get in this course

  • What is PA/DM?

    • Definition

    • Related terms and fields

      • Machine learning

      • Computer-aided pattern discovery

      • Business analytics and statistics

      • Others you have heard?

    • Examples

    • Differences

  • How can you develop PA/DM opportunities

    • Generative questions

    • Examples

  • Nuts and bolts of a project

    • Big Picture: Introduction to CRISP-DM

      • What is it?  What is it not?

      • Why do we care? Why use  it? What is it good for?

    • Example: Tour of CRISP-DM in real-world context

    • Team Exercise

  • One Practitioner's View

    • Regarding PA/DM: What's hype and what isn't?

    • How to be successful with PA/DM

    • Tools and products

    • People matter

 
CRISP-DM METHODOLOGY: Parts 3, 4, 5

  • Highlight CRISP-DM 1, 2, 6
    CRISP 1, 2, 6 are detailed in "Level II: Strategic Implementation"

    • Business understanding

    • Data understanding

    • Deployment
       

  • Data Preparation (CRISP 3)

    • Rows: Select data

      • How much data?

      • Rows: Selecting the "unit  of analysis"

      • Determine what the record will  look like

      • Determine how many records we have to work with

      • Site  selection example

    • Rows: Defining the population / outcome of interest

    • Rows: Sampling methods / oversampling

    • Rows: Exclusions / rules of thumb

    • Columns: Identifying types

      • Need definitions (from clients or internal) so that we
        understand what the data represents.  Don't assume
        that an element isn't important

      • Categorical / Nominal (what does null mean)?

      • Ordinal

      • Interval / Rational

      • Date / Time

      • Sub-Types (money, count, geo, id, etc, and why care?)

    • Columns: Appropriate statistics and visualizations

      • Univariate

      • Multivariate

    • Columns: Selection for modeling

      • See "Clean Data" for pre-modeling elimination of
        redundant, constant, etc columns

      • Final selection is done during the Modeling phase

    • Document the above in a "Scorecard"
       

  • Modeling  (CRISP 4)

    • Select modeling technique

      • Taxonomies: An overview

        • Supervised vs. Unsupervised

        • Descriptive vs. Predictive

        • Classification vs. Estimation

      • Supervised -- Constellation of methods with pros and cons

      • Classification

        • Decision Trees

        • Logistic Regression

        • Neural Networks

        • K-Nearest Neighbor

      • Prediction

        • Linear Regression

        • Neural Networks

        • MARS

      • Exercise: Scenario revisited -- What method(s) do we choose?

    • Unsupervised -- More methods with pros and cons

      • Segmentation / Clustering

        • Hierarchical clustering

        • K-Means

        • Decision trees

      • Association Rules

    • Team Exercise: Com up with an expert-derived decision tree to
      make a selection for supervised problems

    • Advanced Topics

      • Ensembles

      • Bagging

      • Boosting

    • Parting remarks

      • Models should be as simple as possible, but no simpler

      • Why not both?  (a low-res descriptive model and
        a high-res opaque accuracy model)

    • Generate test design

      • Data segregation

      • Performance metrics: Whenever possible, go for the
        custom metric -- "If you build it, they will come."

    • Build Model

      • Use a tool, select a method, set parameters (if any),
        select candidate columns, select outcome (if supervised)

      • Variable selection techniques for supervised methods

      • Variable selection techniques for unsupervised methods

    • Assess Model (Tweaking)

      • Predictors

      • Manually removing or limiting

      • Forcing predictors

    • Structure

    • Profiles

    • Compared to What?

      • Baseline model comparison

      • Train/Test/Validation comparison

    • Scoring the model

      • What does scoring mean?

      • How is it different from building the model?

      • What are we looking for when scoring?

    • Final Product

      • Model(s)

      • Description(s)
         

  • Evaluation  (CRISP 5)

    • Evaluate results (from business perspective)

    • Prelude to business use presentation

      • Informal, low-risk setting

      • Poke holes early, before business presentation

    • Does the model or segmentation make sense?

    • Does it contradict of reinforce the standard "lore"?

    • SWOT analysis: What are the strengths, weaknesses,
      opportunities and threats?

    • Get support and buy-in from potential champions

    • Candidate names for  segments

    • Present results to business users or clients

      • BUs need to be convinced: Models, segments and analysis
        need to be marketed!

      • Deployment will require change

        • To processes

        • To systems

        • To ingrained mindsets

      • Deployment costs (to each change area above)

      • Results must have business value, not technical representations

      • Performance results -- in business terms

      • Descriptions

        • No equations

        • Tell the story, paint the vision, what will life be like with
          or without the model in place?

    • Review Process

      • Follow-ups to the presentation

        • Anticipate follow-up issues in planning and estimates

        • Revisions to the model(s) or segments based upon feedback

        • Final quality assurance

    • Determine next steps

      • Are you done?

      • Will the model(s) be deployed?  Why or why not?

      • Document!

      • Lessons learned meeting

    • Final Product

    • Consulting Exercise

 
WRAP-UP AND PARTING THOUGHTS

  • Final Q&A

    • Springboard exercise: "The boss in the elevator"

  • PA/DM Philosophy

    • Understand the problem

    • Understand the data

    • Then, think about how to solve it (Einstein quote)

    • Work on problems with specific business goals,
      specific hypotheses to be tested.  Do NOT go
      prospecting for "data mining nuggets."

  • Next Steps

    • PA/DM Level II Course: "Strategic Implementation"

    • Certification Exam (for those who complete the series)

    • Product training courses

    • Keep learning!

    • Supplementary materials and resources

    • Conferences and communities

    • Get started on a project!
       


 
 

Courses May Be Delivered At Your Site

Call (888) 742-2454 or send an email inquiry to receive a
value-based spreadsheet quotation for training at your site.

ATTENDEES' COMMENTS

"
When the only complaint is that the course could be longer, I think you've got an excellent class!  I very much enjoyed the instructor's use of a real data set to demonstrate principles taught throughout the entire class. The instructor went out of his way both before and during the class to help me to translate the class material to my own work."

Susan Glass
Senior Engineer, Biological Technologies Analysis Solutions
Wyeth

"The instructor is knowledgeable, well organized, and interacts extremely well with participants. If you have only two days to learn about data mining, "Model Development" is the class you should attend."

Yiguang Qiu, PhD
Marketing Department
Amica Insurance

 
"The wealth of information covered in these courses, as well as the in-depth demonstrations of multiple software packages, made the sessions valuable from a wide range of perspectives.  I will certainly recommend that others attend."

Brent King
AVP, Managed Care Analytics / Business Development
HealthSmart Preferred Care

 
"The instructor and course material are first rate. Any organization that believes data mining should be a part of their business operations portfolio would be making a wise investment by attending this course."

Eric Rickard
Information Computing Sciences
SRI International
 

"PA & DM: Model Development" gave me a new perspective on techniques and applications software that our federal agency had not previously seen. The course content was great, and the very knowledgeable instructor kept the students attention by using real-life examples and discussion of additional resources. I highly recommend this course!"

Larry P. Taylor
Auditor
US Department of Education

 
"The class was great! I was really impressed with the instructor's knowledge, experience, and ability. He was able to answer everyone's questions thoroughly and tailor the class to individual needs. I learned so much about the data mining process, the different methods, and available tools. I highly recommend this course to both technical and non-technical people interested in leading-edge data mining methodologies and the application of current data mining software to marketing, business, and research endeavors."
Stephen Pearce
Preventive Medicine
Kaiser Permanente

 
"I was a bit apprehensive about attending and how I could apply data mining concepts to my particular industry, but Dean put those fears to rest. Highly recommended!  Thanks TMA!"

Lewis Kohnle
Planner / Analyst
The Mitchell Gold Company
 

"This class gives the Statistician a bunch of new tools to use in solving business problems. Once the limitations of statistics are reached, grab this data mining tool belt.  You will be surprised how much further you can get."

Raymond D. Mooring, PhD
Wage and Investment Research
Internal Revenue Service

Seating is limited to 18 participants.  Register early!
Proceed to the
On-Line Registration Form to reserve your space today.

    address
One Oxford Centre
301 Grant St, Ste 4300
Pittsburgh, PA 15219 USA
 

  training: 888.742.2454

 direct: 281.667.4200
 
Copyright © 2000 - 2010 The Modeling Agency. All rights reserved