direct
281.667.4200

training
888.742.2454

fax
281.652.5721

email
send a message
 
newsletter
Receive quarterly training schedule updates and informative articles

 
 
 
 
 
 
 
 

 
The Modeling Agency Quarterly Newsletter
2008-Q1 Release
 

[ March 1, 2008  |  This Edition: ]

1.  Training Schedule Update:  Learn how experts mine data, and why building an internal predictive modeling practice is within your grasp.  Next up: Las Vegas April 7 - 11, 2008 and Washington, DC June 2 - 6, 2008

2.  Feature Article:  "Data Mining in the Monetary and Financial Regulatory Space" by Steven W. Oxman, President, OXKO Corporation

3.  Announcement:  Second Annual Data Miner Survey: Your participation is requested, by Rexer Analytics

4.  Announcement:  Predictive Modeling News: Activate your subscription to the Monthly Newsletter for Healthcare Professionals Involved with Predictive Analytics

5.  Newsletter Summary

 
 

1.  TRAINING SCHEDULE UPDATE 

 

  
LEARN HOW EXPERTS MINE DATA IN LAS VEGAS OR WASHINGTON, DC

The next offering of The Modeling Agency's vendor-neutral, application-oriented data mining courses is scheduled for April 7 to 11 in Las Vegas and June 2 to 6 in Washington, DC.  Participants will enjoy a balanced, broad and non-promotional presentation of predictive modeling without restriction to a particular tool method or product.

 

Attendees will learn about data mining capabilities, limitations, best practices, strategies, methods, tools, techniques and applications while enjoying all the entertainment and seasonal weather that Las Vegas has to offer.  Those in attendance will leave with a comprehensive binder of notes, illustrations and references to valuable resources.  Don't leave a powerful competitive advantage untapped: harness the valuable information and profits hidden in your data. 

Last year's offerings of both the April Las Vegas and June DC courses sold out months in advance.   Las Vegas is already near capacity, so be sure to reserve your space early.  A current status of remaining space in the April course may be viewed at TMA's training schedule page.   If you're not yet ready to formalize your registration, you may submit an unofficial registration without obligation or penalty and reserve your space today while your training request is processed.
 

CHOOSE THE TRAINING THAT'S RIGHT FOR YOU
The Modeling Agency offers three data mining courses with distinct objectives.  The courses are designed to be attended independently, or as a progressive series.  While the three levels are staged as a progression, they should not be viewed simply as "introductory, intermediate and advanced."  Refer to the table below to ensure that your experience, situation and objectives align properly with the intent, scope and depth of each offering:

Course

Focus

Scope

Geared To

Data Mining: Level I Strategy An intensive overview of strategy, best practices and case studies Project leaders,
Stakeholders,
Functional Managers
Data Mining: Level II Methods A tactical drill-down of the data mining process, methods, techniques and resources Business Analysts,
Functional Analysts,
IT Professionals
Data Mining: Level III Application A hands-on application workshop as an extension to Data Mining: Level II Practitioners,
Model-builders,
Decision Support
Developers

 
FULL COURSE DETAILS

Detailed course outlines, instructor background, site information, a secure registration form and other course descriptions offered by TMA may be obtained through the links that follow.
 

Since The Modeling Agency is not a tools vendor, participants enjoy a balanced, broad and
non-promotional perspective of predictive analytics at desirable venues throughout the USA.
 
 

DATA MINING: LEVEL I
An Intensive Overview of Strategy, Best Practices
and Case Studies for Predictive Analytics

CourseDetailed Outline 
InstructorTony Rathburn
Registration On-Line Form
 
SCHEDULE AND SITE DETAILS 
April 7 & 8, 2008: Las Vegas, NV 
June 2 & 3, 2008: Washington, DC
September 29 & 30: San Diego, CA
Duration and Fee:
2 Days, 1.2 CEUs
$1295 USD
 
Package Price:
$1995 Levels I & II
 
 
 

DATA MINING: LEVEL II
A Tactical Drill-Down of the Data Mining
Process, Methods, Tools and Techniques

Course: Detailed Outline 
InstructorDean Abbott
Registration: On-Line Form
 
SCHEDULE AND SITE DETAILS
April 9 & 10, 2008: Las Vegas, NV 
June 4 & 5, 2008: Washington, DC
October 1 & 2, 2008: San Diego, CA
Duration and Fee:
2 Days, 1.2 CEUs
$1295 USD
 
Package Price:
$1995 Levels I & II
 
 
 

DATA MINING: LEVEL III
A Hands-On Application Workshop
for Data Mining Practitioners

Course: Detailed Outline 
InstructorDean Abbott
Registration: On-Line Form
 
SCHEDULE AND SITE DETAILS
April 11, 2008: Las Vegas, NV   
June 6, 2008: Washington, DC
October 3, 2008: San Diego, CA
Duration and Fee:
1 Day, 0.6 CEUs
$695 USD
 
Package Price:
$595 With Level II
 
 
 

 
Courses May Be Delivered At Your Site

Call (888) 742-2454 or send an email inquiry to receive a value-based
spreadsheet quotation for training at your site.


Government Buyers
TMA is a CCR Registered Veteran-Owned Small Business and accepts EFT.
 

 
 
 

 

2.  FEATURE ARTICLE
 

Data Mining in the Monetary
and Financial Regulatory Space

by  Steven W. Oxman
President, OXKO Corporation

 
INTRODUCTION
 
In the United States, there are tens of millions of monetary and financial transactions going on every day.  Many of these transactions fall under government regulations for one or more reasons.  One federal agency is required by law to monitor certain financial and monetary transactions.  While doing this monitoring, multiple computer system based solutions were considered and many were tried.  There are many different illegal activities which this monitoring seeks to identify.  Examples of these include fraudulent financial activities and money laundering. 

Imagine looking for one or a set of trans-actions that are not proper from the entire universe of transactions.  Looking at one day’s worth of transactions is a lot of work.  Looking at one month’s worth of transactions is a large undertaking.  Looking at one year’s worth of transactions most likely is too large a data set for many of the analysis computers that the government has at its disposal for this work. 

And what are you looking for?  People who execute improper financial and monetary transactions work hard to make these transactions look normal and legal.  The law says that you must notify the authorities when you move $10,000.00 or more via a bank, so the people not wishing to notify the authorities try to move money in smaller amounts and try to use means other than banks for many of these transactions.  How do we ferret out the illegal transaction(s) from the legal ones when they “look” so much the same?  Will simple data analysis and data mining work for this work?  If yes, what algorithms do work?  If no, then what must an analyst do to find the transactions of interest?

What are the analysts allowed to do (by law)?  Can the FBI easily obtain financial data from other governmental organizations and agencies like the IRS or FDIC, for example?  If yes, then what are their boundaries for doing so?  If no, then what can they do? 
 

WITHIN AND BETWEEN GOVERNMENT AGENCIES
Although this subject actually goes beyond the boundaries of this article, and I am not an expert in this area, I can tell you what I have observed. 

Even with the Patriot Act, it is still not easy for Government Agencies to share certain data about businesses and individuals.  The US Government takes privacy issues very seriously.  Agencies like the FBI cannot request certain information directly from other governmental agencies.  Instead, the FBI or similar agencies must go to a Grand Jury and ask the Judge for permission for certain information transference.  They must provide good reason along with very specific focus and purpose.  If the Grand Jury and the Judge agree, then specific permission can be given for the good reason for specific information pertinent to a specific case.  For the most part “John Doe requests” (those that ask for information on a larger, non-specific group of people) are not approved. 

So agents and analysts have to focus on their cases, must focus on information they require, must make a good case for the information needed, and must be able to present their case to the Grand Jury effectively, before they can obtain certain data and information from another Government Agency.  It is not easy, and it is usually not fast.  If given the “go ahead” from the Judge and Jury, then the analysts and agents can obtain that specific data and information and use it within the boundaries of the Judge’s and Jury’s guidelines.
 

ANALYSTS' WORK
Usually, the work of a financial or monetary analyst is being done for one or more agents working on specific cases or specific financial or monetary schemes.  The agents usually work with agency lawyers.  The analyst often will have to sift through a lot of raw data looking for patterns. 

Anyone who might be called as a witness in a case cannot be “tainted” with data and information that is beyond the boundaries of the case.  Therefore, in a large case, there could be an analyst that does the preliminary work that might get inadvertently tainted.  Then there is another analyst or analysts that only look at the initially sifted-through data, so they will not be tainted and can be available as witnesses as needed by the government lawyer on the case.   

The analyst’s work is to find pertinent evidence for the case, and specifically for the lawyer on the case.  This analyst must work within the boundaries set by any Judge or Grand Jury, if applicable.

An analyst’s workstation needs to be connected to the data.  This data can be very large and the need to access this very large data base is important.  Therefore, the workstation needs a large communications pipe to the data base server or servers.  The analyst’s workstation (some like to call it a work bench) needs to be loaded with a rich set of data analysis and data mining tools.  The analyst will have basic tools like Microsoft Excel and all its data analysis tools.  SQL Server is a good data base management system for the data servers, for it offers important services like Analysis Services for SQL Server 2005 and OLAP (on line analytical processing) support.  Even before SQL Server 2005, Microsoft was providing a lot of these services to SQL Server.  With SQL Server 2005 and Office 2007, Microsoft provides free downloads of new data mining tools for Excel and Visio. 

The rest of the data analysis tools needed for the analyst might include ETL (extraction, transformation, and loading – with a de-emphasis on the transformation step) tools, data mining tools, link analysis tools, data induction tools, web analytics tools, data visualization tools, text handling tools, text analysis tools, text mining tools, time series analysis tools, powerful search tools (for web and for private data collections), data merge tools, data field duplication detection tools, statistical tools, etc.  Effective tools are available from vendors like SPSS, SAS, IBM, Microsoft, KnowledgeStudio, Isoft, i2, Visual Analytics, Tildenwoods, LPA, NetMap Analytics, Megaputer, Eastport Analytics, Business Objects, MicroStrategy, Paris Technologies, Thomas Behrends, Synaptris, and others.
 

CHAIN OF CUSTODY AND ORIGINAL DATA FORM RETENTION
This is another area that is beyond the scope of this paper, but allow me to provide a few observations.  When working on legal projects, the court system wants to insure that there is a clear and documented chain of custody of the data and its analysis.  This is to insure that the data used for the case has not been modified.  Also, the courts want documented evidence that the data used for the analysis was of the original form, that it has not been transformed, modified, or otherwise altered as to not represent the original evidence and facts.  It is up to the agents to bring the analysts “clean” data, and it is up to the agents and the analysts to not modify or corrupt the data, and to be able to provide full documentation of all parties that have had any contact with the data.
 

DATA ANALYSIS MEETS SOCIAL ANALYSIS
Data mining and analysis could be defined as the process of pattern and/or knowledge discovery from large collections of data using data analytic methods.   Social analysis might be defined as the process of pattern and/or knowledge discovery from collections of social artifacts using social analytic methods.

Data mining and analysis amassess large collections of data, and looks for patterns of interest in the collected data base to realize knowledge discovery from the data patterns found.  Social analysis collects social facts and looks for patterns of interest in the collected social fact base to realize knowledge discovery from the social patterns found.

We might find through a data mining and analysis process that many monetary transactions of $5,000.00 each by individuals were attempts to transfer large amounts of money without Government Agency notice.  We might find, through social analysis, that people predominantly spend larger amounts of money on groceries when they are near their primary residence.  

When looking to use data mining and analysis techniques to find illegal monetary and financial transactions and identify the parties involved, sometimes social analysis is needed to assist with the analysis work.  For example, say that an individual was transferring money from one country into and out of the US.  And say that the monetary actions and monetary transfer technique was identified.  The analysis left us with the monetary vehicle and an identification number, but not the name and address of the individual involved.  So how do we locate and identify the individual of interest?  Can data mining find this person for us?

We have not been able to identify a method of locating and identifying the person of interest from a pure data mining play.  However, by utilizing both data mining and analysis with social analysis, we were able to develop a method to locate and identify people of interest.  Our social analysis really gets us into “social mining.”

Through the data mining and analysis work, we can identify monetary and/or financial transactions that appear to be illegal.  We can verify if the transactions are illegal or not through classic investigatory methods.  When we have a transaction or set of transactions that are verified as illegal, then we proceed to locate and identify the person or persons of interest. 

We start with any identifying data from the transaction, for example, an account number.  Sometimes we are lucky and the person’s location and identification data is also available, that makes it pretty simple to provide the location and identification information to the agent or agents.  But frequently life is not so simple.  We have an account number, but little to no other data for location and identification purposes.  But, the rest of the information will help us.   It is data on where transactions occurred and what the transactions were for, whether the reason for the transaction is accurate or not, this information will assist us.  

So let’s take a small imaginary example to see how social mining and analysis assists us.  For one account number, we have the following transactions:

Date Place City, State Items(s) Amount
01/01/01 Grocery Store New York, NY Food items $50.00
01/01/01 Liquor Store New York, NY Wine $100.00
01/02/01 Gas Station Branford, CT Gas $50.00
01/02/01 Grocery Store Natick, MA Food items $250.00
01/03/01 Vet Office Natick, MA Cat Exam $80.00
01/04/01 Tire Store Wellesley, MA Tires $300.00
01/04/01 Jiffy Lube Store Natick, MA Lube and Oil $40.00
01/05/01 A-Plus Plumbing Framingham, MA Plumbing Repair $150.00

If an analyst were to see this data alone, I believe that a good analyst would be able to infer that the person of interest lives in the Natick/Wellesley/Framingham area of Massachusetts.  This might be a primary residence or a vacation home.  The Natick area in January does not seem to be a vacation home or secondary residence.  Natick is not near warm climates for January, it is not near any significant tourist attraction, and it is not near a winter sport area like a Ski Resort.  Therefore, most likely, this is a primary residence.  So there are at least two important questions to answer here:

1.  When working with very large data bases, how do we get to “look” at a small data set like this one?

2.  How do we identify the actual location and identification of the person of interest from this data?

Actually it is easy to answer both of these questions.  For question one, we would query our very large data base for one week or one month of account data for an account that we have already identified as having illegal monetary or financial transactions (e.g., twenty  movements of approximately $5,000.00 from one account to one other account without any notification, in a very short period of time).  Therefore, we would be now looking at a small, very workable set of data like that shown above.  For question two, we would go to the Grand Jury and get the proper authorization to go to some of the businesses (say the Vet and Plumber) and ask for the identification and location of the person who used that account number.  For smaller businesses near the person of interest, it is likely (like in the case of the Vet) that the business would know their client, have the account numbers on file or in their accounting records.  Where personal recognition is later needed, again with the correct court permission and procedures, employees of these businesses can be used for personal recognition of the person or persons.

Also notice, that through the use of data mining, a good rule-based engine, and some social mining, an analyst could also automate the function of going from improper monetary or financial transactions directly to account identification and location data assistance.  For example, one rule might be:  If business used is a VET, then primary residence is within 10 miles of the business, Confidence 80.  Another example might be: If business used is a GROCERY STORE, and amount spent is OVER $100, then primary residence is within 5 miles of the business, Confidence 90.
 

CONCLUDING REMARKS
Data analysis and mining is a valuable tool for regulatory agencies.  However, data analysis and mining, in some cases, is not enough.  Social analysis and mining can be used to augment data analysis and mining in those cases where data analysis and mining is not enough.  Through the use of data mining and social mining together, many illegal activities could be found including, but not limited to, illegal monetary transactions, illegal financial transactions, money laundering, drug related transactions, and terrorism related transactions. 

    
ABOUT THE AUTHOR
Steven Oxman, President of the OXKO Corporation, has been in the Information Technology industry since 1967 spanning a number of significant technologies including very large data bases, data warehouses, data base security, software engineering environments, expert systems, knowledge based systems, data mining, and web based application development. Presently Mr. Oxman has been functioning as a part-time/surrogate CIO for a number of organizations that have a need for a CIO level executive for a specific amount of time/for a specific need.

Clients of Mr. Oxman have included Charlotte Radiology, Du Pont, Elkem Metals, Executive Residence, the IRS, US Navy, NASA, CSC, Ingersoll-Rand, American Airlines, GE, Union-Carbide, and the State of Illinois. Mr. Oxman has also been a Professor for George Washington University, and an Instructor for American University and City Colleges of Chicago. Mr. Oxman is an avid pilot, often using his personal aircraft to commute to his clients.

Published with permission by OXKO Corporation Copyright © 2008


 

 

3.  ANNOUNCEMENT
 

 Second Annual Data Miner Survey

Your Participation is Requested
by Rexer Analytics

 

REXER ANALYTICS ANNOUNCES LAUNCH OF SECOND ANNUAL DATA MINER SURVEY
Rexer Analytics has launched their second annual survey of the data mining community.  In the spring of 2007, Rexer Analytics assessed the experiences, priorities, and challenges of the data mining community through an online survey.  They are now repeating that survey, and have added new questions in order to dig deeper into the opinions and needs of the data mining community.

Last year’s Data Miner Survey received responses from 314 individuals in over a dozen fields and from 35 different countries.  The results were carried in both online and traditional publications in the US, France, Germany, Poland, and elsewhere.

Rexer Analytics would welcome your participation in this year’s survey.  Your responses will be held completely confidential: no information you provide on the survey will be shared with anyone outside of Rexer Analytics.  All reporting of the survey findings will be done in the aggregate, and no findings will be written in such a way as to identify any of the participants.  This research is not being conducted for any third party, but is solely for the purpose of Rexer Analytics to disseminate the findings throughout the data mining community via publication, conference presentations, and personal contact.

If you would like a summary of last year’s or this year’s findings emailed to you, there will be a place at the end of the survey to leave your email address.  You can also email Karl Rexer directly if you have any questions about this research or to request research summaries.


PLEASE CONTRIBUTE YOUR PERSPECTIVES AND EXPERIENCE TO THE SURVEY
To participate, please proceed to the survey and enter the access code T9MAE in the space provided.  The survey should take approximately 15 minutes to complete.

Thank you for your time.  We hope the results form this survey provide useful information to the data mining community.

 


4.
  ANNOUNCEMENT
 

Predictive Modeling News

The Monthly Newsletter for Health Care Professionals
Involved with Predictive Modeling

 

  • Featuring articles, news, key data, technology developments, results from recent studies, interviews, case studies, monthly survey results, thought leaders corner and more

  • Content covering issues of care management, actuarial and profiling interest

  • Twelve pages per month with a mixture of detailed information, charts, graphs and tables, summary tidbits and more

  • Available in print and electronic format

  • Subscriber web site with archives, supplemental content and account management

  • Subscribers can also participate and interact in predictive modeling list-serv 

  • Subscribe now for $39 per month, or $468 annually 

 

Predictive Modeling News

Request a free sample by sending an e-mail to info@predictivemodelingnews.com. You must include full contact information for mailing, and be sure to mention you are requesting a sample copy in your message.  Activate your subscription now.  Or for additional information you may also call (209) 577-4888.

   


 
5.  NEWSLETTER SUMMARY
 

The Modeling Agency newsletter is a quarterly publication which provides course announcements, training schedule updates and informative articles.  This newsletter may be shared in its entirety and subscriptions are free. For additional information on TMA's training, consulting services and solutions, follow corresponding links at the top of this page.

This newsletter is shared with those who have activated a subscription, or have supplied their Email address to The Modeling Agency when requesting product information. If you wish not to receive future releases, simply send an empty email with cancel as he subject from the account which you were subscribed.

    address
One Oxford Centre
301 Grant St, Ste 4300
Pittsburgh, PA 15219 USA
 
phone: 281.667.4200
fax: 281.652.5721
training: 888.742.2454
Copyright © 2000 - 2008 The Modeling Agency. All rights reserved.