|
|
Data defines the model by dint of genetic programming, producing the best decile table.
|
|
|
Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data (4th printing) - Bruce Ratner, Ph.D. hjhjhjhhjhj |
|
|
Articles for the Second Edition: click here.
Table of Contents Chapter 1 - Introduction 1.1 The Personal Computer and Statistics 1.2 Statistics and Data Analysis 1.3 EDA 1.4 The EDA Paradigm 1.5 EDA Weaknesses 1.6 Small and Big Data 1.6.1 Data Size Characteristics 1.6.2 Data Size: Personal Observation of One 1.7 Data Mining Paradigm 1.8 Statistics and Machine Learning 1.9 Statistical Learning 1.10 References
Chapter 2 - Two Simple Data Mining Methods for Variable Assessment 2.1 Correlation Coefficient 2.2 Scatterplots 2.3 Data Mining 2.3.1 Example #1 2.3.3 Example #2 2.4 Smoothed Scatterplot 2.5 General Association Test 2.6 Summary 2.7 References
Chapter 3 - Logistic Regression: The Workhorse of Database Response Modeling 3.1 Logistic Regression Model 3.1.1 Illustration 3.1.2 Scoring a LRM 3.2 Case Study 3.2.1 Candidate Predictor and Dependent Variables 3.3 Logits and Logit Plots 3.3.1 Logits for Case Study 3.4 The Importance of Straight Data 3.5 Re-expressing for Straight Data 3.5.1 Ladder of Powers 3.5.2 Bulging Rule 3.5.3 Measuring Straight Data3.6 Straight Data for Case Study 3.6 Straight Data for Case Study 3.6.1 Re-expressing FD2_OPEN 3.6.2 Re-expressing INVESTMENT 3.7 Techniques When Bulging Rule Does Not Apply 3.7.1 Fitted Logit Plot 3.7.2 Smooth Predicted vs. Actual Plot 3.8 Re-expressing MOS_OPEN 3.8.1 Smooth Predicted vs. Actual Plot for MOS_OPEN 3.9 Assessing the Importance of Variables 3.9.1 Computing the G statistic 3.9.2 Importance of a Single Variable 3.9.3 Importance of a Subset of Variables 3.9.4 Comparing the Importance of Different Subsets of Variables 3.10 Important Variables for Case Study 3.10.1 Importance of the Predictor Variables 3.11 Relative Importance of the Variables 3.11.1 Selecting the Best Subset 3.12 Best Subset of Variables for Case Study 3.13 Visual Indicators of Goodness of Model Predictions 3.13.1 Smooth Residual by Score Groups Plot 3.13.1.1 Smooth Residual by Score Groups Plot for Case Study 3.13.2 Smooth Actual vs. Predicted by Decile Groups Plot 3.13.2.1 Smooth Actual vs. Predicted by Decile Groups Plot for Case Study 3.13.3 Smooth Actual vs. Predicted by Score Groups Plot 3.13.3.1 Smooth Actual vs. Predicted by Score Groups Plot for Case Study 3.14 Evaluating the Data Mining Work 3.14.1 Comparision of Smooth Residual by Score Groups Plots: EDA vs.NonEDA Models 3.14.2 Comparison of Smooth Actual vs. Predicted by Decile Groups Plots: EDA vs. NonEDA Models 3.14.3 Comparison of Smooth Actual vs. Predicted by Score Groups Plots: EDA vs. NonEDA Models 3.14.4 Summary of the Data Mining Work 3.15 Smoothing A Categorical Variable 3.15.1 Smoothing FD_TYPE with CHAID 3.15.2 Importance of CH_FTY_1 and CH_FTY_2 3.16 Additional Data Mining Work For Case Study 3.16.1 Comparison of Smooth Residual by Score Group Plots: 4var- vs. 3var-EDA Models 3.16.2 Comparison of Smooth Actual vs. Predicted by Decile Groups Plots: 4var- vs. 3var-EDA Models 3.16.3 Comparison of Smooth Actual vs. Predicted by Score Groups Plots: 4var- vs. 3var-EDA Models 3.16.4 Final Summary of the Additional Data Mining Work 3.17 Summary
Chapter 4 - Ordinary Regression: The Workhorse of Database Profit Modeling 4.1 Ordinary Regression Model 4.1.1 Illustration 4.1.2 Scoring A OLS Profit Model 4.2 Mini Case Study 4.2.1 Straight Data for Mini Case Study 4.2.1.1 Re-expressing INCOME 4.2.1.2 Re-expressing AGE 4.2.2 Smooth Predicted vs. Actual Plot 4.2.3 Assessing the Importance of Variables 4.2.3.1 Defining the F Statistic and R-squared 4.2.3.2 Importance of a Single Variable 4.2.3.3 Importance of a Subset of Variables 4.2.3.4 Comparing the Importance of Different Subsets of Variables 4.3 Important Variables for Mini Case Study 4.3.1 Relative Importance of the Variables 4.3.2 Selecting the Best Subset 4.4 Best Subset of Variable for Case Study 4.4.1 PROFIT Model with gINCOME and AGE 4.4.2 Best PROFIT Model 4.5 Suppressor Variable AGE 4.6 Summary
Chapter 5 - CHAID for Interpreting a Logistic Regression Model 5.1 Logistic Regression Model 5.2 Database Marketing Response Model Case Study 5.2.1 Odds Ratio 5.3 CHAID 5.3.1 Proposed CHAID-based Method 5.4 Multivariable CHAID Trees 5.5 CHAID Market Segmentation 5.6 CHAID Tree Graphs 5.7 Summary
Chapter 6 - The Importance of the Regression Coefficient 6.1 The Ordinary Regression Model 6.2 Four Questions 6.3 Important Predictor Variables 6.4 P-values and BIG Data 6.5 Returning to Question #1 6.6 Predictor Variable's Effect On Prediction 6.7 The Caveat 6.8 Returning to Question #2 6.9 Ranking Predictor Variables By Effect On Prediction 6.10 Returning to Question #3 6.11 Returning to Question #4 6.12 Summary 6.13 Reference
Chapter 7 - The Predictive Contribution Coefficient: A Measure of Predictive Importance 7.1 Background 7.2 Illustration of Decision Rule 7.3 Predictive Contribution Coefficient 7.4 Calculation of Predictive Contribution Coefficient 7.5 Extra-illustration of Predictive Contribution Coefficient 7.6 Summary 7.7 Reference
Chapter 8 - CHAID For Specifying A Model With Interaction Variables 8.1 Interaction Variables 8.2 Strategy for Modeling with Interaction Variables 8.3 Strategy Based on the Notion of a Special Point 8.4 Example of a Response Model with an Interaction Variable 8.5 CHAID for Uncovering Relationships 8.6 Illustration of CHAID for Specifying a Model 8.7 An Exploratory Look 8.8 Database Implication 8.9 Summary 8.10 Reference
Chapter 9 - Market Segment Classification Modeling With Logistic Regression 9.1 Binary Logistic Regression 9.1.1 Necessary Notation 9.2 Polychotomous Logistic Regression Model 9.3 Model Building With PLR 9.4 Market Segmentation Classification Model 9.4.1 Survey of Cellular Phone Users 9.4.2 CHAID Analysis 9.4.3 CHAID-tree Graphs 9.4.4 Market Segment Classification Model 9.5 Summary
Chapter 10 - CHAID As A Method For Filling In Missing Values 10.1 Introduction to the Problem of Missing Data 10.2 Missing-data Assumption 10.3 CHAID Imputation 10.4 Illustration 10.4.1 CHAID Mean-value Imputation for a Continuous Variable 10.4.2 Many Mean-value CHAID Imputations for a Continuous Variable 10.4.3 Regression-tree Imputation for LIF_DOL 10.5 CHAID Most-likely Category Imputation for a Categorical Variable 10.5.1 CHAID Most-likely Category Imputation for GENDER 10.5.2 Classification-tree Imputation for GENDER 10.6 Summary 10.7 Reference
Chapter 11 - Identifying Your Best Customers: Descriptive, Predictive and Look-Alike Profiling 11.1 Some Definitions 11.2 Illustration of a Flawed Targeting Effort 11.3 Well-Defined Targeting Effort 11.4 Predictive Profiles 11.5 Continuous Trees 11.6 Look-Alike Profiling 11.7 Look-Alike Tree Characteristics 11.8 Summary
Chapter 12 - Assessment of Database Marketing Models 12.1 Accuracy for Response Model 12.2 Accuracy for Profit Model 12.3 Decile Analysis and Cum Lift for Response Model 12.3 Decile Analysis and Cum Lift for Response Model 12.4 Decile Analysis and Cum Lift for Profit Model 12.5 Precision for Response Model 12.6.Construction of SWMAD 12.7 Separability for Response and Profit Models 12.8 Guidelines for Using Cum Lift, HL/SWMAD and CV 12.9 Summary
Chapter 13 - Bootstrapping in Database Marketing:A New Approach For Validating Models 13.1 Traditional Model Validation 13.2 Illustration 13.3 Three Questions 13.4 The Bootstrap 13.4.1 Traditional Construction of Confidence Intervals 13.5 How To Bootstrap 13.5.1 Simple Illustration 13.6 Bootstrap Decile Analysis Validation 13.7 Another Question 13.8 Bootstrap Assessment of Model Implementation Performance 13.8.1 Illustration 13.9 Bootstrap Assessment of Model Efficiency 13.10 Summary 13.11 Reference
Chapter 14 - Visualization of Database Models 14. 1 Brief History of the Graph 14.2 Star Graph Basics 14.2.1 Illustration 14.3 Star Graphs for Single Variables 14.4 Star Graphs for Many Variables Considered Jointly 14.5 Profile Curves Method 14.5.1 Profile Curves Basics 14.5.2 Profile Analysis 14.6 Illustration 14.6.1 Profile Curves for RESPONSE Model 14.6.2 Decile-Group Profile Curves 14.7 Summary 14.8 SAS Code for Star Graphs for Each Demographic Variable about the Deciles 14.9 SAS Code for Star Graphs for Each Decile About the Demographic Variables 14.10 SAS Code for Profile Curves: All Deciles 14.11 Reference
Chapter 15 - Genetic Modeling in Database Marketing: The GenIQ Model 15.1 What Is Optimization? 15.2 What Is Genetic Modeling ? 15.3 Genetic Modeling: An Illustration 15.3.1 Reproduction 15.3.2 Crossover 15.3.3 Mutation 15.4 Parameters for Controlling A Genetic Model Run 15.5 Genetic Modeling : Strengths and Limitations 15.6 Goals of Modeling in Database Marketing 15.7 The GenIQ Response Model 15.8 The GenIQ Profit Model 15.9 Case Study - Response Model 15.10 Case Study - Profit Model 15.11 Summary 15.12 Reference
Chapter 16 - Finding The Best Variables For Database Marketing Models 16.1 Background 16.2 Weakness in the Variable Selection Methods 16.3 Goals of Modeling In Database Marketing 16.4 Variable Selection With GenIQ 16.4.1 GenIQ Modeling 16.4.2 GenIQ-Structure Identification 16.4.3 GenIQ Variable Selection 16.5 Nonlinear Alternative To Logistic Regression Model 16.6 Summary 16.7 Reference
Chapter 17 - Interpretation of Coefficient-free Models 17.1 The Linear Regression Coefficient 17.1.2 Illustration for the Simple Ordinary Regression Model 17.2 The Quasi-Regression Coefficient for Simple Regression Models 17.2.1 Illustration of Quasi-RC for the Simple Ordinary Regression Model 17.2.2 Illustration of Quasi-RC for the Simple Logistic Regression Model 17.2.3 Illustration of Quasi-RC for Nonlinear Predictions 17.3 Partial Quasi-RC for The Everymodel 17.3.1 Calculating the Partial Quasi-RC for The Everymodel 17.3.2 Illustration for the Multiple Logistic Regression Model 17.4 Quasi-RC for A Coefficient-free Model 17.4.1 Illustration of Quasi-RC for a Coefficient-free Model 17.5 Summary
|
| For more information about the book, call Bruce Ratner at 516.791.3544 or 1 800 DM STAT-1; or e-mail at br@dmstat1.com. |
|
|