2013 DataScienceforBusinessWhatYouNe

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Business Data Mining Task, Business Task, Data Mining Task.

Notes

Cited By

Quotes

Book Overview

Data Science for Business is a new book by Foster Provost and Tom Fawcett intended for those who need to understand data science/data mining, and those who want to develop their skill at data-analytic thinking. Data Science for Business is not a book about algorithms. Instead it presents a set of fundamental principles for extracting useful knowledge from data. These fundamental principles are the foundation for many algorithms and techniques for data mining, but also underlie the processes and methods for approaching business problems data-analytically, evaluating particular data science solutions, and evaluating general data science plans.

Design

The book builds up the reader's understanding of data science by discussing the fundamental principles in the context of business examples, and then shows specifically how the principles can provide understanding of many of the most common methods and techniques used in data science. After reading the book, the reader should be able to (i) discuss data science intelligently with data scientists and with other stakeholders, (ii) better understand proposals for data science projects and data science investments, and (iii) participate integrally in data science projects.

As one example, a fundamental principle of data science is that solutions for extracting useful knowledge from data must carefully consider the problem from the business perspective. This may sound obvious at first, but the notion underlies many choices that must be made in the process of data analytics, including problem formulation, method choice, solution evaluation, and general strategy formulation. Another fundamental principle is that some data items can give us information about other data items. This principle manifests itself throughout data science: in the basic notion of finding “correlations” among variables, in the specific design of many particular data mining procedures, and more generally as the basis for all predictive analytics.

Audience

Data Science for Business is intended for business people who will be managing or working with data scientists, for developers who will be implementing data science solutions, as well as for aspiring data scientists. By its very nature the material is somewhat technical --- the goal is to really understand data science, not to give a high-level overview. However, the book does not presume a sophisticated mathematical background, relegating the few technical details to optional "starred" sections.

Table of Contents

   Chapter 1 Introduction: Data-Analytic Thinking
       The Ubiquity of Data Opportunities
       Example: Hurricane Frances
       Example: Predicting Customer Churn
       Data Science, Engineering, and Data-Driven Decision Making
       Data Processing and “Big Data”
       From Big Data 1.0 to Big Data 2.0
       Data and Data Science Capability as a Strategic Asset
       Data-Analytic Thinking
       This Book
       Data Mining and Data Science, Revisited
       Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
       Summary
   Chapter 2 Business Problems and Data Science Solutions
       From Business Problems to Data Mining Tasks
       Supervised Versus Unsupervised Methods
       Data Mining and Its Results
       The Data Mining Process
       Implications for Managing the Data Science Team
       Other Analytics Techniques and Technologies
       Summary
   Chapter 3 Introduction to Predictive Modeling: From Correlation to Supervised Segmentation
       Models, Induction, and Prediction
       Supervised Segmentation
       Visualizing Segmentations
       Trees as Sets of Rules
       Probability Estimation
       Example: Addressing the Churn Problem with Tree Induction
       Summary
   Chapter 4 Fitting a Model to Data
       Classification via Mathematical Functions
       Regression via Mathematical Functions
       Class Probability Estimation and Logistic “Regression”
       Example: Logistic Regression versus Tree Induction
       Nonlinear Functions, Support Vector Machines, and Neural Networks
       Summary
   Chapter 5 Overfitting and Its Avoidance
       Generalization
       Overfitting
       Overfitting Examined
       Example: Overfitting Linear Functions
       * Example: Why Is Overfitting Bad?
       From Holdout Evaluation to Cross-Validation
       The Churn Dataset Revisited
       Learning Curves
       Overfitting Avoidance and Complexity Control
       Summary
   Chapter 6 Similarity, Neighbors, and Clusters
       Similarity and Distance
       Nearest-Neighbor Reasoning
       Some Important Technical Details Relating to Similarities and Neighbors
       Clustering
       Stepping Back: Solving a Business Problem Versus Data Exploration
       Summary
   Chapter 7 Decision Analytic Thinking I: What Is a Good Model?
       Evaluating Classifiers
       Generalizing Beyond Classification
       A Key Analytical Framework: Expected Value
       Evaluation, Baseline Performance, and Implications for Investments in Data
       Summary
   Chapter 8 Visualizing Model Performance
       Ranking Instead of Classifying
       Profit Curves
       ROC Graphs and Curves
       The Area Under the ROC Curve (AUC)
       Cumulative Response and Lift Curves
       Example: churn performance analytics for modeling performance analytics, for modeling churn Performance Analytics for Churn Modeling
       Summary
   Chapter 9 Evidence and Probabilities
       Example: Targeting Online Consumers With Advertisements
       Combining Evidence Probabilistically
       Applying Bayes’ Rule to Data Science
       A Model of Evidence “Lift”
       Example: Evidence Lifts from Facebook "Likes"
       Summary
   Chapter 10 Representing and Mining Text
       Why Text Is Important
       Why Text Is Difficult
       Representation
       Example: Jazz Musicians
       * The Relationship of IDF to Entropy
       Beyond Bag of Words
       Example: Mining News Stories to Predict Stock Price Movement
       Summary
   Chapter 11 Decision Analytic Thinking II: Toward Analytical Engineering
       Targeting the Best Prospects for a Charity Mailing
       Our Churn Example Revisited with Even More Sophistication
   Chapter 12 Other Data Science Tasks and Techniques
       Co-occurrences and Associations: Finding Items That Go Together
       Profiling: Finding Typical Behavior
       Link Prediction and Social Recommendation
       Data Reduction, Latent Information, and Movie Recommendation
       Bias, Variance, and Ensemble Methods
       Data-Driven Causal Explanation and a Viral Marketing Example
       Summary
   Chapter 13 Data Science and Business Strategy
       Thinking Data-Analytically, Redux
       Achieving Competitive Advantage with Data Science
       Sustaining Competitive Advantage with Data Science
       Attracting and Nurturing Data Scientists and Their Teams
       Examine Data Science Case Studies
       Be Ready to Accept Creative Ideas from Any Source
       Be Ready to Evaluate Proposals for Data Science Projects
       A Firm’s Data Science Maturity
   Chapter 14 Conclusion
       The Fundamental Concepts of Data Science
       What Data Can’t Do: Humans in the Loop, Revisited
       Privacy, Ethics, and Mining Data About Individuals
       Is There More to Data Science?
       Final Example: From Crowd-Sourcing to Cloud-Sourcing
       Final Words

References

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2013 DataScienceforBusinessWhatYouNeFoster Provost
Tom Fawcett
Data Science for Business: What You Need to Know About Data Mining and Data-analytic Thinking