1998 AppliedMultivariateStatisticalA

From GM-RKB
(Redirected from Johnson & Wichern, 1998)
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

1 ASPECTS OF MULTIVARIATE ANALYSIS 1

1.1 Introduction 1

1.2 Applications of Multivariate Techniques 3

1.3 The Organization of Data 5

Arrays, 5
Descriptive Statistics, 6
Graphical Techniques, 11

1.4 Data Displays and Pictorial Representations 19

Linking Multiple Two-Dimensional Scatter Plots, 20
Graphs of Growth Curves, 24
Stars, 25
Chernoff Faces, 28

1.5 Distance 30

1.6 Final Comments 38

2 MATRIX ALGEBRA AND RANDOM VECTORS 50

  • 2.1 Introduction 50
  • 2.2 Some Basics of Matrix and Vector Algebra 50

Vectors, 50 Matrices, 55

  • 2.3 Positive Definite Matrices 61
  • 2.4 A Square-Root Matrix 66
  • 2.5 Random Vectors and Matrices 67
  • 2.6 Mean Vectors and Covariance Matrices 68

Partitioning the Covariance Matrix, 74 The Mean Vector and CovarianceMatrix for Linear Combinations of Random Variables, 76 Partitioning the Sample Mean Vector and Covariance Matrix, 78

  • 2.7 Matrix Inequalities and Maximization 79

vii viii Contents Supplement 2A: Vectors and Matrices: Basic Concepts 84 Vectors, 84 Matrices, 89

3 SAMPLE GEOMETRY AND RANDOM SAMPLING 112

  • 3.1 Introduction 112
  • 3.2 The Geometry of the Sample 112
  • 3.3 Random Samples and the Expected Values of the Sample Mean and

Covariance Matrix 120

  • 3.4 Generalized Variance 124

Situations in which the Generalized Sample Variance Is Zero, 130 Generalized Variance Determined by \R\ and Its Geometrical Interpretation, 136 Another Generalization of Variance, 138

  • 3.5 Sample Mean, Covariance, and Correlation

As Matrix Operations 139

  • 3.6 Sample Values of Linear Combinations of Variables 141

4 THE MULTIVARIATE NORMAL DISTRIBUTION 149

  • 4.1 Introduction 149
  • 4.2 The Multivariate Normal Density and Its Properties 149

Additional Properties of the Multivariate ' Normal Distribution, 156

  • 4.3 Sampling from a Multivariate Normal Distribution

and Maximum Likelihood Estimation 168 The Multivariate Normal Likelihood, 168 Maximum Likelihood Estimation of /x and %, 170 Sufficient Statistics, 173

  • 4.4 The Sampling Distribution of X and S 173

Properties of the Wishart Distribution, 174

  • 4.5 Large-Sample Behavior of X and S 175
  • 4.6 Assessing the Assumption of Normality 177

Evaluating the Normality of the Univariate Marginal Distributions, 178 Evaluating Bivariate Normality, 183

  • 4.7 Detecting Outliers and Cleaning Data 189

Steps for Detecting Outliers, 190

  • 4.8 Transformations To Near Normality 194

Transforming Multivariate Observations, 198

Contents ix

5 INFERENCES ABOUT A MEAN VECTOR 210

  • 5.1 Introduction 210
  • 5.2 The Plausibility of fi0 as a Value for a Normal

Population Mean 210

  • 5.3 Hotelling's T2 and Likelihood Ratio Tests 216

General Likelihood Ratio Method, 219

  • 5.4 Confidence Regions and Simultaneous Comparisons

of Component Means 220 Simultaneous Confidence Statements, 223 A Comparison of Simultaneous Confidence Intervals with One-at-a-Time Intervals, 229 The Bonferroni Method of Multiple Comparisons, 232

  • 5.5 Large Sample Inferences about a Population Mean Vector 234
  • 5.6 Multivariate Quality Control Charts 239

Charts for Monitoring a Sample of Individual Multivariate Observations for Stability, 241 Control Regions for Future Individual Observations, 247 Control Ellipse for Future Observations, 248 T2-Chart for Future Observations, 248 Control Charts Based on Subsample Means, 249 Control Regions for Future Subsample Observations, 251

  • 5.7 Inferences about Mean Vectors

when Some Observations Are Missing 252

  • 5.8 Difficulties Due to Time Dependence

in Multivariate Observations 256 Supplement 5A: Simultaneous Confidence Intervals and Ellipses as Shadows of the p-Dimensional Ellipsoids 258

6 COMPARISONS OF SEVERAL MULTIVARIATE MEANS 272

  • 6.1 Introduction 272
  • 6.2 Paired Comparisons and a Repeated Measures Design 272

Paired Comparisons, 272 A Repeated Measures Design for Comparing Treatments, 278

  • 6.3 Comparing Mean Vectors from Two Populations 283

Assumptions Concerning the Structure of the Data, 283 Further Assumptions when nt and n2Are Small, 284 Simultaneous Confidence Intervals, 287 The Two-Sample Situation when t\ ¥= t2,290

  • 6.4 Comparing Several Multivariate Population Means

(One-Way Manova) 293 Assumptions about the Structure of the Data for One-way MANOVA, 293 A Summary of Univariate ANOVA, 293 Multivariate Analysis of Variance (MANOVA), 298 Contents

  • 6.5 Simultaneous Confidence Intervals for Treatment Effects 305
  • 6.6 Two-Way Multivariate Analysis of Variance 307

Univariate Two-Way Fixed-Effects Model with Interaction, 307 Multivariate Two-Way Fixed-Effects Model with Interaction, 309

  • 6.7 Profile Analysis 318
  • 6.8 Repeated Measures Designs and Growth Curves 323
  • 6.9 Perspectives and a Strategy for Analyzing

Multivariate Models 327

7 MULTIVARIATE LINEAR REGRESSION MODELS 354

7.1 Introduction 354

Regression analysis is the statistical methodology for predicting values of one or more response (dependent) variables from a collection of [[predictor (independent) variable value]]s. It can also be used for assessing the effects of the predictor variables on the responses. Unfortunately, the name regression culled from the title of the first paper on the subject by F. Galton [13], in no way reflects either the importance or breath of application of this methodology. …

7.2 The Classical Linear Regression Model 354

Let [math]\displaystyle{ z_1, z_2, ..., z_r }[/math] be [math]\displaystyle{ r }[/math] predictor variables through to be related to a response variable [math]\displaystyle{ Y }[/math]

7.3 Least Squares Estimation 358

Sum-of-Squares Decomposition, 360
Geometry of Least Squares, 361
Sampling Properties of Classical Least Squares Estimators, 363

7.4 Inferences About the Regression Model 365

Inferences Concerning the Regression Parameters, 365
Likelihood Ratio Tests for the Regression Parameters, 370

7.5 Inferences from the Estimated Regression Function 374

Estimating the Regression Function at z0, 374
Forecasting a New Observation at za, 375

7.6 Model Checking and Other Aspects of Regression 377

Does the Model Fit?, 377
Leverage and Influence, 380
Additional Problems in Linear Regression, 380

7.7 Multivariate Multiple Regression 383

Likelihood Ratio Tests for Regression Parameters, 392
Other Multivariate Test Statistics, 395
Predictions from Multivariate Multiple Regressions, 395

7.8 The Concept of Linear Regression 398

Prediction of Several Variables, 403
Partial Correlation Coefficient, 406

7.9 Comparing the Two Formulations of the Regression Model 407

Mean Corrected Form of the Regression Model, 407
Relating the Formulations, 409

7.10 Multiple Regression Models with Time Dependent Errors 410

Supplement 7A
The Distribution of the Likelihood Ratio
for the Multivariate Multiple Regression Model 415

8 PRINCIPAL COMPONENTS 426

  • 8.1 Introduction 426
  • 8.2 Population Principal Components 426

Principal Components Obtained from Standardized Variables, 432 Principal Components for Covariance Matrices with Special Structures, 435

  • 8.3 Summarizing Sample Variation by Principal Components 437

The Number of Principal Components, 440 Interpretation of the Sample Principal Components, 444 Standardizing the Sample Principal Components, 445

  • 8.4 Graphing the Principal Components 450
  • 8.5 Large Sample Inferences^ 452

Large Sample Properties of kj and e,-, 452 Testing for the Equal Correlation Structure, 453

  • 8.6 Monitoring Quality with Principal Components 455

Checking a Given Set of Measurements for Stability, 455 Controlling Future Values, 459 Supplement 8A: The Geometry of the Sample Principal Component Approximation 462 The p-Dimensional Geometrical Interpretation, 464 The n-Dimensional Geometrical Interpretation, 465

9 FACTOR ANALYSIS AND INFERENCE FOR STRUCTURED COVARIANCE MATRICES 477

9.1 Introduction 477

9.2 The Orthogonal Factor Model 478

9.3 Methods of Estimation 484

The Principal Component (and Principal Factor) Method, 484
A Modified Approach — the Principal Factor Solution, 490
The Maximum Likelihood Method, 492
A Large Sample Test for the Number of Common Factors, 498

9.4 Factor Rotation 501

Oblique Rotations, 509

9.5 Factor Scores 510

The Weighted Least Squares Method, 511
The Regression Method, 513

9.6 Perspectives and a Strategy for Factor Analysis 517

9.7 Structural Equation Models 524

The LISREL Model, 525
Construction of a Path Diagram, 525
Covariance Structure, 526
Estimation, 527
Model-Fitting Strategy, 529
Recommended Computational Scheme, 531
Maximum Likelihood Estimators of p = LZL'7+ »|/7, 532

10 CANONICAL CORRELATION ANALYSIS 543

10.1 Introduction 543 10.2 Canonical Variates and Canonical Correlations 543 10.3 Interpreting the Population Canonical Variables 551 Identifying the Canonical Variables, 551 Canonical Correlations as Generalizations of Other Correlation Coefficients, 553 The First r Canonical Variables as a Summary of Variability, 554 A Geometrical Interpretation of the Population Canonical Correlation Analysis 555 10.4 The Sample Canonical Variates and Sample Canonical Correlations 556 10.5 Additional Sample Descriptive Measures 564 Matrices of Errors of Approximations, 564 Proportions of Explained Sample Variance, 567 10.6 Large Sample Inferences 569

11 DISCRIMINATION AND CLASSIFICATION 581

11.1 Introduction 581 11.2 Separation and Classification for Two Populations 582 11.3 Classification with Two Multivariate Normal Populations 590 Classification of Normal Populations When X i = X2 = X, 590 Scaling, 595 Classification of Normal Populations When X\ =t X2, 596 11.4 Evaluating Classification Functions 598 11.5 Fisher's Discriminant Function — Separation of Populations 609 11.6 Classification with Several Populations 612 The Minimum Expected Cost of Misclassification Method, 613 Classification with Normal Populations, 616 11.7 Fisher's Method for Discriminating among Several Populations 628 Using Fisher's Discriminants to Classify Objects, 635 11.8 Final Comments 641 Including Qualitative Variables, 641 Classification Trees, 641 Neural Networks, 644 Contents xiii Selection of Variables, 645 Testing for Group Differences, 645 Graphics, 646 Practical Considerations Regarding Multivariate Normality, 646

12 CLUSTERING, DISTANCE METHODS, AND ORDINATION 668

12.1 Introduction 668 12.2 Similarity Measures 670 Distances and Similarity Coefficients for Pairs of Items, 670 Similarities and Association Measures for Pairs of Variables, 676 Concluding Comments on Similarity, 677 12.3 Hierarchical Clustering Methods 679 Single Linkage, 681 Complete Linkage, 685 Average Linkage, 689 Ward's Hierarchical Clustering Method, 690 Final Comments — Hierarchical Procedures, 693 12.4 Nonhierarchical Clustering Methods 694 K-means Method, 694 Final Comments — Nonhierarchical Procedures, 698 12.5 Multidimensional Scaling 700 The Basic Algorithm, 700 12.6 Correspondence Analysis 709 Algebraic Development of Correspondence Analysis, 711 Inertia, 718 Interpretation in Two Dimensions, 719 Final Comments, 719 12.1 Biplots for Viewing Sampling Units and Variables 719 Constructing Biplots, 720 12.8 Procrustes Analysis: A Method for Comparing Configurations 723 . Constructing the Procrustes Measure of Agreement, 724

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1998 AppliedMultivariateStatisticalARichard Arnold Johnson
Dean W. Wichern
Applied Multivariate Statistical Analysis1998