Description
The explosion in the development of methods for analyzing categorical data that began in the 1960s has continued apace in recent years. This book provides an overview of these methods, as well as older, now standard, methods. It gives special emphasis to generalized linear modeling techniques, which extend linear model methods for continuous variables, and their extensions for multivariate responses.
Today, because of this development and the ubiquity of categorical data in applications, most statistics and biostatistics departments offer courses on categorical data analysis. This book can be used as a text for such courses.
The material in Chapters 17 forms the heart of most courses. Chapters 13 cover distributions for categorical responses and traditional methods for two-way contingency tables. Chapters 47 introduce logistic regression and related logit models for binary and multicategory response variables. Chapters 8 and 9 cover loglinear models for contingency tables. Over time, this model class seems to have lost importance, and this edition reduces somewhat its discussion of them and expands its focus on logistic regression.
In the past decade, the major area of new research has been the development of methods for repeated measurement and other forms of clustered categorical data. Chapters 1013 present these methods, including marginal models and generalized linear mixed models with random effects. Chapters 14 and 15 present theoretical foundations as well as alternatives to the maximum likelihood paradigm that this text adopts. Chapter 16 is devoted to a historical overview of the development of the methods. It examines contributions of noted statisticians, such as Pearson and Fisher, whose pioneering effortsand sometimes vocal debatesbroke the ground for this evolution.
Every chapter of the first edition has been extensively rewritten, and some substantial additions and changes have occurred. The major differences are:
– A new Chapter 1 that introduces distributions and methods of inference
for categorical data.
- A unified presentation of models as special cases of generalized linear
models, starting in Chapter 4 and then throughout the text.
– Greater emphasis on logistic regression for binary response variables and extensions for multicategory responses, with Chapters 47 introducing models and Chapters 1013 extending them for clustered data.
– Three new chapters on methods for clustered, correlated categorical data, increasingly important in applications.
– A new chapter on the historical development of the methods.
– More discussion of ‘‘exact’’ small-sample procedures and of conditional logistic regression.
In this text, I interpret categorical data analysis to refer to methods for
categorical response variables. For most methods, explanatory variables can
be qualitative or quantitative, as in ordinary regression. Thus, the focus is
intended to be more general than contingency table analysis, although for
simplicity of data presentation, most examples use contingency tables. These
examples are often simplistic, but should help readers focus on understanding
the methods themselves and make it easier for them to replicate results
with their favorite software.
Special features of the text include:
– More than 100 analyses of ‘‘real’’ data sets.
– More than 600 exercises at the end of the chapters, some directed towards theory and methods and some towards applications and data analysis.
– An appendix that shows, by chapter, the use of SAS for performing analyses presented in this book.
– Notes at the end of each chapter that provide references for recent research and many topics not covered in the text.