Description
The world suddenly has become awash in data! A great many popular books have been written recently that extol “big data” and the information derived for decision makers. These data are considered “big” because a certain “catalog” of data may be so large that traditional ways of managing and analyzing such information cannot easily accommodate it. The data originate from you and me whenever we use certain social media, or make purchases online, or have information derived from us through radio frequency identification (RFID) readers attached to clothing and cars, even implanted in animals, and so on. The result is a massive avalanche of information that exists for businesses leaders, decision makers, and researchers to use for predicting related behaviors and attitudes.
BIG DATA ANALYSIS
Decision makers are trying to figure out how to manage and use the information available. Typical computer software used for statistical decision making is currently limited to a number of cases far below that which is available for consideration of big data. A traditional approach to address this issue is known as “data mining” in which a number of techniques, including statistics, are used to discover patterns in a large set of data.
Researchers may be overjoyed with the availability of such rich data, but it pro-vides both opportunities and challenges. On the opportunity side, never before have such large amounts of information been available to assist researchers and policy makers understand widespread public thinking and behavior. On the challenge side however are several difficult questions:
• How are such data to be examined?
• Do current social science methods and processes provide guidance to examining data sets that surpass historical data-gathering capacity?
• Are big data representative?
• Do data sets so large obviate the need for probability-based research analyses?
• Do decision makers understand how to use social science methodology to assist in their analyses of emerging data?
• Will the decisions emerging from big data be used ethically, within the context to social science research guidelines?
• Will effect size considerations overshadow questions of significance testing?
Social scientists can rely on existing statistical methods to manage and analyze big data, but the way in which the analyses are used for decision making will change. One trend is that prediction may be hailed as a more prominent method for under-standing the data than traditional hypothesis testing. We will have more to say about this distinction later in the book, but it is important at this point to see that researchers will need to adapt statistical approaches for analyzing big data.
VISUAL DATA ANALYSIS
Another emerging trend for understanding and managing the swell of data is the use of visuals. Of course, visual descriptions of data have been used for centuries. It is commonly acknowledged that the first “pie chart” was published by Playfair (1801). Playfair’s example in Figure 1.1 compares the dynamics of nations over time.
Figure 1.1 compared nations using size, color, and orientation over time. Using this method for comparing information has been useful for viewing the patterns in data not readily observable from numerical analysis.
As with numerical methods, however, there are opportunities and challenges in the use of visual analyses:
• Can visual means be used to convey complex meaning?
• Are there “rules” that will help to insure a standard way of creating, analyzing, and interpreting such visual information?
• Will visual analyses become divorced from numerical analysis so that observers have no way of objectively confirming the meaning of the images?