Data Analysis Techniques

Describing how to do data analysis is not easy. There is really no recipe to follow. There are some basic techniques and principles to learn. There are some pitfalls to avoid. But beyond that there is just the task of careful and meticulous thought and imagination. There is no formula for that.

It is important, at this stage, to recruit the help and opinions of insiders in the population under study. Their observations will come from the world-view of the respondents. This may yield explanations quite different from those of an outsider.

One purpose of data analysis is to discern the factors and causes which underlie the phenomena we observe. Why are these churches growing while the others are not? Why are some conference participants motivated to change their behavior and others not? Why is this people group comparatively receptive to the Gospel while the other is not? Why do some people drop out of church after a certain period while others do not? Why is this denomination able to plant churches in a given area while another is not?

Measurement of Social Phenomena

One of the more interesting problems of social research is the measurement of social realities. It is one thing to measure how many calories of heat is required to bring a beaker of water to a boil, it is quite another to measure the impact of economic class on religious commitment. How can religious commitment be measured? Another problem arises when definitions are not clear. How can we estimate percent evangelical when the term evangelical means different things to different people?

Researchers are forced to devise indicators which approximate the measures which they are really interested in. An indicator for religious commitment might include church membership, church attendance, tithing or some combination of these as a proxy for "religious commitment." The problem of unclear definitions can be addressed by the researcher clearly specifying his meaning of the terms he is using. In the case of defining evangelical, the researcher may avoid a theological definition by listing denominations who are included within the evangelical population. There will always be disagreement about who is and who isn’t an evangelical but the researcher cannot let that deter him from getting the information needed.

It is helpful to keep several basic variables of social research n mind while doing data analysis.

1. Functions and Randomness

If the purpose of a research project is to discover the causes of a given phenomenon, some phenomena are so closely related that one can exactly predict one variable if he has knowledge of another variable. If l know that a jet is traveling at a steady 500 miles an hour, then I also know that in three hours that jet will cover 1500 miles. This is called a functional relationship; the distance traveled is a function of the speed.

Most phenomena we observe, however, have an element of randomness which obscures any functional relationship which may exist. For example, there is some relationship between a person's height and their weight. But it is not a functional relationship. Sampling and statistical techniques have been developed to help the researcher see through the randomness and to discern if there is some underlying relationship between variables.

2. Dependent and Independent Variables

The variable which a researcher has set out to understand, say church growth rate, is called the dependent variable. Other variables which are thought to influence it, such as receptivity of the people, the various methods of evangelism used, the availability of trained leadership, the underlying spiritual realities, are called independent variables. The researcher‘s hypothesis is that church growth "depends" on these other variables. The intention is to find out how significant is the influence and in which direction is the influence of these independent variables on church growth.

3. Explanatory and Extraneous Variables

It is important for the researcher to keep in mind that his survey or experiment will not be performed in a laboratory where conditions can be perfectly controlled. There will be many factors which may influence what happens and may even thwart the purpose of the research. The researcher first hypothesizes which variables have influence on which other variables. These independent and dependent variables are called explanatory variables. They are the object of the research project. In addition, he must anticipate what extraneous factors may be at work. Some of these extraneous factors can be accounted for, some cannot. For example, in a study to test the hypothesis "women have a different degree of religious commitment than men" the explanatory variables obviously include sex and an indicator of religious commitment. But are there other factors which need to be accounted for? How about age, education level, social class or birth order? These things can be accounted for simply by obtaining measurements about these factors for each respondent. But what if there is some significant influence stemming from nearly forgotten childhood experiences. How could those be accounted for? They could not. In the case of such uncontrollable extraneous factors the researcher may treat them as randomized errors, part of the sampling error.

4. Categorical and Continuous Variables

Variables can be of two types. A categorical variable can take on a finite number of discrete values. A continuous variable can take on an infinite number of values.
Sometimes continuous variables are transformed into categorical variables. Examples of Categorical Variables are: gender, religion, occupation, and membership.Continuous Variables are age, temperature, growth rate, salary.Age could be transformed into a categorical variable with discrete values indicating O to 5 years, 5 to 10 years, etc. Independent and dependent variables can be either categorical or continuous. Care should be taken to use the right statistical technique depending on what kind of variables are being used to predict what. (See Alreck and Stettle for more on this.)

Qualitative and Quantitative Analysis

It is possible to distinguish qualitative and quantitative analysis. Quantitative analysis extracts information from numerical measurements of the sample. Counting, averaging, comparing, tabulating and the use of any statistical technique are examples of quantitative analysis. Quantitative analysis requires interpretation. Interpretation of quantitative analysis gets into qualitative analysis. Qualitative analysis has to do with the discovery of patterns, paradigms, categories and relationships. Qualitative analysis is built on quantitative analysis. Some data gathered using observation or interview will probably not be amenable to quantitative analysis. Instead, the researcher will need to call forth objectivity and clear thinking and the guidance of the Holy Spirit as he or she attempts to see through the evidence to the categories and relationships underlying the phenomena in question.

For example, in a study of 10 growing churches in Manila, a researcher may have notes from ten interviews with the respective pastors of the churches. There may be some tabulation possible but since the sample is so small it is unlikely that any statistical inferences could be made. There probably won’t be much quantitative work possible here unless members of the church were surveyed. Still, it will be possible for the researcher to notice commonalities between the ten churches: their programs, their pastors, the demographic profiles of the congregations, other features. The existence of patterns will suggest the presence of causal relationships. These causal relationships are what we are looking for! (These churches grew BECAUSE the pastor was a charismatic, powerful preacher). Often these patterns suggest further questions and hypotheses which can be followed up on.

A note of caution: It is very hard to prove the existence of a causal relationship. Often all we can prove is a correlation between the supposed cause and its supposed effect. In the above example we could not really say that THE reason that the churches grew was the preaching of the pastor. Perhaps it was some other factor which was present but which we missed in our analysis. What about the growing churches with comparatively weak preaching? To prove that it was the preaching one would probably have to design an experiment to test this hypothesis. Very tough to do! However, even if we can't prove a causal link between two phenomena, the mere fact of correlation can be an important finding. Just be careful not to say, "A caused B." instead, say, "A seems to be associated with B."

The Quantitative Data Analysis Process

Data gathered using surveys or experimentation requires the use of quantitative analysis. Quantitative analysis is the domain of statistics. A statistic is a single number which captures important information contained in a whole bunch of other numbers. For example, the average is a single number which captures the central tendency of a large group of numbers.

Steps in quantitative data analysis:

  1. Cleaning/Verifying the raw data
  2. Categorizing and coding
  3. Data entry
  4. Tabulation and application of statistical methods

1. Cleaning/Verifying the Raw Data

Before any analysis begins, the raw data has to be verified or cleaned. This entails checking to see if all the questions have been answered. Are the answers clearly indicated? Are there any duplicates? Are there missing values which need to be filled in? Are there any answers so far out of line that they should be thrown out? If it is pretty clear that the respondent misunderstood the question, the answer is invalid and should be left out.

2. Categorizing and Coding

This involves identifying the classes into which each answer for each categorical question will fall. The set of categories which make up possible answers should be exhaustive. They should be mutually exclusive and should represent only one dimension or characteristic. Be careful about allowing bias to enter as responses are assigned to categories.

    Examples of Data Categories
    Gender: Male, Female
    Religion: Buddhist, Muslim, Christian, etc.
    In School: Yes, No, Part-time

If computers will be used to tabulate data, it may be necessary to assign a code for each category for each question. The document which shows the meaning of the codes is called the code book. The code book can be prepared after the pretest of the questionnaire and updated if necessary after the questionnaires have been filled out.

3. Data Entry

In large projects a computer will be essential. Data entry is the process of encoding answers from the questionnaires into a computer file to facilitate analysis. This needs to be done carefully since errors can easily creep in at this stage. The data should be printed out and verified to assure that the actual answers from the questionnaires are encoded correctly! lt is very helpful to decide on the method of data entry (for example: which software you are going to use) before completing the final form of the questionnaire. Test the form for ease of data entry in addition to testing it with the target audience.

4. Tabulation and Application of Statistical Methods

Tabulation is the process of finding out how many of the observations fall into the various categories. For example:

    Example of tabulation: Question #5: In school?
    Yes 50
    No 10
    Part time 15
    Total 50

Cross tabulation is often very helpful in discovering patterns in the data. For example:
Example of cross tabulation: Question #5: In school?

    Category Boys Girls Total
    Yes 8 17 25
    No 5 10 15
    Part time 9 1 10
    Total 22 28 50

From this cross-tab it would appear that girls tend to be in school more frequently than boys. This hypothesis could be tested by using the chi-squared statistic, which leads us to...

Basic Statistics

Since mission oriented research seldom has extensive numerical data to work with, exotic statistical methods are seldom needed. A good understanding of elementary statistics is usually sufficient. A solid understanding of means, medians, modes, minimum, maximum, range, standard deviations, the normal distribution, skewedness and chi-squares are a good beginning. There are many good books on statistics which can provide an understanding of these. Any good bookstore will have a section of books on statistics. A basic introductory book on statistics will cover all the topics mentioned here.

Significant Figures

It is important not to imply a high degree of precision and accuracy by the use of many figures after a decimal point when it is not justified. One may be able to calculate a people group's level on the Engle scale to four significant figures, but the instrument and the scale itself surely do not warrant such precision!

Source: OC Research Manual