# What is Statistics

When I first began to read original research papers, I would skim over the statistical part to get to the conclusions. I understood statistics enough to tell if group A and B were different, and since the rest didn’t make sense to me I skipped it. I’m a little wiser now, and have a strong base of statistical knowledge to inform my reading. Honestly, it changes the research reading completely.

**Statistics **is simply the study of numerical data – how to collect it, analyze it and interpret it. In research, we use statistical testing to determine if the results of a study show a true phenomenon or were the result of chance.

Statistics can be broken down into two broad categories – descriptive or inferential. **Descriptive** statistics allow us to organize and summarize information from data. **Inferential **statistics lets us use a sample to draw conclusions about a population.

The **population** is the group of every individual you are interested in. For example, you may want to know about all the women of childbearing age in the United States. Or, you may want to know about all the women of childbearing age in the United States who were born in Mexico and primarily speak Spanish in their home. Both are populations and would be legitimate for a study. As the researcher, you define what population the study will examine.

The **sample** is the group of individuals you are able to collect information about. In inferential statistics, this group of individuals allows you to make estimates about the population.

**Answering a Question**

We like statistics because it helps us to answer a question. But for statistics to be useful we need to create a very specific question. The question is about the relationship between two variables. The **Independent **variable is the characteristic of interest, often thought of as the **exposure**. The **Dependent** variable is the outcome that depends on the independent variable.

**Birth Worker Survey**

The Birth Worker Survey can give us an example of independent and dependent variables.

Starting with the basics, we had 31 completed surveys. This is helpful statistically because 30 is a target number for being able to make assumptions about the normality of the mean of a variable, but that gets very technical and beyond what you need to know. Just be aware that all my hounding you to respond gave us a unusable sample for analyzing.

Our **n (or size of the sample)** is 31. Of those 24 women (incidentally, the respondents were all female) reported they do now or did in the past work as a doula.

If you remember, the Beliefs about birth questions were ranked 1-5, with 1 being strongly agree and 5 being strongly disagree. The simplest method to compare the two groups is to take the **mean** of the scores (the average of the scored values). When we do this, we find the doula group has a mean score of 1.912, indicating the doula group agrees that women should have a doula. The non-doula group has a mean score of 2.286, closer to an indifferent score.

So, is the dependent variable related to the independent variable? The means were different, but we have such a small sample that using this test we don’t get a statistically significant difference. The p-value is only .365, and the confidence interval for the difference between the means goes from -1.189 to 0.4509. This data and this test have not given evidence of a difference in belief about women having a doula in labor between doulas and other birth workers.

If you got a little lost, don’t worry. This is just day one, we will take time to discuss all of this later.

## Point to Remember

While statistics is helpful for identifying the difference between a true phenomenon and a random result, it is important to remember statistics are only one piece of the design of a study that help you determine if findings are valid. The math can be good, and the result can be poor simply because the wrong sample was used or the wrong data were collected. We won’t go into all the aspects of a good study this summer, but perhaps we can plan for a series on research in the future. Later this week, we will start exploring data.

#### Coming Up

In the next post we will begin to explore the variable.