Why do you need statistical calculations




















We then perform a study and find the average height of Californians to be higher by 1. Subsequently, the lower the P-value, the more meaningful the result because it is less likely to be caused by noise or random chance. This significance value varies by situation and field of study, but the most commonly used value is 0. A Z-score can be converted to a P-value and vice versa using a programming language like R, or by simpler methods like an Excel formula, an online tool, a graphing calculator, or even a simple number table called the Z-score table.

For a Z-test, the normal distribution curve is used as an approximation for the distribution of the test statistic.

To carry out a Z-test, find a Z-score for your test or study and convert it to a P-value. If your P-value is lower than the significance level, you can conclude that your observation is statistically significant.

Imagine we work in the admissions department at University A, located in City X. To see if our students actually perform better, we poll students to share their test scores and find out that the average is 78 points with a standard deviation of 2.

Since we are trying to prove that our students perform better on the test, our null hypothesis is that the average score of students at University A is not above the city average. We begin by calculating the Z-score for this test by subtracting the population mean the City X average of 75 from our measured value 78 and dividing by the standard deviation 2.

This gives us a Z-score of That means that we can reject the null hypothesis. Now that you know how to calculate statistical significance, here are a few examples of places you can use statistical significance testing:. Keep in mind that you don't need to believe the null hypothesis.

Next, create an alternative hypothesis. Typically, your alternative hypothesis is the opposite of your null hypothesis since it'll state that there is, in fact, a statistically significant relationship between your data sets.

Your next step involves determining the significance level or rather, the alpha. This refers to the likelihood of rejecting the null hypothesis even when it's true. A common alpha is 0.

Next, you'll need to determine if you'll use a one-tailed test or a two-tailed test. Whereas the critical area of distribution is one-sided in a one-tailed test, it's two-sided in a two-tailed test.

In other words, one-tailed tests analyze the relationship between two variables in one direction and two-tailed tests analyze the relationship between two variables in two directions. If the sample you're using lands within the one-sided critical area, the alternative hypothesis is considered true. You'll then need to do a power analysis to determine your sample size. A power analysis involves the effect size, sample size, significance level and statistical power.

For this step, consider using a calculator. This type of analysis allows you to see the sample size you'll need to determine the effect of a given test within a degree of confidence. In other words, it'll let you know what sample size is suitable to determine statistical significance. For example, if your sample size ends up being too small, it won't give you an accurate result. Next, you'll need to calculate the standard deviation.

To this, you'll use the following formula:. Performing this calculation will let you know how to spread out your measurements are about the mean or expected value.

If you have more than one sample group, you'll also need to determine the variance between the sample groups. Next, you'll need to use the standard error formula. For our purposes, let's say you have two standard deviations for your two groups. In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions. Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:. However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:. Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups?

Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test. From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable.

Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics correlational study After collecting data from students, you tabulate descriptive statistics for annual parental income and GPA. Next, we can compute a correlation coefficient and perform a statistical test to understand the significance of the relationship between the variables in the population.

Step 4: Test hypotheses or make estimates with inferential statistics A number that describes a sample is called a statistic , while a number describing a population is called a parameter. Using inferential statistics , you can make conclusions about population parameters based on sample statistics. You can consider a sample statistic a point estimate for the population parameter when you have a representative sample e.

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not. Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true.

These tests give two main outputs:. Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics. Parametric tests make powerful inferences about the population based on sample data.

But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead. A regression models the extent to which changes in a predictor variable results in changes in outcome variable s.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample e. The z and t tests have subtypes based on the number and types of samples and the hypotheses:. The correlation coefficient r tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:. This assumption is easily met in the examples below. The point of this example is that one or both variables may have more than two levels, and that the variables do not have to have the same number of levels.

In this example, female has two levels male and female and ses has three levels low, medium and high. Please see the results from the chi squared example above. A one-way analysis of variance ANOVA is used when you have a categorical independent variable with two or more categories and a normally distributed interval dependent variable and you wish to test for differences in the means of the dependent variable broken down by the levels of the independent variable.

For example, using the hsb2 data file , say we wish to test whether the mean of write differs between the three program types prog. The command for this test would be:. The mean of the dependent variable differs significantly among the levels of program type. However, we do not know if the difference is between only two of the levels or all three of the levels.

The F test for the Model is the same as the F test for prog because prog was the only variable entered into the model. If other variables had also been entered, the F test for the Model would have been different from prog.

To see the mean of write for each level of program type,. From this we can see that the students in the academic program have the highest mean writing score, while students in the vocational program have the lowest. The Kruskal Wallis test is used when you have one independent variable with two or more levels and an ordinal dependent variable.

In other words, it is the non-parametric version of ANOVA and a generalized form of the Mann-Whitney test method since it permits two or more groups. We will use the same data file as the one way ANOVA example above the hsb2 data file and the same variables as in the example above, but we will not assume that write is a normally distributed interval variable.

If some of the scores receive tied ranks, then a correction factor is used, yielding a slightly different value of chi-squared. With or without ties, the results indicate that there is a statistically significant difference among the three type of programs. A paired samples t-test is used when you have two related observations i. For example, using the hsb2 data file we will test whether the mean of read is equal to the mean of write. The Wilcoxon signed rank sum test is the non-parametric version of a paired samples t-test.

You use the Wilcoxon signed rank sum test when you do not wish to assume that the difference between the two variables is interval and normally distributed but you do assume the difference is ordinal. We will use the same example as above, but we will not assume that the difference between read and write is interval and normally distributed. The results suggest that there is not a statistically significant difference between read and write. If you believe the differences between read and write were not ordinal but could merely be classified as positive and negative, then you may want to consider a sign test in lieu of sign rank test.

Again, we will use the same variables in this example and assume that this difference is not ordinal. These binary outcomes may be the same outcome variable on matched pairs like a case-control study or two outcome variables from a single group. Continuing with the hsb2 dataset used in several above examples, let us create two binary outcomes in our dataset: himath and hiread. These outcomes can be considered in a two-way contingency table.

The null hypothesis is that the proportion of students in the himath group is the same as the proportion of students in hiread group i. You would perform a one-way repeated measures analysis of variance if you had one categorical independent variable and a normally distributed interval dependent variable that was repeated at least twice for each subject.

This is the equivalent of the paired samples t-test, but allows for two or more levels of the categorical variable. This tests whether the mean of the dependent variable differs by the categorical variable. In this data set, y is the dependent variable, a is the repeated measure and s is the variable that indicates the subject number. You will notice that this output gives four different p-values. No matter which p-value you use, our results indicate that we have a statistically significant effect of a at the.

If you have a binary outcome measured repeatedly for each subject and you wish to run a logistic regression that accounts for the effect of multiple measures from single subjects, you can perform a repeated measures logistic regression. The exercise data file contains 3 pulse measurements from each of 30 people assigned to 2 different diet regiments and 3 different exercise regiments. A factorial ANOVA has two or more categorical independent variables either with or without the interactions and a single normally distributed interval dependent variable.

For example, using the hsb2 data file we will look at writing scores write as the dependent variable and gender female and socio-economic status ses as independent variables, and we will include an interaction of female by ses. Note that in SPSS, you do not need to have the interaction term s in your data set. You perform a Friedman test when you have one within-subjects independent variable with two or more levels and a dependent variable that is not interval and normally distributed but at least ordinal.

We will use this test to determine if there is a difference in the reading, writing and math scores. The null hypothesis in this test is that the distribution of the ranks of each type of score i. To conduct a Friedman test, the data need to be in a long format. SPSS handles this for you, but in other statistical packages you will have to reshape the data before you can conduct this test. Hence, there is no evidence that the distributions of the three types of scores are different.

Ordered logistic regression is used when the dependent variable is ordered, but not continuous. For example, using the hsb2 data file we will create an ordered variable called write3.



0コメント

  • 1000 / 1000