1

I am working on 2 sample independent t-test. I have conducted analysis on test group vs control group and I have to write a report but I have few questions.

  1. Do we have to take out the outliers and then perform t-test?

  2. Once I perform t-test- can anybody explain the t-test output? The explanation should not be in terms of statistical terms but in such a way that non business person can also understand. I need simple explanation for confidence intervals and difference in means of the two samples.

  3. What kind of charts can we draw to represent our results?

Ethan
  • 1,625
  • 8
  • 23
  • 39
pinky
  • 151
  • 1
  • 8
  • if you're data is not normally distributed, then use a non-parametric test like Mann-Whitney U test: https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test – dmb Mar 24 '16 at 21:36

2 Answers2

1

It's fine to do a t-test on unequal sample size, however, the power wouldn't be as good as equal sample size.

1:) Yes or no. Impossible to say without plotting the outliers. What's more important, can you assume your data be normally distributed? Have you checked the QQ-plot? Have you checked the histogram? Do they look like close to a normal distribution? While the t-test is robust against non-normal data as long as the sample size is sufficient large, your data shouldn't behave too far away from a normal.

When you think about outliers, ask yourself the following questions:

  • How many outliers? If you have many, t-test is probably not appropriate.
  • Why the outliers? If it's a random error (you're just unlucky), you could include it in the t-test. If it's a systematic error, stop the test, go back and check your data.
  • How do you define the outliers?
  • Do those outliers look symmetry? If so, you might assume your sample come from a normal population. You can check the skewness of your data.

You have to try to understand those outliers to come with up a decision.

2:) You can just explain like "the probability of the difference in means is (or isn't) significant".

3:) You should draw a box-plot for each group.

ABCD
  • 3,510
  • 2
  • 18
  • 30
  • Thank yo so much. I conducted non parametric tests on the data as the data was not that normal and had many outliers.I tried to remove outliers to get normal distribution but I was loosing data and so carried out non parametric test. Regarding the boxplot- what can I explain about each boxplot? What are the important points to list while explaining boxplots? – pinky Mar 26 '16 at 18:04
0

1) Maybe, remember that you are assuming a normal distributions, if you don't satisfy those assumptions you are not running a valid test.

2)You are testing whether or not the difference is zero, i.e. no difference=zero in my confidence interval.

3)Bar charts are the easiest to understand because you can see the difference. Box-plots provide more info but are for technical people only.

Ryan
  • 702
  • 3
  • 11
  • I just have 40 observations in a sample and 400 observations in another sample.There are many outliers in first sample. If I remove those, I would have very less data. So I conducted t-test with the original dataset. 2) If boxplots have to be explained to non -technical people, what can we explain them? 3) Can you suggest a better way to represent confidence intervals? – pinky Mar 24 '16 at 21:21
  • For this particular study, I have 6 samples of data. 3 of test and 3 of control. Test groups have very less data (40 observations in each) but control has around 400 observations in each sample. I have conducted t-test with each respective test and control groups. Now I need to represent the confidence intervals of these t-test in a chart in R. How can I do that? – pinky Mar 24 '16 at 22:47
  • What is the difference between the 3 test (/treatment) sets? – TBSRounder Mar 25 '16 at 14:02
  • You can add error bars to your bar chart,to stand in for your confidence intervals. – Ryan Mar 25 '16 at 14:42