# T-test

## t-test in R

The t-test is probably the best know statistical test.

Baiscally the **t-test** can be used to compare a) if the avarage of a given sample is different from 0 or b) if the averages of two (independent) samples are different.

The individual values in each sample should follow the **normal distribution** and the samples should be **independet**.
for testing Normality in R you may use the Shapiro-test

Before launching the test it is essential to define the **hypothesis to be tested** and the Ho (hypothesis of the inverse). Averages may be tested "two-sided" for (not-)equality (the hypothesis doesnt specify if average_1 is larger or smaller than average_2), or single-sided (where larger or samller has to be chosen).
The initial t-test assumes equal variance in both samples, if you think this is not the case the Welch-correction allows to use for each sample individual estimations of the standard deviation. in fact, the default implementation in R does already the Welch-correction.

Run the test in R as :

samp1 <- c(2:10,4:6) samp2 <- c(6:11,9,10,14) # test the hypothesis that the averages of samp1 and samp2 are equal (ie Ho aver(samp1) equal aver(samp2) ) t.test(samp1, samp2)

will return the t-value, the degrees of freedom, the p-value, the 95% confidence interval and the sample (estimated) means. If you simply want the p-values type :

t.test(samp1, samp2)$p.value

In this particular example the probability (p-value) for the hypothesis of both averages being equal is quite samll, therefore one may consider the averages of both samples as significaltly different (ie below the calssical a=5% threshold) since :

t.test(samp1, samp2)$p.value < 0.05

** Special cases and Assumptions** :

As mentioned before, t-test assumes INDEPENDENCE of the variables to be tested ! Note, that in many settings in Bioinformatics such independence is not entirely granted (eg genes may potentially be co-regulated...).

When running many t-test a special correction for the multiple testing should be applied. For example this is the case with many testing situation many genes present on a single microarray.