# Tests of difference

Before going into this set of exercises, let's start by selecting the requested subsets of data. Here we can see different methods for doing this

```
> spearheads[ Loo == 1 & Mat== 1, ] # it is a data frame, so we can treat it like a matrix (notice the final ‘, ’)
> Socle[ Loo == 1 & Mat== 1, ] # WRONG! DOESN'T WORK!
> Socle[ Loo == 1 & Mat== 1] # good, because Socle is a vector
```

It's better to use intermediate objects to store data to be analysed. This way they can be easily reused later. Remember to choose object names that you can easily retrieve. It's better to be much more verbose than Fletcher and Lock are in their book. Extra characters are for free! ;-).

```
> SocketLengthBronzeLoop <- Socle[ Loo == 1 & Mat == 1 ]
> SocketLengthBronzeLoop
[1] 8.1 7.2 3.4 6.0 5.9 3.5 3.5 4.3 4.5 5.4
> MaximumWidthBronzeLoop <- Maxwi[ Loo == 1 & Mat == 1 ]
[1] 2.7 2.8 3.9 4.8 5.7 2.8 3.6 2.8 5.3 2.4
```

Now our objects are defined. We can select subsets in this way whenever we need.

## Paired t-test

Run a t-test between socket length (`Socle`

) and and maximum width
(`Maxwi`

) for Bronze spearheads that have loops. The subset selection is
exemplified in the above section.

```
> t.test(SocketLengthBronzeLoop,MaximumWidthBronzeLoop, paired=TRUE)
Paired t-test
data: SocketLengthBronzeLoop and MaximumWidthBronzeLoop
t = 2.2461, df = 9, p-value = 0.05133
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.01074165 3.01074165
sample estimates:
mean of the differences
1.5
```

Data are not independent of each other as we expect that the more a spearhead is long, the more it is wide, and vice-versa. This is the reason why a paired test is required.

## Wilcoxon non-parametric test

Now let’s see a Wilcoxon non-parametric test on the same subsets.

```
> wilcox.test(SocketLengthBronzeLoop,MaximumWidthBronzeLoop)
Wilcoxon rank sum test with continuity correction
data: SocketLengthBronzeLoop and MaximumWidthBronzeLoop
W = 246, p-value = 0.008209
alternative hypothesis: true location shift is not equal to 0
Warning message:
impossibile calcolare p-value esatto in presenza di ties in: wilcox.test.default(soclebrloo, maxwibrloo)
```

A warning tells us there are several ties in each sample, which suggests strongly that these data are from a discrete distribution (probably due to rounding).

## Two sample t-test

Now we want to select another subset and run a two-sample t-test for a
difference in lower socket length (`Losoc`

) between those with pegs and
those without for Bronze spearheads. Here is the subset selection:

```
> LowerSocketBronzePeg <- Losoc[Mat == 1 & Peg == 1]
> LowerSocketBronzeNoPeg <- Losoc[Mat == 1 & Peg == 2]
```

And this is the test itself:

```
> t.test(LowerSocketBronzePeg,LowerSocketBronzeNoPeg, var.equal=T)
Two Sample t-test
data: LowerSocketBronzePeg and LowerSocketBronzeNoPeg
t = -2.6227, df = 19, p-value = 0.01675
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.66200752 -0.07435612
sample estimates:
mean of x mean of y
1.981818 2.350000
```

## Mann-Whitney test

General syntax:

```
> wilcox.test(A,B, paired=F)
```

In action:

```
> wilcox.test(LowerSocketBronzePeg, LowerSocketBronzeNoPeg, paired=F)
Wilcoxon rank sum test with continuity correction
data: LowerSocketBronzePeg and LowerSocketBronzeNoPeg
W = 23.5, p-value = 0.02805
alternative hypothesis: true location shift is not equal to 0
Warning message:
impossibile calcolare p-value esatto in presenza di ties in: wilcox.test.default(LowerSocketBronzePeg, LowerSocketBronzeNoPeg, paired = F)
```

## F-test

We do an F-test for equality of variances for Weight between Bronze spears with and without loops. Provided that the two samples are from normal populations, the general syntax is:

```
> var.test(A,B)
```

and the code for our case:

```
> var.test(LowerSocketBronzePeg,LowerSocketBronzeNoPeg)
F test to compare two variances
data: LowerSocketBronzePeg and LowerSocketBronzeNoPeg
F = 3.685, num df = 10, denom df = 9, p-value = 0.0625
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.9296497 13.9254996
sample estimates:
ratio of variances
3.685006
```