Before going into this set of exercises, let's start by selecting the requested subsets of data. Here we can see different methods for doing this
> spearheads[ Loo == 1 & Mat== 1, ] # it is a data frame, so we can treat it like a matrix (notice the final ‘, ’) > Socle[ Loo == 1 & Mat== 1, ] # WRONG! DOESN'T WORK! > Socle[ Loo == 1 & Mat== 1] # good, because Socle is a vector
It's better to use intermediate objects to store data to be analysed. This way they can be easily reused later. Remember to choose object names that you can easily retrieve. It's better to be much more verbose than Fletcher and Lock are in their book. Extra characters are for free! ;-).
> SocketLengthBronzeLoop <- Socle[ Loo == 1 & Mat == 1 ] > SocketLengthBronzeLoop  8.1 7.2 3.4 6.0 5.9 3.5 3.5 4.3 4.5 5.4 > MaximumWidthBronzeLoop <- Maxwi[ Loo == 1 & Mat == 1 ]  2.7 2.8 3.9 4.8 5.7 2.8 3.6 2.8 5.3 2.4
Now our objects are defined. We can select subsets in this way whenever we need.
Run a t-test between socket length (
Socle) and and maximum width
Maxwi) for Bronze spearheads that have loops. The subset selection is
exemplified in the above section.
> t.test(SocketLengthBronzeLoop,MaximumWidthBronzeLoop, paired=TRUE) Paired t-test data: SocketLengthBronzeLoop and MaximumWidthBronzeLoop t = 2.2461, df = 9, p-value = 0.05133 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.01074165 3.01074165 sample estimates: mean of the differences 1.5
Data are not independent of each other as we expect that the more a spearhead is long, the more it is wide, and vice-versa. This is the reason why a paired test is required.
Wilcoxon non-parametric test
Now let’s see a Wilcoxon non-parametric test on the same subsets.
> wilcox.test(SocketLengthBronzeLoop,MaximumWidthBronzeLoop) Wilcoxon rank sum test with continuity correction data: SocketLengthBronzeLoop and MaximumWidthBronzeLoop W = 246, p-value = 0.008209 alternative hypothesis: true location shift is not equal to 0 Warning message: impossibile calcolare p-value esatto in presenza di ties in: wilcox.test.default(soclebrloo, maxwibrloo)
A warning tells us there are several ties in each sample, which suggests strongly that these data are from a discrete distribution (probably due to rounding).
Two sample t-test
Now we want to select another subset and run a two-sample t-test for a
difference in lower socket length (
Losoc) between those with pegs and
those without for Bronze spearheads. Here is the subset selection:
> LowerSocketBronzePeg <- Losoc[Mat == 1 & Peg == 1] > LowerSocketBronzeNoPeg <- Losoc[Mat == 1 & Peg == 2]
And this is the test itself:
> t.test(LowerSocketBronzePeg,LowerSocketBronzeNoPeg, var.equal=T) Two Sample t-test data: LowerSocketBronzePeg and LowerSocketBronzeNoPeg t = -2.6227, df = 19, p-value = 0.01675 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.66200752 -0.07435612 sample estimates: mean of x mean of y 1.981818 2.350000
> wilcox.test(A,B, paired=F)
> wilcox.test(LowerSocketBronzePeg, LowerSocketBronzeNoPeg, paired=F) Wilcoxon rank sum test with continuity correction data: LowerSocketBronzePeg and LowerSocketBronzeNoPeg W = 23.5, p-value = 0.02805 alternative hypothesis: true location shift is not equal to 0 Warning message: impossibile calcolare p-value esatto in presenza di ties in: wilcox.test.default(LowerSocketBronzePeg, LowerSocketBronzeNoPeg, paired = F)
We do an F-test for equality of variances for Weight between Bronze spears with and without loops. Provided that the two samples are from normal populations, the general syntax is:
and the code for our case:
> var.test(LowerSocketBronzePeg,LowerSocketBronzeNoPeg) F test to compare two variances data: LowerSocketBronzePeg and LowerSocketBronzeNoPeg F = 3.685, num df = 10, denom df = 9, p-value = 0.0625 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.9296497 13.9254996 sample estimates: ratio of variances 3.685006