Computerintensive Methoden - Coalescent Theory - Project C

From StatWiki
Jump to: navigation, search

The following data were taken from the segregating sites in a sequence of nucleotides from the Y chromosome of 133 Asians. Fourteen segregating sites were found and 9 different alleles. At each site 0 represents the ancestral variant (as observed in the majority of a reasonably large sample of chimpanzees). The alleles observed and their frequencies are given below.



Alleles
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
C 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0
F 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0
G 0 1 0 0 0 1 0 1 0 1 0 0 0 1 1
H 0 1 0 0 0 1 0 1 1 0 0 0 0 0 0
J 0 1 0 0 0 1 0 1 0 1 1 1 1 0 0
N 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
P 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Q 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0
R 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0

Task 1

Calculate the matrix given the Hamming distance between each allele.

REngine.php: > rpdf<-'/var/www/localhost/htdocs/StatWiki/Rfiles/R/e183cbb6cc17df0e110cd42eb4d51caa869f2c2c_%i.pdf'
> rpdfno<-0
> rhtml<-''
> rfiles<-'/var/www/localhost/htdocs/StatWiki/Rfiles/R/'
> source('/var/www/localhost/htdocs/StatWiki/Rfiles/R/@.R')
> rout<-'text'
> cat('<!--- Start of program --->\n')
<!--- Start of program --->
> d=dist(alleles,method="manhattan")
Error in as.matrix(x) : object 'alleles' not found
Calls: dist -> as.matrix
Execution halted
in

d=dist(alleles,method="manhattan") d

Task 2

Calculate the nucleotide diversity.

REngine.php: > rpdf<-'/var/www/localhost/htdocs/StatWiki/Rfiles/R/70b6d76d881857f7e636a65d487a1fec6f4fd24c_%i.pdf'
> rpdfno<-0
> rhtml<-''
> rfiles<-'/var/www/localhost/htdocs/StatWiki/Rfiles/R/'
> source('/var/www/localhost/htdocs/StatWiki/Rfiles/R/@.R')
> rout<-'text'
> cat('<!--- Start of program --->\n')
<!--- Start of program --->
> theta.pi=sum(as.dist(as.matrix(d) * (freq %o% freq)))*(2/(sum(freq)*(sum(freq)-1)))
Error in as.matrix(d) : object 'd' not found
Calls: as.dist -> as.matrix
Execution halted
in

theta.pi=sum(as.dist(as.matrix(d) * (freq %o% freq)))*(2/(sum(freq)*(sum(freq)-1))) theta.pi

Task 3

Carry out the Tajima test to verify the Wright-Fisher model.

REngine.php: > rpdf<-'/var/www/localhost/htdocs/StatWiki/Rfiles/R/ca556cf0d857ca74d03700fe463e09cd7be2ab18_%i.pdf'
> rpdfno<-0
> rhtml<-''
> rfiles<-'/var/www/localhost/htdocs/StatWiki/Rfiles/R/'
> source('/var/www/localhost/htdocs/StatWiki/Rfiles/R/@.R')
> rout<-'text'
> cat('<!--- Start of program --->\n')
<!--- Start of program --->
> S=dim(alleles)[2]
Error: object 'alleles' not found
Execution halted
in

S=dim(alleles)[2] n=sum(freq)

theta.l=S/sum(1/(1:(n-1)))

an=sum(1/1:(n-1)) bn=sum(1/((1:(n-1))^2))

e1 = (n+1)/(3*an * (n-1)) - 1/an^2 e2 = 1/(an^2+bn) * ( (2*(n^2+n+3))/(9*n*(n-1)) - (n+2)/(n*an) + bn/an^2 )

var.theta=e1*S + e2*S*(S-1)

D=(theta.pi - theta.l)/sqrt(var.theta)

REngine.php: <!--- Start of program --->
Error in names(D) = "D" : names() applied to a non-vector
Execution halted
in

names(D)="D"

est=c(theta.l, theta.pi) names(est)=c("Theta L", "Theta Pi")

ret=list(statistic=D, method="Tajima Test", estimate=est, p.value=2*(1-pnorm(abs(D)))) class(ret)="htest" ret