Completely Randomized Experimental Design

Transcription

Completely Randomized Experimental Design
Completely Randomized Experimental Design
• Simplified as Completely Randomized Design or CRD in
which replicated treatments are assigned randomly to EUs
(pots, plots, individuals, containers, chambers, refrig., etc.)
• It is the simplest, valid design that results in data that can be
analyzed, in order of feasibility, as:
- two-sample t-test, when having
1. one continuous response variable
2. one categorical explanatory variable with only 2 levels
- one-way ANOVA, when having
1. one continuous response variable
2. one categorical explanatory variable having
2 levels
- simple linear regression, when having
1. one continuous response variable
2. one continuous explanatory variable
- multi-way ANOVA, when having
1. one continuous response variable
2. Two or more categorical explanatory variables having
2 levels
- multiple linear regression, when having
1. one continuous response variable
2. Two or more continuous explanatory variables
16
• CRD is used when EUs are assumed to be homogeneous
with respect to all factors, which can affect the response,
but are not the focus of the study
- most practical for lab; small plots, chambers, etc. But
even for such cases the assumption of homogeneity of
EUs may not always hold
-- e.g., studying the effect of light on bacterial growth
using refrigerators as EUs
• CRD is simple in design and data analysis
- simple design does not mean a bad design
• In CRD, loss of information due to missing values
replicates) is smaller than other designs (why?)
- appropriate for long-term studies, human studies,
having severe treatments
• CRD provides maximum df for the experimental error,
improving the detection of a treatment effect
- particularly useful when replication is low
• CRD is inefficient if the variability among experimental
units is known to be large and can be blocked in certain
ways
17
Procedure
•Treatments are assigned randomly to EUs (pots, plots,
individuals, containers, chambers, refrigerators, etc.)
•Throughout the study, the EUs should be processed
randomly at any stage, when the order of processing may
affect the response (e.g.?)
•Both equal and unequal replications can be employed
- often preferred, to have equal replication (balanced
design)
18
A simple example
•Objective: to test the effect of 3 fertilization level on
Ht. growth of houseplants grown in 9 independent
pots
•Research Hypothesis: the 3 different levels affect
plant height differently (can causality be inferred?)
•Treatment Design, fertilization at:
1. 0 level (control)
2. 5 units
3. 10 units
Experimental Unit: EU =?
Total # of EUs:
n =?
Treatment levels:
t =?
Replicates:
r =?
19
Randomization:
• If pots are fixed (i.e., immobile; e.g., plots of land,
chambers, etc):
1. Number them arbitrarily (e.g., 1 to 9: having 3
replicates)
2. Order treatment levels and their replicates
3. Generate 9 random numbers (stat. book, calculator, etc)
and assign subsequently each random number to one of
the t×r combinations
4. Order the 9 (t×r) combinations based on their random
numbers and assign them to the 9 EUs in order
• If the EUs are not fixed (i.e., EUs are mobile):
- place them randomly, and then do as above
20
Do EUs seem homogeneous?
T. Levels
0
0
0
5
5
5
10
10
10
Rep
1
2
3
1
2
3
1
2
3
Random#
Rank
21
6. After a number of days, measure plant height
randomly
7. The experimental design part is over, but before analyzing
data ensure to recognize or set the following:
- the response variable?
- the explanatory variable?
- statistical Hypothesis:
H0 :
HA:
- a level?
- statistical test?
22
• May we use a series of t-tests on pairwise treatment
means?
if A, B, C represent 3 treatments (0, 5, 10 levels), there are
a total of 3 possible pairs of means to compare
H0: A vs B,A = B = 
A vs C,A = C = 
B vs C,B = C = 
• There are 3 separate tests
• The overall (experimentwise)  will exceed the preset
value
• For a = 0.05, it will increase to a maximum of:
1- (1-  )3*  0.14 (when all comparisons are independent)
* number of pairs being compared
23
•As the number of treatments increases, the overall 
(probability of incorrectly rejecting the H0 for at least one
of the pairs) increases drastically:
# of treat.
2
3
4
5
6
10
# of
Pairs
1
3
6
10
15
45
pairwise a level
0.10 0.05
0.01
0.10
0.27
0.47
0.65
0.79
0.99
0.05
0.14
0.26
0.40
0.54
0.90
0.01
0.03
0.06
0.10
0.14
0.36
experimentwise error
•Thus, it is improper to use a series of t-tests for unplanned
comparison of more than 2 means, because the overall
(experimentwise) Type-I error would be underestimated
•There is a need for another kind of test--an F-test
24
1. Assume that the plant heights at the end of the study
are: 5.9, 5.9, 5.9, 5.9, 5.9, 5.9, 5.9, 5.9, 5.9 (cm)
- any variation in the data?
- reason for such a result?
2. Now, consider the actual data as below:
6.8, 7.1, 6.6, 7.2, 7.2, 7.1, 7.1, 6.9, 7.0
- any variation in the data?
- how much is it?
- possible cause(s) for such variation?
in other words, what could be the source(s) of such
variation?
- average magnitude of such variations?
- relative importance of average variations?
25
3. Now, consider actual data arranged based on the
treatment design
treatment
level
Ht. (cm)
Replicate
1
2
3
treatment treatment
total
mean
0
6.8
6.7
6.6
20.1
6.70
5
7.1
6.9
7.0
21.0
7.00
10
7.2
7.2
7.1
21.5
7.17
treatment levels:
t= 3
replicates:
r= 3
total pots:
n = tr = 9
grand mean:
= 6.96
26
Summation Notation For a One-Way Classification
Treat.
level
1
Response
Replicate
2
3
.
r
Total Mean
1
y11
y12
y13
.
y1r
y1.
y
2
y21
y22
y23
.
y2r
y2.
y
3
y31
y32
y33
.
y3r
y3.
y
.
.
.
.
.
.
.
t
yt1
yt2
yt3
.
ytr
yt.
1.
2.
3.
yt.
treatment levels: t
replications: r
total # of EUs: = n = tr
grand mean: y..
27
4. Actual data arranged based on treatment design and
classification notation
t
1
r
2
3
0
6.8
6.7
6.6
20.1
6.70
5
7.1
6.9
7.0
21.0
7.00
10
7.2
7.2
7.1
21.5
7.17
treat. levels:
replicates:
total pots:
grand mean:
yt.
yt.
t= 3
r= 3
n = tr = 9
y.. = 6.96
- quantification of the magnitude and relative
importance of sources of variations constitutes
ANOVA
- in most cases (fixed models), ANOVA is used to test
hypothesis regarding means
- in some cases (random models) ANOVA is used to test
hypothesis regarding variances
28
Hypothesis
H0 : 1 =2 = 3 = 
- stat. model: yij = µ + ij (reduced model)
- graphical presentation


y1
y2

y3
 HA : not all µs are equal
- stat. model: yij = µ + i + ij (full model)
- graphical presentation
1
2
3
yij: jth observation (rep.) from the ith treatment level
(j = 1, 2, … r; i = 1, 2, …, t)
µ: estimated by the grand mean
i: effect of the ith treat. level (deviation of the mean of ith
level from µ)
ij: random deviation of yij from its expected value
(called: error or residual)
29
•Under the H0 model (yij = µ + ij):
- assuming no systematic error at any stage, any value (cell)
deviates from y..due to a random error: yij – y..= eij
-- source of variation is unknown (chance?)
-- sum of all individual squared deviations
(yij - y..)2 = eij2 constitutes total sum of squares
(SSTotal), which can be averaged [SSTotal / (n-1)] as:
Mean Squared deviations (MSerror), also known as?
•Under the HA model (µ + i + ij):
- assuming no systematic error, any cell deviates from y..
due to a combination of a treatment effect and a random
error:
yij – y.. = eij + y
t
-- random variation is not the only source of variation
-- sum of all individual squared deviations (yij - y..)2
(SSTotal)  eij2, and averaging it does not provide
MSerror
So, the magnitude of variation due to treatment & random
noise should be quantified and compared (how?) to test
the significance of treatment effect
30
SSTotal :
- the Total Sum of Squares of the data
-- definition formula:
-- computation formula:
( yij  y..)2
 yij2 [( yij )2 / n]
SStreatment:
- the SS due to treatment differences and random error
also called SSbetween, SSamong …
2
-- definition formula: [r( yi.  y..) ]
-- computation formula: [( y2 / r)][( y )2 / n]
i.
ij
where: r = # of reps
y = treatment-level mean
i.
yi. = treatment-level total
- SStreatment accounts for a portion of SSTotal
- SStreatment /(t-1): variance due to treatment and
random error, denoted by MStreatment
31
SSerror:
- SS due to random variation, also called SSwithin or
SSresidulal
-- definition formula:
( yij  yi.)2
-- computation formula
2
2
1.  yij  ( yi . / r )
2
2
2
2. ( S  S  ...  St ) (r -1)
1
2
3. SSTotal-SStreatment
The ANOVA table can be completed in several ways:
A. calculate the SSTotal and SSerror (from pooled
variance), and get SStreatment by subtraction
B. calculate SSTotal and SStreatment, and get SSerror
by subtraction
32
• Ideal Conditions (Assumptions):
1. Normality
ANOVA is robust with regard to departures from
normality, especially when replication is large
2. Equality (homogeneity of variances)
ANOVA is robust with regard to departures from
equality of variance, especially when replication is
large
3. Independence
A. Should be assured by the experimenter
considering the scope of inference
B. Can be improved by
- random sampling in observational studies
- random assignment of treatments to EUs
in experimental studies
-setting up EUs as far as possible to decrease:
a. pre-existing dependence
b. treatment spill-over
33
For the plant example, calculate the followings:
1. (yij)2/n, (called: correction factor--C)
(6.8+6.7+6.6+7.1+6.9+7+7.2+7.2+7.1) 2 / 9 = 435.4178
2. SS
  y2 C
ij
Total
(6.82+6.72+6.62+7.12+6.92+72+7.22 +7.22+7.12) - C
= 435.8- 435.4178 = 0.3822
3. SS
2 / r)] C

[
(
y

treatment
i.
= [(20.12/3)+(212/3)+(21.52/3)] - C
= 435.7533 - 435.4178 = 0.3356
4. SSerror = SSTotal-SStreatment
= 0.3822-0.3356 = 0.0466
5. Means of sum of squares for SStreatment and SSerror by
dividing them over related dfs
- complete the ANOVA table
- reject the H0 if calculated F > tabular F
34
df
n -1
t-1
n-t
Source
Total
treatment
error
ANOVA table
35
SSerror
SStreatment
SSTotal
SS
MSerror =
SSerror / (n-t)
MStreatment=
SStreatment / (t-1)
MS
MSt / MSe =
F
P
36
Plant fertilization analysis using the GLM and MIXED
Procedures of the SAS System
options nocenter nodate ps=74 ls=74;
data a; input t r ht @@; lines;
0 1 6.8 0 2 6.7 0 3 6.6 5 1 7.1 5 2
6.9 5 3 7 10 1 7.2 10 2 7.2 10 3 7.1
;
proc glm; class t r ; model ht = t/ss3; run;
The GLM Procedure
Class
Levels Values
t
3
0 5 10
r
3
1 2 3
Dependent Variable: ht
Source
DF
Model
2
Error
6
Corrected Total 8
R-Square
0.878
Sum of
Squares
Mean
Square
.3356
.0467
.3822
.1678
.0078
Coeff Var
1.268
Source
DF
t
2
Root MSE
0.088
TypeIII SS
.336
F
Pr > F
21.6
.0018
ht Mean
6.956
MS
F
Pr > F
.168
21.6
.0018
37
Proc Mixed data = a;
class t r ;
model ht = t;
run;
The Mixed Procedure
……………………..
……………………..
Covariance Parameter Estimates
Cov Parm
Residual
Estimate
.00778
Type 3 Tests of Fixed Effects
Effect
t
Num
DF
Den
DF
F
2
6
21.6
Pr > F
.0018
38
Notes:
1. The F value is a ratio of variances, and thus it:
• is always positive (one-sided)
• can also be used to test equality of two variances
H0: 12 = 22
2. two-sided, two-sample t-test and one-way ANOVA
performed on the same data (when doable)
a) yield the same P
b) MSe from ANOVA is the same as Sp2 from the t-test
c) F = t 2
3. A significant F, resulted from a one-way ANOVA
does not indicate which mean differs from which;
it only indicates that there is at least a significant
difference between one pair of means
- to find out which mean differs from which we must
perform a multiple mean comparison test (MMCT)
39
Types of One-way ANOVA:
• Fixed-effect Model (Model I): Applicable when the
levels of a factor are
- specifically selected
- focus of interest
- repeatable
e.g., testing specific levels of a drug, fertilization,
diet, temperature, or light factor
yij = µ + i + ij,
where:
yij is jth observation from the ith treatment level
µ is the grand mean
i is the effect of the ith treatment level (a fixed
deviation of the mean of ith level from µ)
ij is the random deviation of yij from its
expected value (called residual)
ij = yij -µ, (if no treatment effect)
ij = yij – (µ+ i) (if treatment effect exists)
40
• Random-effect Model (Model II): Applicable when
performing a test to
- make a general statement regarding the effect of a
factor, and chose the levels of that factor randomly
among many possible levels
e.g.,testing different watershed types, species,
locations, etc., in general, and not interested
in the differences among any specific levels
yij = µ + Ai + ij
where: Ai includes both a treatment effect and a
random effect due to randomness of the
treatment levels)
- in one-way ANOVA, calculations are the same for
both Models (I & II) but the H0 is stated differently
- when two or more factors are involved, calculations
are not the same for the two Models,
• Mixed-effect models (Model III): a mixture of both
fixed and random effect factors
41
ANOVA table
Source df
SS
MS
Expected MS
Fixed
Random
Treat.
t-1
SStreat.
SStreat./ (t-1)
2e+ r 2t
2e+ r2t
Error
n-t
SSerror
SSerror/ (n-t)
2e
2e
Total
n -1
SSTotal
42
Test of Assumptions
• Usually performed on residuals (why?)
• In an ANOVA setting, residuals are original data
transformed to have zero means per treatment level,
as well as across all treatment levels
• In a Regression setting, residuals are original data
transformed to have a zero mean across all
treatment levels (and not per treatment)
• If assumptions are met for residuals, they also are
met for the collected data
43
I. Independence: within and among treatments
1.Its violation affects both the scope of inference and
the significance of the test
2.Can be improved greatly through randomization
- spending more resources (thought, effort, money)
3.Should be dealt with during experimental design (if
violated, no remedy during data analysis)
- is the responsibility of the researcher not the
statistician
4.Is required for both parametric and non-parametric
procedures
5.Tests to check it are not generally available,
particularly when replication is low
- plotting the observation against a sequence, if such
sequence can be identified, may be helpful to
detect serial correlation
44
A.Dependence within treatments (among replicates),
which may be positive or negative
e.g., testing the effects of several diets on weight
gain of a certain animal
1. Positive correlation: eating in the presence of others
increases feeding time (a social species?)
a. weight gain increases by a factor other than diet
b. within treatment variance is underestimated
c. the F value is over estimated
d. spurious differences are detected
e. Type I error is increased
2. Negative correlation: eating in the presence of
others decreases feeding time (an aggressive
specie?)
a. within treatment variance is overestimated
b. The F value is underestimated
c. real differences are not detected
d. Type II error is increased
45
B. Dependence among treatments, which may be
positive or negative
e.g.,temporal or spatial dependency, repeatedmeasures, etc.
1. Compound symmetry vs circularity, will be
discussed later
C. To ensure independence:
1. Understand the system (biology, ecology, etc.)
being investigated
2. Prevent dependence
a. always assume dependence can occur and find
ways to avoid it
- keep EU / OU separate
- do not use the same EU / OU for more than
one treatment
- if not avoidable in some cases,
-- ensure to use a correct model for data
analysis
-- define the population being investigated and
do not extrapolate the results beyond that
population (the scope of inference is 46
decreased)
II. Normality
1. ANOVA is robust with regard to normality due to CLT
2. Is usually tested on residuals across treatment levels
when variances are equal
- graphically
-- stem-and-leaf diagram, Box plots, frequency
distribution, normal probability plot, residual plot
- statistically
--D statistics, W statistics, but these are not always
reliable, e.g., but note that
a. for following normally distributed values
1.8048,-0.0799,0.3966,-1.0833,2.2383,-0.6242,0.51376,-0.0866,
-.5942,0.0319,-0.7378,-0.2501,0.6850,-0.8042,-0.74428
the SAS Proc Univariate test of normality
returns a P < 0.05, rejecting normality
b. for numbers from 1 to 8, from a uniform
distribution, the same routine returns a
P > 0.9, not rejecting normality
c. sensitivity (power to reject) of such tests
increases with increasing sample size, but
increased sample size improves normality of the
means (CLT)
3.Te assumption is, in fact, for the sampling mean
distribution within treatment levels, and this can
never be tested having only one sample at each
treatment level—CLT’s help is manifested!!
47
III. Equality (homogeneity, homoscedasticity) of
error variances calculated within treatment levels
(i.e., s12error = 22error … = t2error = error)
1. ANOVA is robust with regard to departure from
this assumption, particularly when the study is
large (more than 5 treatments and 6 replicates) and
balanced (equal sample size)
2. When study is small and unbalanced
A. if the larger variance is associated with the
smaller sample
1. the F value is overestimated
2. Type I error is underestimated
B. if the larger variance is associated with the
larger sample:
1. the F value is underestimated
2. Type I error is overestimated
3. Type II error is increased (decreased power)
48
3.Statistical Test of equality of variances using SAS
A. PROC TTEST (for one explanatory variable with
two levels)
1. Folded -F method (performed by default)
B. Proc MIXED: Null Model Likelihood Ratio Test
With the use of ‘REPEATED’ statement and for
for one-way ANOVA only
3.1. Pattern in the variances depicted from:
Plots of residuals versus Treatment means or
‘Predicted’ in SAS jargon
Note: There might increasing or decreasing
variances with treatment means without
causing variances to differ significantly.
Such pattern is still detrimental to ANOVA
49
Residuals
Satisfactory
0
Predicted
Residuals
Needs log, weighted least squares or
square-root transformation
0
Predicted
Residuals
Time should be included in the model
0
Time order
50
Data Transformation
• May solve lack of normality and/or heterogeneity of
variances
• Should be monotonic (retain the order of the means)
• Should be done only to solve a problem (routine
transformation not ok, unless suggested by the nature
of the system being investigated: distance moved,
growth)
1.Test the original data, do transformation if needed, test
the transformed data, if the problem:
A. is solved, proceed with data analysis
B. persists, stop and think
2.Most common transformations:
A. square-root transformation (counts, poisson)
B. log-transformation
C. arc-sin transformation of percentages and
proportions (binomial data)
51
3.Kind of transformation should be reported, so others
can compare your results with theirs
4.The significance level based on the transformed data
do not hold for the original data
5.Means (but not SD or S2) should be transformed back
to the original scale for presentations
6.If transformation does not help, note that in large ( > 5
treatments, > 6 replication) balanced studies, ANOVA
is robust with regard to departures from this ideal
conditions
- perform the analysis, the conclusion is valid
particularly if no treatment effect is detected (why?)
7.If sample size is small, variances are unequal,
transformation does not help, analysis is performed,
and a significant treatment effect is detected
- use the study as a pilot study to plan a larger
experiment
52

Similar documents