RNA was extracted using RNA-Easy kit from Qiagen according to manufacturer's instructions.
Label
biotin
Label protocol
Biotinylated cRNA were prepared according to the standard Affymetrix protocol from 1 ug total RNA (Expression Analysis Technical Manual, 2001, Affymetrix). Samples with less than 1 ug quantities were amplified using 10 ng of total RNA.
Hybridization protocol
Following fragmentation, 10 ug of cRNA were hybridized for 16 hr at 45C on GeneChip Drosophila Genome Array. GeneChips were washed and stained in the Affymetrix Fluidics Station 400.
Scan protocol
GeneChips were scanned using the Hewlett-Packard GeneArray Scanner G2500A.
Description
Responder
Data processing
All statistical analysis was conducted using the R environment and the R packages affy and affy-PLM. All the raw data tables are included in the accompanying CD in the folder (OV01). We first started by reading the data files into R. > library(affy) > library(affyPLM) > Data <- ReadAffy() > sampleNames(Data) <- c('12a', '14a', '15a', '16a', '19a', '24a2', '27a2', + '28a2', '2a', '33a2', '34a', '38a2', '39a', + '41a', '43a', '44a', '5a2', '7a', '8a', '9a') Normalization Background subtraction and normalization was performed using the method of Li and Wong. This method finds a rank-invariant set of probes across the arrays and uses this for between array normalization. > Data.norm <- expresso(Data, normalize.method='invariantset', + bg.correct=FALSE, pmcorrect.method='pmonly', summary.method='liwong') > Data.norm.values <- exprs(Data.norm) Present and absent calls Affy probes are arranged as 11 pairs of perfect matches and mismatches for each affymetrix ID present on the array. Ideally, a mismatch should yield minimum intensity values. Thus, for a specific hybridization to have occurred, the intensity from the perfect match probes should be “significantly” higher than that from the mismatched probes. Probe level data were used to categorize genes to either present ( p < 0.04), absent ( p > 0.06) or marginal (0.04 ≤ p ≤ 0.06). The mean percentage present calls across all arrays was 51.1% (sd = 8.9%). All the arrays had a percentage present calls of more than 33% of the total number of probe pair sets on an array. > Calls <- mas5calls(Data) > Calls.data <- exprs(Calls) > Data filtering Data filtering was performed in two stages. First, genes that were either absent or marginal in more than 6 arrays (30% of the arrays) were filtered out. After applying this filtering method only 10,161 genes (out of 22,227) remained for downstream analysis. > Calls.data.AP <- Calls.data==A| Calls.data==M > Calls.data.AP.sum <- apply(Calls.data.AP, 1, sum) > Calls.data.AP.filter <- which(Calls.data.AP.sum > 6) > Calls.data.AP.retain <- which(Calls.data.AP.sum <= 6) > eset.AP.retained <- Data.norm.values[Calls.data.AP.retain,] > length(Calls.data.AP.retain) Second, genes that showed non significant variation in gene expression across arrays were removed according to the method adopted by Simon et al in which only genes that were significantly more variable across arrays than the median variance of all genes were included. This was done by computing the quantity (n − 1) ∗ (Var.i/Var.med ) and comparing it against the chi-square distribution at n − 1 degrees of freedom. After performing this step only 3426 genes were retained for further downstream analysis. > eset.AP.retained.var <- apply(log2(eset.AP.retained), 1, var) > eset.AP.retained.var.quant <- + (ncol(eset.AP.retained)-1)*(eset.AP.retained.var/median(eset.AP.retained.var)) > eset.AP.retained.var.quant.centile <- pchisq(eset.AP.retained.var.quant, + df=ncol(eset.AP.retained)-1) # Genes above the 95th centile (p<0.05) were retained > Var.retain <- which(eset.AP.retained.var.quant.centile > 0.95) > eset.AP.var.retained <- eset.AP.retained[Var.retain,] > dim(eset.AP.var.retained)