dyebias.apply.correction {dyebias} | R Documentation |
Corrects the gene- and slide specific dye bias in a data set, using the GASSCO method by Margaritis et~al.
data.norm |
A marrayNorm object containing the data whose dye bias should
be corrected. This object must be a complete marrayNorm object. In particular,
maLabels(maGnames(data.norm)) should be set and indicate the
identities of the spots. Spots with the same ID should contain the same
oligo or cDNA sequence, and will receive the same dye bias correction.
|
iGSDBs |
A data frame with the intrinsic gene specific dye bias per reporter
(i.e., oligo or cDNA). The data frame would typically have come
from a call to dyebias.estimate.iGSDBs , but this is
not necessary; other estimates can also be used.
The data frame must have (at least) the following columns:
The order of the rows in this data frame is irrelevant. There must be no rows with duplicate reporterId in this frame.
For any reporter in data.norm that is not in the
iGSDBs data frame, an iSGDB of 0.00 is used, i.e. data from
such reporters is not dye bias-corrected.
|
estimator.subset |
An index indicating which reporters are fit to be used as estimators of the slide bias. This set of reporters is used throughout the whole data set. Reporters that are typically excluded are those corresponding to parasitic DNA elements or mitochondrial genes. |
application.subset |
An index indicating which values must be dye
bias-corrected. It should be either a vector with as many values as
spots, or a matrix of the same dimensions as
maM(data.norm) . In former case, the selected spots on all
slides with be dye bias-corrected; in the latter, selected spots on
selected slides will corrected.
Often it is prudent not to dye bias-correct measurements that are close to the detection limit or close to signal saturation. A convenience function for this is provided; see dyebias.application.subset .
|
dyebias.percentile |
The slide bias estimation uses a small subset of reporters having the strongest green or red iGSDB, as specified by this percentile. The default should suffice in practically all cases. |
minmaxA.perc |
To obtain a robust estimate of the slide bias, the range of the
average expression A is trimmed by minmaxA.perc percent
on both sides; only reporters lying inside this trimmed range are
considered as estimators of the slide bias. The default value is 25,
meaning that top dyebias.percentile red- and green-biased
spots within the the middle two average expression quartiles are
used. This should suffice in practically all cases.
|
minA.abs |
If specified, reporters with an average expression
(A) lower than this value are never considered as estimators
of the slide bias. If not specified, reporters with an
A-percentile < minmaxA.perc are not considered.
|
maxA.abs |
If specified, reporters with an average expression
(A) greater than this are never considered as estimators of the
slide bias. If not specified, reporters with an A-percentile <
100-minmaxA.perc are not considered. |
verbose |
Logical speficying whether to be verbose or not |
This function corrects the gene-specific dye bias of two-colour microarrays using the GASSCO method. This method is general, robust and fast, and is based on the observation that the total bias per gene is the product of a slide-specific factor (strongly related to the labeling percentage) and an intrinsic gene-specific factor (iGSDB), which is strongly related to the probe sequence.
The slide bias is estimated from the total bias of the
dyebias.percentile
percentage of reporters having the strongest
iGSDB. The iGSDBs can be estimated with
dyebias.estimate.iGSDBs
.
If the signal of certain oligos is too weak, or in contrast, tends to
be saturated, they are no good estimator of the slide bias.
Therefore, only reporters with an average expression level A
that is not too extreme are allowed to be slide bias estimators. (This
is the reason for the A
-column in the iGSDBs
data
frame).
Full control over which reporters to allow as slide bias estimators is
given by the arguments minmaxA.perc
, minA.abs
, and
maxA.abs
; see there for details. To not exclude any reporter
(e.g., when A is not available and therefore artificially set),
you can use minA.abs= -Inf
and maxA.abs = Inf
.
For further details concerning the method, see the dyebias
vignette and the publication. If your research benefits from using this
package, we kindly request that you cite this work.
The data returned is a list wit the following elements
data.corrected |
A marrayNorm object of the same 'shape' as
the input data.norm , but with corrected M values.
|
estimators |
Another list, containing the details of the
reporters that were used to obtain an estimate of the slide bias.
The contents of the estimators list are:
|
summary |
A data frame summarizing the correction process per slide. It
consist of the following columns:
|
Note that the input data should be normalized, and that the dye swaps should not have been swapped back (if needed, this can of course be done afterwards).
Philip Lijnzaad p.lijnzaad@umcutrecht.nl
Margaritis, T., Lijnzaad, P., van~Leenen, D., Bouwmeester, D., Kemmeren, P., van~Hooff, S.R and Holstege, F.C.P. (2009). Adaptable gene-specific dye bias correction for two-channel DNA microarrays. Molecular Systems Biology, submitted
dyebias.estimate.iGSDBs
,
dyebias.application.subset
,
dyebias.rgplot
,
dyebias.maplot
,
dyebias.boxplot
,
dyebias.trendplot
## First load data and estimate the iGSDBs ## (see dyebias.estimate.iGSDBs) ### choose the estimators and which spots to correct: estimator.subset <- dyebias.umcu.proper.estimators(maInfo(maGnames(data.norm))) ### choose which genes to dye bias correct: application.subset <- (maW(data.norm) == 1 & dyebias.application.subset(data.raw=data.raw, use.background=TRUE)) ### do the correction: correction <- dyebias.apply.correction(data.norm=data.norm, iGSDBs = iGSDBs.estimated, estimator.subset=estimator.subset, application.subset = application.subset, verbose=FALSE) ## Not run: edit(correction$summary) ## End(Not run) ## give overview: correction$summary[,c("slide", "file", "reduction.perc", "p.value")] ## and summary: summary(as.numeric(correction$summary[, "reduction.perc"]))