An in silico expression analysis can be performed in order to validate short DNA sequences
as cis-elements responsive to different biotic and abiotic stimuli. The potential
cis-sequence is used to perform a genome-wide Arabidopsis thaliana promoter screening.
Either 250 nt, 500 nt or 1000 nt long promoter lengths can be selected for analysis. Genes potentially
regulated by small RNAs can be excluded from the analysis.
As search result, the total number of genes (3000 max.) as well as
the number of genes potentially regulated by small RNAs is displayed. By selecting the respective
links, the genes are listed in tables. Gene descriptions are displayed when selecting the arrow.
The gene sets are used to calculate mean induction factors for every Arabidopsis thaliana microarray experiment stored
within the PathoPlant database. These mean values are normalized according to overall expression
values of each stimulus. This results in a ranked list of microarray experiments according
to their mean induction factors. The most probable stimuli the potential cis-element is responsible for
can be identified by looking at the highest-ranked stimuli. For each stimulus, the number of expression values
used for mean induction factor calculation is given. The corresponding raw p-value as well as the BH (FDR) adjusted p-value are calculated
for each mean factor to assess its significance. By default, the table is ranked by mean induction
factors and can be resorted in descending or ascending order by selecting the headers Stimulus,
Mean induction factor, raw p-value or BH (FDR) adjusted p-value. The list of genes obtained by the in silico expression
analysis can direcly be submitted to the Microarray expression function of PathoPlant to obtain
expression data of these genes for all stimuli. Additionally, this list can also be transferred
to AthaMap's Gene Analysis function for a transcription factor binding sites analysis.
The number of expression values can be selected to show detailed information about the genes and
their individual expression values for a given stimulus.
By doing so, a new window or tab will open showing the number of genes present on the microarray
chip and a table with the genes, positional information from the promoter screening and
gene expression details. The orientation and relative distance refers to the distance of the first
match position to the point of reference that can either be the transcription start site (TSS),
if known, or otherwise the translation start site (ATG). The individual and mean induction factors
of each gene are given as well as the number of replicates (n) and the base-10 logarithm of the
standard deviation for mean induction factor calculation of each gene.