Gene Expression analysis

 

Additional files for:"Systematic Review of Genome-wide Gene Expression Studies of Bipolar Disorder,
Seufiddin et al.
(manuscript submitted)"

The 2 main scripts are:

 

1) Individual Study Analysis:

 

2) Mega-Analysis:

  • Download: zip file including required scripts and sample input files: Mega-Analysis.zip
  • Requirements:
    • R packages: lme4, languageR, arm
    • Sample Input files: single gene expression data matrix created by combining all individual processed studies into one matrix

 

Sample output: output.zip ( Show DescriptionHide Description )

Output (file name) Description
study.name_affy.RObject.R If your input is CEL files you will get "additional output" as an Affy RObject (for later use)
study.name_fRMA.normalized.txt If your input is CEL files you will get "additional output" as a normalized gene expression data matrix (gedm) (rows = features/probes/probesets, columns = samples)
study.name_boxplots.pre. and.post.normalization.png If your input is CEL files you will get "additional output" as a box plots showing distribution of intensities for each microarrays/samples pre and post normalization
study.name_boxplots. post.normalization.png Box plots showing distribution of intensities for each microarrays/samples post normalization
study.name_NegativeStrandMatching Probesets.intensity.distributions.png Density plot showing distribution of intensities for different categories of Negative Strand Matching Probesets (PANP provided NSMPs, all probes, no.transcript, sense.transcript, antisense.transcript, overlap.transcript
study.name_myPcalls.data.gedm.panp.txt Presence(P)/Absent(A)/Marginal(M) calls for all probesets in each individual sample from the Presence/Absence Calling algorithm (PANP)
study.name_myPcalls.data. gedm.panp.converted.to.binary.txt P/A/M calls for all probesets in each individual sample from the Presence/Absence Calling algorithm (PANP) converted to binary calls i.e. P=1/A=0/M=0.5
study.name_myPcalls.data. gedm.barcode.GPL96.txt P/A calls for all probesets in each individual sample from the Presence/Absence Calling algorithm (barcode) for hgu133a microarray platform
study.name_myPcalls.data. gedm.barcode.GPL570.txt P/A calls for all probe sets in each individual sample from the Presence/Absence Calling algorithm (barcode) for hgu133plus2 microarray platform
study.name_PAcalls. distribution.PANP.vs.Barcode.png Density plot showing distribution of P/A/M calls comparing the PANP algorithm to barcode algorithm produced only for the hgu133a and hgu133p microarray platforms
study.name_data.gedm.cases.controls.txt Normalized gene expression data matrix (gedm) (rows = features/probes/probesets, columns = samples) for samples included in the demographic file and after removing outliers if any
study.name_myPcalls.data. gedm.matrix.cases.controls.txt P/A/M calls for all probesets in each individual sample from the Presence/Absence Calling algorithm (PANP) converted to binary calls i.e. P=1/A=0/M=0.5 for samples included in the demographic file and after removing outliers if any
study.name_data.gedm.filtered. by.subjects.and.by.probesets.txt Normalized gene expression data matrix (gedm) (rows = features/probes/probesets, columns = samples) filtered by subjects (outliers if any) and by probesets
study.name_surrogate.variables.txt Surrogate variables matrix for each individual
study.name_residuals.data.gedm.post.sva.txt Residual intensities matrix after removing effects of surrogate variables from the actual intensity i.e. (observed intensity - expected intensity) (rows = samples , columns = features/probes/probesets)
study.name_residuals.data. gedm.transposed.post.sva.txt Residual intensities matrix after removing effects of surrogate variables from the actual intensity i.e. (observed intensity - expected intensity) (rows = features/probes/probesets , columns = samples) - transposed
study.name_probe.statistics. regression.residuals.post.sva.txt Linear regression of residual intensities with groups for e.g. "1" (Group1 microarrays for e.g. "cases") and "0" (Group2 microarrays for e.g. "controls")
study.name_probe.statistics. regression.pre.and.post.sva.txt Linear regression of actual intensities with groups for e.g. "1" (Group1 microarrays for e.g. "cases") and "0" (Group2 microarrays for e.g. "controls") with surrogate variables included as covariates in the model
study.name_probe.statistics.regression. significants.pre.and.post.sva.p.less.than.0.05.txt Output file above subset for probesets with association p-value adjusted < 0.05
study.name_unadjusted.png 1) volcano plot 2) MA plot and 3) Histogram of association results pre-and post SVA
study.name_adjusted.png 1) volcano plot 2) MA plot and 3) Histogram of association results pre-and post SVA
study.name_residuals.data.gedm. transposed.post.sva.probesets.mapped.to.gene.txt Residual intensities matrix after removing effects of surrogate variables from the actual intensity i.e. (observed intensity - expected intensity) (rows = genes, columns = samples) - transposed and mapped to gene
metaname_mega_statistics.txt Results of mega-analysis
metaname_meta_statistics.txt Results of meta-analysis

Hide this content.