refine.bio
  • Search
      • Normalized Compendia
      • RNA-seq Sample Compendia
  • Docs
  • About
  • My Dataset
github link
Showing
of 499 results
Sort by

Filters

Technology

Platform

accession-icon GSE53355
Preserving biological heterogeneity with personalized genomics batch correction
  • organism-icon Homo sapiens
  • sample-icon 39 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133 Plus 2.0 Array (hgu133plus2)

Description

Motivation: Sample source, procurement process, and other technical variations introduce batch effects into genomics data. Algorithms to remove these artifacts enhance differences between known biological covariates, but also carry potential concern of removing intra-group biological heterogeneity and thus any personalized genomic signatures. As a result, accurate identification of novel subtypes from batch corrected genomics data is challenging using standard algorithms designed to remove batch effects for class comparison analyses. Nor can batch effects be corrected reliably in future applications of genomics-based clinical tests, in which the biological groups are by definition unknown a priori.

Publication Title

Preserving biological heterogeneity with a permuted surrogate variable analysis for genomics batch correction.

Sample Metadata Fields

Sex, Specimen part, Disease, Disease stage, Race

View Samples
accession-icon GSE47018
Gene Expression Profiling in Polycythemia Vera (PV)
  • organism-icon Homo sapiens
  • sample-icon 27 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133A Array (hgu133a)

Description

To define the molecular abnormalities at the stem cell level in polycythemia vera (PV), we examined global gene expression in circulating CD34+ cells from 19 JAK2 V617F-positive PV patients and 6 normal individuals using Affymetrix oligonucleotide microarray technology. We observed that CD34+ cell gene expression not only differed between the PV patients and the normal controls but also between men and women PV patients. Based on these gender-specific differences in gene expression, we were able to identify 102 genes differentially regulated concordantly by both men and women, which likely represent a core set of genes whose dysregulation is involved in the pathogenesis of PV. Gene expression was verified by Q-PCR of patient CD34+ cell RNA. Using the 102 gene set and unsupervised hierarchical clustering, the 19 PV patients could be separated in two groups that differed significantly with respect to hemoglobin level, thrombosis frequency, splenomegaly, splenectomy or chemotherapy exposure, leukemic transformation and overall survival. These results were confirmed using top scoring pairs, which identified a different set of 29 genes that independently segregated the 19 patients into the same two clinical groups: those with an aggressive form of the disease (7 patients), and those with an indolent form (12 patients).

Publication Title

Two clinical phenotypes in polycythemia vera.

Sample Metadata Fields

Sex, Disease

View Samples
accession-icon GSE32975
Gene expression signatures modulated by epidermal growth factor receptor activation and their relationship to cetuximab resistance in head and neck squamous cell carcinoma
  • organism-icon Homo sapiens
  • sample-icon 58 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133 Plus 2.0 Array (hgu133plus2)

Description

Aberrant activation of signaling pathways controlled in normal epithelial cells by the epidermal growth factor receptor (EGFR) has been linked to cetuximab (a monoclonal antibody against EGFR) resistance in head and neck squamous cell carcinoma (HNSCC). To infer relevant and specific pathway activation downstream of EGFR from gene expression in HNSCC, we generated gene expression signatures using immortalized keratinocytes (HaCaT) subjected to either ligand stimulation or pharmacological inhibition of the signaling intermediaries PI-3-Kinase and MEK or transfected with EGFR, RELA/p65, or HRASVal12. The gene expression patterns that distinguished the various HaCaT variants and conditions were inferred using the Markov chain Monte Carlo (MCMC) matrix factorization algorithm Coordinated Gene Activity in Pattern Sets (CoGAPS). This approach inferred gene expression signatures with greater relevance to cell signaling pathway activation than the expression signatures inferred with standard linear models. Furthermore, the pathway signature generated using HaCaT-HRASVal12 further associated with the cetuximab treatment response in isogenic cetuximab-sensitive (UMSCC1) and -resistant (1CC8) cell lines. Our data suggest that the CoGAPS algorithm can generate gene expression signatures that are pertinent to downstream effects of receptor signaling pathway activation and potentially be useful in modeling resistance mechanisms to targeted therapies.

Publication Title

Gene expression signatures modulated by epidermal growth factor receptor activation and their relationship to cetuximab resistance in head and neck squamous cell carcinoma.

Sample Metadata Fields

Cell line, Treatment

View Samples
accession-icon GSE36110
A 3'-UTR KRAS-variant is associated with cisplatin resistance in patients with recurrent and/or metastatic head and neck squamous cell carcinoma.
  • organism-icon Homo sapiens
  • sample-icon 18 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133 Plus 2.0 Array (hgu133plus2)

Description

To determine the differential expression of KRAS-variant HNSCC (head and neck squamous cell carcinoma) cell lines.

Publication Title

A 3'-UTR KRAS-variant is associated with cisplatin resistance in patients with recurrent and/or metastatic head and neck squamous cell carcinoma.

Sample Metadata Fields

Specimen part, Cell line

View Samples
accession-icon GSE87650
Integrative Epigenome-Wide Analysis Shows That DNA Methylation May Mediate Genetic Risk In Inflammatory Bowel Disease
  • organism-icon Homo sapiens
  • sample-icon 251 Downloadable Samples
  • Technology Badge IconIllumina HumanHT-12 V4.0 expression beadchip

Description

This SuperSeries is composed of the SubSeries listed below.

Publication Title

Integrative epigenome-wide analysis demonstrates that DNA methylation may mediate genetic risk in inflammatory bowel disease.

Sample Metadata Fields

Sex, Age, Specimen part, Subject

View Samples
accession-icon GSE86434
Integrative Epigenome-Wide Analysis Shows That DNA Methylation May Mediate Genetic Risk In Inflammatory Bowel Disease [Expression profiling]
  • organism-icon Homo sapiens
  • sample-icon 251 Downloadable Samples
  • Technology Badge IconIllumina HumanHT-12 V4.0 expression beadchip

Description

Epigenetic alterations may provide important insights into gene-environment interaction in inflammatory bowel disease (IBD). Here we observe epigenome-wide DNA methylation differences in 240 newly-diagnosed IBD cases and 190 controls. These include 439 differentially methylated positions (DMPs) and 5 differentially methylated regions (DMRs), which we study in detail using whole genome bisulphite sequencing. We replicate the top DMP (RPS6KA2) and DMRs (VMP1, ITGB2, TXK) in an independent cohort.

Publication Title

Integrative epigenome-wide analysis demonstrates that DNA methylation may mediate genetic risk in inflammatory bowel disease.

Sample Metadata Fields

Sex, Age, Specimen part

View Samples
accession-icon GSE56457
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequence Quality Control consortium
  • organism-icon Homo sapiens
  • sample-icon 16 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Gene Expression Array (primeview), Illumina HumanHT-12 V4.0 expression beadchip, Affymetrix Human Gene 2.0 ST Array (hugene20st)

Description

We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the United States Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for sequence discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed, for these and qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcriptlevel profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.

Publication Title

A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.

Sample Metadata Fields

No sample metadata fields

View Samples
accession-icon GSE37517
Expression data from human induced pluripotent stem cell derived NSCs and striatal-like cells
  • organism-icon Homo sapiens
  • sample-icon 13 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Gene 1.0 ST Array (hugene10st)

Description

Huntington's disease (HD) is an inherited neurodegenerative disorder caused by an expanded stretch of CAG trinucleotide repeats that results in neuronal dysfunction and death. We made induced pluripotent stem cell (iPSC) lines from HD patients and controls. Though no obvious effects of the CAG expansion on reprogramming or subsequent neural stem cell (NSC) production were seen, HD-NSCs showed CAG expansion-associated gene expression patterns and, upon differentiation, changes in electrophysiology, metabolism, cell adhesion, and ultimately an increased risk of cell death for both medium and longer CAG repeat expansions, with some deficits greater in cells from longer repeat HD NSCs. The HD180 lines were more vulnerable than control lines to cellular stressors and BDNF withdrawal using a range of assays across consortium laboratories. This HD iPSC collection represents a unique and well-characterized resource to elucidate disease mechanisms in HD and provides a novel human stem cell platform for screening new candidate therapeutics.

Publication Title

Induced pluripotent stem cells from patients with Huntington's disease show CAG-repeat-expansion-associated phenotypes.

Sample Metadata Fields

Specimen part, Disease, Disease stage

View Samples
accession-icon SRP095272
Analysis of parent-of-origin bias in gene expression levels
  • organism-icon Homo sapiens
  • sample-icon 325 Downloadable Samples
  • Technology Badge IconIllumina HiSeq 2000

Description

In order to study parent-of-origin effects on gene expression, we performed RNAseq analysis (100bp single end reads) of 165 children who formed part of mother/father/child trios where genotype data was available from the HapMap and/or 1000 Genomes Projects. Based on phased genotypes at heterozygous SNP positions, we generated allelic counts for expression of the maternal and paternal alleles in each individual. This analysis reveals significant bias in the expression of the parental alleles for dozens of genes, including both previously known and novel imprinted transcripts. Overall design: This submission contains RNAseq data from 165 children from mother/father/child trios studied as part of the 1000 genomes and/or HapMap projects. We provide raw fastq format reads, and processed read counts per gene. Allelic count information can be provided by directly contacting the authors.

Publication Title

RNA-Seq in 296 phased trios provides a high-resolution map of genomic imprinting.

Sample Metadata Fields

Specimen part, Cell line, Subject

View Samples
accession-icon GSE26111
Whole-genome gene expression profiling of Pik3cg-depleted mice
  • organism-icon Mus musculus
  • sample-icon 12 Downloadable Samples
  • Technology Badge IconIllumina MouseWG-6 v2.0 expression beadchip

Description

We performed whole-genome gene expression profiling in Pik3cg-/- mice and subsequent gene ontology clustering of differentially expressed genes compared to wild type mice, in order to investigate the role of Pik3cg in platelet membrane biogenesis and blood coagulation.

Publication Title

Maps of open chromatin guide the functional follow-up of genome-wide association signals: application to hematological traits.

Sample Metadata Fields

Sex, Specimen part

View Samples
...

refine.bio is a repository of uniformly processed and normalized, ready-to-use transcriptome data from publicly available sources. refine.bio is a project of the Childhood Cancer Data Lab (CCDL)

fund-icon Fund the CCDL

Developed by the Childhood Cancer Data Lab

Powered by Alex's Lemonade Stand Foundation

Cite refine.bio

Casey S. Greene, Dongbo Hu, Richard W. W. Jones, Stephanie Liu, David S. Mejia, Rob Patro, Stephen R. Piccolo, Ariel Rodriguez Romero, Hirak Sarkar, Candace L. Savonen, Jaclyn N. Taroni, William E. Vauclain, Deepashree Venkatesh Prasad, Kurt G. Wheeler. refine.bio: a resource of uniformly processed publicly available gene expression datasets.
URL: https://www.refine.bio

Note that the contributor list is in alphabetical order as we prepare a manuscript for submission.

BSD 3-Clause LicensePrivacyTerms of UseContact