refine.bio
  • Search
      • Normalized Compendia
      • RNA-seq Sample Compendia
  • Docs
  • About
  • My Dataset
github link
Showing
of 97 results
Sort by

Filters

Technology

Platform

accession-icon GSE20613
The Sp100 component of ND10/PML bodies is a potent tumor suppressor
  • organism-icon Homo sapiens
  • sample-icon 12 Downloadable Samples
  • Technology Badge IconIllumina human-6 v2.0 expression beadchip

Description

Identifying the functions of proteins, which define specific subnuclear structures and territories, is important for understanding eukaryotic nuclear dynamics. Sp100 is a prototypical protein of ND10/PML bodies and co-localizes with the proto-oncogenic protein PML and Daxx, proteins with critical roles in oncogenic transformation, interferon-mediated viral resistance and response to PML-directed cancer therapeutics. Sp100 isoforms contain PHD, Bromo and HMG domains and are highly sumoylated at ND10/PML bodies, all characteristics suggestive of a role in chromatin mediated gene regulation. However, no clear role for the Sp100 component of PML bodies in oncogenesis has been defined. Using isoform-specific knockdown techniques, we show that most human diploid fibroblasts, which lack Sp100, rapidly senesce and discuss gene expression changes associated with this rapid senescence.

Publication Title

Sp100 as a potent tumor suppressor: accelerated senescence and rapid malignant transformation of human fibroblasts through modulation of an embryonic stem cell program.

Sample Metadata Fields

Cell line, Treatment

View Samples
accession-icon GSE13255
Gene Expression Profiles in Peripheral Blood Mononuclear Cells Can Distinguish Patients with NonSmall Cell Lung Cancer.
  • organism-icon Homo sapiens
  • sample-icon 291 Downloadable Samples
  • Technology Badge IconIllumina human-6 v2.0 expression beadchip

Description

We report a 29-gene diagnostic signature, which distinguishes individuals with NSCLC from controls with non-malignant lung disease with 91% Sensitivity, 79% Specificity and a ROC AUC of 92%. Accuracy on an independent set of 18 NSCLC samples from the same location was 79%. Samples from an independent location including 12 stage 1 NSCLC and 15 controls, achieved an accuracy of 74%. A study of 18 paired samples taken pre and post surgery shows that the PBMC associated cancer signature is significantly reduced after tumor removal, supporting the hypothesis that the signature detected in pre-surgery samples is a response to the presence of the tumor.

Publication Title

Gene expression profiles in peripheral blood mononuclear cells can distinguish patients with non-small cell lung cancer from patients with nonmalignant lung disease.

Sample Metadata Fields

Sex, Age, Race

View Samples
accession-icon GSE6653
Gene expression analysis of IOSE cells treated with TGFb1, a time course study
  • organism-icon Homo sapiens
  • sample-icon 8 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133 Plus 2.0 Array (hgu133plus2)

Description

Unlike ovarian cancer, normal ovarian epithelium response to TGFb1 induced growth inhibition. This time course study tried to idenify genes that showed changes after additionof TGFb1 in immortalized ovarian surface epithelial cells (IOSE) which is derived from normal ovarian epithelial cells

Publication Title

An integrative ChIP-chip and gene expression profiling to model SMAD regulatory modules.

Sample Metadata Fields

No sample metadata fields

View Samples
accession-icon GSE17933
Transcriptional Biomarkers to Predict Female Mouse Lung Tumors in Rodent Cancer Bioassays - A 26 Chemical Set
  • organism-icon Mus musculus
  • sample-icon 191 Downloadable Samples
  • Technology Badge Icon Affymetrix Mouse Genome 430 2.0 Array (mouse4302)

Description

The process for evaluating chemical safety is inefficient, costly, and animal intensive. There is growing consensus that the current process of safety testing needs to be significantly altered to improve efficiency and reduce the number of untested chemicals. In this study, the use of short-term gene expression profiles was evaluated for predicting the increased incidence of mouse lung tumors. Animals were exposed to a total of 26 diverse chemicals with matched vehicle controls over a period of three years. Upon completion, significant batch-related effects were observed. Adjustment for batch effects significantly improved the ability to predict increased lung tumor incidence. For the best statistical model, the estimated predictive accuracy under honest five-fold cross-validation was 79.3% with a sensitivity and specificity of 71.4 and 86.3%, respectively. A learning curve analysis demonstrated that gains in model performance reached a plateau at 25 chemicals, indicating that the size of the current data set was sufficient to provide a robust classifier. The classification results showed a small subset of chemicals contributed disproportionately to the misclassification rate. For these chemicals, the misclassification was more closely associated with genotoxicity status than efficacy in the original bioassay. Statistical models were also used to predict dose-response increases in tumor incidence for methylene chloride and naphthalene. The average posterior probabilities for the top models matched the results from the bioassay for methylene chloride. For naphthalene, the average posterior probabilities for the top models over-predicted the tumor response, but the variability in predictions were significantly higher. The study provides both a set of gene expression biomarkers for predicting chemically-induced mouse lung tumors as well as a broad assessment of important experimental and analysis criteria for developing microarray-based predictors of safety-related endpoints.

Publication Title

Use of short-term transcriptional profiles to assess the long-term cancer-related safety of environmental and industrial chemicals.

Sample Metadata Fields

Sex, Age, Specimen part, Disease, Subject

View Samples
accession-icon GSE16716
MicroArray Quality Control Phase II (MAQC-II) Project
  • organism-icon Mus musculus, Homo sapiens, Rattus norvegicus
  • sample-icon 1314 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133 Plus 2.0 Array (hgu133plus2), Affymetrix Rat Genome 230 2.0 Array (rat2302), Affymetrix Human Genome U133A Array (hgu133a), Affymetrix Mouse Genome 430 2.0 Array (mouse4302)

Description

The MAQC-II Project: A comprehensive study of common practices for the development and validation of microarray-based predictive models

Publication Title

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors.

Sample Metadata Fields

Sex, Age, Specimen part, Race, Compound

View Samples
accession-icon GSE24080
MAQC-II Project: Multiple myeloma (MM) data set
  • organism-icon Homo sapiens
  • sample-icon 549 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133A Array (hgu133a), Affymetrix Human Genome U133 Plus 2.0 Array (hgu133plus2)

Description

The multiple myeloma (MM) data set (endpoints F, G, H, and I) was contributed by the Myeloma Institute for Research and Therapy at the University of Arkansas for Medical Sciences (UAMS, Little Rock, AR, USA). Gene expression profiling of highly purified bone marrow plasma cells was performed in newly diagnosed patients with MM. The training set consisted of 340 cases enrolled on total therapy 2 (TT2) and the validation set comprised 214 patients enrolled in total therapy 3 (TT3). Plasma cells were enriched by anti-CD138 immunomagnetic bead selection of mononuclear cell fractions of bone marrow aspirates in a central laboratory. All samples applied to the microarray contained more than 85% plasma cells as determined by 2-color flow cytometry (CD38+ and CD45-/dim) performed after selection. Dichotomized overall survival (OS) and eventfree survival (EFS) were determined based on a two-year milestone cutoff. A gene expression model of high-risk multiple myeloma was developed and validated by the data provider and later on validated in three additional independent data sets.

Publication Title

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors.

Sample Metadata Fields

Sex, Age

View Samples
accession-icon GSE24363
MAQC-II Project: NIEHS data set
  • organism-icon Rattus norvegicus
  • sample-icon 410 Downloadable Samples
  • Technology Badge Icon Affymetrix Rat Genome 230 2.0 Array (rat2302), Affymetrix Human Genome U133A Array (hgu133a)

Description

The NIEHS data set (endpoint C) was provided by the National Institute of Environmental Health Sciences (NIEHS) of the National Institutes of Health (Research Triangle Park, NC, USA). The study objective was to use microarray gene expression data acquired from the liver of rats exposed to hepatotoxicants to build classifiers for prediction of liver necrosis. The gene expression compendium data set was collected from 418 rats exposed to one of eight compounds (1,2-dichlorobenzene, 1,4-dichlorobenzene, bromobenzene, monocrotaline, N-nitrosomorpholine, thioacetamide, galactosamine, and diquat dibromide). All eight compounds were studied using standardized procedures, i.e. a common array platform (Affymetrix Rat 230 2.0 microarray), experimental procedures and data retrieving and analysis processes.

Publication Title

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors.

Sample Metadata Fields

Sex, Specimen part, Compound

View Samples
accession-icon GSE20194
MAQC-II Project: human breast cancer (BR) data set
  • organism-icon Homo sapiens
  • sample-icon 267 Downloadable Samples
  • Technology Badge Icon Affymetrix Human Genome U133A Array (hgu133a)

Description

The human breast cancer (BR) data set (endpoints D and E) was contributed by the University of Texas M. D. Anderson Cancer Center (MDACC, Houston, TX, USA). Gene expression data from 230 stage I-III breast cancers were generated from fine needle aspiration specimens of newly diagnosed breast cancers before any therapy. The biopsy specimens were collected sequentially during a prospective pharmacogenomic marker discovery study between 2000 and 2008. These specimens represent 70-90% pure neoplastic cells with minimal stromal contamination. Patients received 6 months of preoperative (neoadjuvant) chemotherapy including paclitaxel, 5-fluorouracil, cyclophosphamide and doxorubicin followed by surgical resection of the cancer. Response to preoperative chemotherapy was categorized as a pathological complete response (pCR = no residual invasive cancer in the breast or lymph nodes) or residual invasive cancer (RD), and used as endpoint D for prediction. Endpoint E is the clinical estrogen-receptor status as established by immunohistochemistry. RNA extraction and gene expression profiling were performed in multiple batches over time using Affymetrix U133A microarrays. Genomic analysis of a subset of this sequentially accrued patient population were reported previously. For each endpoint, the first 130 cases were used as a training set and the next 100 cases were used as an independent validation set.

Publication Title

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors.

Sample Metadata Fields

Age, Specimen part, Race

View Samples
accession-icon GSE24061
MAQC-II Project: Hamner data set
  • organism-icon Mus musculus
  • sample-icon 88 Downloadable Samples
  • Technology Badge Icon Affymetrix Mouse Genome 430 2.0 Array (mouse4302)

Description

The Hamner data set (endpoint A) was provided by The Hamner Institutes for Health Sciences (Research Triangle Park, NC, USA). The study objective was to apply microarray gene expression data from the lung of female B6C3F1 mice exposed to a 13-week treatment of chemicals to predict increased lung tumor incidence in the 2-year rodent cancer bioassays of the National Toxicology Program. If successful, the results may form the basis of a more efficient and economical approach for evaluating the carcinogenic activity of chemicals. Microarray analysis was performed using Affymetrix Mouse Genome 430 2.0 arrays on three to four mice per treatment group, and a total of 70 mice were analyzed and used as the MAQC-II's training set (GEO Series GSE6116). Additional data from another set of 88 mice were collected later and provided as the MAQC-II's external validation set (this Series). The training dataset had already been deposited in GEO by its provider and its accession number is GSE6116.

Publication Title

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors.

Sample Metadata Fields

Specimen part, Compound

View Samples
accession-icon SRP060372
Foxd3 promotes the exit from naïve pluripotency and prevents germline specification through enhancer decommissioning [RNA-Seq]
  • organism-icon Mus musculus
  • sample-icon 12 Downloadable Samples
  • Technology Badge IconIllumina HiSeq 2500

Description

Following implantation, mouse epiblast cells transit from a naïve to a primed state in which they are competent for both somatic and primordial germ cell (PGC) specification. Using mouse embryonic stem cells (mESC) as an in vitro model to study the transcriptional regulatory principles orchestrating peri-implantation development, here we show that the transcription factor Foxd3 is necessary for the exit from naïve pluripotency and the progression to a primed pluripotent state. During this transition, Foxd3 acts as a repressor that dismantles a significant fraction of the naïve pluripotency expression program through the decommissioning of active enhancers associated with key naïve pluripotency and early germline genes. Subsequently, Foxd3 needs to be silenced in primed pluripotent cells to allow the reactivation of relevant genes required for proper PGC specification. Our findings uncover a wave of activation-deactivation of Foxd3 as a crucial step for the exit from naïve pluripotency and subsequent PGC specification. Overall design: mRNA profiles were generated by RNA-seq in duplicates for each of the following mESC lines: Foxd3fl/fl;Cre-ER mESC maintained in "Serum+LIF" (SL) treated with TM for three days (SL Foxd3-/-); untreated Foxd3fl/fl;Cre-ER SL mESC (SL Foxd3fl/fl); tetON Foxd3 SL mESC treated with Dox for three days; WT SL mESC treated with Dox for three days; Foxd3fl/fl;Cre-ER mESC maintained in "2i+LIF" (2i) treated with TM for three days (2i Foxd3-/-); untreated Foxd3fl/fl;Cre-ER 2i mESC (2i Foxd3fl/fl).

Publication Title

Foxd3 Promotes Exit from Naive Pluripotency through Enhancer Decommissioning and Inhibits Germline Specification.

Sample Metadata Fields

No sample metadata fields

View Samples
...

refine.bio is a repository of uniformly processed and normalized, ready-to-use transcriptome data from publicly available sources. refine.bio is a project of the Childhood Cancer Data Lab (CCDL)

fund-icon Fund the CCDL

Developed by the Childhood Cancer Data Lab

Powered by Alex's Lemonade Stand Foundation

Cite refine.bio

Casey S. Greene, Dongbo Hu, Richard W. W. Jones, Stephanie Liu, David S. Mejia, Rob Patro, Stephen R. Piccolo, Ariel Rodriguez Romero, Hirak Sarkar, Candace L. Savonen, Jaclyn N. Taroni, William E. Vauclain, Deepashree Venkatesh Prasad, Kurt G. Wheeler. refine.bio: a resource of uniformly processed publicly available gene expression datasets.
URL: https://www.refine.bio

Note that the contributor list is in alphabetical order as we prepare a manuscript for submission.

BSD 3-Clause LicensePrivacyTerms of UseContact