Sage Bionetworks Repository

Filter Datasets

Data from the studies below are available for download. We plan to provide a data package for each dataset that contains (A) a curated dataset that contains raw, validated data, (B) the quality control normalized (QC’d) dataset that has been normalized to Sage Bionetworks standards, (C) network models built from QC’d data, (D) documentation describing data processing that is sufficient for replication of analyses (usually in the form of code), and (E) documentation describing data source, citations, and any data use restrictions. Not all data packages currently contain all elements. Data icons indicate what types of data are available for download within each dataset. These include: genotypes, phenotypes, intermediate traits (typically expression or proteomic profiles), and networks. Note that, in some cases, individual elements of a GCD may have to be obtained from different repositories (e.g., dbGaP) in order to comply with investigator restrictions or other applicable regulations. Check the “description” file of each dataset for a catalogue of each package prior to download.

B cell interactome

Species: Human
Tissue: Cell Line
Disease: Healthy
Investigator: Andrea Califano
Institution: Columbia University
Approximate Number Subjects: 254

Abstract:

Bayesian evidence integration framework which integrates a variety of generic
and context specific experimental clues about protein-protein and protein-DNA
interactions - such as a large collection of B cell expression profiles - with
inferences from different reverse engineering algorithms, such as GeneWays
and ARACNE. Modulatory interactions are predicted by the MINDY, an algorithm
for the prediction of modulators of transcriptional interactions
(please refer to the publication section for more information). The HBCI
contains 21,156 protein-protein interactions, 41,568 protein-DNA interactions
and 1,925 modulatory interactions.



Citation

A context specific network of protein-DNA and protein-protein interactions reveals new regulatory motifs in human B cells. Lefebvre C, Lim WK, Basso K, Dalla Favera R, and Califano A. Lecture Notes in Bioinformatics (LNCS) 2007, 4532:42-56.

ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G,Dalla Favera R, Califano A. BMC Bioinformatics. 2006 Mar 20;7 Suppl 1:S7.

Reverse engineering of regulatory networks in human B cells. Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A. Nat Genet. 2005 Apr;37(4):382-90.

Genome-wide identification of post-translational modulators of transcription factor activity in human B cells. Wang K, Saito M, Bisikirska BC, Alvarez MJ, Lim WK, Rajbhandari P, Shen Q, Nemenman I, Basso K, Margolin AA, Klein U, Dalla-Favera R, Califano A. Nat Biotechnol. 2009 Sep;27(9):829-39.

Dissecting the interface between signaling and transcriptional regulation in human B cells. Wang K, Alvarez MJ, Bisikirska BC, Linding R, Basso K, Dalla Favera R, Califano A. Pac Symp Biocomput. 2009:264-75.

A systems biology approach to prediction of oncogenes and molecular perturbation targets in B-cell lymphomas. Mani KM, Lefebvre C, Wang K, Lim WK, Basso K, Dalla-Favera R, Califano A. Mol Syst Biol. 2008;4:169.

A human B-cell interactome identifies MYB and FOXM1 as master regulators of proliferation in germinal centers. Lefebvre C, Rajbhandari P, Alvarez MJ, Bandaru P, Lim WK, Sato M, Wang K, Sumazin P, Kustagi M, Bisikirska BC, Basso K, Beltrao P, Krogan N, Gautier J, Dalla-Favera R, Califano A. Mol Syst Biol. 2010 Jun 8;6:377.






Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Bladder Cancer Cohort

Species: Human
Tissue: Bladder
Disease: Cancer
Investigator: Francois Radvanyi
Institution: Institut Curie
Approximate Number Subjects: 62

Abstract:

Genetic and epigenetic alterations have been identified that lead to transcriptional deregulation in cancers. Genetic mechanisms may affect single genes or regions containing several neighboring genes, as has been shown for DNA copy number changes. It was recently reported that epigenetic suppression of gene expression can also extend to a whole region; this is known as long-range epigenetic silencing. Various techniques are available for identifying regional genetic alterations, but no large-scale analysis has yet been carried out to obtain an overview of regional epigenetic alterations. This dataset was used to carry out an exhaustive search for regions susceptible to such mechanisms using a combination of transcriptome correlation map analysis and array CGH data for a series of bladder carcinomas. 
Western-IRB has confirmed that this dataset residing in Sage Bionetworks Repository is 'exempt' under federal regulation 45 CFR 46.101(b)4 and does not involve human subject research as defined by OHRP guidelines.

Citation

Regional copy number-independent deregulation of transcription in cancer.
Stransky N, Vallot C, Reyal F, Bernard-Pierrot I, de Medina SG, Segraves R, de Rycke Y, Elvin P, Cassidy A, Spraggon C, Graham A, Southgate J, Asselain B, Allory Y, Abbou CC, Albertson DG, Thiery JP, Chopin DK, Pinkel D, Radvanyi F. Regional copy number-independent deregulation of transcription in cancer. Nat Genet. 2006 Dec;38(12):1386-96. 

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Breast Cancer NKI

Species: human
Tissue: breast
Disease: cancer
Investigator: Stephen H Friend
Institution: Merck Research Laboratories
Approximate Number Subjects: 295

Abstract:

This dataset contains expression profiles and clinical traits derived from 295 breast cancer tumors collected from the tissue bank of the Netherlands Cancer Institute.  This dataset has been used to develop a gene expression signature that is strongly predictive of good vs. poor prognosis in patients with stage I or stage II breast cancer.

Citation

A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients.
Dai H, van't Veer L, Lamb J, He YD, Mao M, Fine BM, Bernards R, van de Vijver M, Deutsch P, Sachs A, Stoughton R, Friend S.Cancer Res. 2005 May 15;65(10):4059-66.

A gene-expression signature as a predictor of survival in breast cancer.
van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R.N Engl J Med. 2002 Dec 19;347(25):1999-2009.

Gene expression profiling predicts clinical outcome of breast cancer.
van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH.Nature. 2002 Jan 31;415(6871):530-6.

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Breast Cancer Stanford Norway

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: Pollack
Institution: Stanford University
Approximate Number Subjects: 127

Abstract:

Breast cancer is a leading cause of cancer-death among women, where the clinicopathological features of tumors are used to prognosticate and guide therapy. DNA copy number alterations (CNAs), which occur frequently in breast cancer and define key pathogenetic events, are also potentially useful prognostic or predictive factors. Here, we report a genome-wide array-based comparative genomic hybridization (array CGH) survey of CNAs in 89 breast tumors from a patient cohort with locally advanced disease. Statistical analysis links distinct cytoband loci harboring CNAs to specific clinicopathological parameters, including tumor grade, estrogen receptor status, presence of TP53 mutation, and overall survival. Notably, distinct spectra of CNAs also underlie the different subtypes of breast cancer recently defined by expression-profiling, implying these subtypes develop along distinct genetic pathways. In addition, higher numbers of gains/losses are associated with the "basal-like" tumor subtype, while high-level DNA amplification is more frequent in "luminal-B" subtype tumors, suggesting also that distinct mechanisms of genomic instability might underlie their pathogenesis. The identified CNAs may provide a basis for improved patient prognostication, as well as a starting point to define important genes to further our understanding of the pathobiology of breast cancer.

This dataset contains expression profiles, copy number variation measured by arrayCGH, and tumor traits.

Citation

Distinct patterns of DNA copy  number alteration are associated with different clinicopathological features and gene-expression subtypes of breast cancer.
Bergamaschi A, Kim YH, Wany P, Sorlie T, Hernandez-Boussard T, Lonning PE, Tibshirani R, Borresen-Dale AL, Pollack JR. Genes Chromosomes Cancer 2006 45(11):1033-40.

Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors.  
Pollack JR, Sorlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, Borresen-Dale AL, Brown PO.  PNAS. 2002 99(20): 12963-8.

Repeated observation of breast tumor subtypes in independent gene expression data sets.  Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS< Nodel A, Deng S, Johnsen H, Pesich R, Geisler S, Demeter J, Perou CM, Lonning PE, Brown PO, Borresen-Dale AL, Botstein D.  PNAS. 2003 100:8418-23.

Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.  Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, va de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Lonning PE, Borresen-Dale AL.  PNAS (2001) 98:10869-74.

Geisler S, Borresen-Dale AL, Johnsen H, Aas T, Geisler J, Akslen LA, Anker G, Lonning PE Clin Cancer Res. (2003) 9:5582-8.

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Cancer Cell line Panel

Species: Human
Tissue: Cell Line
Disease: Cancer
Investigator: Richard Wooster
Institution: GlaxoSmithKline (GSK)
Approximate Number Subjects: 417

Abstract:

GlaxoSmithKline (GSK) has released the genomic profiling data for over 300 cancer cell lines via the National Cancer Institute's cancer Bioinformatics Grid(TM) (caBIG(R)). Cancer cell lines can be manipulated in the laboratory and have been used extensively by GSK in the discovery and development of novel cancer therapeutics. These data are available through caArray.

Citation

This data set was generated and is provided for public use by GlaxoSmithKline.

https://cabig.nci.nih.gov/caArray_GSKdata/

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Colorectal Cancer VUmc Amsterdam

Species: human
Tissue: colon
Disease: cancer
Investigator: Beatriz Carvalho and Gerrit Meijer
Institution: VU medisch centrum
Approximate Number Subjects: 73

Abstract:

This study contains arrayCGH traits, expression traits, and tumor traits for 36 colorectal adenocarcinoma tumors and 33 adenoma tumors (malignant polyps) collected prospectively collected at the VU University Medical Center in Amsterdam, The Netherlands.  This dataset was used to assess oncogenes within chromosomal 20q copy number gain and revealed seven genes to be important in chromosomal instability related to the progression of adenoma to carcinoma.  ArrayCGH was performed uisng BAC arrays against a reference DNA pool derived from 10 normal individuals.  Expression traits were measured against an RNA pool dervied from several cancer cell lines and using a custom spotted oligonucleotide array developed by the VUMC microarray core facility (Human Release 2.0 containing 60-mer oligonucleotide probes to 28,830 unique genes).

Citation

Multiple putative oncogenes at the chromosome 20q amplicon contribute to colorectal adenoma to carcinoma progression.  Carvalho B, Postma C, Mongera S, Hopmans E, Diskin S, van de Wiel MA, van Criekinge W, Thas O, Matthai A, Cuesta MA, Terhaar Sive Droste JS, Craanen M, Schrock E, Ylstra B, Meijer GA. Gut. 2009 Jan;58(1):79-89. 

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Embryonic Stem Cells

Species: Human
Tissue: Embryonic Stem Cell
Disease: Healthy
Investigator: Sheng Zhong
Institution: University of Illinois
Approximate Number Subjects: 24

Abstract:

The differentiation of embryonic stem cells is initiated by a gradual loss of
pluripotency-associated transcripts and induction of differentiation genes.
Accordingly, the detection of differentially expressed genes at the early stages of differentiation could assist the identification of the causal genes that either promote or inhibit differentiation. The previous methods of identifying differentially expressed genes by comparing different cell types would inevitably include a large portion of genes that respond to, rather than
regulate, the differentiation process. We demonstrate through the use of
biological replicates and a novel statistical approach that the gene expression data obtained without prior separation of cell types are informative for detecting differentially expressed genes at the early stages of differentiation. Applying the proposed method to analyze the differentiation of murine embryonic stem cells, we identified and then experimentally verified Smarcad1 as a novel regulator of pluripotency and self-renewal. We formalized this statistical approach as a statistical test that is generally applicable to analyze other differentiation processes.


Citation

Dissecting Early Differentially Expressed Genes in a Mixture of Differentiating Embryonic Stem Cells. Feng Hong, Fang Fang, Xuming He, Xiaoyi Cao, Hiram Chipperfield, Dan Xie, Wing H. Wong, Huck H. Ng, Sheng Zhong. PLoS Comput Biol 5(12): e1000607. doi:10.1371/journal.pcbi.1000607

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Glioblastoma TCGA

Species: Human
Tissue: Brain
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 413

Abstract:

Gene co-expression network analysis (GCENA) has been widely used to identify
pathways and gene targets for a variety of common human diseases such as cancer,
atherosclerosis, obesity and diabetes. In this project, we constructed a
genome-wide gene co-expression network based on the TCGA glioblastoma multiform
(GBM) gene expression data. The gene expression data includes 339 glioblastoma
samples and 22,215 genes. For a gene with multiple probes present on the
microarray platform, only the probe with the largest sample variation was
selected, leading to 14,492 unique genes for network construction. The network
analysis uncovered 28 gene modules which are comprised of highly co-regulated
genes. Most of these modules are significantly enriched for genes with
certain ontological categories such as cell cycle, extracellular matrix,
and immune response.

Citation

Cancer Genome Atlas Research Network.
Comprehensive genomic characterization defines human glioblastoma genes and core pathways.
Nature. 2008 Oct 23;455(7216):1061-8. 


p53 and Pten control neural and glioma stem/progenitor cell renewal and differentiation. Zheng, H., Ying, H., Yan, H., Kimmelman, A.C., Hiller, D.J., Chen, A.J., Perry, S.R., Tonon, G., Chu, G.C., Ding, Z., et al. (2008) Nature. 455(7216):1129-1133

http://tcga.cancer.gov/

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Hepatocellular Carcinoma HongKong

Species: Human
Tissue: Liver
Disease: Cancer
Investigator: John Luk
Institution: Hong Kong University
Approximate Number Subjects: 100

Abstract:

The HKU Hepatocarcinoma study (HKU-HCC) aimed to characterize the process of
tumorigenesis in hepatocellular carcinoma (HCC) using genotyping, gene
expression profiling and clinical endpoints in adjacent normal (AN) and
tumor (TU) samples representing, respectively, the pre-cancer state and the
results of tumor evolution. The HKU-HCC-100 is a subset of 100 matched paired
TU and AN liver tissue samples collected from Asian subjects undergoing surgical
resection for treatment of HCC. These 100 paires of samples represent a subset of the 250
matched pairs of TU and AN samples that were screened. DNA was isolated from all AN and
TU tissues and genotyped on the Illumina 650Y SNP genotyping array
representing 655,352 tag SNP markers. Copy number aberration markers
(sCNV markers) were then imputed for 32,711 locations in the genome from this
high-density SNP panel. RNA samples were profiled on a custom Affymetrix
microarray comprised of oligonucleotide probes targeting transcripts
representing 37,585 known and predicted genes, including high-confidence
non-coding RNA sequences.

Citation

A general framework for weighted gene co-expression network analysis. Zhang, B. & Horvath, S. Stat Appl Genet Mol Biol 4, Article17 (2005).
	
Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut library for R. Langfelder P, Zhang B, Horvath S (2007) 

Predictive genes in adjacent normal tissue represent rare limiting steps in the process of tumorigenesis in liver cancer.J. Lamb, C. Zhang, T. Xie, K. Wang, B. Zhang, K. Hao, E. Chudin, H. Fraser,J. Millstein, M. Ferguson, C. Suver, I. Ivanovska, M. Scott, U. Philippar, D. Bansal, Z. Zhang, J. Burchard, R. Smith, D. Greenawalt, J. Derry, I. Wang, A. Loboda, J. Watters, R. Poon, C. Yeung, N. Lee, C. Molony, V. Emilsson, C. Buser-Doepner, J. Zhu, S. Friend, M. Mao, P. Shaw, H. Dai, J. Luk, E. Schadt PLoS One (In Press, 2011)

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Heterogeneous Stock Mice

Species: Mouse
Tissue: Hippocampus,Lung,Liver
Disease: Multiple
Investigator: Flint/Mott
Institution: Oxford
Approximate Number Subjects: 2519

Abstract:

This dataset contains data on several hundred heterogeneous stock mice derived from 8 parental lines (A/J, AKR/J, BALBc/J, CBA/J, C3H/HeJ, C57BL/6J, DBA/2J, and LP/J).  Data includes genome-wide DNA variation as well as genome-wide expression profiling of lung (285 mice), liver (273 mice), and hippocampus (468 mice). Expression profiling for this project was one of the first to use the Illumina mouse expression beadchips. Extensive phenotyping for traits related to asthma, type 2 diabetes mellitus, obesity and anxiety was also performed.

Citation

Genome-wide genetic association of complex traits in heterogeneous stock mice. Valdar W, Solberg LC, Gauguier D, Burnett S, Klenerman P, Cookson WO, Taylor MS, Rawlins JN, Mott R, Flint J. Nat Genet. 2006 Aug;38(8):879-87.
 
A resource for the simultaneous high-resolution mapping of multiple quantitative trait loci in rats: the NIH heterogeneous stock. Johannesson M, Lopez-Aumatell R, Stridh P, Diez M, Tuncel J, Blazquez G, Martinez-Membrives E, Canete T, Vicens-Costa E, Graham D, Copley RR, Hernandez-Pliego P, Beyeen AD, Ockinger J, Fernandez-Santamaria C, Gulko PS, Brenner M, Tobena A, Guitart-Masip M, Gimenez-Llort L, Dominiczak A, Holmdahl R, Gauguier D, Olsson T, Mott R, Valdar W, Redei EE, Fernandez-Teruel A, Flint J. Genome Res. 2009 Jan;19(1):150-8.


High resolution mapping of expression QTLs in heterogeneous stock mice in multiple tissues.
Huang GJ, Shifman S, Valdar W, Johannesson M, Yalcin B, Taylor MS, Taylor JM, Mott R, Flint J.
Genome Res. 2009 Jun;19(6):1133-40. 

http://mus.well.ox.ac.uk/mouse/HS/

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Human Liver Cohort

Species: Human
Tissue: Liver
Disease: CVD
Investigator: Fred Guengrich/Steve Strom/ Erin Schuetz/ Merck & Co.
Institution: Vanderbilt University/ University of Pittsburg/ StJudes Hospital/ Merck & Co.
Approximate Number Subjects: 517

Abstract:

The Human Liver Cohort (HLC) study aimed to characterize the genetic
architecture of gene expression in human liver using genotyping, gene expression
profiling, and enzyme activity measurements of Cytochrom P450. The HLC was
assembled from a total of 780 liver samples screened.  These liver samples
were acquired from caucasian individuals from three independant tissue
collection centers.   DNA samples were genotyped on the Affymetrix 500K SNP
and Illumina 650Y SNP genotyping arrays representing a total of 782,476 unique
single nucleotide polymorphisms (SNPs). Only the genotype data from those
samples which were collected postmortem are accessible in dbGap.  These 228
samples represent a subset of the 427 samples included in the Human Liver
Cohort Publication (Schadt, Molony et al. 2008).  RNA samples were profiled on
a custom Agilent 44,000 feature microarray composed of 39,280 oligonucleotide
probes targeting transcripts representing 34,266 known and predicted genes,
including high-confidence, noncoding RNA sequences. Each of the liver samples
was processed into cytosol and microsomes using a standard differential
centrifugation method. The activities of nine P450 enzymes (CYP1A2, 2A6, 2B6,
2C8, 2C9, 2C19, 2D6, 2E1, and 3A4) in isolated microsomes from 398 HLC liver
samples were measured in the microsome preparations using probe substrate
metabolism assays  expressed as nmol/min/mg protein.  Each was measured with a
single substrate except for the CYP3A4 activity that was measured using two
substrates, midazolam and testosterone.

Citation

Mapping the genetic architecture of gene expression in human liver. Eric E. Schadt, Cliona Molony, Eugene Chudin, Ke Hao, Xia Yang, Pek Y. Lum, Andrew Kasarskis, Bin Zhang, Susanna Wang, Christine Suver, Jun Zhu, Joshua Millstein, Solveig Sieberts, John Lamb, Debraj GuhaThakurta, Jonathan Derry, John D. Storey, Iliana Avila-Campillo, Mark J. Kruger, Jason M. Johnson, Carol A. Rohl, Atila van Nas, Margarete Mehrabian, Thomas A. Drake, Aldons J. Lusis, Ryan C. Smith, F. Peter Guengerich, Stephen C. Strom, Erin Schuetz, Thomas H. Rushmore, Roger Ulrich. PLoS Biol, 2008. 6(5): p. e107. PMID: 18462017
	
	
Systematic Genetic and Genomic Analysis of Cytochrome P450 Enzyme Activities in Human Liver. Xia Yang, Bin Zhang, Cliona Molony, Eugene Chudin, Ke Hao, Jun Zhu, Christine Suver, Hua Zhong, F. Peter Guengerich, Stephen C. Strom, Erin Schuetz, Thomas H. Rushmore, Roger G. Ulrich, J. Greg Slatter, Eric E. Schadt, Andrew Kasarskis, Pek Yee Lum. Genome Res. 2010 Aug;20(8):1020-36.


Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

LFN-Kronos-PHASE I

Species: Human
Tissue: Brain Parietal Cortex,Brain Temporal Cortex,Brain Cerebellum,Brain Frontal Cortex
Disease: Alzheimers
Investigator: Amanda Myers
Institution: University of Miami
Approximate Number Subjects: 364

Abstract:

This dataset surveys the relationship between the human brain transcriptome and genome within a series of neuropathologically normal postmortem samples including 176 confirmed pathologic diagnosis of late-onset Alzheimer disease and 188 controls. Samples were collected from a series of brain banks. This dataset contains whole-genome DNA variation (SNP) traits, whole-genome expression profiling, and several clinical phenotypes.

Citation

Genetic control of human brain transcript expression in Alzheimer disease.  Webster JA, Gibbs JR, Clarke J, Ray M, Zhang W, Holmans P, Rohrer K, Zhao A, Marlowe L, Kaleem M, McCorquodale DS 3rd, Cuello C, Leung D, Bryden L, Nath P, Zismann VL, Joshipura K, Huentelman MJ, Hu-Lince D, Coon KD, Craig DW, Pearson JV; NACC-Neuropathology Group, Heward CB, Reiman EM, Stephan D, Hardy J, Myers AJ. Am J Hum Genet. 2009 Apr;84(4):445-58. PMID: 19361613

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

METABRIC Breast Cancer

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: Sam Aparicio/ Carlos Caldas
Institution: University of British Columbia/ Cambridge Research Institute
Approximate Number Subjects: 1500

Abstract:

Data for 1600 breast cancers (all subtypes) generated jointly by
Drs. Sam Aparicio at BCCA and Carlos Caldas at CRC Cambridge UK as part of the
METABRIC project. Coexpression and Bayesian networks are available that were derived from  DNA variation (Affymetrix SNP6.0),
gene expression (Illumina Infinium II Bead arrays) and clinical outcome traits (5 year minimum outcomes information). The data will become available in the future.

Citation

A metadata approach for clinical data management in translational genomics studies in breast cancer. Papatheodorou I, Crichton C, Morris L, Maccallum P; METABRIC Group,Davies J, Brenton JD, Caldas C.
BMC Med Genomics. 2009 ;2:66

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Mouse Model of Blood Pressure

Species: Mouse
Tissue: Kidney medulla,Kidney cortex,Liver,Adipose
Disease: CVD
Investigator: Merck & Co.
Institution: Merck & Co.
Approximate Number Subjects: 412

Abstract:

An F2 population of was derived from a C57BL/6J x A/J cross (B6AF2) and traits were measured in 340 male and female progeny. Mice were placed on a high-fat high-salt balanced diet at week 7 and maintained on this chow until termination at week 16. Five principle phenotyping components were used: blood pressure and heart rate by tail cuff at week 10; echocardiography at week 10; energy utilization by Oxymax at week 12; oral glucose tolerance test (OGTT) at week 13; intra-peritoneal insulin sensitivity test (IPIST) at week 14; and body composition by Dexascan at week 15. In addition, a number of endpoints relevant to size and adiposity, and serum for blood analytes including lipids, were collected at final necropsy. Whole genome expression data were generated for all F2 progeny for four different tissues (gonadal adipose, liver, kidney cortex, and kidney medulla) using a custom Agilent mouse array containing ~40,000 unique reporter sequences Genotypes were collected using the Affymetrix Mouse Mapping 5K panel on F2 individuals.The final dataset consists of gene expression (4 tissues), phenotype data (106 traits), genotype data (4643 markers) for 340 F2 progeny as well as for 24 each of the parentals (C57BL6J and A/J) and F1 progeny.

Citation

Identification of genes and networks driving 
cardiovascular and metabolic phenotypes in a mouse F2 intercross.  Derry JM, Zhong H, Molony C, MacNeil D, Guhathakurta D, Zhang B, Mudgett J,
Small K, El Fertak L, Guimond A, Selloum M, Zhao W, Champy MF, Monassier L, Vogt 
T, Cully D, Kasarskis A, Schadt EE.  PLoS One. 2010 
Dec 14;5(12):e14319. PubMed PMID: 21179467; PubMed Central PMCID: PMC3001864.


Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Mouse Model of Diet-Induced Atherosclerosis

Species: Mouse
Tissue: Liver
Disease: Metabolic Disease
Investigator: Jake Lusis/ Merck & Co.
Institution: UCLA/ Merck & Co.
Approximate Number Subjects: 111

Abstract:

111 female F2 progeny of a C57BL/6J and DBA/2J intercross were examined for
multiple measures of femoral bone mass, density, and biomechanical properties
using both computerized tomographic and radiographic methods. In addition, body
weight and length, adipose tissue mass, plasma lipids and insulin, and aortic
fatty lesions were assessed. Mice were on a rodent chow diet up to 12 months
of age, and then switched to an atherogenic high-fat, high-cholesterol diet for
another 4 months. Mice were killed at 16 months of age.  Liver tissue was
profiled for expression traits using a custom mouse gene oligonucleotide
microarray (Rosetta Inpharmatics) that contained 23,574 non-control
oligonucleotide probes for mouse genes and 2,186 control probes.
All microarrays were custom ink-jet microarrays fabricated by Agilent
Technologies. A complete linkage map for all chromosomes except Y in mouse
was constructed at an average density of 13cM using microsatellite markers.

Citation

Genetic loci determining bone density in mice with diet-induced atherosclerosis.  Drake TA, Schadt E, Hannani K, Kabo JM, Krass K, Colinayo V, Greaser LE 3rd, Goldin J, Lusis AJ. Physiol Genomics. 2001 Apr 27;5(4):205-15. PMID: 11328966


Genetics of gene expression surveyed in maize, mouse and man. Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS, Mao M, Stoughton RB, Friend SH.
Nature. 2003 Mar 20;422(6929):297-302.PMID: 12646919

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Mouse Model of Diet-Induced BreastCancer

Species: Mouse
Tissue: Breast
Disease: Cancer
Investigator: Gordon/Pomp
Institution: UNC
Approximate Number Subjects: 134

Abstract:

Breast cancer is a complex disease resulting from a combination of genetic and environmental factors. Among environmental factors, body composition and intake of specific dietary components like total fat are associated with increased incidence of breast cancer and metastasis. This dataset was used to demonstrate that mice fed a high-fat diet have shorter mammary cancer latency, increased tumor growth and more pulmonary metastases than mice fed a standard diet. Subsequent genetic analysis identified several modifiers of metastatic mammary cancer along with widespread interactions between cancer modifiers and dietary fat. To elucidate diet-dependent genetic modifiers of mammary cancer and metastasis risk, global gene expression profiles and copy number alterations from mammary cancers were measured and expression quantitative trait loci (eQTL) identified. Functional candidate genes that colocalized with previously detected metastasis modifiers were identified. Additional analyses, such as eQTL by dietary fat interaction analysis, causality and database evaluations, helped to further refine the candidate loci to produce an enriched list of genes potentially involved in the pathogenesis of metastatic mammary cancer. 

Citation

Dietary fat-dependent transcriptional architecture and copy number alterations associated with modifiers of mammary cancer metastasis. Ryan R. Gordon, Michele La Merrill, Kent W. Hunter, Peter Srensen, David W. Threadgill and Daniel Pomp. Clinical and Experimental Metastasis; 27 (5), 279-293

Genotype X diet interactions in mice predisposed to mammary cancer: II. Tumors and metastasis.
Gordon RR, Hunter KW, La Merrill M, Srensen P, Threadgill DW, Pomp D. Mamm Genome. 2008 Mar;19(3):179-89. Epub 2008 Feb 21.

Genotype X diet interactions in mice predisposed to mammary cancer. I. Body weight and fat.
Gordon RR, Hunter KW, Srensen P, Pomp D. Mamm Genome. 2008 Mar;19(3):163-78. Epub 2008 Feb 20.

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Mouse Model of Sexually Dimorphic Atherosclerotic Traits

Species: Mouse
Tissue: Adipose,Liver,Brain,Muscle
Disease: CVD
Investigator: Jake Lusis/ Merck & Co.
Institution: UCLA/ Merck & Co.
Approximate Number Subjects: 334

Abstract:

C57BL/6J and C3H/HeJ inbred mouse strains exhibit dramatically different
cardiovascular and metabolic phenotypes on the hyperlipidemic apolipoprotein
E (Apoe) null background.  In order to identify the genes that contribute to
these differences, we constructed an F2 intercross between the B6.Apoe-/- and
C3H.Apoe-/- strains consisting of 334 animals.  The mice were fed on a chow
diet until 8 weeks of age, then fed a high fat (42% fat) "western" diet for 16
weeks to exacerbate the phenotypes and euthanized at 24 weeks of age via
cervical dislocation.  Prior to death, mice were fasted for 4 hours in the
morning, anesthetized using Isoflurane, and weighed.   Blood was collected by
retro-orbital bleed; plasma was frozen at -800C.  We measured plasma
cholesterol, HDL, LDL, triglycerides, free fatty acids, glucose, insulin,
leptin, adiponectin and PON1 activity levels.  Liver, brain, skeletal muscle
(hamstring) and adipose (gonadal fat pad) were flash frozen in liquid nitrogen.
RNA was isolated from the tissues using the Trizol method and utilized in
microarray analysis on a custom 60mer Agilent chip (reference for chip would
be useful).  Hepatic cholesterol, triglyceride and free fatty acid levels were
also measured.  Hearts and aortae were extracted, perfused and fixed for
atherosclerotic lesion analysis.  The aortic arch was serially sectioned
through to the aortic sinus with every fifth 10um section stained with
hematoxylin and oil-red-o, which specifically stains lipids.  Slides were
examined by light microscopy.  The fatty streak lesion area was quantified
using an ocular with a grid; forty sections per mouse were quantified and
averaged.  Vascular calcification and aneurysm formation were also measured in
a semi-quantitative manner based on presence or absence and size or severity.
DNA was isolated from kidney using a phenol chloroform extraction method.  The
mice were genotyped at 1500 SNPs using the ParAllele molecular inversion probe
technology; 1353 SNPs passed quality control for a final marker density of 1.5cM.

Citation

Integrating genetic and network analysis to characterize genes related to mouse weight. Ghazalpour, A., et al., PLoS Genet, 2006. 2(8): p. e130.

Dosage compensation is less effective in birds than in mammals. Itoh, Y., et al., J Biol, 2007. 6(1): p. 2.

Elucidating the murine brain transcriptional network in a segregating mouse population to identify core functional modules for obesity and diabetes.  Lum, P.Y., et al., J Neurochem, 2006. 97 Suppl 1: p. 50-62.

Identification of Abcc6 as the major causal gene for dystrophic cardiac calcification in mice through integrative genomics. Meng, H., et al. Proc Natl Acad Sci U S A, 2007. 104(11): p. 4530-5.

Mapping the genetic architecture of gene expression in human liver. Schadt, E.E., et al.,  PLoS Biol, 2008. 6(5): p. e107.

Elucidating the role of gonadal hormones in sexually dimorphic gene coexpression networks.  van Nas, A., et al., Endocrinology, 2009. 150(3): p. 1235-49.

Identification of pathways for atherosclerosis in mice: integration of quantitative trait locus analysis and global gene expression data. Wang, S.S., et al., Circ Res, 2007. 101(3): p. e11-30.

Tissue-specific expression and regulation of sexually dimorphic genes in mice. Yang, X., et al., Genome Res, 2006. 16(8): p. 995-1004.

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

MSKCC Prostate Cancer

Species: Human
Tissue: Prostate
Disease: Cancer
Investigator: Charles Sawyers
Institution: Memorial Sloan Kettering Cancer Center
Approximate Number Subjects: 261

Abstract:

Genetic and epigenetic alterations have been identified that lead to transcriptional Annotation of prostate cancer genomes provides a foundation for discoveries that can impact disease understanding and treatment. Concordant assessment of DNA copy number, mRNA expression, and focused exon resequencing in the 218 prostate cancer tumors represented in this dataset haveidentified the nuclear receptor coactivator NCOA2 as an oncogene in approximately 11% of tumors. Additionally, the androgen-driven TMPRSS2-ERG fusion was associated with a previously unrecognized, prostate-specific deletion at chromosome 3p14 that implicates FOXP1, RYBP, and SHQ1 as potential cooperative tumor suppressors. DNA copy-number data from primary tumors revealed that copy-number alterations robustly define clusters of low- and high-risk disease beyond that achieved by Gleason score. 

Citation

Integrative genomic profiling of human prostate cancer.
Taylor BS, Schultz N, Hieronymus H, Gopalan A, Xiao Y, Carver BS, Arora VK, Kaushik P, Cerami E, Reva B, Antipin Y, Mitsiades N, Landers T, Dolgalev I, Major JE, Wilson M, Socci ND, Lash AE, Heguy A, Eastham JA, Scher HI, Reuter VE, Scardino PT, Sander C, Sawyers CL, Gerald WL.
Cancer Cell. 2010 Jul 13;18(1):11-22. 

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Sanger Cell Line Project

Species: Human
Tissue: Cell Line
Disease: Cancer
Investigator: Stratton/Futreal/Todd Golub
Institution: Sanger/BROAD
Approximate Number Subjects: 748

Abstract:

The cancer genome is moulded by the dual processes of somatic mutation and selection. Homozygous deletions in cancer genomes occur over recessive cancer genes, where they can confer selective growth advantage, and over fragile sites, where they are thought to reflect an increased local rate of DNA breakage. However, most homozygous deletions in cancer genomes are unexplained. Here we identified 2,428 somatic homozygous deletions in 746 cancer cell lines. These overlie 11% of protein-coding genes that, therefore, are not mandatory for survival of human cells. We derived structural signatures that distinguish between homozygous deletions over recessive cancer genes and fragile sites. Application to clusters of unexplained homozygous deletions suggests that many are in regions of inherent fragility, whereas a small subset overlies recessive cancer genes. The results illustrate how structural signatures can be used to distinguish between the influences of mutation and selection in cancer genomes. The extensive copy number, genotyping, sequence and expression data available for this large series of publicly available cancer cell lines renders them informative reagents for future studies of cancer biology and drug discovery.

Citation

Signatures of mutation and selection in the cancer genome. Bignell GR, Greenman CD, Davies H, Butler AP, Edkins S, Andrews JM, Buck G, Chen L, Beare D, Latimer C, Widaa S, Hinton J, Fahey C, Fu B, Swamy S, Dalgliesh GL, Teh BT, Deloukas P, Yang F, Campbell PJ, Futreal PA, Stratton MR. Nature. 2010 Feb 18;463(7283):893-8.

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

TCGA curation package

Species: 0
Tissue: Multiple
Disease: Cancer
Investigator: Guinney/Henderson
Institution: Sage Bionetworks
Approximate Number Subjects: 500

Abstract:

This package contains code that takes data from the TCGA project (Level 2 CNV data and Levels 2 and 3 expression data) and reformats it into a series of
tab delimited text files suitable for analysis in many statistical software packages.

Citation

This script was developed by Justin Guinney and David Henderson at Sage Bionetworks.

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Yeast Genetic Interactions

Species: Yeast
Tissue: Yeast
Disease: Healthy
Investigator: Jonathan Weissman/Jack Greenblatt
Institution: UCSF/University of Toronto
Approximate Number Subjects: 743

Abstract:

Here we present an epistatic miniarray profile (E-MAP) consisting of
quantitative pairwise measurements of the genetic interactions between 743
Saccharomyces cerevisiae genes involved in various aspects of chromosome
biology (including DNA replication/repair, chromatid segregation and
transcriptional regulation).

Citation

Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. Collins SR, Miller KM, Maas NL, Roguev A, Fillingham J, Chu CS, Schuldiner M, Gebbia M, Recht J, Shales M, Ding H, Xu H, Han J, Ingvarsdottir K, Cheng B, Andrews B, Boone C, Berger SL, Hieter P, Zhang Z, Brown GW, Ingles CJ, Emili A, Allis CD, Toczyski DP, Weissman JS, Greenblatt JF, Krogan NJ. Nature. 2007 Apr 12;446(7137):806-10.

Overall packet description & agreement:

View Description View User Agreement

Data packets for download:

Click Here to login or register to download files.

Pending datasets have been authorized for inclusion in the Sage Bionetworks Repository and are currently being processed for release. In some cases, these datasets will be made available following an embargo period imposed by the data contributor. Interested researchers should check back periodically for availability of these datasets. Let us know what dataset you are most interested in by emailing repdata@sagebase.org.

Breast Cancer MD Anderson

Species: human
Tissue: breast
Disease: cancer
Investigator: Fabrice Andre and Vladimir Lazar
Institution: Institut Gustave Roussy
Approximate Number Subjects: 103

Abstract:

Unable to load the dataset abstract.

Citation

Unable to load the dataset citation.

Harvard Brain Tissue Resource Center

Species: Human
Tissue: Brain prefrontal cortex,Brain visual cortex,Brain cerebellum
Disease: Neurological Disease
Investigator: Francine Benes/ Eric Schadt
Institution: Harvard Brain Tissue Resource Center/ Merck Research Laboratories
Approximate Number Subjects: 803

Abstract:

This study aims at identifying functional variation in the human genome (especially as it pertains to brain expressed RNAs) and elucidate its relationship to disease and drug response. The ~800 individuals in this dataset are composed of approximately 400 Alzheimers disease (AD) cases, 230 Huntington's Disease and 170 controls matched for age, gender, and post mortem interval (PMI).  The tissue specimens for this study were provided by Harvard Brain Tissue Resource Center (HBTRC).  
Three brain regions (cerebellum, visual cortex, and dorsolateral prefrontal cortex) from the same individuals were profiled on a custom-made Agilent 44K microarray of 39,280 DNA probes uniquely targeting 37,585 known and predicted genes, including splice variants, miRNAs and high-confidence non-coding RNA sequences. The individuals were genotyped on two different platforms, the Illumina HumanHap650Y array and a custom Perlegen 300K array (a focused panel for detection of singleton SNPs). Clinical outcomes available include age at onset, age at death, Braak scores (AD), Vonsattel scores (HD), Regional brain enlargement/atrophy.

Data from this study was generated by Merck on samples collected by the Harvard Brain Tissue Resource Center under an agreement (enclosed) that supports public release of scientific results emanating from these tissues.
Western-IRB has confirmed that this dataset residing in Sage Bionetworks Repository is 'exempt' under federal regulation 45 CFR 46.101(b)4 and does not involve human subject research as defined by OHRP guidelines.

Citation

Not yet published.

Mouse Model of Dietary Regulation of Metabolic Disease

Species: Mouse
Tissue: Liver,Hypothalamus,Muscle,Adipose
Disease: Metabolic Disease
Investigator: Merck & Co.
Institution: Merck & Co.
Approximate Number Subjects: 1650

Abstract:

This backcross cohort consists of 4 separate populations of C57BL6/JxDBA/2 (n=2), C57BL6/Jx129SvImJ F2 and C57BL6/JxA/J F2s numbering 300-500 animals each. 3 of the populations, comprising the 3 different cross backgrounds, underwent a 20 week protocol prior to sacrifice including 12 weeks on high fat diet (HFD) while the 2nd C57BL6/JxDBA/2 cohort underwent the identical protocol for the first 20 weeks of life but was then aged out to week 62 before sacrifice, with repetition of the 12 weeks of HFD for the last period of life. Phenotypes collected in these mice include weight, body composition and measures of glucose metabolism. Gene expression profiles have been generated from all animals for adipose, liver hypothalamus and muscle.

Citation

This study has not yet been published.

Mouse Model of Hypertension

Species: Mouse
Tissue: Aorta,Kidney,Adrenal,Hypothalamus
Disease: CVD
Investigator: Merck & Co.
Institution: Merck & Co.
Approximate Number Subjects: 574

Abstract:

This dataset contains SNP, expression (aorta, kidney, adrenal, hypothalamus tissues), and phentoypic data on a population of 574 B6xBPH "hypertension" F2 cross mice.   Phenotypes include those related to hypertension including blood pressure.

Citation

This study has not yet been published.

Mouse Model of Metabolic Disease

Species: Mouse
Tissue: Kidney,Muscle,Islet,Liver,Adipose,Hypothalamus
Disease: Metabolic Disease
Investigator: Alan Attie/ Merck & Co.
Institution: University of Wisconsin/ Merck & Co.
Approximate Number Subjects: 500

Abstract:

The C57BL/6J (B6) mice, when made genetically obese (B6-ob/ob), develop only transient mild hyperglycemia. In striking contrast, when BTBR T+ tf/J (BTBR) mice are made genetically obese (BTBR-ob/ob), the mice develop severe diabetes. At 6 weeks of age, B6-ob/ob mice have an average fasting plasma insulin level of ~10 ng/ml vs.17 ng/ml in BTBR-ob/ob mice (normal insulin in a lean mouse: <2.0 ng/ml.). B6-ob/ob mice showed slightly increased fasting glucose at 8 weeks of age, but returned to a plateau by 10 weeks. The B6-ob/ob mice maintain the normal glucose level by progressively increasing insulin to 30 ng/ml, from 6 to 14 weeks of age. On the other hand, the BTBR-ob/ob mice fail to secrete sufficient insulin. Consequently, their fasting glucose rises to >400 mg/dl by 10 weeks of age. Preliminary data show insufficient -cell mass in BTBR-ob/ob mice, which leads to the hypothesis that the BTBR-ob/ob mice are diabetic because they fail to maintain sufficient -cell mass to compensate for insulin resistance. The -cells in the islets of Langerhans make up the most important tissue in the development of type-2 diabetes and therefore make a logical choice for new diabetes targets. This BTBR X C57BL6 ob/ob F2 cross study, was designed to simultaneously study tissues involved in insulin resistance (liver, muscle, adipose) and insulin secretion (islets) and derive the gene networks that underlie there phenotypes.

Citation

Not yet published.

Mouse Model of Obesity

Species: Mouse
Tissue: Liver,Adipose,Hypothalamus
Disease: Metabolic Disease
Investigator: Daniel Pomp
Institution: University of Nebraska-Lincoln
Approximate Number Subjects: 308

Abstract:

This dataset was designed to provide an inter-tissue view of obesity. An F2 population was establised by intercrossing the M16 and ICR mouse lines.  Mice were typed for expression traits in hypothalamus (n=308), liver (n=302), and adipose (n=308) using the Rosetta/Merck Mouse TOE 75k Array 1.  Phenotypes are related to obesity and include body weight, food intake, glucose and other standard blood markers, cytokines, and DEXA scans of total sucranial tissue and fat mass, and tissue weights. Genotypes are also available.

Citation

Multi-tissue coexpression networks reveal unexpected subnetworks associated with disease. Dobrin R, Zhu J, Molony C, Argman C, Parrish ML, Carlson S, Allan MF, Pomp D, Schadt EE. Genome Biol. 2009;10(5):R55.

Mouse Model of Sarcopenia

Species: Mouse
Tissue: Muscle
Disease: Sarcopenia
Investigator: Arimantas Lionikas
Institution: Pennsylvania State University
Approximate Number Subjects: 811

Abstract:

This dataset was developed by the Center for Developmental and Health Genetics at Penn State University (PSU)and contains SNP traits, muscle expression traits, and phenotypes from 811 B6DF2 intercrossed mice. 

Citation

This study has not yet been published.

Mouse Model of Sleep Traits

Species: Mouse
Tissue: Hypothalamus,Thalamus,Brain cortex,Hippocampus
Disease: Healthy
Investigator: Fred Turek/ Merck & Co.
Institution: Northwestern University/ Merck & Co.
Approximate Number Subjects: 220

Abstract:

An F2 cross between C57BL/6J and 129SvJ mice.  EEG and EMG characterized sleep states were assigned for a total of 50 hrs that included 24 baseline, 6 sleep deprivation, and 20 recovery.  The objective of the study was to relate genes that control sleep patterns to psychiatric and sleep deprevation traits.

Citation

Not yet published.

Ovarian Adenocarcinoma TCGA

Species: Human
Tissue: Ovary
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 387

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 ovarian  serous cystadenocarcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

Common variants at 19p13 are associated with susceptibility to ovarian cancer. Bolton, K.L., Tyrer, J., Song, H., Ramus, S.J., Notaridou, M., Jones, C., Sher, T., Gentry-Maharaj, A., Wozniak, E., Tsai, Y.Y., et al. (2010) Nat Genet. 42(10):880-884. 

http://tcga.cancer.gov/index.asp

Pancreatic Cancer

Species: Human
Tissue: Pancreas
Disease: Cancer
Investigator: Bert Vogelstein
Institution: Johns Hopkins Medicine
Approximate Number Subjects: 24

Abstract:

Comprehensive data on DNA and mRNA (copy number, expression, sequence) from 24 pancreatic cancer samples including sequence on >20,000 genes. Clinical outcomes are available.

Citation

Not yet published.

Pediatric AML TARGET

Species: Human
Tissue: Blood
Disease: Cancer
Investigator: Soheil Meshinchi/ Bob Arceci
Institution: Fred Hutchinson Cancer Research Center /SKCCC at Johns Hopkins
Approximate Number Subjects: 200

Abstract:

http://target.cancer.gov/dataportal/about/

Citation

http://target.cancer.gov/

This list describes datasets that are known to exist outside of the Sage Bionetworks Repository including some datasets that will be generated from samples currently being collected through ongoing clinical trials. It is Sage Bionetworks policy to periodically seek authorization to incorporate these datasets into the Sage Bionetworks Repository. Let us know if you would like to provide access to a dataset listed in this section or to make us aware of additional dataset(s): repdata@sagebase.org.

AML Nonresponders Cohort

Species: Human
Tissue: Blood
Disease: Cancer
Investigator: Stephen Friend
Institution: Sage Bionetworks
Approximate Number Subjects: 300

Abstract:

Clinical outcomes with responder/non-responder status for standard of care therapies. Partial retrospective, partial prospective. Patient-driven. Minimum full exome sequencing.

Citation

This study is not yet published.

AML TCGA

Species: Human
Tissue: Blood
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 AML tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study is not yet published. For more information, go to : http://tcga.cancer.gov/

Asthma Human Lymphoblastoid Cell Lines

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Asthma
Investigator: Cookson
Institution: Imperial College London
Approximate Number Subjects: 404

Abstract:

We have created a global map of the effects of polymorphism on gene expression in 400 children from families recruited through a proband with asthma. We genotyped 408,273 SNPs and identified expression quantitative trait loci from measurements of 54,675 transcripts representing 20,599 genes in Epstein-Barr virus-transformed lymphoblastoid cell lines. We found that 15,084 transcripts (28%) representing 6,660 genes had narrow-sense heritabilities (H2) > 0.3. We executed genome-wide association scans for these traits and found peak lod scores between 3.68 and 59.1. The most highly heritable traits were markedly enriched in Gene Ontology descriptors for response to unfolded protein (chaperonins and heat shock proteins), regulation of progression through the cell cycle, RNA processing, DNA repair, immune responses and apoptosis. SNPs that regulate expression of these genes are candidates in the study of degenerative diseases, malignancy, infection and inflammation. We have created a downloadable database to facilitate use of our findings in the mapping of complex disease loci.

Citation

A genome-wide association study of global gene expression. Dixon AL, Liang L, Moffatt MF, Chen W, Heath S, Wong KC, Taylor J, Burnett E, Gut I, Farrall M, Lathrop GM, Abecasis GR, Cookson WO.Nat Genet. 2007 Oct;39(10):1202-7

Brain Regions NIA

Species: Human
Tissue: Brain,Brain
Disease: Healthy
Investigator: Cookson/Singleton
Institution: NIA
Approximate Number Subjects: 150

Abstract:

A fundamental challenge in the post-genome era is to understand and annotate the consequences of genetic variation, particularly within the context of human tissues. We present a set of integrated experiments that investigate the effects of common genetic variability on DNA methylation and mRNA expression in four human brain regions each from 150 individuals (600 samples total). We find an abundance of genetic cis regulation of mRNA expression and show for the first time abundant quantitative trait loci for DNA CpG methylation across the genome. We show peak enrichment for cis expression QTLs to be approximately 68,000 bp away from individual transcription start sites; however, the peak enrichment for cis CpG methylation QTLs is located much closer, only 45 bp from the CpG site in question. We observe that the largest magnitude quantitative trait loci occur across distinct brain tissues. Our analyses reveal that CpG methylation quantitative trait loci are more likely to occur for CpG sites outside of islands. Lastly, we show that while we can observe individual QTLs that appear to affect both the level of a transcript and a physically close CpG methylation site, these are quite rare. We believe these data, which we have made publicly available, will provide a critical step toward understanding the biological effects of genetic variation.

Citation

Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. Gibbs JR, van der Brug MP, Hernandez DG, Traynor BJ, Nalls MA, Lai SL, Arepalli S, Dillman A, Rafferty IP, Troncoso J, Johnson R, Zielke HR, Ferrucci L, Longo DL, Cookson MR, Singleton AB.PLoS Genet. 2010 May 13;6(5):e1000952.

Breast Cancer CellLines LBL

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: Joe Gray
Institution: Berkeley Lab
Approximate Number Subjects: 56

Abstract:

Recent studies suggest that thousands of genes may contribute to breast cancer pathophysiologies when deregulated by genomic or epigenomic events. Here, we describe a model "system" to appraise the functional contributions of these genes to breast cancer subsets. In general, the recurrent genomic and transcriptional characteristics of 51 breast cancer cell lines mirror those of 145 primary breast tumors, although some significant differences are documented. The cell lines that comprise the system also exhibit the substantial genomic, transcriptional, and biological heterogeneity found in primary tumors. We show, using Trastuzumab (Herceptin) monotherapy as an example, that the system can be used to identify molecular features that predict or indicate response to targeted therapies or other physiological perturbations.

Citation

A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes.Neve RM, Chin K, Fridlyand J, Yeh J, Baehner FL, Fevr T, Clark L, Bayani N, Coppe JP, Tong F, Speed T, Spellman PT, DeVries S, Lapuk A, Wang NJ, Kuo WL, Stilwell JL, Pinkel D, Albertson DG, Waldman FM, McCormick F, Dickson RB, Johnson MD, Lippman M, Ethier S, Gazdar A, Gray JW. Cancer Cell. 2006 Dec;10(6):515-27.

Breast Cancer HER2+ ICGC

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published. 

Breast Cancer HER2+ER+ ICGC

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published. 

Breast Cancer Karolinska

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: Bergh
Institution: Karolinska Institute
Approximate Number Subjects: 650

Abstract:

During the period of 1997 to 2005, fresh breast neoplasms were collected from women treated for primary breast cancer by surgery at the Karolinska Institute Center, Sweden. Fresh tissue samples were immediately frozen in liquid nitrogen and stored at -80C. Clinical outcomes in the cohort include: Response to therapy (CT scans, MRI, clinical criteria), metastatic v non-metastatic, Time to progression.

Citation

This study is not yet published. 

Breast Cancer Nonresponders Cohort

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: Stephen Friend
Institution: Sage Bionetworks
Approximate Number Subjects: 300

Abstract:

Clinical outcomes with responder/non-responder status for standard of care therapies. Partial retrospective, partial prospective. Patient-driven. Minimum full exome sequencing.

Citation

This study is not yet published. 

Breast Cancer Triple Negative ICGC

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published. 

Breast Cancer Tumors LBL

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: Joe Gray
Institution: Berkeley Lab
Approximate Number Subjects: 101

Abstract:

This study explores the roles of genome copy number abnormalities (CNAs) in breast cancer pathophysiology by identifying associations between recurrent CNAs, gene expression, and clinical outcome in a set of aggressively treated early-stage breast tumors. It shows that the recurrent CNAs differ between tumor subtypes defined by expression pattern and that stratification of patients according to outcome can be improved by measuring both expression and copy number, especially high-level amplification. Sixty-six genes deregulated by the high-level amplifications are potential therapeutic targets. Nine of these (FGFR1, IKBKB, ERBB2, PROCC, ADAM9, FNTA, ACACA, PNMT, and NR1D1) are considered druggable. Low-level CNAs appear to contribute to cancer progression by altering RNA and cellular metabolism.

Citation

Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Chin K, DeVries S, Fridlyand J, Spellman PT, Roydasgupta R, Kuo WL, Lapuk A, Neve RM, Qian Z, Ryder T, Chen F, Feiler H, Tokuyasu T, Kingsley C, Dairkee S, Meng Z, Chew K, Pinkel D, Jain A, Ljung BM, Esserman L, Albertson DG, Waldman FM, Gray JW.Cancer Cell. 2006 Dec;10(6):529-41.

CAMP Continuation Study Childhood Asthma

Species: Human
Tissue: Blood
Disease: Asthma
Investigator: Benjamin Raby
Institution: Brigham and Womens
Approximate Number Subjects: 200

Abstract:

Genome-wide association studies of human gene expression promise to identify functional regulatory genetic variation that contributes to phenotypic diversity. However, it is unclear how useful this approach will be for the identification of disease-susceptibility variants. We generated gene expression profiles for 22 184 mRNA transcripts using RNA derived from peripheral blood CD4+ lymphocytes, and genome-wide genotype data for 516 512 autosomal markers in 200 subjects. We screened for cis-acting variants by testing variants mapping within 50 kb of expressed transcripts for association with transcript abundance using generalized linear models. Significant associations were identified for 1585 genes at a false discovery rate of 0.05 (corresponding to P-values ranging from 1  10(-91) to 7  10(-4)). Importantly, we identified evidence of regulatory variation for 119 previously mapped disease genes, including 24 examples where the variant with the strongest evidence of disease-association demonstrates strong association with specific transcript abundance. The prevalence of cis-acting variants among disease-associated genes was 63% higher than the genome-wide rate in our data set (P = 6.41  10(-6)), and although many of the implicated loci were associated with immune-related diseases (including asthma, connective tissue disorders and inflammatory bowel disease), associations with genes implicated in non-immune-related diseases including lipid profiles, anthropomorphic measurements, cancer and neurologic disease were also observed. Genetic variants that confer inter-individual differences in gene expression represent an important subset of variants that contribute to disease susceptibility. Population-based integrative genetic approaches can help identify such variation and enhance our understanding of the genetic basis of complex traits.

Citation

Mapping of numerous disease-associated expression polymorphisms in primary peripheral blood CD4+ lymphocytes. Murphy A, Chu JH, Xu M, Carey VJ, Lazarus R, Liu A, Szefler SJ, Strunk R, Demuth K, Castro M, Hansel NN, Diette GB, Vonakis BM, Adkinson NF Jr, Klanderman BJ, Senter-Sylvia J, Ziniti J, Lange C, Pastinen T, Raby BA.Hum Mol Genet. 2010 Dec 1;19(23):4745-57.

Cardiogenics Transcriptomic Study

Species: Human
Tissue: Blood
Disease: CVD
Investigator: Stuart Cook
Institution: Imperial College London
Approximate Number Subjects: 849

Abstract:

Combined analyses of gene networks and DNA sequence variation can provide new insights into the aetiology of common diseases that may not be apparent from genome-wide association studies alone. Recent advances in rat genomics are facilitating systems-genetics approaches. Here we report the use of integrated genome-wide approaches across seven rat tissues to identify gene networks and the loci underlying their regulation. We defined an interferon regulatory factor 7 (IRF7)-driven inflammatory network (IDIN) enriched for viral response genes, which represents a molecular biomarker for macrophages and which was regulated in multiple tissues by a locus on rat chromosome 15q25. We show that Epstein-Barr virus induced gene 2 (Ebi2, also known as Gpr183), which lies at this locus and controls B lymphocyte migration, is expressed in macrophages and regulates the IDIN. The human orthologous locus on chromosome 13q32 controlled the human equivalent of the IDIN, which was conserved in monocytes. IDIN genes were more likely to associate with susceptibility to type 1 diabetes (T1D)-a macrophage-associated autoimmune disease-than randomly selected immune response genes (P = 8.85??10(-6)). The human locus controlling the IDIN was associated with the risk of T1D at single nucleotide polymorphism rs9585056 (P = 7.0??10(-10); odds ratio, 1.15), which was one of five single nucleotide polymorphisms in this region associated with EBI2 (GPR183) expression. These data implicate IRF7 network genes and their regulatory locus in the pathogenesis of T1D.

Citation

A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk. Heinig M, Petretto E, Wallace C, Bottolo L, Rotival M, Lu H, Li Y, Sarwar R, Langley SR, Bauerfeind A, Hummel O, Lee YA, Paskas S, Rintisch C, Saar K, Cooper J, Buchan R, Gray EE, Cyster JG; Cardiogenics Consortium, Erdmann J, Hengstenberg C, Maouche S, Ouwehand WH, Rice CM, Samani NJ, Schunkert H, Goodall AH, Schulz H, Roider HG, Vingron M, Blankenberg S, Mnzel T, Zeller T, Szymczak S, Ziegler A, Tiret L, Smyth DJ, Pravenec M, Aitman TJ, Cambien F, Clayton D, Todd JA, Hubner N, Cook SA.Nature. 2010 Sep 23;467(7314):460-4.

Chronic Lymphocytic Leukemia ICGC

Species: Human
Tissue: Blood
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published.

Colon Adenocarcinoma TCGA

Species: Human
Tissue: Intestine
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Colon_Adenocarcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study is not yet published. For more information, go to: http://tcga.cancer.gov/

Colon Cancer HKU

Species: Human
Tissue: Intestine
Disease: Cancer
Investigator: Suet Yi Leung
Institution: Hong Kong University
Approximate Number Subjects: 400

Abstract:

A cohort of 400 individuals with colon carcinoma with RNA profiling (already generated), DNA genotyping (planned) and miRNA (planned). Clinical outcomes available.

Citation

This study is not yet published. For more information, go to: http://tcga.cancer.gov/

Colon Cancer ICGC

Species: Human
Tissue: Intestine
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published. For more information, go to: http://tcga.cancer.gov/

deCODE Adipose Cohort

Species: Human
Tissue: Adipose
Disease: Metabolic Disease
Investigator: Kari Stefansson
Institution: deCODE
Approximate Number Subjects: 1002

Abstract:

Common human diseases result from the interplay of many genes and environmental factors. Therefore, a more integrative biology approach is needed to unravel the complexity and causes of such diseases. To elucidate the complexity of common human diseases such as obesity, we have analysed the expression of 23,720 transcripts in large population-based blood and adipose tissue cohorts comprehensively assessed for various phenotypes, including traits related to clinical obesity. In contrast to the blood expression profiles, we observed a marked correlation between gene expression in adipose tissue and obesity-related traits. Genome-wide linkage and association mapping revealed a highly significant genetic component to gene expression traits, including a strong genetic effect of proximal (cis) signals, with 50% of the cis signals overlapping between the two tissues profiled. Here we demonstrate an extensive transcriptional network constructed from the human adipose data that exhibits significant overlap with similar network modules constructed from mouse adipose data. A core network module in humans and mice was identified that is enriched for genes involved in the inflammatory and immune response and has been found to be causally associated to obesity-related traits.

Citation

Genetics of gene expression and its effect on disease. Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, Zhu J, Carlson S, Helgason A, Walters GB, Gunnarsdottir S, Mouy M, Steinthorsdottir V, Eiriksdottir GH, Bjornsdottir G, Reynisdottir I, Gudbjartsson D, Helgadottir A, Jonasdottir A, Jonasdottir A, Styrkarsdottir U, Gretarsdottir S, Magnusson KP, Stefansson H, Fossdal R, Kristjansson K, Gislason HG, Stefansson T, Leifsson BG, Thorsteinsdottir U, Lamb JR, Gulcher JR, Reitman ML, Kong A, Schadt EE, Stefansson K. Nature. 2008 Mar 27;452(7186):423-8. 

deCODE Blood Cohort

Species: Human
Tissue: Blood
Disease: Metabolic Disease
Investigator: Kari Stefansson
Institution: deCODE
Approximate Number Subjects: 1002

Abstract:

Common human diseases result from the interplay of many genes and environmental factors. Therefore, a more integrative biology approach is needed to unravel the complexity and causes of such diseases. To elucidate the complexity of common human diseases such as obesity, we have analysed the expression of 23,720 transcripts in large population-based blood and adipose tissue cohorts comprehensively assessed for various phenotypes, including traits related to clinical obesity. In contrast to the blood expression profiles, we observed a marked correlation between gene expression in adipose tissue and obesity-related traits. Genome-wide linkage and association mapping revealed a highly significant genetic component to gene expression traits, including a strong genetic effect of proximal (cis) signals, with 50% of the cis signals overlapping between the two tissues profiled. Here we demonstrate an extensive transcriptional network constructed from the human adipose data that exhibits significant overlap with similar network modules constructed from mouse adipose data. A core network module in humans and mice was identified that is enriched for genes involved in the inflammatory and immune response and has been found to be causally associated to obesity-related traits.

Citation

Genetics of gene expression and its effect on disease. Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, Zhu J, Carlson S, Helgason A, Walters GB, Gunnarsdottir S, Mouy M, Steinthorsdottir V, Eiriksdottir GH, Bjornsdottir G, Reynisdottir I, Gudbjartsson D, Helgadottir A, Jonasdottir A, Jonasdottir A, Styrkarsdottir U, Gretarsdottir S, Magnusson KP, Stefansson H, Fossdal R, Kristjansson K, Gislason HG, Stefansson T, Leifsson BG, Thorsteinsdottir U, Lamb JR, Gulcher JR, Reitman ML, Kong A, Schadt EE, Stefansson K. Nature. 2008 Mar 27;452(7186):423-8. 

Ependymoma Cohort StJude

Species: Human
Tissue: Brain
Disease: Cancer
Investigator: Gilbertson
Institution: St Jude Hospital
Approximate Number Subjects: 83

Abstract:

Understanding the biology that underlies histologically similar
but molecularly distinct subgroups of cancer has proven difficult
because their defining genetic alterations are often numerous, and
the cellular origins of most cancers remain unknown13.Wesought
to decipher this heterogeneity by integrating matched genetic
alterations and candidate cells of origin to generate accurate disease
models. First, we identified subgroups of human ependymoma,
a form of neural tumour that arises throughout the
central nervous system (CNS). Subgroup-specific alterations
included amplifications and homozygous deletions of genes not
yet implicated in ependymoma. To select cellular compartments
most likely to give rise to subgroups of ependymoma, we matched
the transcriptomes of human tumours to those of mouse neural
stem cells (NSCs), isolated from different regions of the CNS at
different developmental stages, with an intact or deleted Ink4a/Arf
locus (that encodes Cdkn2a and b). The transcriptome of human
supratentorial ependymomas with amplified EPHB2 and deleted
INK4A/ARF matched only that of embryonic cerebral Ink4a/Arf2/2
NSCs. Notably, activation of Ephb2 signalling in these, but not
other, NSCs generated the first mouse model of ependymoma,
which is highly penetrant and accurately models the histology
and transcriptome of one subgroup of human supratentorial
tumour. Further, comparative analysis of matched mouse and
human tumours revealed selective deregulation in the expression
and copy number of genes that control synaptogenesis, pinpointing
disruption of this pathway as a critical event in the production of
this ependymoma subgroup. Our data demonstrate the power of
cross-species genomics to meticulously match subgroup-specific
driver mutations with cellular compartments to model and interrogate
cancer subgroups.

Citation

Cross-species genomics matches driver mutations and cell compartments to model ependymoma.
Johnson RA, Wright KD, Poppleton H, Mohankumar KM, Finkelstein D, Pounds SB, Rand V, Leary SE, White E, Eden C, Hogg T, Northcott P, Mack S, Neale G, Wang YD, Coyle B, Atkinson J, DeWire M, Kranenburg TA, Gillespie Y, Allen JC, Merchant T, Boop FA, Sanford RA, Gajjar A, Ellison DW, Taylor MD, Grundy RG, Gilbertson RJ. Nature. 2010 Jul 29;466(7306):632-6.

Framingham Heart Study

Species: Human
Tissue: Blood,Blood
Disease: Multiple
Investigator: Dan Levy
Institution: NHLBI
Approximate Number Subjects: 5000

Abstract:

The Framingham Heart Study comprises a longitudinal three-generation population study (n~15,000). Genotyping at 500K SNPs is available for ~9,500 individuals and gene expression for blood on 5,000-7,000. Phenotypes in the cohort include serum lipid/cholesterol, CRP, glucose levels, adiposity measures, hypertension, T2D, chronic kidney disease, imaging measures and heart disease outcomes (MI, failure)

Citation

Genetics of the Framingham Heart Study population.  Govindaraju DR, Cupples LA, Kannel WB, O'Donnell CJ, Atwood LD, D'Agostino RB Sr, Fox CS, Larson M, Levy D, Murabito J, Vasan RS, Splansky GL, Wolf PA, Benjamin EJ.Adv Genet. 2008;62:33-65.

http://www.framinghamheartstudy.org/

Gastric Bypass Cohort

Species: Human
Tissue: Adipose omental,Adipose subcutaneous,Liver,Stomach
Disease: Metabolic Disease
Investigator: Lee Kaplan
Institution: Harvard medical school Massachusetts General Hospital
Approximate Number Subjects: 975

Abstract:

High throughput genotyping and expression profiling in three relevant tissues (liver, omental adipose, subcutaneous adipose) from ~1000 patients undergoing roux en y gastric bypass surgery to identify expression SNPs (eSNPs) and the genetics of weight loss and diabetes correction due to bypass surgery; microRNA profiling of liver samples and adipose tissues to understand the role of microRNA in metabolic disorders.

Citation

Liver and adipose expression associated SNPs are enriched for association to type 2 diabetes. Zhong H, Beaulaurier J, Lum PY, Molony C, Yang X, Macneil DJ, Weingarth DT, Zhang B, Greenawalt D, Dobrin R, Hao K, Woo S, Fabre-Suver C, Qian S, Tota MR, Keller MP, Kendziorski CM, Yandell BS, Castro V, Attie AD, Kaplan LM, Schadt EE. PLoS Genet. 2010 May 6;6:e1000932.

Gastric Cancer ACRG

Species: Human
Tissue: Stomach
Disease: Cancer
Investigator: Asian Cancer Research Group, Inc., (ACRG)
Institution: Asian Cancer Research Group, Inc., (ACRG)
Approximate Number Subjects: 2000

Abstract:

Eli Lilly and Company, Merck, and Pfizer Inc. have formed the Asian Cancer Research Group, Inc., (ACRG), an independent, not-for-profit company established to accelerate research and ultimately improve treatment for patients affected with the most commonly-diagnosed cancers in Asia. Over the next two years ACRG have committed to create one of the most extensive pharmacogenomic cancer databases known to date. This database will be composed of data from approximately 2,000 tissue samples from patients with lung and gastric cancer that will be made publicly available to researchers and, over time, further populated with clinical data from a longitudinal analysis of patients. Comparison of the contrasting genomic signatures of these cancers could inform new approaches to treatment

Citation

This study is not yet published.

Gastric Cancers ICGC

Species: Human
Tissue: Stomach
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published.

Gastroesophageal Cancer Cohort

Species: Human
Tissue: Stomach
Disease: Cancer
Investigator: Isinger-Ekstrand
Institution: Lund
Approximate Number Subjects: 27

Abstract:

We aimed to characterize the genomic profiles of adenocarcinomas in the gastroesophageal junction in relation to cancers in the esophagus and the stomach. Profiles of gains/losses as well as gene expression profiles were obtained from 27 gastroesophageal adenocarcinomas by means of 32k high-resolution array-based comparative genomic hybridization and 27k oligo gene expression arrays, and putative target genes were validated in an extended series. Adenocarcinomas in the distal esophagus and the gastroesophageal junction showed strong similarities with the most common gains at 20q13, 8q24, 1q21-23, 5p15, 13q34, and 12q13, whereas different profiles with gains at 5p15, 7p22, 2q35, and 13q34 characterized gastric cancers. CDK6 and EGFR were identified as putative target genes in cancers of the esophagus and the gastroesophageal junction, with upregulation in one quarter of the tumors. Gains/losses and gene expression profiles show strong similarity between cancers in the distal esophagus and the gastroesophageal junction with frequent upregulation of CDK6 and EGFR, whereas gastric cancer displays distinct genetic changes. These data suggest that molecular diagnostics and targeted therapies can be applied to adenocarcinomas of the distal esophagus and gastroesophageal junction alike.

Citation

Genetic profiles of gastroesophageal cancer: combined analysis using expression array and tiling array--comparative genomic hybridization. Isinger-Ekstrand A, Johansson J, Ohlsson M, Francis P, Staaf J, Jnsson M, Borg A, Nilbert M. Cancer Genet Cytogenet. 2010 Jul 15;200(2):120-6.

GenCord

Species: Human
Tissue: Blood,Lymphoblastoid Cell Line,Cell Line
Disease: Healthy
Investigator: Stylianos Antonarakis/Manolis Dermitzakis
Institution: University of Geneva/Sanger Institute
Approximate Number Subjects: 85

Abstract:

Studies correlating genetic variation to gene expression facilitate the interpretation of common human phenotypes and disease. As functional variants may be operating in a tissue-dependent manner, we performed gene expression profiling and association with genetic variants (single-nucleotide polymorphisms) on three cell types of 75 individuals. We detected cell type-specific genetic effects, with 69 to 80% of regulatory variants operating in a cell type-specific manner, and identified multiple expressive quantitative trait loci (eQTLs) per gene, unique or shared among cell types and positively correlated with the number of transcripts per gene. Cell type-specific eQTLs were found at larger distances from genes and at lower effect size, similar to known enhancers. These data suggest that the complete regulatory variant repertoire can only be uncovered in the context of cell-type specificity.

Citation

Common regulatory variation impacts gene expression in a cell type-dependent manner.
Dimas AS, Deutsch S, Stranger BE, Montgomery SB, Borel C, Attar-Cohen H, Ingle C, Beazley C, Gutierrez Arcelus M, Sekowska M, Gagnebin M, Nisbett J, Deloukas P, Dermitzakis ET, Antonarakis SE. Science. 2009 Sep 4;325(5945):1246-50

GeneNetwork.org

Species: Mouse
Tissue: Multiple
Disease: Multiple
Investigator: Robert Williams
Institution: University of Tennessee
Approximate Number Subjects: 0

Abstract:

GeneNetwork consists of linked resources and analysis tools for systems genetics. It has been designed for the analysis of networks of genes, transcripts, and classic phenotype data sets. GeneNetwork combines more than 25 years of legacy data generated from mouse samples by hundreds of scientists with genome sequence data and massive transcriptome data sets (expression genetic or eQTL data sets). 

http://www.genenetwork.org/home.html

Citation

This resource can be found at:  www.genenetwork.org

Groningen Blood Cohort

Species: Human
Tissue: Blood
Disease: Inflammatory Disease
Investigator: Franke/van Heel
Institution: Groningen/Barts
Approximate Number Subjects: 1469

Abstract:

We performed a second-generation genome-wide association study of 4,533 individuals with celiac disease (cases) and 10,750 control subjects. We genotyped 113 selected SNPs with P(GWAS) < 10(-4) and 18 SNPs from 14 known loci in a further 4,918 cases and 5,684 controls. Variants from 13 new regions reached genome-wide significance (P(combined) < 5 x 10(-8)); most contain genes with immune functions (BACH2, CCR4, CD80, CIITA-SOCS1-CLEC16A, ICOSLG and ZMIZ1), with ETS1, RUNX3, THEMIS and TNFRSF14 having key roles in thymic T-cell selection. There was evidence to suggest associations for a further 13 regions. In an expression quantitative trait meta-analysis of 1,469 whole blood samples, 20 of 38 (52.6%) tested loci had celiac risk variants correlated (P < 0.0028, FDR 5%) with cis gene expression.

Citation

Multiple common variants for celiac disease influencing immune gene expression. Dubois PC, Trynka G, Franke L, Hunt KA, Romanos J, Curtotti A, Zhernakova A, Heap GA, Adny R, Aromaa A, Bardella MT, van den Berg LH, Bockett NA, de la Concha EG, Dema B, Fehrmann RS, Fernndez-Arquero M, Fiatal S, Grandone E, Green PM, Groen HJ, Gwilliam R, Houwen RH, Hunt SE, Kaukinen K, Kelleher D, Korponay-Szabo I, Kurppa K, MacMathuna P, Mki M, Mazzilli MC, McCann OT, Mearin ML, Mein CA, Mirza MM, Mistry V, Mora B, Morley KI, Mulder CJ, Murray JA, Nez C, Oosterom E, Ophoff RA, Polanco I, Peltonen L, Platteel M, Rybak A, Salomaa V, Schweizer JJ, Sperandeo MP, Tack GJ, Turner G, Veldink JH, Verbeek WH, Weersma RK, Wolters VM, Urcelay E, Cukrowska B, Greco L, Neuhausen SL, McManus R, Barisani D, Deloukas P, Barrett JC, Saavalainen P, Wijmenga C, van Heel DA. Nat Genet. 2010 Apr;42(4):295-302.

Gutenberg Heart Study Monocytes

Species: Human
Tissue: Blood
Disease: CVD
Investigator: Blankenberg/Cambien
Institution: JGUM/INSERM
Approximate Number Subjects: 1500

Abstract:

Variability of gene expression in human may link gene sequence variability and phenotypes; however, non-genetic variations, alone or in combination with genetics, may also influence expression traits and have a critical role in physiological and disease processes. To get better insight into the overall variability of gene expression, we assessed the transcriptome of circulating monocytes, a key cell involved in immunity-related diseases and atherosclerosis, in 1,490 unrelated individuals and investigated its association with >675,000 SNPs and 10 common cardiovascular risk factors. Out of 12,808 expressed genes, 2,745 expression quantitative trait loci were detected (P<5.78x10(-12)), most of them (90%) being cis-modulated. Extensive analyses showed that associations identified by genome-wide association studies of lipids, body mass index or blood pressure were rarely compatible with a mediation by monocyte expression level at the locus. At a study-wide level (P<3.9x10(-7)), 1,662 expression traits (13.0%) were significantly associated with at least one risk factor. Genome-wide interaction analyses suggested that genetic variability and risk factors mostly acted additively on gene expression. Because of the structure of correlation among expression traits, the variability of risk factors could be characterized by a limited set of independent gene expressions which may have biological and clinical relevance. For example expression traits associated with cigarette smoking were more strongly associated with carotid atherosclerosis than smoking itself. This study demonstrates that the monocyte transcriptome is a potent integrator of genetic and non-genetic influences of relevance for disease pathophysiology and risk assessment.

Citation

Genetics and beyond--the transcriptome of human monocytes and disease susceptibility. Zeller T, Wild P, Szymczak S, Rotival M, Schillert A, Castagne R, Maouche S, Germain M, Lackner K, Rossmann H, Eleftheriadis M, Sinning CR, Schnabel RB, Lubos E, Mennerich D, Rust W, Perret C, Proust C, Nicaud V, Loscalzo J, Hbner N, Tregouet D, Mnzel T, Ziegler A, Tiret L, Blankenberg S, Cambien F. PLoS One. 2010 May 18;5(5):e10693.

HapMap LCL Drug Response BROAD

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Healthy
Investigator: David Altschuler
Institution: BROAD Institute
Approximate Number Subjects: 269

Abstract:

Lymphoblastoid cell lines (LCLs), originally collected as renewable sources of DNA, are now being used as a model system to study genotype-phenotype relationships in human cells, including searches for QTLs influencing levels of individual mRNAs and responses to drugs and radiation. In the course of attempting to map genes for drug response using 269 LCLs from the International HapMap Project, we evaluated the extent to which biological noise and non-genetic confounders contribute to trait variability in LCLs. While drug responses could be technically well measured on a given day, we observed significant day-to-day variability and substantial correlation to non-genetic confounders, such as baseline growth rates and metabolic state in culture. After correcting for these confounders, we were unable to detect any QTLs with genome-wide significance for drug response. A much higher proportion of variance in mRNA levels may be attributed to non-genetic factors (intra-individual variance--i.e., biological noise, levels of the EBV virus used to transform the cells, ATP levels) than to detectable eQTLs. Finally, in an attempt to improve power, we focused analysis on those genes that had both detectable eQTLs and correlation to drug response; we were unable to detect evidence that eQTL SNPs are convincingly associated with drug response in the model. While LCLs are a promising model for pharmacogenetic experiments, biological noise and in vitro artifacts may reduce power and have the potential to create spurious association due to confounding.

Citation

Genetic analysis of human traits in vitro: drug response and gene expression in lymphoblastoid cell lines. Choy E, Yelensky R, Bonakdar S, Plenge RM, Saxena R, De Jager PL, Shaw SY, Wolfish CS, Slavik JM, Cotsapas C, Rivas M, Dermitzakis ET, Cahir-McFarland E, Kieff E, Hafler D, Daly MJ, Altshuler D. PLoS Genet. 2008 Nov;4(11):e1000287. 

HapMap LCL Drug Response Chicago

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Cancer
Investigator: Dolan
Institution: Chicago
Approximate Number Subjects: 176

Abstract:

Cisplatin, a platinating agent commonly used to treat several cancers, is associated with nephrotoxicity, neurotoxicity, and ototoxicity, which has hindered its utility. To gain a better understanding of the genetic variants associated with cisplatin-induced toxicity, we present a stepwise approach integrating genotypes, gene expression, and sensitivity of HapMap cell lines to cisplatin. Cell lines derived from 30 trios of European descent (CEU) and 30 trios of African descent (YRI) were used to develop a preclinical model to identify genetic variants and gene expression that contribute to cisplatin-induced cytotoxicity in two different populations. Cytotoxicity was determined as cell-growth inhibition at increasing concentrations of cisplatin for 48 h. Gene expression in 176 HapMap cell lines (87 CEU and 89 YRI) was determined using the Affymetrix GeneChip Human Exon 1.0 ST Array. We identified six, two, and nine representative SNPs that contribute to cisplatin-induced cytotoxicity through their effects on 8, 2, and 16 gene expressions in the combined, Centre d'Etude du Polymorphisme Humain (CEPH), and Yoruban populations, respectively. These genetic variants contribute to 27%, 29%, and 45% of the overall variation in cell sensitivity to cisplatin in the combined, CEPH, and Yoruban populations, respectively. Our whole-genome approach can be used to elucidate the expression of quantitative trait loci contributing to a wide range of cellular phenotypes.

Citation

Identification of genetic variants contributing to cisplatin-induced cytotoxicity by use of a genomewide approach. Huang RS, Duan S, Shukla SJ, Kistner EO, Clark TA, Chen TX, Schweitzer AC, Blume JE, Dolan ME. Am J Hum Genet. 2007 Sep;81(3):427-37.

HapMap LCL McGill

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Healthy
Investigator: Pastinen/Grundberg
Institution: McGill
Approximate Number Subjects: 53

Abstract:

Cis-acting variants altering gene expression are a source of phenotypic differences. The cis-acting components of expression variation can be identified through the mapping of differences in allelic expression (AE), which is the measure of relative expression between two allelic transcripts. We generated a map of AE associated SNPs using quantitative measurements of AE on Illumina Human1M BeadChips. In 53 lymphoblastoid cell lines derived from donors of European descent, we identified common cis variants affecting 30% (2935/9751) of the measured RefSeq transcripts at 0.001 permutation significance. The pervasive influence of cis-regulatory variants, which explain 50% of population variation in AE, extend to full-length transcripts and their isoforms as well as to unannotated transcripts. These strong effects facilitate fine mapping of cis-regulatory SNPs, as demonstrated by dissection of heritable control of transcripts in the systemic lupus erythematosus-associated C8orf13-BLK region in chromosome 8. The dense collection of associations will facilitate large-scale isolation of cis-regulatory SNPs.

Citation

Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Ge B, Pokholok DK, Kwan T, Grundberg E, Morcos L, Verlaan DJ, Le J, Koka V, Lam KC, Gagn V, Dias J, Hoberman R, Montpetit A, Joly MM, Harvey EJ, Sinnett D, Beaulieu P, Hamon R, Graziani A, Dewar K, Harmsen E, Majewski J, Gring HH, Naumova AK, Blanchette M, Gunderson KL, Pastinen T. Nat Genet. 2009 Nov;41(11):1216-22.

HapMap LCL Merck

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Healthy
Investigator: Eric Schadt
Institution: Merck Research Laboratories
Approximate Number Subjects: 210

Abstract:

Combining genetic inheritance information, for both molecular profiles and complex traits, is a promising strategy not only for detecting quantitative trait loci (QTLs) for complex traits but for understanding which genes, pathways, and biological processes are also under the influence of a given QTL. As a primary step in determining the feasibility of such an approach in humans, we present the largest survey to date, to our knowledge, of the heritability of gene-expression traits in segregating human populations. In particular, we measured expression for 23,499 genes in lymphoblastoid cell lines for members of 15 Centre d'Etude du Polymorphisme Humain (CEPH) families. Of the total set of genes, 2,340 were found to be expressed, of which 31% had significant heritability when a false-discovery rate of 0.05 was used. QTLs were detected for 33 genes on the basis of at least one P value <.000005. Of these, 13 genes possessed a QTL within 5 Mb of their physical location. Hierarchical clustering was performed on the basis of both Pearson correlation of gene expression and genetic correlation. Both reflected biologically relevant activity taking place in the lymphoblastoid cell lines, with greater coherency represented in Kyoto Encyclopedia of Genes and Genomes database (KEGG) pathways than in Gene Ontology database pathways. However, more pathway coherence was observed in KEGG pathways when clustering was based on genetic correlation than when clustering was based on Pearson correlation. As more expression data in segregating populations are generated, viewing clusters or networks based on genetic correlation measures and shared QTLs will offer potentially novel insights into the relationship among genes that may underlie complex traits.

Citation

Genetic inheritance of gene expression in human cell lines. Monks SA, Leonardson A, Zhu H, Cundiff P, Pietrusiak P, Edwards S, Phillips JW, Sachs A, Schadt EE. Am J Hum Genet. 2004 Dec;75(6):1094-105.

HapMap LCL Penn

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Healthy
Investigator: Vivian Cheung
Institution: University of Pennsylvania
Approximate Number Subjects: 140

Abstract:

Natural variation in gene expression is extensive in humans and other organisms, and variation in the baseline expression level of many genes has a heritable component. To localize the genetic determinants of these quantitative traits (expression phenotypes) in humans, we used microarrays to measure gene expression levels and performed genome-wide linkage analysis for expression levels of 3,554 genes in 14 large families. For approximately 1,000 expression phenotypes, there was significant evidence of linkage to specific chromosomal regions. Both cis- and trans-acting loci regulate variation in the expression levels of genes, although most act in trans. Many gene expression phenotypes are influenced by several genetic determinants. Furthermore, we found hotspots of transcriptional regulation where significant evidence of linkage for several expression phenotypes (up to 31) coincides, and expression levels of many genes that share the same regulatory region are significantly correlated. The combination of microarray techniques for phenotyping and linkage analysis for quantitative traits allows the genetic mapping of determinants that contribute to variation in human gene expression.

Citation

Genetic analysis of genome-wide variation in human gene expression. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG. Nature. 2004 Aug 12;430(7001):743-7.

HapMap LCL RNAseq Chicago

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Healthy
Investigator: Jonathan Pritchard
Institution: University of Chicago
Approximate Number Subjects: 69

Abstract:

Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals.

Citation

Understanding mechanisms underlying human gene expression variation with RNA sequencing. Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras JB, Stephens M, Gilad Y, Pritchard JK. Nature. 2010 Apr 1;464(7289):768-72.

HapMap LCL RNAseq Geneva

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Healthy
Investigator: Stylianos Antonarakis/Manolis Dermitzakis
Institution: University of Geneva/Sanger Institute
Approximate Number Subjects: 60

Abstract:

Gene expression is an important phenotype that informs about genetic and environmental effects on cellular state. Many studies have previously identified genetic variants for gene expression phenotypes using custom and commercially available microarrays. Second generation sequencing technologies are now providing unprecedented access to the fine structure of the transcriptome. We have sequenced the mRNA fraction of the transcriptome in 60 extended HapMap individuals of European descent and have combined these data with genetic variants from the HapMap3 project. We have quantified exon abundance based on read depth and have also developed methods to quantify whole transcript abundance. We have found that approximately 10 million reads of sequencing can provide access to the same dynamic range as arrays with better quantification of alternative and highly abundant transcripts. Correlation with SNPs (small nucleotide polymorphisms) leads to a larger discovery of eQTLs (expression quantitative trait loci) than with arrays. We also detect a substantial number of variants that influence the structure of mature transcripts indicating variants responsible for alternative splicing. Finally, measures of allele-specific expression allowed the identification of rare eQTLs and allelic differences in transcript structure. This analysis shows that high throughput sequencing technologies reveal new properties of genetic effects on the transcriptome and allow the exploration of genetic effects in cellular processes.

Citation

Transcriptome genetics using second generation sequencing in a Caucasian population. Montgomery SB, Sammeth M, Gutierrez-Arcelus M, Lach RP, Ingle C, Nisbett J, Guigo R, Dermitzakis ET. Nature. 2010 Apr 1;464(7289):773-7.

HapMap LCL Sanger CNV

Species: Human
Tissue: Lymphoblastoid Cell Line
Disease: Healthy
Investigator: Manolis Dermitzakis
Institution: Sanger Institute
Approximate Number Subjects: 210

Abstract:

Extensive studies are currently being performed to associate disease susceptibility with one form of genetic variation, namely, single-nucleotide polymorphisms (SNPs). In recent years, another type of common genetic variation has been characterized, namely, structural variation, including copy number variants (CNVs). To determine the overall contribution of CNVs to complex phenotypes, we have performed association analyses of expression levels of 14,925 transcripts with SNPs and CNVs in individuals who are part of the International HapMap project. SNPs and CNVs captured 83.6% and 17.7% of the total detected genetic variation in gene expression, respectively, but the signals from the two types of variation had little overlap. Interrogation of the genome for both types of variants may be an effective way to elucidate the causes of complex phenotypes and disease in humans.

Citation

Relative impact of nucleotide and copy number variation on gene expression phenotypes. Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C, Tyler-Smith C, Carter N, Scherer SW, Tavar S, Deloukas P, Hurles ME, Dermitzakis ET. Science. 2007 Feb 9;315(5813):848-53.

Head And Neck Squamous Cell Carcinoma TCGA

Species: Human
Tissue: Skin
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Head_And_Neck_Squamous_Cell_Carcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study is not yet published.  For more informatin, go to:  http://tcga.cancer.gov/

Hepatocellular Carcinoma Metabolic ICGC

Species: Human
Tissue: Liver
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published.  

Hepatocellular Carcinoma Viral ICGC

Species: Human
Tissue: Liver
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published.  

Invasive Breast Cancer TCGA

Species: Human
Tissue: Breast
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Invasive_Breast_Cancer tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

Unable to load the dataset citation.

Lung Adenocarcinoma TCGA

Species: Human
Tissue: Lung
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Lung_Adenocarcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study is not yet published.  For more information, go to: http://tcga.cancer.gov/

Lung Cancer ACRG

Species: Human
Tissue: Lung
Disease: Cancer
Investigator: Asian Cancer Research Group, Inc., (ACRG)
Institution: Asian Cancer Research Group, Inc., (ACRG)
Approximate Number Subjects: 2000

Abstract:

Eli Lilly and Company, Merck, and Pfizer Inc. have formed the Asian Cancer Research Group, Inc., (ACRG), an independent, not-for-profit company established to accelerate research and ultimately improve treatment for patients affected with the most commonly-diagnosed cancers in Asia. Over the next two years ACRG have committed to create one of the most extensive pharmacogenomic cancer databases known to date. This database will be composed of data from approximately 2,000 tissue samples from patients with lung and gastric cancer that will be made publicly available to researchers and, over time, further populated with clinical data from a longitudinal analysis of patients. Comparison of the contrasting genomic signatures of these cancers could inform new approaches to treatment

Citation

This study is not yet published.  

Lung Cancer Nonresponders Cohort

Species: Human
Tissue: Lung
Disease: Cancer
Investigator: Stephen Friend
Institution: Sage Bionetworks
Approximate Number Subjects: 300

Abstract:

Clinical outcomes with responder/non-responder status for standard of care therapies. Partial retrospective, partial prospective. Patient-driven. Minimum full exome sequencing.

Citation

This study is not yet published.  

Lung Cancer UBC

Species: Human
Tissue: Lung
Disease: Cancer
Investigator: Will Lockwood
Institution: University of British Columbia
Approximate Number Subjects: 114

Abstract:

Traditionally, non-small cell lung cancer is treated as a single disease entity in terms of systemic therapy. Emerging evidence suggests the major subtypesadenocarcinoma (AC) and squamous cell carcinoma (SqCC)respond differently to therapy. Identification of the molecular differences between these tumor types will have a significant impact in designing novel therapies that can improve the treatment outcome. We used an integrative genomics approach, combing high-resolution comparative genomic hybridization and gene expression microarray profiles, to compare AC and SqCC tumors in order to uncover alterations at the DNA level, with corresponding gene transcription changes, which are selected for during development of lung cancer subtypes. Through the analysis of multiple independent cohorts of clinical tumor samples (>330), normal lung tissues and bronchial epithelial cells obtained by bronchial brushing in smokers without lung cancer, we identified the overexpression of BRF2, a gene on Chromosome 8p12, which is specific for development of SqCC of lung. Genetic activation of BRF2, which encodes a RNA polymerase III (Pol III) transcription initiation factor, was found to be associated with increased expression of small nuclear RNAs (snRNAs) that are involved in processes essential for cell growth, such as RNA splicing. Ectopic expression of BRF2 in human bronchial epithelial cells induced a transformed phenotype and demonstrates downstream oncogenic effects, whereas RNA interference (RNAi)-mediated knockdown suppressed growth and colony formation of SqCC cells overexpressing BRF2, but not AC cells. Frequent activation of BRF2 in >35% preinvasive bronchial carcinoma in situ, as well as in dysplastic lesions, provides evidence that BRF2 expression is an early event in cancer development of this cell lineage. This is the first study, to our knowledge, to show that the focal amplification of a gene in Chromosome 8p12, plays a key role in squamous cell lineage specificity of the disease. Our data suggest that genetic activation of BRF2 represents a unique mechanism of SqCC lung tumorigenesis through the increase of Pol III-mediated transcription. It can serve as a marker for lung SqCC and may provide a novel target for therapy.

Citation

Integrative genomic analyses identify BRF2 as a novel lineage-specific oncogene in lung squamous cell carcinoma. Lockwood WW, Chari R, Coe BP, Thu KL, Garnis C, Malloff CA, Campbell J, Williams AC, Hwang D, Zhu CQ, Buys TP, Yee J, English JC, Macaulay C, Tsao MS, Gazdar AF, Minna JD, Lam S, Lam WL. PLoS Med. 2010 Jul 27;7(7):e1000315.

Lung Cohort

Species: Human
Tissue: Lung
Disease: Respiratory Disease
Investigator: Parre/ Bosse/ Timmens
Institution: UBC/ Laval/ Groningen
Approximate Number Subjects: 1180

Abstract:

A cohort assembled from 3 centres to define the genetics of gene expression in the lung. Samples are collected from a range of pathological conditions with the majority being either COPD or adjacent normal tissue from Lung cancer patients. Gene expression carried out on a custom Affymetrix chip, genotyping on the Illumina 1M panel. Clinical outcomes available include measures of lung function.

Citation

This study is not yet published.  

Lung Squamous Cell Carcinoma TCGA

Species: Human
Tissue: Lung
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Lung_Squamous_Cell_Carcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study is not yet published.  For more information, go to: http://tcga.cancer.gov/

Malignant Lymphoma ICGC

Species: Human
Tissue: Blood
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study is not yet published.  

Medulloblastoma

Species: Human
Tissue: Brain
Disease: Cancer
Investigator: Bert Vogelstein
Institution: Johns Hopkins Medicine
Approximate Number Subjects: 47

Abstract:

The medollablastoma dataset is comprised of 24 pancreatic cancer samples including 14 cell lines and 10 xenografts derived from indivdiual patients.  The dataset includes expression traits as meansured using the SAGE gene expression method (not affiliated with Sage Bionetworks), and clinical phenotypes including patient demographics and tumor traits. SNP traits have been measured on these samples (Illumina 1M beadchip) but are not yet available.  

Citation

Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Jones S, Zhang X, Parsons DW, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Kamiyama H, Jimeno A, Hong SM, Fu B, Lin MT, Calhoun ES, Kamiyama M, Walter K, Nikolskaya T, Nikolsky Y, Hartigan J, Smith DR, Hidalgo M, Leach SD, Klein AP, Jaffee EM, Goggins M, Maitra A, Iacobuzio-Donahue C, Eshleman JR, Kern SE, Hruban RH, Karchin R, Papadopoulos N, Parmigiani G, Vogelstein B, Velculescu VE, Kinzler KW. Science. 2008 Sep 26;321(5897):1801-6.

Mouse Cultured Bone marrow derived Macrophage

Species: Mouse
Tissue: Marcrophage
Disease: Healthy
Investigator: Smith
Institution: Cleaveland Clinic
Approximate Number Subjects: 207

Abstract:

A powerful way to identify genes for complex traits it to combine genetic and genomic methods. Many trait quantitative trait loci (QTLs) for complex traits are sex specific, but the reason for this is not well understood. RNA was prepared from bone marrow derived macrophages of 93 female and 114 male F(2) mice derived from a strain intercross between apoE-deficient mice on the AKR and DBA/2 genetic backgrounds, and was subjected to transcriptome profiling using microarrays. A high density genome scan was performed using a mouse SNP chip, and expression QTLs (eQTLs) were located for expressed transcripts. Using suggestive and significant LOD score cutoffs of 3.0 and 4.3, respectively, thousands of eQTLs in the female and male cohorts were identified. At the suggestive LOD threshold the majority of the eQTLs were trans eQTLs, mapping unlinked to the position of the gene. Cis eQTLs, which mapped to the location of the gene, had much higher LOD scores than trans eQTLs, indicating their more direct effect on gene expression. The majority of cis eQTLs were common to both males and females, but only approximately 1% of the trans eQTLs were shared by both sexes. At the significant LOD threshold, the majority of eQTLs were cis eQTLs, which were mostly sex-shared, while the trans eQTLs were overwhelmingly sex-specific. Pooling the male and female data, 31% of expressed transcripts were expressed at different levels in males vs. females after correction for multiple testing.These studies demonstrate a large sex effect on gene expression and trans regulation, under conditions where male and female derived cells were cultured ex vivo and thus without the influence of endogenous sex steroids. These data suggest that eQTL data from male and female mice should be analyzed separately, as many effects, such as trans regulation are sex specific.

Citation

Sex specific gene regulation and expression QTLs in mouse macrophages from a strain intercross. Bhasin JM, Chakrabarti E, Peng DQ, Kulkarni A, Chen X, Smith JD. PLoS One. 2008 Jan 16;3(1):e1435.

Mouse liver expression

Species: Mouse
Tissue: Liver
Disease: Healthy
Investigator: Jake Lusis/Ghazalpour
Institution: UCLA
Approximate Number Subjects: 110

Abstract:

Quantitative trait locus (QTL) analysis is a powerful tool for mapping genes for complex traits in mice, but its utility is limited by poor resolution. A promising mapping approach is association analysis in outbred stocks or different inbred strains. As a proof of concept for the association approach, we applied whole-genome association analysis to hepatic gene expression traits in an outbred mouse population, the MF1 stock, and replicated expression QTL (eQTL) identified in previous studies of F2 intercross mice. We found that the mapping resolution of these eQTL was significantly greater in the outbred population. Through an example, we also showed how this precise mapping can be used to resolve previously identified loci (in intercross studies), which affect many different transcript levels (known as eQTL "hotspots"), into distinct regions. Our results also highlight the importance of correcting for population structure in whole-genome association studies in the outbred stock.

Citation

High-resolution mapping of gene expression using association in an outbred mouse stock. Ghazalpour A, Doss S, Kang H, Farber C, Wen PZ, Brozell A, Castellanos R, Eskin E, Smith DJ, Drake TA, Lusis AJ. PLoS Genet. 2008 Aug 8;4(8):e1000149.

Mouse Model of Diabetes (BxH)

Species: Mouse
Tissue: Adipose,Liver,Brain,Muscle
Disease: Diabetes
Investigator: Jake Lusis/ Merck & Co.
Institution: UCLA/ Merck & Co.
Approximate Number Subjects: 309

Abstract:

Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process.

Citation

Mapping the genetic architecture of gene expression in human liver. Schadt EE, Molony C, Chudin E, Hao K, Yang X, Lum PY, Kasarskis A, Zhang B, Wang S, Suver C, Zhu J, Millstein J, Sieberts S, Lamb J, GuhaThakurta D, Derry J, Storey JD, Avila-Campillo I, Kruger MJ, Johnson JM, Rohl CA, van Nas A, Mehrabian M, Drake TA, Lusis AJ, Smith RC, Guengerich FP, Strom SC, Schuetz E, Rushmore TH, Ulrich R.PLoS Biol. 2008 May 6;6(5):e107.

Mouse Model of Diabetes (db-db)

Species: Mouse
Tissue: Islet,Adipose,Liver,Intestine
Disease: Diabetes
Investigator: Jake Lusis/ Merck & Co.
Institution: UCLA/ Merck & Co.
Approximate Number Subjects: 500

Abstract:

The classic model of diabetes in mouse is the C57BLKS db/db strain. While C57BL/6 db/db animals show hyperinsulinemia with age, they do not progress to frank diabetes, although they are somewhat hyperglycemic. By contrast, C57BLKS db/db animals are insulin resistant at a very young age and progress to beta cell failure and hyperglycemia over a few months. At 12 weeks, these mice show the beginnings of diabetic complications including nephropathy, neuropathy and cardiomyopathy. The C57BLKS mouse is genetically identical to C57BL/6 over 70% of its genome. Most of the remainder comes from DBA (with some small residual from an unknown strain). Thus, the diabetic susceptibility of C57BLKS most likely originates from the DBA regions of its genome. Further, there are several indications that DBA is more susceptible to diabetes than C57BLKS. For instance compared with C57BLKS and DBA, C57BL/6 islets show strongest glucose-stimulated insulin release and beta-cell proliferation. Similarly, C57BL/6 has the least glucose induced beta-cell apoptosis. Between C57BLKS and DBA, the DBA strain is consistently the most diabetes susceptible by these measures. It is surmised that the DBA strain carries additional diabetes susceptibility loci beyond those carried in the 20% of DBA present in C57BLKS. It is anticipated that the BXD db/db cross will give mapping information for the susceptibility loci in C57BLKS plus the additional loci in DBA.

Citation

This study has not yet been published.

Mouse Model of Diet Induced CVD

Species: Mouse
Tissue: Liver,Brain,Muscle,Adipose
Disease: Metabolic Disease
Investigator: Jake Lusis
Institution: UCLA
Approximate Number Subjects: 442

Abstract:

An F2 cross between C57BL/6J (B6) and Castaneus (CAST) mice. All mice were maintained on a 12 h light12 h dark cycle and fed ad libitum. Mice were fed Purina Chow containing 4% fat until 10 wk of age, and then fed western diet (Teklad 88137, Harlan Teklad) containing 42% fat and 0.15% cholesterol for the subsequent 8 wk. Designed to identify the genes and networks that regulate metabolic traits including body composition, lipids, glucose, and bone traits. 

Citation

This study has not yet been published.

Mouse Model of Naive Airway Hyperresponsiveness

Species: Mouse
Tissue: Lung
Disease: Respiratory Disease
Investigator: David Beier
Institution: Harvard Medical School Brigham and Women's Hospital
Approximate Number Subjects: 200

Abstract:

This backcross is designed to uncover the genetic drivers of nave airway hyperresponsiveness (AHR) in the mouse. The cohort comprises a set of ~200 backcross mice from C57BL/6 and A/J inbred lines and have been phenotyped for methacholine-induced AHR. Genotypes and gene expression for Lung are available.

Citation

This study is not yet published.

Mouse Model of Skin Cancer

Species: Mouse
Tissue: Skin
Disease: Cancer
Investigator: Allan Balmain
Institution: UCSF
Approximate Number Subjects: 71

Abstract:

Germline polymorphisms in model organisms and humans influence susceptibility to complex trait diseases such as inflammation and cancer. Mice of the Mus spretus species are resistant to tumour development, and crosses between M. spretus and susceptible Mus musculus strains have been used to map locations of genetic variants that contribute to skin cancer susceptibility. We have integrated germline polymorphisms with gene expression in normal skin from a M. musculus x M. spretus backcross to generate a network view of the gene expression architecture of mouse skin. Here we demonstrate how this approach identifies expression motifs that contribute to tissue organization and biological functions related to inflammation, haematopoiesis, cell cycle control and tumour susceptibility. Motifs associated with inflammation, epidermal barrier function and proliferation are differentially regulated in backcross mice susceptible or resistant to tumour development. The intestinal stem cell marker Lgr5 is identified as a candidate master regulator of the hair follicle, and the vitamin D receptor (Vdr) is linked to coordinated control of epidermal barrier function, inflammation and tumour susceptibility.

Citation

Genetic architecture of mouse skin inflammation and tumour susceptibility. Quigley DA, To MD, Prez-Losada J, Pelorosso FG, Mao JH, Nagase H, Ginzinger DG, Balmain A.Nature. 2009 Mar 26;458(7237):505-8. 

Myeloma Nonresponders Cohort

Species: Human
Tissue: Blood
Disease: Cancer
Investigator: Stephen Friend
Institution: Sage Bionetworks
Approximate Number Subjects: 300

Abstract:

Clinical outcomes with responder/non-responder status for standard of care therapies. Partial retrospective, partial prospective. Patient-driven. Minimum full exome sequencing.

Citation

This study has not yet been published.

NCI60 Drug Response NCI

Species: Human
Tissue: Cell Line,Cell Line
Disease: Cancer
Investigator: Weinstein
Institution: NCI
Approximate Number Subjects: 60

Abstract:

Chromosome rearrangement, a hallmark of cancer, has profound effects on carcinogenesis and tumor phenotype. We used a panel of 60 human cancer cell lines (the NCI-60) as a model system to identify relationships among DNA copy number, mRNA expression level, and drug sensitivity. For each of 64 cancer-relevant genes, we calculated all 4,096 possible Pearson's correlation coefficients relating DNA copy number (assessed by comparative genomic hybridization using bacterial artificial chromosome microarrays) and mRNA expression level (determined using both cDNA and Affymetrix oligonucleotide microarrays). The analysis identified an association of ERBB2 overexpression with 3p copy number, a finding supported by data from human tumors and a mouse model of ERBB2-induced carcinogenesis. When we examined the correlation between DNA copy number for all 353 unique loci on the bacterial artificial chromosome microarray and drug sensitivity for 118 drugs with putatively known mechanisms of action, we found a striking negative correlation (-0.983; 95% bootstrap confidence interval, -0.999 to -0.899) between activity of the enzyme drug L-asparaginase and DNA copy number of genes near asparagine synthetase in the ovarian cancer cells. Previous analysis of drug sensitivity and mRNA expression had suggested an inverse relationship between mRNA levels of asparagine synthetase and L-asparaginase sensitivity in the NCI-60. The concordance of pharmacogenomic findings at the DNA and mRNA levels strongly suggests further study of L-asparaginase for possible treatment of a low-synthetase subset of clinical ovarian cancers. The DNA copy number database presented here will enable other investigators to explore DNA transcript-drug relationships in their own domains of research focus.

Citation

Integrating data on DNA copy number with gene expression levels and drug sensitivities in the NCI-60 cell line panel.Bussey KJ, Chin K, Lababidi S, Reimers M, Reinhold WC, Kuo WL, Gwadry F, Ajay, Kouros-Mehr H, Fridlyand J, Jain A, Collins C, Nishizuka S, Tonon G, Roschke A, Gehlhaus K, Kirsch I, Scudiero DA, Gray JW, Weinstein JN. Mol Cancer Ther. 2006 Apr;5(4):853-67.

NCI60 Harvard

Species: Human
Tissue: Cell Line
Disease: Cancer
Investigator: William Sellers
Institution: Harvard
Approximate Number Subjects: 58

Abstract:

Systematic analyses of cancer genomes promise to unveil patterns of genetic alterations linked to the genesis and spread of human cancers. High-density single-nucleotide polymorphism (SNP) arrays enable detailed and genome-wide identification of both loss-of-heterozygosity events and copy-number alterations in cancer. Here, by integrating SNP array-based genetic maps with gene expression signatures derived from NCI60 cell lines, we identified the melanocyte master regulator MITF (microphthalmia-associated transcription factor) as the target of a novel melanoma amplification. We found that MITF amplification was more prevalent in metastatic disease and correlated with decreased overall patient survival. BRAF mutation and p16 inactivation accompanied MITF amplification in melanoma cell lines. Ectopic MITF expression in conjunction with the BRAF(V600E) mutant transformed primary human melanocytes, and thus MITF can function as a melanoma oncogene. Reduction of MITF activity sensitizes melanoma cells to chemotherapeutic agents. Targeting MITF in combination with BRAF or cyclin-dependent kinase inhibitors may offer a rational therapeutic avenue into melanoma, a highly chemotherapy-resistant neoplasm. Together, these data suggest that MITF represents a distinct class of 'lineage survival' or 'lineage addiction' oncogenes required for both tissue-specific cancer development and tumour progression.

Citation

Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Garraway LA, Widlund HR, Rubin MA, Getz G, Berger AJ, Ramaswamy S, Beroukhim R, Milner DA, Granter SR, Du J, Lee C, Wagner SN, Li C, Golub TR, Rimm DL, Meyerson ML, Fisher DE, Sellers WR.Nature. 2005 Jul 7;436(7047):117-22.

Nonsmokers Lung Cancer Cohort

Species: Human
Tissue: Lung
Disease: Cancer
Investigator: Ping Yang
Institution: Mayo Clinic
Approximate Number Subjects: 70

Abstract:

Lung cancer in individuals who have never smoked tobacco products is an increasing medical and public-health issue. We aimed to unravel the genetic basis of lung cancer in never smokers. We did a four-stage investigation. First, a genome-wide association study of single nucleotide polymorphisms (SNPs) was done with 754 never smokers (377 matched case-control pairs at Mayo Clinic, Rochester, MN, USA). Second, the top candidate SNPs from the first study were validated in two independent studies among 735 (MD Anderson Cancer Center, Houston, TX, USA) and 253 (Harvard University, Boston, MA, USA) never smokers. Third, further replication of the top SNP was done in 530 never smokers (UCLA, Los Angeles, CA, USA). Fourth, expression quantitative trait loci (eQTL) and gene-expression differences were analysed to further elucidate the causal relation between the validated SNPs and the risk of lung cancer in never smokers. 44 top candidate SNPs were identified that might alter the risk of lung cancer in never smokers. rs2352028 at chromosome 13q31.3 was subsequently replicated with an additive genetic model in the four independent studies, with a combined odds ratio of 1.46 (95% CI 1.26-1.70, p=5.94x10(-6)). A cis eQTL analysis showed there was a strong correlation between genotypes of the replicated SNPs and the transcription level of the gene GPC5 in normal lung tissues (p=1.96x10(-4)), with the high-risk allele linked with lower expression. Additionally, the transcription level of GPC5 in normal lung tissue was twice that detected in matched lung adenocarcinoma tissue (p=6.75x10(-11)). Genetic variants at 13q31.3 alter the expression of GPC5, and are associated with susceptibility to lung cancer in never smokers. Downregulation of GPC5 might contribute to the development of lung cancer in never smokers.

Citation

Genetic variants and risk of lung cancer in never smokers: a genome-wide association study. Li Y, Sheu CC, Ye Y, de Andrade M, Wang L, Chang SC, Aubry MC, Aakre JA, Allen MS, Chen F, Cunningham JM, Deschamps C, Jiang R, Lin J, Marks RS, Pankratz VS, Su L, Li Y, Sun Z, Tang H, Vasmatzis G, Harris CC, Spitz MR, Jen J, Wang R, Zhang ZF, Christiani DC, Wu X, Yang P. Lancet Oncol. 2010 Apr;11(4):321-30.

Oral Cancer ICGC

Species: Human
Tissue: Skin
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study has not yet been published.

Ovarian Cancer ICGC

Species: Human
Tissue: Ovary
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study has not yet been published.

Ovarian Cancer Nonresponders Cohort

Species: Human
Tissue: Pancreas
Disease: Cancer
Investigator: Stephen Friend
Institution: Sage Bionetworks
Approximate Number Subjects: 300

Abstract:

Clinical outcomes with responder/non-responder status for standard of care therapies. Partial retrospective, partial prospective. Patient-driven. Minimum full exome sequencing.

Citation

This study has not yet been published.

Paired Breast Ovary Cancer

Species: Human
Tissue: Breast,Ovary
Disease: Cancer
Investigator: Jean-Philippe Meyniel
Institution: Institut Curie
Approximate Number Subjects: 16

Abstract:

 The distinction between primary and secondary ovarian tumors may be challenging for pathologists. The purpose of the present work was to develop genomic and transcriptomic tools to further refine the pathological diagnosis of ovarian tumors after a previous history of breast cancer. Sixteen paired breast-ovary tumors from patients with a former diagnosis of breast cancer were collected. The genomic profiles of paired tumors were analyzed using the Affymetrix GeneChip Mapping 50 K Xba Array or Genome-Wide Human SNP Array 6.0 (for one pair), and the data were normalized with ITALICS (ITerative and Alternative normaLIzation and Copy number calling for affymetrix Snp arrays) algorithm or Partek Genomic Suite, respectively. The transcriptome of paired samples was analyzed using Affymetrix GeneChip Human Genome U133 Plus 2.0 Arrays, and the data were normalized with gc-Robust Multi-array Average (gcRMA) algorithm. A hierarchical clustering of these samples was performed, combined with a dataset of well-identified primary and secondary ovarian tumors. In 12 of the 16 paired tumors analyzed, the comparison of genomic profiles confirmed the pathological diagnosis of primary ovarian tumor (n = 5) or metastasis of breast cancer (n = 7). Among four cases with uncertain pathological diagnosis, genomic profiles were clearly distinct between the ovarian and breast tumors in two pairs, thus indicating primary ovarian carcinomas, and showed common patterns in the two others, indicating metastases from breast cancer. In all pairs, the result of the transcriptomic analysis was concordant with that of the genomic analysis.In patients with ovarian carcinoma and a previous history of breast cancer, SNP array analysis can be used to distinguish primary and secondary ovarian tumors. Transcriptomic analysis may be used when primary breast tissue specimen is not available.

Citation

A genomic and transcriptomic approach for a differential diagnosis between primary and secondary ovarian carcinomas in patients with a previous history of breast cancer. Meyniel JP, Cottu PH, Decraene C, Stern MH, Couturier J, Lebigot I, Nicolas A, Weber N, Fourchotte V, Alran S, Rapinat A, Gentien D, Roman-Roman S, Mignot L, Sastre-Garau X.BMC Cancer. 2010 May 21;10:222.

Pancreatic Ductal Adenocarcinoma ICGC

Species: Human
Tissue: Pancreatic
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study has not yet been published.

Pediatric ALL TARGET

Species: Human
Tissue: Blood
Disease: Cancer
Investigator: Hunger/Willman/Downing/Relling/Mullighan
Institution: Colorado/UNM/St Jude
Approximate Number Subjects: 255

Abstract:

http://target.cancer.gov/dataportal/about/

Citation

http://target.cancer.gov/

Pediatric Brain Cancer ICGC

Species: Human
Tissue: Brain
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study has not yet been published.

Pediatric Neuroblastoma TARGET

Species: Human
Tissue: Brain
Disease: Cancer
Investigator: Maris/Seeger/Khan
Institution: CHOP/Childrens-LA/NCI
Approximate Number Subjects: 455

Abstract:

http://target.cancer.gov/dataportal/about/

Citation

http://target.cancer.gov/

Pediatric Osteosarcoma TARGET

Species: Human
Tissue: Bone
Disease: Cancer
Investigator: Boris Lau
Institution: Baylor University
Approximate Number Subjects: 100

Abstract:

http://target.cancer.gov/dataportal/about/

Citation

http://target.cancer.gov/

Pediatric Wilms Tumor TARGET

Species: Human
Tissue: Kidney,Kidney
Disease: Cancer
Investigator: Perlman
Institution: Northwestern University
Approximate Number Subjects: 100

Abstract:

http://target.cancer.gov/dataportal/about/

Citation

http://target.cancer.gov/

Pima Indian Metabolic Disease

Species: Human
Tissue: Muscle,Adipose,Muscle,Adipose
Disease: Metabolic Disease
Investigator: Clifton Bogardus
Institution: NIDDK, National Institutes of Health
Approximate Number Subjects: 225

Abstract:

The Pima Indians have an extraordinarily high prevalence of type II diabetes and have been the subject of longitudinal genetic studies by NIDDK. This dataset comprises 225 skeletal muscle and adipose samples profiled on a exon array with accompanying genotypes (~1 million SNPs). Longitudinal clinical outcomes include a variety of tests to measure body composition, oral and intravenous glucose tolerance, measures of insulin sensitivity using the hyperinsulinemic, euglycemic clamp, measures of insulin secretory function, as well as measures of 24-hour energy expenditure and substrate oxidation rates in a human calorimeter.

Citation

This study has not yet been published.

Primary Endothelial Cell Response to Oxidized Phospolipids

Species: Human
Tissue: Cell Line
Disease: CVD
Investigator: Jake Lusis/Romanoski
Institution: UCLA
Approximate Number Subjects: 96

Abstract:

Gene by environment (GxE) interactions are clearly important in many human diseases, but they have proven to be difficult to study on a molecular level. We report genetic analysis of thousands of transcript abundance traits in human primary endothelial cell (EC) lines in response to proinflammatory oxidized phospholipids implicated in cardiovascular disease. Of the 59 most regulated transcripts, approximately one-third showed evidence of GxE interactions. The interactions resulted primarily from effects of distal-, trans-acting loci, but a striking example of a local-GxE interaction was also observed for FGD6. Some of the distal interactions were validated by siRNA knockdown experiments, including a locus involved in the regulation of multiple transcripts involved in the ER stress pathway. Our findings add to the understanding of the overall architecture of complex human traits and are consistent with the possibility that GxE interactions are responsible, in part, for the failure of association studies to more fully explain common disease variation.

Citation

Systems genetics analysis of gene-by-environment interactions in human cells. Romanoski CE, Lee S, Kim MJ, Ingram-Drake L, Plaisier CL, Yordanova R, Tilford C, Guan B, He A, Gargalovic PS, Kirchgessner TG, Berliner JA, Lusis AJ. Am J Hum Genet. 2010 Mar 12;86(3):399-410.

Primary Osteoblast Cohort

Species: Human
Tissue: Cell Line
Disease: Healthy
Investigator: Pastinen/Grundberg
Institution: McGill
Approximate Number Subjects: 95

Abstract:

The common genetic variants associated with complex traits typically lie in noncoding DNA and may alter gene regulation in a cell type-specific manner. Consequently, the choice of tissue or cell model in the dissection of disease associations is important. We carried out an expression quantitative trait loci (eQTL) study of primary human osteoblasts (HOb) derived from 95 unrelated donors of Swedish origin, each represented by two independently derived primary lines to provide biological replication. We combined our data with publicly available information from a genome-wide association study (GWAS) of bone mineral density (BMD). The top 2000 BMD-associated SNPs (P < approximately 10(-3)) were tested for cis-association of gene expression in HObs and in lymphoblastoid cell lines (LCLs) using publicly available data and showed that HObs have a significantly greater enrichment (threefold) of converging cis-eQTLs as compared to LCLs. The top 10 BMD loci with SNPs showing strong cis-effects on gene expression in HObs (P = 6 x 10(-10) - 7 x 10(-16)) were selected for further validation using a staged design in two cohorts of Caucasian male subjects. All 10 variants were tested in the Swedish MrOS Cohort (n = 3014), providing evidence for two novel BMD loci (SRR and MSH3). These variants were then tested in the Rotterdam Study (n = 2090), yielding converging evidence for BMD association at the 17p13.3 SRR locus (P(combined) = 5.6 x 10(-5)). The cis-regulatory effect was further fine-mapped to the proximal promoter of the SRR gene (rs3744270, r(2) = 0.5, P = 2.6 x 10(-15)). Our results suggest that primary cells relevant to disease phenotypes complement traditional approaches for prioritization and validation of GWAS hits for follow-up studies.

Citation

Population genomics in a disease targeted primary cell model. Grundberg E, Kwan T, Ge B, Lam KC, Koka V, Kindmark A, Mallmin H, Dias J, Verlaan DJ, Ouimet M, Sinnett D, Rivadeneira F, Estrada K, Hofman A, van Meurs JM, Uitterlinden A, Beaulieu P, Graziani A, Harmsen E, Ljunggren O, Ohlsson C, Mellstrm D, Karlsson MK, Nilsson O, Pastinen T. Genome Res. 2009 Nov;19(11):1942-52.

Prostate Cancer FHCRC

Species: Human
Tissue: Prostate
Disease: Cancer
Investigator: Barbara Trask
Institution: Fred Hutchinson Cancer Research Center
Approximate Number Subjects: 54

Abstract:

Androgen deprivation is the mainstay of therapy for progressive prostate cancer. Despite initial and dramatic tumor inhibition, most men eventually fail therapy and die of metastatic castration-resistant (CR) disease. Here, we characterize the profound degree of genomic alteration found in CR tumors using array comparative genomic hybridization (array CGH), gene expression arrays, and fluorescence in situ hybridization (FISH). Bycluster analysis, we show that the similarity of the genomic profiles from primary and metastatic tumors is driven by the patient. Using data adjusted for this similarity, we identify numerous high-frequency alterations in the CR tumors, such as 8p loss and chromosome 7 and 8q gain. By integrating array CGH and expression array data, we reveal genes whose correlated values suggest they are relevant to prostate cancer biology. We find alterations that are significantly associated with the metastases of specific organ sites, and others with CR tumors versus the tumors of patients with localized prostate cancer not treated with androgen deprivation. Within the high-frequency sites of loss in CR metastases, we find an overrepresentation of genes involved in cellular lipid metabolism, including PTEN. Finally, using FISH, we verify the presence of a gene fusion between TMPRSS2 and ERG suggested by chromosome 21 deletions detected by array CGH. We find the fusion in 54% of our CR tumors, and 81% of the fusion-positive tumors contain cells with multiple copies of the fusion. Our investigation lays the foundation for a better understanding of and possible therapeutic targets for CR disease, the poorly responsive and final stage of prostate cancer.

Citation

Comparative analyses of chromosome alterations in soft-tissue metastases within and across patients with castration-resistant prostate cancer.Holcomb IN, Young JM, Coleman IM, Salari K, Grove DI, Hsu L, True LD, Roudier MP, Morrissey CM, Higano CS, Nelson PS, Vessella RL, Trask BJ. Cancer Res. 2009 Oct 1;69(19):7793-802.

Prostate Cancer ICGC

Species: Human
Tissue: Prostate
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study has not yet been published.

Rat Model of Diabetes

Species: Rat
Tissue: Kidney,Heart,Liver,Muscle,Adrenal,Aorta,Adipose
Disease: Diabetes
Investigator: Stuart Cook
Institution: Imperial College London
Approximate Number Subjects: 29

Abstract:

Combined analyses of gene networks and DNA sequence variation can provide new insights into the aetiology of common diseases that may not be apparent from genome-wide association studies alone. Recent advances in rat genomics are facilitating systems-genetics approaches. Here we report the use of integrated genome-wide approaches across seven rat tissues to identify gene networks and the loci underlying their regulation. We defined an interferon regulatory factor 7 (IRF7)-driven inflammatory network (IDIN) enriched for viral response genes, which represents a molecular biomarker for macrophages and which was regulated in multiple tissues by a locus on rat chromosome 15q25. We show that Epstein-Barr virus induced gene 2 (Ebi2, also known as Gpr183), which lies at this locus and controls B lymphocyte migration, is expressed in macrophages and regulates the IDIN. The human orthologous locus on chromosome 13q32 controlled the human equivalent of the IDIN, which was conserved in monocytes. IDIN genes were more likely to associate with susceptibility to type 1 diabetes (T1D)-a macrophage-associated autoimmune disease-than randomly selected immune response genes (P = 8.85??10(-6)). The human locus controlling the IDIN was associated with the risk of T1D at single nucleotide polymorphism rs9585056 (P = 7.0??10(-10); odds ratio, 1.15), which was one of five single nucleotide polymorphisms in this region associated with EBI2 (GPR183) expression. These data implicate IRF7 network genes and their regulatory locus in the pathogenesis of T1D.

Citation

A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk.
Heinig M, Petretto E, Wallace C, Bottolo L, Rotival M, Lu H, Li Y, Sarwar R, Langley SR, Bauerfeind A, Hummel O, Lee YA, Paskas S, Rintisch C, Saar K, Cooper J, Buchan R, Gray EE, Cyster JG; Cardiogenics Consortium, Erdmann J, Hengstenberg C, Maouche S, Ouwehand WH, Rice CM, Samani NJ, Schunkert H, Goodall AH, Schulz H, Roider HG, Vingron M, Blankenberg S, Mnzel T, Zeller T, Szymczak S, Ziegler A, Tiret L, Smyth DJ, Pravenec M, Aitman TJ, Cambien F, Clayton D, Todd JA, Hubner N, Cook SA. Nature. 2010 Sep 23;467(7314):460-4

Rectum Adenocarcinoma TCGA

Species: Human
Tissue: Skin
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Rectum_Adenocarcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study has not yet been published. For more information, go to: http://tcga.cancer.gov

Renal Cell Carcinoma ICGC

Species: Human
Tissue: Kidney
Disease: Cancer
Investigator: ICGC
Institution: ICGC
Approximate Number Subjects: 500

Abstract:

The ICGC cohorts consist of prospectively collected matched tumor and adjacent normal tissue from 500 patients with particular cancer diagnosis. Data includes full genome DNA sequence, DNA methylation, mRNA Expression profiling, miRNA. Clinical outcomes available. www.icgc.org

Citation

This study has not yet been published.

Renal Clear Cell Carcinoma TCGA

Species: Human
Tissue: Kidney
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Renal_Clear_Cell_Carcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study has not yet been published. For more information, go to: http://tcga.cancer.gov

Renal Papillary Cell Carcinoma TCGA

Species: Human
Tissue: Kidney
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Renal_Papillary_Cell_Carcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study has not yet been published. For more information, go to: http://tcga.cancer.gov

San Antonio Family Heart Study PBMC

Species: Human
Tissue: Blood
Disease: CVD
Investigator: Goring
Institution: University of Texas Southwestern
Approximate Number Subjects: 1240

Abstract:

Quantitative differences in gene expression are thought to contribute to phenotypic differences between individuals. We generated genome-wide transcriptional profiles of lymphocyte samples from 1,240 participants in the San Antonio Family Heart Study. The expression levels of 85% of the 19,648 detected autosomal transcripts were significantly heritable. Linkage analysis uncovered >1,000 cis-regulated transcripts at a false discovery rate of 5% and showed that the expression quantitative trait loci with the most significant linkage evidence are often located at the structural locus of a given transcript. To highlight the usefulness of this much-enlarged map of cis-regulated transcripts for the discovery of genes that influence complex traits in humans, as an example we selected high-density lipoprotein cholesterol concentration as a phenotype of clinical importance, and identified the cis-regulated vanin 1 (VNN1) gene as harboring sequence variants that influence high-density lipoprotein cholesterol concentrations.

Citation

Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes.Gring HH, Curran JE, Johnson MP, Dyer TD, Charlesworth J, Cole SA, Jowett JB, Abraham LJ, Rainwater DL, Comuzzie AG, Mahaney MC, Almasy L, MacCluer JW, Kissebah AH, Collier GR, Moses EK, Blangero J. Nat Genet. 2007 Oct;39(10):1208-16. 

Stockholm Atherosclerosis Gene Expression Study

Species: Human
Tissue: Blood,Marcrophage,Liver,Carotid,Aorta,Adipose,Muscle,Plaque
Disease: CVD
Investigator: Johan Bjorkegren
Institution: Clinical Gene Network
Approximate Number Subjects: 100

Abstract:

Environmental exposures filtered through the genetic make-up of each individual alter the transcriptional repertoire in organs central to metabolic homeostasis, thereby affecting arterial lipid accumulation, inflammation, and the development of coronary artery disease (CAD). The primary aim of the Stockholm Atherosclerosis Gene Expression (STAGE) study was to determine whether there are functionally associated genes (rather than individual genes) important for CAD development. To this end, two-way clustering was used on 278 transcriptional profiles of liver, skeletal muscle, and visceral fat (n = 66/tissue) and atherosclerotic and unaffected arterial wall (n = 40/tissue) isolated from CAD patients during coronary artery bypass surgery. The first step, across all mRNA signals (n = 15,042/12,621 RefSeqs/genes) in each tissue, resulted in a total of 60 tissue clusters (n = 3958 genes). In the second step (performed within tissue clusters), one atherosclerotic lesion (n = 49/48) and one visceral fat (n = 59) cluster segregated the patients into two groups that differed in the extent of coronary stenosis (P = 0.008 and P = 0.00015). The associations of these clusters with coronary atherosclerosis were validated by analyzing carotid atherosclerosis expression profiles. Remarkably, in one cluster (n = 55/54) relating to carotid stenosis (P = 0.04), 27 genes in the two clusters relating to coronary stenosis were confirmed (n = 16/17, P<10(-27 and-30)). Genes in the transendothelial migration of leukocytes (TEML) pathway were overrepresented in all three clusters, referred to as the atherosclerosis module (A-module). In a second validation step, using three independent cohorts, the A-module was found to be genetically enriched with CAD risk by 1.8-fold (P<0.004). The transcription co-factor LIM domain binding 2 (LDB2) was identified as a potential high-hierarchy regulator of the A-module, a notion supported by subnetwork analysis, by cellular and lesion expression of LDB2, and by the expression of 13 TEML genes in Ldb2-deficient arterial wall. Thus, the A-module appears to be important for atherosclerosis development and, together with LDB2, merits further attention in CAD research.

Citation

Multi-organ expression profiling uncovers a gene module in coronary artery disease involving transendothelial migration of leukocytes and LIM domain binding 2: the Stockholm Atherosclerosis Gene Expression (STAGE) study.Hgg S, Skogsberg J, Lundstrm J, Noori P, Nilsson R, Zhong H, Maleki S, Shang MM, Brinne B, Bradshaw M, Bajic VB, Samnegrd A, Silveira A, Kaplan LM, Gigante B, Leander K, de Faire U, Rosfors S, Lockowandt U, Liska J, Konrad P, Takolander R, Franco-Cereceda A, Schadt EE, Ivert T, Hamsten A, Tegnr J, Bjrkegren J. PLoS Genet. 2009 Dec;5(12):e1000754.

Stomach Adenocarcinoma TCGA

Species: Human
Tissue: Stomach
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Stomach_Adenocarcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study has not yet been published. For more information, go to: http://tcga.cancer.gov

The Connectivity Map

Species: Human
Tissue: Cell Line
Disease: Cancer
Investigator: Todd Golub
Institution: BROAD Institute
Approximate Number Subjects: 5

Abstract:

To pursue a systematic approach to the discovery of functional connections among diseases, genetic perturbation, and drug action, we have created the first installment of a reference collection of gene-expression profiles from cultured human cells treated with bioactive small molecules, together with pattern-matching software to mine these data. We demonstrate that this "Connectivity Map" resource can be used to find connections among small molecules sharing a mechanism of action, chemicals and physiological processes, and diseases and drugs. These results indicate the feasibility of the approach and suggest the value of a large-scale community Connectivity Map project.

Citation

The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR. Science. 2006 Sep 29;313(5795):1929-35.

Uterine Corpus Endometrioid Carcinoma TCGA

Species: Human
Tissue: Uterus
Disease: Cancer
Investigator: TCGA
Institution: TCGA
Approximate Number Subjects: 500

Abstract:

The Cancer Genome Atlas is generating multiple levels of genomic information on a panel of 500 Uterine_Corpus_Endometrioid_Carcinoma tumor samples.  For more information, go to: http://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp

Citation

This study has not yet been published. For more information, go to: http://tcga.cancer.gov

Loading....