GeWare: A data warehouse for gene expression analysis

Like dokumenter
Sascha Schubert Product Manager Data Mining SAS International Copyright 2006, SAS Institute Inc. All rights reserved.

Er du nysgjerrig på om det er mulig...

UNIVERSITETET I OSLO

Den europeiske byggenæringen blir digital. hva skjer i Europa? Steen Sunesen Oslo,

HARP-Hybrid Ad Hoc Routing Protocol

Grunnlag: 11 år med erfaring og tilbakemeldinger

Andrew Gendreau, Olga Rosenbaum, Anthony Taylor, Kenneth Wong, Karl Dusen

OPPA European Social Fund Prague & EU: We invest in your future.

OPPA European Social Fund Prague & EU: We invest in your future.

By Bioforsk RECOCA Team Per Stålnacke Csilla Farkas Johannes Deelstra

Multimedia in Teacher Training (and Education)

A Study of Industrial, Component-Based Development, Ericsson

OECD GUIDELINE FOR THE TESTING OF CHEMICALS

Presenting a short overview of research and teaching

Server-Side Eclipse. Bernd Kolb Martin Lippert it-agile GmbH

buildingsmart Norge seminar Gardermoen 2. september 2010 IFD sett i sammenheng med BIM og varedata

XCube XML for Data Warehouses

Invitation to Tender FSP FLO-IKT /2013/001 MILS OS

Server-Side Eclipse. Martin Lippert akquinet agile GmbH

Exploratory Analysis of a Large Collection of Time-Series Using Automatic Smoothing Techniques

Presenting a short overview of research and teaching

Public roadmap for information management, governance and exchange SINTEF

SRP s 4th Nordic Awards Methodology 2018

Digitization of archaeology is it worth while?

Gir vi IKT-kandidatene egnet kompetanse for fremtiden? Morten Dæhlen Dekan

Energihandel og miljø 2020 Resultater fra delprosjekt III. Strömstad november 2012 Stig Ødegaard Ottesen

MID-TERM EXAM TDT4258 MICROCONTROLLER SYSTEM DESIGN. Wednesday 3 th Mars Time:

FMEM: A Fine- grained Memory Estimator for MapReduce Jobs

Unit Relational Algebra 1 1. Relational Algebra 1. Unit 3.3

Citation and reference tools for your master thesis

Emergency Management Training

Understanding Social and Environmental Conflicts in Mining Exploration Simexmin May 18, 2016 Alan Dabbs Social Capital Group. Tia Maria Conflict

ISO standardisering for leveranser av informasjon. BIM => En måte å tenke på. TEMA - Informasjon

Verktøy for å håndtere siteringer og referanser i masteroppgaven. Citation and reference tools for your master thesis. Citations and references

Endringsdyktige og troverdige systemer

EFFEKTIVE KUNDEPROSESSER

RF Power Capacitors Class kV Discs with Moisture Protection

Software applications developed for the maritime service at the Danish Meteorological Institute

RF Power Capacitors Class1. 5kV Discs

CSR Harvesting Final Meeting September, 2015 Brest, France. Anne Che-Bohnenstengel & Matthias Pramme, BSH

// PRESENTASJONER FRA NJAVA

Social Media Insight

Citation and reference tools for your master thesis

SAS FANS NYTT & NYTTIG FRA VERKTØYKASSA TIL SAS 4. MARS 2014, MIKKEL SØRHEIM

Bruk av HP Quality Center med smidige utviklingsmetoder. HP Sofware Norge

TriCOM XL / L. Energy. Endurance. Performance.

Databases 1. Extended Relational Algebra

Citation and reference tools for your master thesis

Pålitelighet - Vi tar ansvar, holder det vi lover og opptrer troverdig i alle sammenhenger Engasjement - Vi er positive, løsningsorientert, ønsker å

Harmonisation of terminological resources future perspectives

SMART MANAGER I STATKRAFT ALL INFORMASJON FOR LINJELEDER I ETT DASHBOARD.

Utfordring til KIS-leverandører håndtering av

Introduction to DK- CERT Vulnerability Database

Trianguleringer og anvendelser

Øystein Haugen, Professor, Computer Science MASTER THESES Professor Øystein Haugen, room D

ESTABLISHING A EUROPEAN HIGH POWER CHARGING NETWORK JAN HAUGEN IHLE, REGIONSDIREKTØR NORTHERN EUROPE, IONITY. IONITY Präsentation October 2018

Satellite Stereo Imagery. Synthetic Aperture Radar. Johnson et al., Geosphere (2014)

L+T Data Processing and its Special Features

Workshop 2: Med fokus på forvaltning av grunndata i M3

Erfaringer fra en Prosjektleder som fikk «overflow»

Capturing the value of new technology How technology Qualification supports innovation

Trond Brande

Péter Bakonyi VITUKI

RF Power Capacitors Class , 20 & 30 mm Barrel Transmitting Types

Praktisk bevaringsmetodikk - prosesser, rutiner, metoder, verktøy. v/sigve Espeland

EG-leder konferanse 2017

Neuroscience. Kristiansand

Merak Un-glazed Porcelain Wall and Floor Tiles

The Future of Academic Libraries the Road Ahead. Roy Gundersen

Company Presentation. (sfi) 2 karrieredag. 31 October Knut T. Smerud Founder, Chairman and CEO SMERUD MEDICAL RESEARCH GROUP

Status for IMOs e-navigasjon prosess. John Erik Hagen, Regiondirektør Kystverket

Interaction between GPs and hospitals: The effect of cooperation initiatives on GPs satisfaction

Quantitative Spectroscopy Chapter 3 Software

En praktisk anvendelse av ITIL rammeverket

INF5120 Modellbasert systemutvikling

Integrating Evidence into Nursing Practice Using a Standard Nursing Terminology

Hvordan bedømmer Gartner de lange linjene?

Minimumskrav bør være å etablere at samtykke ikke bare må være gitt frivillig, men også informert.

Kontor i Stockholm, Oslo og København Lang erfaring fra produkt og sw. utvikling innenfor IT og Telecom segmentet

Hvordan komme i gang med ArchiMate? Det første modelleringsspråket som gjør TOGAF Praktisk

SERK1/2 Acts as a Partner of EMS1 to Control Anther Cell Fate Determination in Arabidopsis

Implementing Bayesian random-effects meta-analysis

Neural Network. Sensors Sorter

Europeiske standarder -- CIM og ENTSO-E CGMES. Svein Harald Olsen, Statnett Fornebu, 11. september 2014

Moving Objects. We need to move our objects in 3D space.

TIDE DISTRIBUTIVE EFFECTS OF INDIRECT TAXATION:

ISO 41001:2018 «Den nye læreboka for FM» Pro-FM. Norsk tittel: Fasilitetsstyring (FM) - Ledelsessystemer - Krav og brukerveiledning

PSi Apollo. Technical Presentation

Rollen til norsk vannkraft i 2050 scenarioer for Norge som leverandør av balansekraft

NDT og Skadeanalyser Viktige brikker i optimalisert tilstandskontroll. MainTech Konferansen 2015

HONSEL process monitoring

SOME EMPIRICAL EVIDENCE ON THE DECREASING SCALE ELASTICITY

Estimating Peer Similarity using. Yuval Shavitt, Ela Weinsberg, Udi Weinsberg Tel-Aviv University

BI ON THE CHEAP. Mark Griffiths Griff & Jones Ltd.

Erfaringer og eksempler fra Nord-Europas største implementering av SAP BPC for konsolidering. Stig Skoglund Leading Consultant Statoil

P(ersonal) C(omputer) Gunnar Misund. Høgskolen i Østfold. Avdeling for Informasjonsteknologi

Call function of two parameters

Accuracy of Alternative Baseline Methods

Generalization of age-structured models in theory and practice

Ruter dialogkonferanse

Transkript:

GeWare: A data warehouse for gene expression analysis T. Kirsten, H.-H. Do, E. Rahm WG 1, IZBI, University of Leipzig www.izbi.de, dbs.uni-leipzig.de

Outline Motivation GeWare Architecture Annotation Integration Expression Analysis Conclusion and Future Work 2

Motivation Microarrays to measure expression of thousands of genes at the same time Various kinds of data with different characteristics and requirements Central data management and analysis platform Data Warehouse approach Expression data import, e.g. from Affymetrix system Fact tables to store both raw and derived data Uniform specification of experiment annotations Integration of gene annotations from public sources Integration of analysis and data mining algorithms/tools 3

System Architecture Sources Data Warehouse Data Analysis Experiments Flat Files & MicroDB RDBMS Probe and gene Intensities intensities File-based exchange Data ex-/import to/from tools Uniform web-based GUI Descriptive statistics Canned / Ad-hoc queries (Data mining, OLAP) MIAME Submission Website Manual User Input Sample & Experiment annotations Multidimensional data model Transparent integration (Data access using database API) Public Data Sources LocusLink GO RDBMS UniGene GenMapper Gene annotations Data Integration Tight integration (Direct operation on database Tool Integration 4

System Workflows Import workflow Expression analysis workflow Experiment creation / selection Browse / search in annotations Import of raw data (CEL files) Import of preprocessed data (MicroDB or others) Experiment annotation Gene groups Management of analysis objects Experiment groups Gene expression matrices Preprocessing (Normalization/aggregation) Expression analysis Visualization Internal / integrated analysis Statistics / reporting Export OLAP External analysis (Functional profiling, clustering) 5

Experiment Annotation Uniform and consistent annotation Controlled annotation vocabularies Annotation templates as collections of annotation categories for which the annotation values has to be captured Hierarchical arrangement of categories Definition of MIAME compliant templates by biologists MAGE-ML export (data exchange) 6

Experiment and Gene Groups Collections of objects with common patterns Input for reporting and further analysis Definition by User selection Search in annotation Result of reporting and advanced analysis Example: p53 tumor group 7

Preprocessing Preprocessing Generation of expression values on gene level Several predefined algorithms (Mas5, Rma, Li/Wong) P. by user defined methods for normalization, etc. 8

Expression Analysis Several statistical reports used for analysis entry and outlier detection Descriptive statistics (mean, sd, se ) Gene/experiment correlation Concentration ratios Advanced analysis Multiple Resampling Procedures: Analysis and test of gene groups Multivariate tests on probe basis 9

Visualization Different kinds of visualization of gene expression values (line, bar charts, heat maps) 10

Conclusions / Future Work GeWare Management of a high volume of expression data Flexible and uniform experiment annotation Storing experiment and gene groups, expression matrices Different kinds of analysis, export Web-Interface: http://www.izbi.de/izbi/ag1/geware.html Future work Coupling with advanced analysis/ data mining routines Visualization extension 11