Kursus 02402/02323 Introduktion til statistik. Forelæsning 13: Et overblik over kursets indhold. Peder Bacher

Like dokumenter
Kapittel 3: Studieopplegg

Ordliste faguttrykk; ÅMA110 Sannsynlighetsregning med statistikk

ECON240 Statistikk og økonometri

Analyse av kontinuerlige data. Intro til hypotesetesting. 21. april Seksjon for medisinsk statistikk, UIO. Tron Anders Moger

ECON240 Statistikk og økonometri

Checking Assumptions

Eksamensoppgave i PSY3100 Forskningsmetode - Kvantitativ

Checking Assumptions

Medisinsk statistikk, KLH3004 Dmf, NTNU Styrke- og utvalgsberegning

TMA4245 Statistikk Eksamen 9. desember 2013

Utvalgsfordelinger; utvalg, populasjon, grafiske metoder, X, S 2, t-fordeling, χ 2 -fordeling

TMA 4255 Forsøksplanlegging og anvendte statistiske metoder. Høsten 2008

Eksamensoppgave i PSY3100 forskningsmetoder kvantitativ

Bioberegninger, ST november 2006 Kl. 913 Hjelpemidler: Alle trykte og skrevne hjelpemidler, lommeregner.

EKSAMEN I EMNE ST1201/ST6201 STATISTISKE METODER Onsdag 5. desember 2007 Tid: 09:00 13:00

Generelle lineære modeller i praksis

Eksamensoppgave i PSY3100 Forskningsmetode - Kvantitativ

UNIVERSITETET I OSLO ØKONOMISK INSTITUTT

Kp. 13. Enveis ANOVA

Lineære modeller i praksis

Kræsjkurs i STAT101. Noen anbefalinger Regn mange(5-10) oppgavesett til eksamen:

TMA4240 Statistikk 2014

Sammenlikninger av gjennomsnitt. SOS1120 Kvantitativ metode. Kan besvare to spørsmål: Sammenlikning av to gjennomsnitt

Eksamensoppgave i ST3001

Eksamensoppgave i PSY2017/PSYPRO4317 Statistikk og kvantitative forskningsmetoder

Norges teknisk-naturvitenskapelige universitet Fakultet for samfunnsvitenskap og teknologiledelse Pedagogisk institutt

Level Set methods. Sandra Allaart-Bruin. Level Set methods p.1/24

Akkurat den samme begrunnelsen som vi brukte med variabelen X 2. "Jeg bruker internett mye mer på i-phone nå enn det jeg gjorde før på mobilen.

Kp. 11 Enkel lineær regresjon (og korrelasjon) Kp. 11 Regresjonsanalyse; oversikt

MOT310 Statistiske metoder 1, høsten 2011

Fordelinger, mer om sentralmål og variasjonsmål. Tron Anders Moger

ECON240 Statistikk og økonometri

Datahandling and presentation. Themes. Respekt og redelighet Masterseminar, Frode Volden

Inference for Distributions

Anvendt medisinsk statistikk, vår Repeterte målinger, del II

Statistisk modellering for biologer og bioteknologer, ST august, 2012 Kl. 913 Sensur: 3 uker etter eksamen

Exploratory Analysis of a Large Collection of Time-Series Using Automatic Smoothing Techniques

Forelesning 13 Analyser av gjennomsnittsverdier. Er inntektsfordelingen for kvinner og menn i EU-undersøkelsen lik?

SOS 301 og SOS31/ SOS311 MULTIVARIAT ANALYSE

Moving Objects. We need to move our objects in 3D space.

Implementing Bayesian random-effects meta-analysis

Slope-Intercept Formula

Eksamensoppgave i TMA4255 Anvendt statistikk

Eksamensoppgave i PSY3100 Forskningsmetode - Kvantitativ

UNIVERSITETET I OSLO

Accuracy of Alternative Baseline Methods

MASTER I IDRETTSVITENSKAP 2014/2016. Utsatt individuell skriftlig eksamen. STA 400- Statistikk. Mandag 24. august 2015 kl

Eksamensoppgave i TMA4267 Lineære statistiske modeller

Fakultet for informasjonsteknologi, Institutt for matematiske fag EKSAMEN I EMNE ST2202 ANVENDT STATISTIKK

Til bruk i metodeundervisningen ved Høyskolen i Oslo

Datamatrisen: observasjoner, variabler og verdier. Variablers målenivå: Nominal Ordinal Intervall Forholdstall (ratio)

Repeated Measures Anova.

Satellite Stereo Imagery. Synthetic Aperture Radar. Johnson et al., Geosphere (2014)

Dean Zollman, Kansas State University Mojgan Matloob-Haghanikar, Winona State University Sytil Murphy, Shepherd University

MASTER I IDRETTSVITENSKAP 2013/2015 MASTER I IDRETTSFYSIOTERAPI 2013/2015. Utsatt individuell skriftlig eksamen. STA 400- Statistikk

Stationary Phase Monte Carlo Methods

EKSAMEN I TMA4255 ANVENDT STATISTIKK

Eksamensoppgave i PSY Forskningsdesign

SOS 31 MULTIVARIAT ANALYSE

Estimatorar. Torstein Fjeldstad Institutt for matematiske fag, NTNU

Variansanalyse. Uke Variansanalyse. ANOVA=ANalysis Of Variance

EXAMINATION PAPER. Exam in: STA-3300 Date: Wednesday 27. November 2013 Time: Kl 09:00 13:00 Place: Åsgårdsv All printed and written

Forskningsmetoder i menneske-maskin interaksjon Research Methods in Human-Computer Interaction

UNIVERSITETET I OSLO

EKSAMENSOPPGAVE I SØK1004 STATISTIKK FOR ØKONOMER STATISTICS FOR ECONOMISTS

EXAMINATION PAPER. Exam in: STA-3300 Applied statistics 2 Date: Wednesday, November 25th 2015 Time: Kl 09:00 13:00 Place: Teorifagb.

Multisample Inference del 2 (Rosner ) Øyvind Salvesen

MASTER I IDRETTSVITENSKAP 2018/2020. Individuell skriftlig eksamen. STA 400- Statistikk. Mandag 18. mars 2019 kl

VLSI Design for Yield on Chip Level

Fra krysstabell til regresjon

Eksamensoppgave i ST2304 Statistisk modellering for biologer og bioteknologer

MOT310 Statistiske metoder 1, høsten 2006 Løsninger til regneøving nr. 8 (s. 1) Oppgaver fra boka:

Kapittel 8: Tilfeldige utvalg, databeskrivelse og fordeling til observatorar, Kapittel 9: Estimering

0:7 0:2 0:1 0:3 0:5 0:2 0:1 0:4 0:5 P = 0:56 0:28 0:16 0:38 0:39 0:23

MASTER I IDRETTSVITENSKAP 2013/2015 MASTER I IDRETTSFYSIOTERAPI 2013/2015. Individuell skriftlig eksamen. STA 400- Statistikk

Eksamensoppgave i TMA4255 Anvendt statistikk

EKSAMENSOPPGAVER STAT100 Vår 2011

PSY 1002 Statistikk og metode. Frode Svartdal April 2016

Kp. 12 Multippel regresjon

UNIVERSITETET I OSLO ØKONOMISK INSTITUTT

Eksamensoppgave i PSY2022 Forskningsdesign

Eksamensoppgave i PSY3100 Forskningsmetode kvantitativ

Oppgave 1. T = 9 Hypotesetest for å teste om kolesterolnivået har endret seg etter dietten: T observert =

Mål: SPSS. Litteratur. Noen statistikk-programpakker. Dokumentasjon fra SPSS Inc. Introduksjon til IBM SPSS Statistics 20

Eksamenssettet består av to deler. Ved bedømmelsen teller del A 30 % og del B 70 %. Innenfor hver del teller alle deloppgaver likt.

Løsningsforslag øving 9, ST1301

ST1101/ST6101 Sannsynlighetsregning og statistikk Vår 2019

MOT310 Statistiske metoder 1, høsten 2011 Løsninger til regneøving nr. 7 (s. 1) Oppgaver fra boka: n + (x 0 x) 2 1. n + (x 0 x) 1 2 ) = 1 γ

Eksamensoppgave i TMA4255 Anvendt statistikk

NGF 2008 Ekstreme hendelser sett med en statistikers øyne

Generalization of age-structured models in theory and practice

Histogramprosessering

Little Mountain Housing

KROPPEN LEDER STRØM. Sett en finger på hvert av kontaktpunktene på modellen. Da får du et lydsignal.

Logistisk regresjon 2

EXSAM PROBLEM 1. Universitetet i Agder (University of Agder) Fakultet for økonomi og samfunnsfag (Faculty of Economics and Social Sciences)

Eksamensoppgave i PSY2017/PSYPRO4317 Statistikk og kvantitative forskningsmetoder

TMA4245 Statistikk Eksamen desember 2016

- All printed and written. The exam contains 16 pages included this cover page

STK Oppsummering

Transkript:

Kursus 02402/02323 Introduktion til statistik Forelæsning 13: Et overblik over kursets indhold Peder Bacher DTU Compute, Dynamiske Systemer Building 303B, Room 017 Danish Technical University 2800 Lyngby Denmark e-mail: pbac@dtu.dk 7. september 2015 1 / 50

Overview 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 2 / 50

enote 1: Simple plots og deskriptive statistik Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 3 / 50

enote 1: Simple plots og deskriptive statistik enote 1: Simple plots og deskriptiv statistik Engelsk Teknikker til at se på data! (deskriptiv statistik) Opsummerende størrelser for stikprøve Gennemsnittet ( x) Empirisk standard afvigelse (s) Empirisk varians (s 2 ) Fraktiler og percentiler (f.eks. 15% af data ligger under 0.15 fraktil) Median, øvre- og nedre kvartiler Empririsk korrelation (r) (mellem to stikprøver) Simple plots Scatter plot (xy plot) Histogram (empirisk tæthed) Kumulativ fordeling (empirisk fordeling) Boxplots, søjlediagram, cirkeldiagram (lagkagediagram) 7. september 2015 4 / 50

enote 2: Diskrete fordelinger Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 5 / 50

enote 2: Diskrete fordelinger enote2: Diskrete fordelinger Grundlæggende koncepter: Stokastisk variabel (den får værdi afhængigt af udfald af endnu ikke udført eksperiment) Tæthedsfunktion: f(x) = P (X = x) (pdf) Fordelingsfunktion: F (x) = P (X x) (cdf) Middelværdi: µ = E(X) Standard afvigelse: σ Varians: σ 2 Specikke distributioner: Binomial (terningekast) Hypergeometrisk (trækning uden tilbagelægning) Poisson (antal hændelser i interval) Engelsk 7. september 2015 6 / 50

enote 2: Kontinuerte fordelinger Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 7 / 50

enote 2: Kontinuerte fordelinger enote 2: Kontinuerte fordelinger Grundlæggende koncepter: Tæthedsfunktion: f(x) (pdf) Fordelingsfunktion: F (x) = P (X x) (cdf) Middelværdi (µ) og varians (σ 2 ) Regneregler for stokastiske variabler Engelsk Specikke fordelinger: Normal Log-Normal Uniform Exponential t χ 2 (Chi-i-anden) F 7. september 2015 8 / 50

enote 3: Kondensintervaller for én gruppe/stikprøve Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 9 / 50

enote 3: Kondensintervaller for én gruppe/stikprøve enote 3: Kondensintervaller for én gruppe/stikprøve Grundlæggende koncepter Population og tilfældig stikprøve Estimation (f.eks. ˆµ er estimat af µ) Signikans niveau α Kondensintervaller (fanger rigtige prm. 1 α af gangene) Stikprøvefordelinger (stikprøvegennemsnit (t) og empirisk varians (χ 2 )) Centrale grænseværdisætning Specikke metoder, én gruppe/stikprøve: Kondensintervaller for middelværdi (t-fordeling) og varians (χ 2 fordeling) Forsøgsplanlægning: beregn stikprøvestørrelsen n for den ønskede præcision Engelsk 7. september 2015 10 / 50

enote 3: Hypotesetests for én gruppe/stikprøve Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 11 / 50

enote 3: Hypotesetests for én gruppe/stikprøve enote 3: Hypotesetests for én gruppe/stikprøve Grundlæggende koncepter: Engelsk Hypoteser (H 0 vs. H 1 ) p-værdi (sandsynlighed for observeret værdi eller mere ekstremt af teststørrelsen, hvis H 0 er sand, e.g. P (T > t obs )) Type I fejl: (i virkeligheden ingen eekt, men H 0 afvises) P (Type I) = α Type II fejl: (i virkeligheden eekt, men H 0 afvises ikke) P (Type II) = β Testens styrke er β Specikke metoder, én gruppe: t-test for middelværdiniveau Stikprøvestørrelse for ønsket styrke Modelkontrol med normal qq-plot 7. september 2015 12 / 50

enote 3: Statistik for to grupper/stikprøver Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 13 / 50

enote 3: Statistik for to grupper/stikprøver enote 3: Statistik for to grupper/stikprøver Engelsk Specikke metoder, to grupper: Test og kondensintervaller for forskel i middelværdi (t-test) Forsøgsplanlægning: Beregn sample størrelsen for den ønskede styrke Specikke metoder, to PARREDE grupper: "Tag dierencen for hver måling" "statistik for én gruppe" 7. september 2015 14 / 50

enote 4: Statistik ved simulation Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 15 / 50

enote 4: Statistik ved simulation enote 4: Statistik ved simulation Introduktion til simulering (Beregn statistik mange gange) Engelsk Fejlforplantning (error propagation rules) (F.eks. igennem ikke-lineær funktion) Bootstrapping: Parametrisk (Simuler mange udfald af stokastisk var.) Ikke-parametrisk (Træk direkte fra data) Kondensintervaller (og derfor også hypotesetest) Specikke setups: (4 versioner af kondensintervaller) Èn gruppe/stikprøve og to grupper/stikprøver data Parametrisk vs. ikke-parametrisk 7. september 2015 16 / 50

enote 5: Simpel lineær regressions analyse Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 17 / 50

enote 5: Simpel lineær regressions analyse enote 5: Simpel lineær regressions analyse To variable: x og y Engelsk Beregn mindstekvadraters estimat af ret linje Inferens med simpel lineær regressionsmodel Statistisk model: Y i = β 0 + β 1 x i + ε i Estimation af kondensintervaller og tests for β 0 og β 1 Kondensintervaller for linjen (95% gange ligger linjen indenfor) Prædiktionsintervaller for punkter (95% af nye punkter ligger indenfor) ρ, R og R 2 ρ er korrelationen (= sign β1 R) beskriver graden af lineær sammenhæng mellem x og y R 2 er andelen af den totale variation som er forklaret af modellen Afvises H 0 : β 1 = 0 så afvises også H 0 : ρ = 0 7. september 2015 18 / 50

enote 6: Multipel lineær regressions analyse Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 19 / 50

enote 6: Multipel lineær regressions analyse enote 6: Multipel lineær regressions analyse Flere variabler: y, x 1, x 2,... (y afhængig/respons var. og x'er er forklarende/uafhængige var.) Mindstekvadraters rette plan (et plan da der er >2 dimensioner) Inferens for en multipel lineær regressionmodel Statistisk model: Y i = β 0 + β 1 x 1,i + β 2 x 2,i +... + ε i Estimation af kondensintervaller og tests for β'er Kondensintervaller for modellen (For det forventede plan) Prædiktionsintervaller for nye punkter Engelsk R 2 er andelen af den totale variationen som er forklaret af modellen 7. september 2015 20 / 50

enote 8: Envejs variansanalyse (envejs ANOVA) Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 21 / 50

enote 8: Envejs variansanalyse (envejs ANOVA) enote 8: Envejs variansanalyse (envejs ANOVA) Engelsk k UAFHÆNGIGE grupper Specikke metoder, envejs variansanalyse: Test der sammenligner middelværdien af grupperne ANOVA-tabel: SST = SS(T r) + SSE F -test Post hoc test(s): Parvise t-test med poolet varians estimat Hvis planlagt på forhånd, så uden Bonferroni korrektion Hvis alle sammenligninger udføres, så med Bonferroni korrektion 7. september 2015 22 / 50

enote 8: Tovejs variansanalyse (ANOVA) Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 23 / 50

enote 8: Tovejs variansanalyse (ANOVA) enote 8: Tovejs variansanalyse (tovejs ANOVA) Engelsk Blokdesign giver to faktorer ANOVA-tabel: SST = SS(T r) + SS(Bl) + SSE F -test SST, SS(T r) og SS(Bl) beregnes som ved envejs ANOVA SSE = SST SS(T r) SS(Bl) Post hoc test(s): Parvise t-test med poolet varians estimat Hvis planlagt på forhånd, så uden Bonferroni korrektion Hvis alle sammenligninger udføres, så med Bonferroni korrektion 7. september 2015 24 / 50

enote 7: Inferens for andele Oversigt 1 enote 1: Simple plots og deskriptive statistik 2 enote 2: Diskrete fordelinger 3 enote 2: Kontinuerte fordelinger 4 enote 3: Kondensintervaller for én gruppe/stikprøve 5 enote 3: Hypotesetests for én gruppe/stikprøve 6 enote 3: Statistik for to grupper/stikprøver 7 enote 4: Statistik ved simulation 8 enote 5: Simpel lineær regressions analyse 9 enote 6: Multipel lineær regressions analyse 10 enote 8: Envejs variansanalyse (envejs ANOVA) 11 enote 8: Tovejs variansanalyse (ANOVA) 12 enote 7: Inferens for andele 7. september 2015 25 / 50

enote 7: Inferens for andele enote 7: Inferens for andele Andel: p = x n (x successer ud af n observationer) Engelsk Specikke metoder, én, to og k > 2 grupper Binær/kategorisk respons Estimation og kondensintervaller for andele Metoder til store stikprøver vs. til små stikprøver Hypoteser for én andel (p) Hypoteser for to andele Analyse af antalstabeller (χ 2 -test) (Alle forventede antal > 5) 7. september 2015 26 / 50

enote 1: Simple Graphics and Summary Statistics Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 27 / 50

enote 1: Simple Graphics and Summary Statistics enote 1: Simple Graphics and Summary Statistics Look at data as it is! (descriptive statistics) Dansk Summary statistics Sample mean: x Sample standard deviation: s Sample variance: s 2 Quantiles and percentiles (e.g. 15% of data is below 0.15 quantile) Median, upper- and lower quartiles Sample correlation (r) (between two samples) Simple graphics Scatter plot (xy plot) Histogram (empirical density) Cumulative distribution (empirical distribution) Boxplots, Bar charts, Pie charts 7. september 2015 28 / 50

enote 2: Discrete Distributions Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 29 / 50

enote 2: Discrete Distributions enote 2: Discrete Distributions General concepts: Random variable (Gets it value dependent on outcome of yet not carried out experiment) Density function: f(x) = P (X = x) (pdf) Distribution function: F (x) = P (X x) (cdf) Mean: µ = E(X) Standard deviation: σ Variance: σ 2 Specic distributions: The binomial distribution (Dice roll) The hypergeometric distribution (Draw without replacement) The Poisson distribution (Number of events in interval) Dansk 7. september 2015 30 / 50

enote 2: Continuous Distributions Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 31 / 50

enote 2: Continuous Distributions enote 2: Continuous Distributions General concepts: Density function: f(x) (pdf) Distribution: F (x) = P (X x) (cdf) Mean (µ) and variance (σ 2 ) Calculation rules for random variables Specic distributions: Normal Log-Normal Uniform Exponential t χ 2 (Chi-square) F Dansk 7. september 2015 32 / 50

enote 3: One sample condence intervals Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 33 / 50

enote 3: One sample condence intervals enote 3: One sample condence intervals General concepts Population and a random sample Estimation (e.g. ˆµ is estimate of µ) Signicance level α Condence intervals (Catches true value 1 α times) Sampling distributions (sample mean (t) and sample vaiance (χ 2 )) Central Limit Theorem Dansk Specic methods, one sample: Condence intervals for the mean (t-distribution) and variance (χ 2 distribution) Design of experiments: calculating the sample size n for wanted precision 7. september 2015 34 / 50

enote 3: One sample hypothesis testing Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 35 / 50

enote 3: One sample hypothesis testing enote 3: One sample hypothesis testing General concepts: Hypotheses (H 0 vs. H 1 ) Dansk p-value (Probability for observing the test value or more extreme, if H 0 is true, e.g. P (T > t obs )) Type I error: (No eect in reality, but H 0 is rejected) P (Type I) = α Type II error: (In reality an eect, but H 0 is not rejected) P (Type II) = β Power of a test is β Specic methods, one sample: t-test for mean dierence Sample size for wanted power Model validation with normal qq-plot 7. september 2015 36 / 50

enote 3: Two Sample statistics Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 37 / 50

enote 3: Two Sample statistics enote 3: Two Samples Dansk Specic methods, two samples: Test and condence interval for the mean dierence (t-test) Planning: calculating the sample size for wanted power Specic methods, two PAIRED samples: "Take dierence" "One sample" 7. september 2015 38 / 50

enote 4: Statistics by simulation Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 39 / 50

enote 4: Statistics by simulation enote 4, Statistics by simulation Introduction to simulation (Calculate the statistic many times) Dansk Error propagation rules (e.g. through a non-linear function) Bootstrapping: Parametric (Simulate many outcomes of random var.) Non-parametric (Draw values directly from data) Condence intervals (and hence also hypothesis testing) Specic situations: (4 versions of condence intervals) One-sample and Two-sample data Parametric vs. non-parametric 7. september 2015 40 / 50

enote 5: Simple linear Regression Analysis Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 41 / 50

enote 5: Simple linear Regression Analysis enote 5: Simple linear Regression Analysis Two quantitative variables: x and y Dansk Calculating least squares line Inferences for a simple linear regression model Statistical model: y i = β 0 + β 1 x i + ε i Interval estimation and test for β 0 and β 1. Condence interval for the line (95% times the line will be inside) Prediction interval for punkter (95% times new points will be inside) ρ, R og R 2 ρ is the correlation (= sign β1 R) describes the strength of linear relation between x and y R 2 is the fraction of the total variation explained by the model If H 0 : β 1 = 0 is rejected, then H 0 : ρ = 0 is also rejected 7. september 2015 42 / 50

enote 6: Multiple linear Regression Analysis Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 43 / 50

enote 6: Multiple linear Regression Analysis enote 6: Multiple linear Regression Analysis Many quantitative variables: y, x 1, x 2,... Dansk (y is the dependent/response var. and x's are explanatory/independent var.) Calculating least squares plane (A plane since there are >2 dimensions) Inferences for a the multiple linear regression model Statistical model: y i = β 0 + β 1 x 1,i + β 2 x 2,i +... + ε i Condence interval estimation and test for the β's Condence interval for the expected t (tted line) Prediction interval for new points R 2 expresses the proportion of the total variation explained by the linear t 7. september 2015 44 / 50

enote 8: One-way Analysis of Variance Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 45 / 50

enote 8: One-way Analysis of Variance enote 8: One-way Analysis of Variance Dansk Specic methods, k INDEPENDENT samples One-way analysis of variance Test for comparing the means of the groups ANOVA-table: SST = SS(T r) + SSE F -test Post hoc test(s): pairwise t-test with pooled variance estimate If planned on beforehand, then without Bonferroni correction If all samples are compared, then with Bonferroni correction 7. september 2015 46 / 50

enote 8: Two-way Analysis of Variance Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 47 / 50

enote 8: Two-way Analysis of Variance enote 8: Two-way Analysis of Variance Dansk Block design - two-way analysis of variance ANOVA-tabel: SST = SS(T r) + SS(Bl) + SSE F -test. SST, SS(T r) and SS(Bl) calculated as one-way ANOVA SSE = SST SS(T r) SS(Bl) Post hoc test(s): pairwise t-test with pooled variance estimate If planned on beforehand, then without Bonferroni correction If all samples are compared, then with Bonferroni correction 7. september 2015 48 / 50

enote 7: Inferences for Proportions Oversigt 13 enote 1: Simple Graphics and Summary Statistics 14 enote 2: Discrete Distributions 15 enote 2: Continuous Distributions 16 enote 3: One sample condence intervals 17 enote 3: One sample hypothesis testing 18 enote 3: Two Sample statistics 19 enote 4: Statistics by simulation 20 enote 5: Simple linear Regression Analysis 21 enote 6: Multiple linear Regression Analysis 22 enote 8: One-way Analysis of Variance 23 enote 8: Two-way Analysis of Variance 24 enote 7: Inferences for Proportions 7. september 2015 49 / 50

enote 7: Inferences for Proportions enote 7: Inferences for Proportions Proportion: p = x n (x successes out of n observations) Dansk Specic methods, one, two and k > 2 samples Binary/categorical response Estimation and condence interval of proportions Large sample vs. small sample methods Hypotheses for one proportion Hypotheses for two proportions Analysis of contingency tables (χ 2 -test) (All expected > 5) 7. september 2015 50 / 50