Introduction to Data Science and Statistics

preliminary — subject to changes

Course Description

This course introduces to the theory and practice of describing and analysing data, testing hypotheses and how to use the free statistical computing software R for your data analyses.

Topics discussed include, but are not limited to, combinatorics, probability, random variables, density and distribution functions, sampling from a population, sample properties and descriptive summary statistics, and inferring population characteristics from a random sample.

Course Objectives

The course will enable students to describe and to analyse data using simple statistical methods and to interpret and to report their data analyses results. Students will learn how to use the statistical computing software R.

Course Materials

Required:

The course will follow the textbook:
tba

The required readings from this textbook are listed below.

Further recommendations:

Additionally, the following text may be used as a reference:

A brief introduction to the software R and RStudio con be found here and a slightly longer version online here.

Course Requirements:

Students must read the corresponding chapters of the textbook before each session. Students need access to a computer with the statistical computing software R. The software is available free of charge from www.r-project.org. As a graphical frontend RStudio is recommended.

Instructor Information:

Prof. Dr. Dennis A. V. Dittrich
dennis.dittrich@touroberlin.de
http://economicscience.net

You can always reach me via email. For meetings in my office appointments can be arranged through my webpage at: http://economicscience.net/content/book-appointment.

Updated information, links to the literature, additional materials, etc. can be found on my webpage as well.

Grading Guidelines:

Grading ComponentWeight
Grading ComponentWeight
Problem Sets40%
Final Examination60%

Workload

A typical 3 credit course requires 150 hours of your time. The table below identifies how I expect those 150 hours will be allocated. While you do not receive direct marks for reading, reading will affect your class participation (your ability to participate in class discussions and activities) and your final exam mark.

ActivityTime
Class Time (3 hours / week)45 hours
Reading (2 hours / week)30 hours
Problem Sets (2 hour / week)30 hours
Preparation and Review (3 hours / week)45 hours

Weekly Topics and Reading Assignments

Session 1: 13.02.

  • Introduction to course and to data science and statistics

Session 2: 20.02.

Session 3: 27.02.

Session 4: 06.03.

Session 5: 13.03.

Session 6: 20.03.

Session 7: 27.03.

Session 8: 03.04.

Session 9: 10.04.

Session 10: 17.04.

Session 11: 29.04. (Monday)

Session 12: 08.05.

Session 13: 15.05.

Session 14: 22.05.

Session 15: 29.05.

Final exam

Topics and reading assignments are subject to changes.