Course catalogue doctoral education - VT24

    Startpage
  • Ansökan kan ske mellan 2023-10-16 och 2023-11-15
Application closed
Print
Title Mastering R - Advanced Data Science and Statistical Analysis
Course number 5686
Programme 0-Inte del av forskarutbildningsprogram
Language English
Credits 4.5
Date 2023-11-20 -- 2023-12-10
Responsible KI department Institutionen för biovetenskaper och näringslära
Specific entry requirements Basic knowledge of R (3 hp or more). Having attended for example courses “Statistics with R - from Data to Publication Figure”, “Introduction to R - Data Management, Analysis and Graphical Presentation” or “Get started with R – Programming Basics, Data Analysis and Visualisation”.
Purpose of the course The course is practical, where the aim is to:
Teach students how to work with more advanced concepts of the programming environment R and Rstudio, such as functional programming, simple algorithm development, advanced data manipulation and visual presentation, Rmarkdown and the Tidyverse.
Intended learning outcomes After attending the course, the student should be able to:
• Data wrangle, tidy up messy data and structure data for analysis (e.g., convert from wide to long format)
• Identify situations suitable to the use of functions, loops and conditionals
• Construct their own algorithms, incorporating self-created functions, loops and conditionals
• Query SQL databases
• Create authentication processes for secure access to, for example data bases
• Use RMarkdown to create easier markup and navigation, as well as create PDF files and websites
• Use version control for collaboration
• Create their own packages from scratch, as well as “source” other R files from another script
• Create interactive and advanced representations of scientific data
• Run simple parallel process operations
• Evaluate the efficiency of code, selecting the best option of, for example functions, to solve a certain problem
Contents of the course The advanced R course covers more complex topics and builds upon the foundation established in basic R courses. The course contains:
Functional programming, including advanced techniques for writing functions in R, such as closures, anonymous functions, and higher-order functions.
Object-Oriented Programming: An introduction to the basics of object-oriented programming in R, including classes, objects, and inheritance.
Advanced data manipulation, including topics such as regular expressions, string manipulation, and the use of the tidyverse packages for data cleaning and manipulation.
Advanced data visualization, including the use of advanced visualization techniques, such as interactive visualizations using shiny, and visualizing complex data using the ggplot2 package.
Big data analysis techniques, including parallel processing, distributed computing, and the use of packages like SparkR and dplyr to scale up data analysis.
Database handling, including connecting to and querying SQL databases.
An introduction to machine learning using tensorflow and keras.

The course will also emphasize the use of best practices for reproducibility and collaboration. Introducing the concepts of writing modular and reusable code, using version control with Git, and using R Markdown for reproducible reporting.
Teaching and learning activities Distance learning with online ZOOM lectures and labs. Videos, quizzes and tasks in Canvas. Group and individual exercises and assignments. Reviewing other students’ code and interaction with other students. Individual project work. Each session will be structured around a specific concept, for example writing simple algorithms (like binary search or greedy algorithms). Each session will be comprised of two days. The first day the concept is introduced with a lecture, then quizzes and tasks, to gradually increase the autonomy of the students to use the specific concept. The second day will be a lab where they are supposed to use the concept to perform a specific action, receiving formative feedback from one peer and one teacher. The last day of every second week will be a larger either individual or group exercise, where the student is required to combine introduced concepts into a whole. This exercise will be reviewed by a fellow student who will have the opportunity to comment on ways to improve the work.
Compulsory elements Participation during lectures, quizzes, labs, individual and group projects and reviews of other students’ projects. Absence from lectures can be compensated by finishing an additional task, quizzes can be finished at any time during the course. The activity of the labs can be finished at any time during the course, but there will only be support from teachers during the specific lab occasion. The remaining elements can not be compensated during the course.
Examination One individual project work and one group project work
Literature and other teaching material Recommended e-book: "Advanced R" by Hadley Wickham (2nd online edition)
Number of students 12 - 18
Selection of students Selection will be based on 1) the relevance of the course syllabus for the applicant's doctoral project (according to written motivation), 2) start date of doctoral studies (priority given to earlier start date)
More information The course duration is three weeks of 100% studies which will be held online, via ZOOM. Teaching occasions consist of lectures, labs, as well as quizzes and tasks in Canvas. Lectures are supported by videos. Examination will be conducted through one group and one individual assignment. Another obligatory element will be to reviewing other students’ code. Each sessions will be structured around a specific concept, for example writing simple algoritms (like binary search or greedy algorithms). Most sessions will be comprised of two days. During the first day the concept is introduced with a lecture, then quizzes and tasks, to gradually increase the autonomy of the students to use the specific concept. The second day will be a lab where students are supposed to use the concept to perform a specific action, receiving formative feedback from one peer and one teacher. The last day of every week will be a larger individual or group exercise, where the student is required to combine introduced concepts into a whole. This exercise will be reviewed by a fellow student who will have the opportunity to comment on ways to improve the work.
Additional course leader Alen Lovric
Latest course evaluation Not available
Course responsible Billy Langlet
Institutionen för neurobiologi, vårdvetenskap och samhälle
+46762033996
billy.langlet@ki.se
Contact person -