Syllabus database for doctoral courses

    Startpage
  • Syllabus database for doctoral courses

SYLLABI FOR DOCTORAL COURSES

Print
Swedish title Bemästra R – Avancerad datavetenskap och statistisk analys
English title Mastering R - Advanced Data Science and Statistical Analysis
Course number 5686
Credits 4.5
Responsible KI department Institutionen för biovetenskaper och näringslära
Specific entry requirements Basic knowledge of R (3 hp or more). Having attended for example courses “Statistics with R - from Data to Publication Figure”, “Introduction to R - Data Management, Analysis and Graphical Presentation” or “Get started with R – Programming Basics, Data Analysis and Visualisation”.
Grading Passed /Not passed
Established by The Committee for Doctoral Education
Established 2023-03-15
Purpose of the course The course is practical, where the aim is to:
Teach students how to work with more advanced concepts of the programming environment R and Rstudio, such as functional programming, simple algorithm development, advanced data manipulation and visual presentation, Rmarkdown and the Tidyverse.
Intended learning outcomes After attending the course, the student should be able to:
• Data wrangle, tidy up messy data and structure data for analysis (e.g., convert from wide to long format)
• Identify situations suitable to the use of functions, loops and conditionals
• Construct their own algorithms, incorporating self-created functions, loops and conditionals
• Query SQL databases
• Create authentication processes for secure access to, for example data bases
• Use RMarkdown to create easier markup and navigation, as well as create PDF files and websites
• Use version control for collaboration
• Create their own packages from scratch, as well as “source” other R files from another script
• Create interactive and advanced representations of scientific data
• Run simple parallel process operations
• Evaluate the efficiency of code, selecting the best option of, for example functions, to solve a certain problem
Contents of the course The advanced R course covers more complex topics and builds upon the foundation established in basic R courses. The course contains:
Functional programming, including advanced techniques for writing functions in R, such as closures, anonymous functions, and higher-order functions.
Object-Oriented Programming: An introduction to the basics of object-oriented programming in R, including classes, objects, and inheritance.
Advanced data manipulation, including topics such as regular expressions, string manipulation, and the use of the tidyverse packages for data cleaning and manipulation.
Advanced data visualization, including the use of advanced visualization techniques, such as interactive visualizations using shiny, and visualizing complex data using the ggplot2 package.
Big data analysis techniques, including parallel processing, distributed computing, and the use of packages like SparkR and dplyr to scale up data analysis.
Database handling, including connecting to and querying SQL databases.
An introduction to machine learning using tensorflow and keras.

The course will also emphasize the use of best practices for reproducibility and collaboration. Introducing the concepts of writing modular and reusable code, using version control with Git, and using R Markdown for reproducible reporting.
Teaching and learning activities Distance learning with online ZOOM lectures and labs. Videos, quizzes and tasks in Canvas. Group and individual exercises and assignments. Reviewing other students’ code and interaction with other students. Individual project work. Each session will be structured around a specific concept, for example writing simple algorithms (like binary search or greedy algorithms). Each session will be comprised of two days. The first day the concept is introduced with a lecture, then quizzes and tasks, to gradually increase the autonomy of the students to use the specific concept. The second day will be a lab where they are supposed to use the concept to perform a specific action, receiving formative feedback from one peer and one teacher. The last day of every second week will be a larger either individual or group exercise, where the student is required to combine introduced concepts into a whole. This exercise will be reviewed by a fellow student who will have the opportunity to comment on ways to improve the work.
Compulsory elements Participation during lectures, quizzes, labs, individual and group projects and reviews of other students’ projects. Absence from lectures can be compensated by finishing an additional task, quizzes can be finished at any time during the course. The activity of the labs can be finished at any time during the course, but there will only be support from teachers during the specific lab occasion. The remaining elements can not be compensated during the course.
Examination One individual project work and one group project work
Literature and other teaching material Recommended e-book: "Advanced R" by Hadley Wickham (2nd online edition)
Course responsible Billy Langlet
Institutionen för neurobiologi, vårdvetenskap och samhälle
+46762033996

billy.langlet@ki.se

Contact person