Algorithms for Big Data


  • Written exam

    The exam will be on Sept 10th at F102 and F303. Check here for exact details.
  • Oral Exam

    Only for students who have visa restrictions! The oral exam will be on July 26th

Aim of the Lecture..

The aim of this lecture is to learn efficient algorithms that are used for processing large datasets. We will study scalable approaches to some of the fundamental problems involving finding similar items, clustering, recmmender systems and graph mining. The lecture has both theoretical and programmatic aspects. Students will be exposed to large distributed data processing frameworks like Hadoop and Spark.


Each lecture will be accompanied by a set of exercise questions that students should complete before the next lecture. Exercises must be handed in by the Tuesday before the next lecture in order to be evaluated. Exercises can be either scanned and emailed or delivered by hand. Please do not take a picture of the assignment using a smartphone and consider it a scan.
After every lecture we will have an hour long tutorial session where students will be asked to present solutions from the previous lecture's exercise questions. All students are expected to present in the session and complete the exercises in time.

  • Students can present solutions in the exercises for grade improvement.
  • Only correct solutions (submitted on time) are eligible for presentations during the exercise session.
  • Every 3 solutions presented results in 0.3 grade improvement in the final exam. The maximum improvement you can get is 1.0 grade points.


# Date Lecture Links
1 11.04.2018 Introduction Lecture Notes
2 18.04.2018 Finding Similar Items Lecture Notes Assignment 1 Solution 1 Textbook reference
3 25.04.2018 Map Reduce Lecture Notes and Code Assignment 2 Solution 2
4 02.05.2018 Streaming Lecture Notes Assignment 3 Solution 3
5 09.05.2018 Streaming Lecture Notes Assignment 4 Solution 4
6 30.05.2018 Streaming Lecture Notes Assignment 5 Solution 5
7 06.06.2018 Graphs Lecture Notes Assignment 6 Solution 6
8 13.06.2018 Graphs Lecture Notes Assignment 7 Solution 7
9 20.06.2018 Graphs Lecture Notes Assignment 8 Solution 8
10 27.06.2018 Graphs Assignment 9 Solution 9
11 04.07.2018 Clustering Lecture Notes Assignment 10 Solution 10
12 18.07.2018 Conclusion Lecture Notes