This course discusses the database systems that deal with large-scale semi-structured data. Typical examples of such data include large hypertext collections such as Wikipedia or the Web. Another example is the data generated by various user activities in distributed database systems, such as Web access logs.
In recent years the amount of semi-structured data that users generate increased significantly. For example, in various Web applications users either generate content or leave huge amount of trails when interacting with the system, e.g. access logs. When dealing with such data the traditional relational database systems very soon hit their limits. Therefore, a new database technology has emerged recently to manipulate these huge data collections of semi-structured and unstructured data. Typically, this new technology is called "NoSQL".
In this course we will investigate some of the properties of NoSQL systems and we will discuss in more details technologies such as Map-Reduce or graph databases.
Course topics include:
In this course the students will:
At the end of this course the students will know how to:
Goal of the practical project is to experience the differences between traditional databases and NoSQL databases first hand. There is a practical project which needs to be conducted during the lecture duration. You will need to compare traditional database concepts with NoSQL database concepts.
There are two sets of requests: i) mandatory requests, and ii) hypothesis requests. All mandatory requests need to be implemented, while you a free in how many hypothesis you want to define (minimum of one). You may choose one (or multiple) of the suggested hypothesis, or define your own.
Total points for the practical project is 40. At the end of the term you will also have a final examination with two questions (20 points each for total of 40). You will have to have at least 10 points for the project and 10 points for the examination and 41 points combined to get a positive grade.
Students who do not conduct/present the project will get zero points on the first part and will automatically fail the course!
The grading scheme is as follows: