Past Topics – Big Data Analytics Group

Hyperfast Snapshotting
How can you manipulate virtual memory to speed-up snapshotting vor concurrency control and other applications. One effort in that space is RUMA. An extension of this work is a slight extension to a linux-kernel making this even faster.
new system architectures for data management
The idea of a Database Systems was developed more than 30 years ago. Meanwhile many new data managing problems have been identified. Several new data models were invented. Moreover hardware has changed dramatically. We doubt that the abstraction of a Database Management System is always the best abstraction. Therefore we are interested in coming up with better system abstractions that cover a wider range of information management problems. One effort in that direction is our project OctopusDB.
main memory and flash databases
Many databases today already fit into main memory. Systems are going away from using disks to using flash and DRAM memory. In addition, new kind of storage media is coming up, e.g., persistable RAM. This new hardware will affect how information systems are built.
data warehousing and OLAP
Online Analytical Processing (OLAP) is at the heart of any medium to large business. OLAP systems provide instantaneous answers to complex queries. Building these systems is far more challenging compared to ‘standard’ OLTP systems.
MapReduce/Hadoop
MapReduce/Hadoop is a popular paradigm for large-scale data analysis. However, typical MapReduce tasks suffer from performance issues. We look at ways to improve runtimes. One effort in that direction is our project Hadoop++ and its successor HAIL (Hadoop Aggressive Indexing Library).
dataspaces
Dataspaces are a new abstraction for information management. A Dataspace system is a new kind of information integration system that incorporates features of search engines as well as EII/OLAP/DWH. We have built one of the first dataspace management systems: iMeMex. It uses declarative trails to enrich a dataspace. video
moving objects and vehicle tracking including cars and aircraft
Indexing moving objects refers to scenarios where the data is changing quickly, e.g., cars, trains, aircrafts change their positions continuously. How do you provide scalable tolling, tracking systems for these kind of applications? The MOVIES project investigates these research questions. In that context we also developed a trace generator to simulate large datasets of moving objects called moto. This simulator overcomes scaling issues prevalent in existing generators.
personal information management
Personal information refers to all files on all your devices (including your laptops, phones, iPods, audio systems, etc.). Currently people use a zoo of techniques to manage and query this information. We look at ways of coming up with a unified system for personal information management allowing you to handle your data with a single system.

If you are from industry and are interested in working with us in of the above areas, please feel free to contact us.