In our Innovation Lab, student teams of 5-6 people spend a semester working on Data Science projects. Students are intensively supervised by the team of experienced coaches over the whole project period. We welcome projects from the industry as well as research institutions and NGO’s, please contact us if you have an interesting problem to solve. Our students will deliver a proof-of-concept using state-of-the-art machine learning techniques with the optional possibility to purchase the usage-rights.
In this collaboration with ororatech, the students worked on building CNN-based methods to predict future spread of wildfires. Besides the previous fire masks, multiple additional satellite data sources were integrated, providing, e.g., weather data, landcover types, and elevation information.
Reducing the time that an ambulance needs to arrive at an incidents can save lives. Dynamically redeploying ambulances to different basestations thus can save lives. The students worked on implementing a discrete event based simulation that replays real-world incidents, processed multiple incident datasets and implemented existing baselines. In order to redeploy ambulances predicting the future demand plays a vital roll. Therefore, students evaluated and implemented various ambulance demand prediction models.
Schafkopf is a traditional Bavarian card game that has complex interaction with other team members (ad-hoc teamplay). The students worked on implementing a high-performance c++ simulator and implemented various agents ranging from rule-based, monte-carlo tree search, to reinforcement learning based agents.
Together with the project partner Dr. med. Boris Rauchmann (LMU Klinikum), the Connectome team explored the prediction of Alzheimer’s disease diagnosis based on connectivity matrices utilizing the Brainnetome Atlas. The results of the project include a pipeline for processing connectivity matrices to predict and explain a patient’s Alzheimer’s status. The pipeline allows users to automate the training, evaluation, and interpretation of various models based on several dataset options, such as aggregated connectivity matrices or graph metrics applied to the human brain connectivity data.
In this project, the students were confronted with the problem of completing the road network of Antananarivo, Madagascar, to close sanitation gaps. The project was conducted in collaboration with Gather, a registered charity. First, the quality of existing digital maps, i.e. Sentinel-2, Google Maps and Mapbox was assessed and Mapbox was chosen as the most comprehensive available data source. Secondly, road segmentation models (U-net architectures) on aerial imagery to detect routes that have not been digitized yet, were implemented and compared.
In this project, students applied techniques from Emotion Mining on Datasets from the Argument Mining area. Students implemented a pipeline for automatic identification of emotions in arguments and demonstrated empirically the role of pathos in the arguing process.
This team tried to extend a single-agent travelling officer to multiple agents and partial observability (settings where a full sensor network is not available)
This team tried to manage traffic lights on a large scale to optimize traffic in cities.
In this project, which is a cooperation with GFZ, the students work on creating a spatio-temporal model of space weather. In particular, we combine measurements of solar activity measured by satellites and on earth, and geomagnetic activities, and aim to predict several shape parameters of the ionosphere. As a challenge, the time-series do not follow the same cadence, can contain missing values, and the sensors are moving (since they are satellites) - thus we do not have measurements for the same location across time nor for the same time across many locations.
This team investigates different strategies (random, passive, active) to obtain labels from a large pool of unlabeled image data. Given labels after the acquisition, they evaluate the impact of this more or less intelligently selected sparse set on the performance of various models for image classification. More precisely, the students assess the effect of selectively choosing labeled data on a model trained from scratch, a transfer learning model, and a state-of-the-art semi-supervised model that additionally has access to all unlabeled data.
This team worked on one of the datasets from this year’s KDD Cup. The dataset is a full dump of Wikidata, a knowledge graph of 80M entities, 1.3k relation types and ~500M triples. The huge volume poses challenges for training models on GPU. This particularly holds for training graph-neural-network based models which require coherent subgraphs for batching, and efficiently obtaining represenative subgraphs is a non-trivial task.
Environmental pollution is increasingly becoming a critical issue in the global community. Much of this pollution comes from human production and consumption of electricity. The main aim of this project, in cooperation with a large national energy provider, is therefore to try to raise awareness of these matters among the people. In this case the focus is purely set on Germany. The result displays a live map of Germany that shows the actual amount of CO2 emissions derived from the consumption and the production of energy on a county level. Moreover, this sustainability mirror shows details about the different sources of electricity.
Clouds are a major and well-known issue when working with earth observation (EO) data. Dense clouds obstruct the earth surface and thus generate gaps in the data. This project revolved around the filling of those gaps. Specifically, the project consisted of two parts: First, machine learning (ML) possibilities for the purpose of gap filling were investigated and a recently proposed ML approach was applied to the problem and evaluated. The second part of the project was the development of a Python package to facilitate the validation of gap filling methods.
The weather in Europe is mainly driven by high and low pressure systems and their constellations to each other. Typical constellations of high and low pressure systems are classified as atmospheric circulation patterns. The class “Tief Mitteleuropa” is known to occur rarely (~ 10 days/year), but when it does it often triggers extreme rainfall and floods in Central Europe. The more frequent class “Trog Mitteleuropa” with about 20 days/year is also related to heavy rainfall (Ustrnul & Danuta, 2001).
Together with a big media outlet, a team of students worked on using data obtainable from sattellite imagery for a journalistic project. The Goal of this Project was to create a tool that allows journalists to investigate several climate phenomena in a given area along with an analysis of their development over time. In the future, you might hear about the findings in the news!
In this project, students applied techniques from Argument Mining on the new corpus of peer-reviews for scientific publications. Students implemented a pipeline for automatic identification of arguments in peer-reviews and demonstrated empirically the importance of arguments in the decision making process. The work was presented at AAAI-21.
The sun continually emits electrically charged particles. These particles get accelerated/decelerated by the earth’s magnetic field. High-energy particles can pose severe threats to satellite operations and affect electricity plants on the ground. In this project, which is a collaboration with LMU Geophysics, the students developed predictive models for proton intensities in space based on geomagnetic and solar activity indices. The models were applied to investigate the correlation between proton intensities and measurement corruptions of an existing spacecraft and forecast proton intensities to facilitate satellite operators to protect their instruments. The work appeared in the Astrophysical Journal.
Knowledge graphs (KGs) are a way to represent facts in a structured form that machines can efficiently process. There exist several large-scale common knowledge KGs, such as Wikidata or Google Knowledge Graph, but also more specialized ones, for instance, bio-medical ones, such as HetioNet. To combine information from different sources, entities from one graph have to be recognized in the other one, despite potentially additional labels/descriptions / associated data. This task is commonly referred to as Entity Alignment (EA). While humans can easily collect and combine information about an entity from different sources, the task remains challenging for Machine Learning methods.
In this project, the students investigated several state-of-the-art entity alignment methods based on Graph Neural Networks (GNNs) and Generate Adversarial Networks (GANs). They re-implemented the techniques in a common framework, compared the code published by the authors to the method described in the papers, and tried to reproduce the reported results.
Presentation
In the project, we studied the performance of several state-of-the-art argument detection models regarding the generalization capability across multiple argumentation-schemes.
Presentation
In this project, we used data from the KDDCup 2020 to create a realistic taxi-dispatching simulation environment for Reinforcement Learning.
The data was analyzed, cleaned, and used to model the agent’s idle movement within the simulation and the taxi requests of passengers. Different kinds of policies were then implemented and evaluated, e.g., using Kuhn-Munkres and a value-based Reinforcement Learning algorithm.
Presentation
In this project, we applied Multi-agent Reinforcement Learning techniques to teach agents to avoid contact with each other while at the same time trying to get to their target destination as quickly as possible. For that, a flexible grid environment with different agent observations and rewards was implemented. Then, we trained Deep Q-Learning agents to navigate the environments and avoid each other, comparing them to ignorant shortest-path agents as a baseline.
Presentation
A team of 5 students, in cooperation with a large national energy provider, worked on a dashboard for various sustainability metrics for cities, making available data for air pollution, urbanization, renewable energy production and more. They combined multiple open data sources which are available for most major cities to provide metrics that are comparable across cities and can be updated automatically.
A team worked on the Energy Consumption Prediction Challenge and built a well-performing model along with an analytics dashboard that lets users predict energy for custom buildings. Github Page. In the second phase of the project the team built a webpage that allows users to predict the energy consumption of their buildings based on the model from the challenge.
Another team built an interactive dashboard for the investigation of results from the IASS Social Sustainability Barometer in collaboration with the Institute for Advanced Sustainability Studies in Potsdam, Germany. The dashboard, built with R and shiny, shows the results of the social sustainability survey geographically and over time.
Students developed mlr3forecasting, an R package for time series forecasting with machine learning. Its goal is to facilitate time series forecasting, e.g., predicting global temperatures. It extends the popular machine learning framework mlr3.
Students trained deep neural network models for the xview 2 disaster prediction challenge. The challenge’s goal is to find and localize damage from natural disasters on satellite images. The students built a model based on a U-Net architecture with a ResNet backbone and adaptive loss functions that was able to classify and localize damage types.
In this project, the goal was to find anomalies in X-Ray images in a completely unsupervised fashion,
without using labeled data. This project was done in cooperation with Deepc, a start-up company founded
by LMU students.
Publication
In this project, the students dealt with spatial interpolation of weather information in mountain regions.
Our industry partner provided us with data from different regions. One of the main challenges in the project
is to learn a model that can be applied to new regions without the need for re-training.
Publication
The students participated in a real Data Science challenge, where they had to compete against 800 Teams.
The goal of the challenge is to determine what is the best route for the user from different variants proposed
by a transportation app.
Poster,
Presentation
Knowledge graphs are a versatile tool to represent structured data used, for example, in the Wikidata
project or for the Google Knowledge Graph. Link prediction aims at predicting missing links in order to
enrich the knowledge base. This project’s focus is on combining different models into an ensemble in order
to exploit the individual models’ strengths.
Poster,
Presentation
In object detection challenges, neural architectures often fail to detect small objects in images.
Through the recent developments in the research area of super-resolution, it is now possible to improve
quality of images. The students apply recently introduced Super Resolution techniques to improve object
detection performance.
Publication
Argument mining is one of the hardest problems in Natural Language Processing. The main challenge is
that arguments are structurally similar to purely informative texts, and only differ semantically.
In this project, students utilized background information from knowledge graphs for better argument mining.
Poster,
Presentation
Aerial images yield a cost-efficient way to automatically generate census information about the
biodiversity in urban environments. In this project, the students developed a neural-network based object
detection and recognition method for registering and classifying trees in the city of Edmonton.
Poster,
Presentation
We are always looking for project partners from industry and academia. The Innovation Lab is an excellent opportunity for you to get smart and eager students to work on your data science and machine learning projects. 5-6 students work in teams for 3-4 months on your project, supervised by experienced researchers from the LMU Munich’s Innovation Lab. In our experience, students are highly motivated to work on real projects, which resulted in many successful project completions to the satisfaction of our project partners.
We are concretely looking for projects that involve some application of methods in Machine Learning or Data Science, which provide interesting challenges to be solved by our students. We would highly welcome relatively concrete goals and projects, which can then be implemented by students.
If you are interested, feel free to drop us a line. We will gladly help you to find out whether your project is suitable for the Innovation Lab and work out the right scope.
Promoted by the Bavarian State Ministry of Science and Art and coordinated by BiDT