Skip to content

Second Level KI in Weichen

The transportation transition in Germany is relying on a reliable and secure rail network to meet the growing demand for sustainable and efficient transportation solutions. A deliberate sabotage attack on communication lines in 2022 has highlighted the need for a robust and secure network.

In response to this incident, the project SLKI (Second Level KI in Weichen) aims to improve the security and reliability of Germany's rail network by harnessing the potential of fixed acceleration sensor data at train switches.

acceleration sensors

One of the main challenges is to clean and preprocess these noisy time-series data to obtain high-quality and reliable train signal data.

Based on these cleaned signal data it is then possible to provide AI-driven approaches to

  • classify train types,
  • track and predict train speeds and
  • uncover potential anomalies.

These capabilities will enable us to provide valuable insights into the operation of the rail network and explore potential applications in areas such as

  • real-time monitoring und control of rail traffic,
  • predictive maintenance scheduling as well as
  • enhanced security und surveillance.

We hope that our work contributes to the advancement of transportation systems und security, und provides a foundation for future research und development in this field.

Sensor implementation

Our project integrates data from sensor units (BREUER ARTEMIS) already installed at 650 turnouts of Deutsche Bahn InfraGO AG, as well as a test installation at Brunswick dock railway in Germany. This integrated dataset covers major lines in the InfraGO network, currently spanning across the middle and southwest regions. Additionally, BREUER is preparing to expand the coverage to include the "Riedbahn" region. The platform also supports data from all speed ranges (40-280 km/h) and various train categories, including Long-Distance, Regional, and Freight trains.

sensor incl. switch sensor focused view
Acceleration sensor at train switches

Data

Breuer has made the data available to us through a website, from which we were able to download it in HDF5 format. It concerns sensor data from turnouts on the DB network. Each turnout has two sensors, called "Frog" and "Points", which measure vibrations of the rails at specific points in front of, behind or above the turnout in x, y, z directions.

The HDF5 files all follow the same data structure, as illustrated by the diagram below.

Each file includes precisely two distinct datasets, distinguished by their unique sensor identifiers. One sensor contains the frog data and the other the points data. To distinction between these two sensors, it is necessary to check the value of the placement attribute within the sensors group. Under each sensor are the sensor data labelled with a timestamp in the format 2024-05-16T09:31:37Z.

The switch_list_info also includes a timestamp, which appears to be consistently fixed at 19700101. It's possible that this may be an error, which could be fix in future updates by Breuer.

This repository mainly uses the su_acceleration_data data.

HDF5 database structure overview
HDF5 database structure overview
HDF5 database structure including their attributes

Each HDF5 database path (groups as well as leaves) can contain attributes. The following diagram illustrates each path including their available attributes.

HDF5 database structure including their attributes
HDF5 database structure including their attributes
HDF5 database structure including attribute example data

Below is an excerpt of the HDF5 file structure, which includes example data for each attribute. The actual values are taken from a single example file and for demonstration purposes only.

HDF5 database structure including attribute example data
HDF5 database structure including attribute example data

Signal Processing

The SLKI pipeline provides a range of signal processing methods, collectively referred to as stages:

  • noise reduction
  • signal extraction
  • resampling
  • outlier reduction
  • signal smoothing
  • normalization
  • double integration
  • ...

Extract signal from data

A basic example of extracting a train signal from the data involves utilizing the stages noise reduction and signal extraction:

SLKI pipeline basic example
SLKI pipeline basic example

Signal peak detection

Using DBSCAN, it is possible to cluster the train signal and focus on the parts where the wheels actually roll over the rail. This method helps analyze the length and speed of the train itself.

SLKI peak detection example
SLKI peak detection example

Signal outlier detection

The idea behind detecting outliers involves:

  • identifying all peaks in the signal
  • calculating a boxplot of the whole signal
  • using the boxplot's "minimum" and "maximum" as boundaries
    \(\Rightarrow\) fliers are anomalies or outliers
  • using the fliers to boundary distance as importance factor
    Q1-1.5IQR   Q1   median  Q3   Q3+1.5IQR
                |-----:-----|
o      |--------|     :     |--------|    o  o
                |-----:-----|
flier  min      <----------->       max   fliers
                    IQR
Boxplot definition
SLKI outlier detection example
SLKI outlier detection example

Train signal classification

Following multiple approaches to classify train signals into three major classes Regionalverkehr, Fernverkehr and Güterverker, we examined various traditional machine learning algorithms, as well as time-series specialized frameworks like McFly [Paper] [GitHub]. In the end, we obtained the best results using a simple ResNet in combination with the time-series framework tsai [Doc] [GitHub].

The dataset consists of 3119 samples per category, which were split into balanced training (60%), validation (30%), and testing sets (10%).

The ResNet model achieved an accuracy of 89.2% after being trained for 20 epochs.

SLKI outlier detection example
SLKI signal classification: random samples
sensor incl. switch sensor focused view
SLKI signal classification: ResNet and confusion matrix