SCANIA Component X Dataset: A Real-World Multivariate Time Series Dataset for Predictive Maintenance

SND-ID: 2024-34. Version: 1. DOI: https://doi.org/10.58141/1w9m-yz81

Citation

Creator/Principal investigator(s)

Tony Lindgren - Stockholm University, Department of Computer and Systems Sciences, DSV orcid

Olof Steinert - Scania CV AB, Strategic Product Planning and Advanced Analytics

Oskar Andersson Reyna - Scania CV AB, Connected Intelligence

Zahra Kharazian - Stockholm University, Department of Computer and Systems Sciences, DSV orcid

Sindri Magnússon - Stockholm University, Department of Computer and Systems Sciences, DSV orcid

Research principal

Scania CV AB rorId

Description

This data is a real-world, multivariate time series dataset collected from an anonymized engine
component (called Component X) of a fleet of trucks from SCANIA, Sweden. This dataset includes diverse variables capturing detailed operational data, repair records, and specifications of trucks while maintaining confidentiality by anonymization. It is well-suited for a range of machine learning applications, such as classification, regression, survival analysis, and anomaly detection, particularly when applied to predictive maintenance scenarios. The large population size and variety of features in the format of histograms and numerical counters, along with the inclusion of temporal information, make this real-world dataset unique in the field. The objective of releasing this dataset is to give a broad range of researchers the possibility of working with real-world data from a well-known international company and introduce a standard benchmark to the predictive maintenance field, fostering reproducible research.

Data contains personal data

No

Language

Method and outcome

Data format / data structure

Data collection
  • Mode of collection: Physical measurements and tests
  • Data collector: Scania CV AB
  • Source of the data: Processes
Geographic coverage
Administrative information

Other research principals

Topic and keywords

Research area

Computer and information science (Standard för svensk indelning av forskningsämnen 2011)

Computer science (Standard för svensk indelning av forskningsämnen 2011)

Other computer and information science (Standard för svensk indelning av forskningsämnen 2011)

Communication systems (Standard för svensk indelning av forskningsämnen 2011)

Signal processing (Standard för svensk indelning av forskningsämnen 2011)

Computer systems (Standard för svensk indelning av forskningsämnen 2011)

Vehicle engineering (Standard för svensk indelning av forskningsämnen 2011)

Publications

Zahra Kharazian, Tony Lindgren, Sindri Magnússon, Olof Steinert, Oskar Andersson Reyna. (2024). SCANIA Component X Dataset: A Real-World Multivariate Time Series Dataset for Predictive Maintenance. arXiv:2401.15199.
DOI: https://doi.org/10.48550/arXiv.2401.15199

If you have published anything based on these data, please notify us with a reference to your publication(s). If you are responsible for the catalogue entry, you can update the metadata/data description in DORIS.

License

CC BY 4.0

Versions

Version 1. 2024-02-20

Version 1: 2024-02-20

DOI: https://doi.org/10.58141/1w9m-yz81

Contact for questions about the data

Tony Lindgren

tony@dsv.su.se

Published: 2024-02-20