Dataset with condition monitoring vibration data annotated with technical language, from paper machine industries in northern Sweden
SND-ID: 2023-246. Version: 1. DOI: https://doi.org/10.5878/z34p-qj52
Citation
Alternative title
Annotated condition monitoring data for technical language processing and supervision
Creator/Principal investigator(s)
Karl Löwenmark - Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering
Fredrik Sandin - Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering
Marcus Liwicki - Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering
Stephan Schnabel - SKF (Sweden)
Research principal
Luleå University of Technology - Department of Computer Science, Electrical and Space Engineering
Principal's reference number
2019-02533
Description
Labelled industry datasets are one of the most valuable assets in prognostics and health management (PHM) research. However, creating labelled industry datasets is both difficult and expensive, making publicly available industry datasets rare at best, in particular labelled datasets.
Recent studies have showcased that industry annotations can be used to train artificial intelligence models directly on industry data ( https://doi.org/10.36001/ijphm.2022.v13i2.3137 , https://doi.org/10.36001/phmconf.2023.v15i1.3507 ), but while many industry datasets also contain text descriptions or logbooks in the form of annotations and maintenance work orders, few, if any, are publicly available.
Therefore, we release a dataset consisting with annotated signal data from two large (80mx10mx10m) paper machines, from a Kraftliner production company in northern Sweden. The data consists of 21 090 pairs of signals and annotations from one year of production. The annotations are written in Swedish, by on-site Swedish experts, and the signals consist primarily of accelerometer vibration measurements from the two ma
Recent studies have showcased that industry annotations can be used to train artificial intelligence models directly on industry data ( https://doi.org/10.36001/ijphm.2022.v13i2.3137 , https://doi.org/10.36001/phmconf.2023.v15i1.3507 ), but while many industry datasets also contain text descriptions or logbooks in the form of annotations and maintenance work orders, few, if any, are publicly available.
Therefore, we release a dataset consisting with annotated signal data from two large (80mx10mx10m) paper machines, from a Kraftliner production company in northern Sweden. The data consists of 21 090 pairs of signals and annotations from one year of production. The annotations are written in Swedish, by on-site Swedish experts, and the signals consist primarily of accelerometer vibration measurements from the two machines.
The dataset is structured as a Pandas dataframe and serialized as a pickle (.pkl) file and a JSON (.json) file. The first column (‘id’) is the ID of the samples; the second column (‘Spectra’) are the fast Fourier transform and envelope-transformed vibration signals; the third column (‘Notes’) are the associated annotations, mapped so that each annotation is associated with all signals from ten days before the annotation date, up to the annotation date; and finally the fourth column (‘Embeddings’) are pre-computed embeddings using Swedish SentenceBERT. Each row corresponds to a vibration measurement sample, though there is no distinction in this data between which sensor or machine part each measurement is from. Show less..
Data contains personal data
No
Geographic spread
Geographic location: Sweden
Responsible department/unit
Department of Computer Science, Electrical and Space Engineering
Other research principals
Contributor(s)
Pär-Erik Martinsson - Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering
Peter Wikström - SCA Munksund
Per-Erik Larson - SKF (Sweden)
Håkan Sirkka - Smurfit Kappa
Kjell Lundberg - Smurfit Kappa
... Show more..Pär-Erik Martinsson - Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering
Peter Wikström - SCA Munksund
Per-Erik Larson - SKF (Sweden)
Håkan Sirkka - Smurfit Kappa
Kjell Lundberg - Smurfit Kappa
RISE Research Institutes of Sweden
Smurfit Kappa
Show less..Research area
Probability theory and statistics (Standard för svensk indelning av forskningsämnen 2011)
Computer and information science (Standard för svensk indelning av forskningsämnen 2011)
Language technology (computational linguistics) (Standard för svensk indelning av forskningsämnen 2011)
Other computer and information science (Standard för svensk indelning av forskningsämnen 2011)
Signal processing (Standard för svensk indelning av forskningsämnen 2011)
Other mechanical engineering (Standard för svensk indelning av forskningsämnen 2011)
Paper, pulp and fiber technology (Standard för svensk indelning av forskningsämnen 2011)
Keywords
Paper industry, Condition monitoring, Language technology, Signal processing, Fault detection, Natural language processing, Technical language processing, Technical language supervision, Natural language supervision, Fault diagnosis, Intelligent fault diagnosis, Prognostics and health management
Löwenmark, K., Taal, C., Vurgaft, A., Liwicki, M., Nivre, J., & Sandin, F. (2023). Labelling of annotated condition monitoring data through technical language processing.
URN:
urn:nbn:se:ltu:diva-95406
SwePub:
oai:DiVA.org:ltu-95406
Löwenmark, K., Taal, C., Schnabel, S., Liwicki, M., & Sandin, F. (n.d.). Technical Language Supervision for Intelligent Fault Diagnosis in Process Industry. In International Journal of Prognostics and Health Management (Vol. 13, Issue 2).
DOI:
https://doi.org/10.36001/ijphm.2022.v13i2.3137
URN:
urn:nbn:se:ltu:diva-93815
SwePub:
oai:DiVA.org:ltu-93815
Löwenmark, K., Sandin, F., & Fink, O. (2023). Technical Language Supervision for Intelligent Fault Diagnosis.
ISBN:
9789180482547
URN:
urn:nbn:se:ltu:diva-95414
SwePub:
oai:DiVA.org:ltu-95414
Löwenmark, K., Taal, C., Nivre, J., Liwicki, M., & Sandin, F. (n.d.). Processing of Condition Monitoring Annotations with BERT and Technical Language Substitution: A Case Study. In Proceedings of the 7th European Conference of the Prognostics and Health Management Society 2022 (pp. 306–314).
DOI:
https://doi.org/10.36001/phme.2022.v7i1.3356
URN:
urn:nbn:se:ltu:diva-95407
SwePub:
oai:DiVA.org:ltu-95407
If you have published anything based on these data, please notify us with a reference to your publication(s). If you are responsible for the catalogue entry, you can update the metadata/data description in DORIS.