European Common Data Management Platform Definition for Railway AI Function Development

Authors

  • Mikel Labayen *

    Autonomous Vehicle Department, CAF Signalling, Donostia, 20018, Spain

    Computer Sciences and Artificial Intelligence Department, University of the Basque Country, Donostia, 20018, Spain

  • Daniel Ochoa de Eribe Autonomous Vehicle Department, CAF Signalling, Donostia, 20018, Spain
  • Ander Aramburu R&D Department, CAF, Beasain, 20200, Spain
  • Marcos Nieto Connected & Cooperative Automated Systems Department, Vicomtech Research Centre, Donostia, 20009, Spain
  • Naiara Aginako Computer Sciences and Artificial Intelligence Department, University of the Basque Country, Donostia, 20018, Spain

DOI:

https://doi.org/10.55121/tdr.v2i1.143

Keywords:

Common data management platform, Artificial intelligence, AI training and testing, Autonomous vehicle

Abstract

Digitalisation and automation of operations in the railway industry include the use of Automatic Train Operation systems that provide automated functions to reach different levels of automation, known as the Grade of Automation (GoA) levels. These levels go up to GoA4 in which the train is automatically controlled without any staff on board. Artificial intelligence has emerged as technology that can substitute humans in certain driving tasks, in GoA3 (driverless) and GoA4 (unattended) modes. AI capabilities include perception, decision-making, precise positioning, or optimization of communications. The success of AI models depends on the quality and diversity of the data used for training, along with the set-up of a data life-cycle framework that covers creation, training, testing, deployment and monitorisation. The management of training datasets implies both expensive and time-consuming data gathering, labelling, curation and formatting efforts, potentially hindering the development of reliable AI systems. This paper presents a Common Data Management Platform developed by a consortium of European railway stakeholders, devised to efficiently manage data for AI training, and which is demonstrated in two different Proofs of Concept.

References

[1] Machine Learning Operations [Internet]. [cited 2023 Aug 10]. Available from: https://ml-ops.org/

[2] Data Act: Commission Welcomes Political Agreement on Rules for a Fair and Innovative Data Economy [Internet]. [cited 2023 Aug 10]. Available from: https://ec.europa.eu/commission/presscorner/detail/en/ip_23_3491

[3] International Data Spaces [Internet]. [cited 2023 Aug 10]. Available from: https://internationaldataspaces.org/

[4] Gaia-X [Internet]. [cited 2023 Aug 10]. Available from: https://www.data-infrastructure.eu/GAIAX/Navigation/EN/Home/home.html

[5] European Data [Internet]. [cited 2023 Aug 10]. Available from: https://data.europa.eu/

[6] Geiger, A., Lenz, P., Stiller, C., et al., 2013. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research. 32(11), 1231–1237. DOI: https://doi.org/10.1177/0278364913491297

[7] Cabon, Y., Murray, N., Humenberger, M., 2020. Virtual kitti 2. arXiv preprint arXiv:2001.10773. DOI: https://doi.org/10.48550/arXiv.2001.10773

[8] Caesar, H., Bankiti, V., Lang, A.H., et al. (editors), 2020. nuscenes: A multimodal dataset for autonomous driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020 Jun 13–19; Seattle, WA, USA. p. 11621–11631.

[9] Huang, X., Wang, P., Cheng, X., et al., 2019. The apolloscape open dataset for autonomous driving and its application. IEEE Transactions on Pattern Analysis and Machine Intelligence. 42(10), 2702–2719. DOI: https://doi.org/10.1109/TPAMI.2019.2926463

[10] Geyer, J., Kassahun, Y., Mahmudi, M., et al., 2020. A2d2: Audi autonomous driving dataset. arXiv preprint arXiv:2004.06320. DOI: https://doi.org/10.48550/arXiv.2004.06320

[11] Sun, P., Kretzschmar, H., Dotiwalla, X., et al. (editors). 2020. Scalability in perception for autonomous driving: Waymo open dataset. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020 Jun 13–19; Seattle, WA, USA. p. 2446–2454.

[12] Agarwal, S., Vora, A., Pandey, G., et al., 2020. Ford multi-AV seasonal dataset. The International Journal of Robotics Research. 39(12), 1367–1376. DOI: https://doi.org/10.1177/0278364920961451

[13] One Thousand and One Hours: Self-driving Motion Prediction Dataset [Internet]. Available from: https://arxiv.org/pdf/2006.14480.pdf

[14] Yogamani, S., Hughes, C., Horgan, J., et al. (editors), 2019. Woodscape: A multi-task, multi-camera fisheye dataset for autonomous driving. Proceedings of the IEEE/CVF International Conference on Computer Vision; 2019 Oct 27–Nov 2; Seoul, Korea. p. 9308–9318.

[15] Ortega, J.D., Kose, N., Cañas, P., et al., 2020. Dmd: A large-scale multi-modal driver monitoring dataset for attention and alertness analysis. Computer Vision-ECCV 2020 Workshops. Springer: Cham. pp. 387–405.

[16] The Global Initiative for Certifiable AV Safety [Internet]. [cited 2023 Aug 10]. Available from: https://www.safetypool.ai/

[17] Streetwise: Accelerating Automated Driving with Advanced Scenario-based Safety Validation [Internet]. [cited 2023 Aug 10]. Available from: https://www.tno.nl/en/digital/smart-traffic-transport/smart-vehicles/streetwise/

[18] AVL SCENIUS [Internet]. [cited 2023 Aug 10]. Available from: https://www.avl.com/-/scenius

[19] A Path to a European Scenarios Database for ADS and ADAS Specification, Validation, and Homologation [Internet]. [cited 2023 Aug 10]. Available from: https://www.vvm-projekt.de/fileadmin/user_upload/Mid-Term/Presentations/VVM_HZE_EmmanuelArnoux.pdf

[20] PEGASUS [Internet]. [cited 2023 Aug 10]. Available from: https://www.pegasusprojekt.de

[21] Zendel, O., Murschitz, M., Zeilinger, M., et al., 2019. Railsem19: A dataset for semantic rail scene understanding. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops; 2019 Jun 16–17; Long Beach, CA, USA.

[22] Harb, J., Rébéna, N., Chosidow, R., et al., 2020. Frsign: A large-scale traffic light dataset for autonomous trains. arXiv preprint arXiv:2002.05665. DOI: https://doi.org/10.48550/arXiv.2002.05665

[23] Cserep, M., 2022. Hungarian MLS point clouds of railroad environment and annotated ground truth data. Mendeley Data. DOI: https://doi.org/10.17632/ccxpzhx9dj.1

[24] Lamas, D., Soilán, M., Grandío, J., et al., 2021. Automatic point cloud semantic segmentation of complex railway environments. Remote Sensing. 13(12), 2332. DOI: https://doi.org/10.3390/rs13122332

[25] Yu, X., He, W., Qian, X., et al., 2022. Real-time rail recognition based on 3D point clouds. Measurement Science and Technology. 33, 105207. DOI: https://doi.org/10.1088/1361-6501/ac750c

[26] Yuan, H., Mei, Z., Chen, Y., et al. (editors), 2022. RailVID: A dataset for rail environment semantic. ICONS 2022: 17th International Conference on Systems; 2022 Apr 24–28; Barcelona, Spain.

[27] Jiang, Y., Gong, X., Liu, D., et al., 2021. Enlightengan: Deep light enhancement without paired supervision. IEEE Transactions on Image Processing. 30, 2340–2349. DOI: https://doi.org/10.1109/TIP.2021.3051462

[28] Dosovitskiy, A., Ros, G., Codevilla, F., et al. (editors), 2017. CARLA: An open urban driving simulator. Proceedings of the 1st Annual Conference on Robot Learning; 2017 Nov 13–15; California, USA.

[29] SVL Simulator by LG [Internet]. [cited 2023 Aug 10]. Available from: https://www.svlsimulator.com/

[30] Simcenter Prescan Software [Internet]. [cited 2023 Aug 10]. Available from: https://www.plm.automation.siemens.com/global/en/products/simcenter/prescan.html

[31] Melbourne Tram Drivers View [Internet]. [cited 2023 Aug 10]. Available from: https://www.youtube.com/watch?v=lMx1Bx2Ei08abchannel=Schony747

[32] Ristić-Durrant, D., Franke, M., Michels, K., 2021. A review of vision-based on-board obstacle detection and distance estimation in railways. Sensors. 21(10), 3452. DOI: https://doi.org/10.3390/s21103452

[33] ASAM OpenLABEL V1.0.0 [Internet]. [cited 2023 Aug 10]. Available from: https://www.asam.net/project-detail/asam-openlabel-v100/

[34] Malina, L., Hajny, J., Dzurenda, P., et al., 2015. Privacy-preserving security solution for cloud services. Journal of Applied Research and Technology. 13(1), 20–31. DOI: https://doi.org/10.1016/S1665-6423(15)30002-X

[35] Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M., 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. DOI: https://doi.org/10.48550/arXiv.2004.10934

[36] Nieto, M., Senderos, O., Otaegui, O., 2021. Boosting AI applications: Labeling format for complex datasets. SoftwareX. 13, 100653. DOI: https://doi.org/10.1016/j.softx.2020.100653

[37] Technologies for Autonomous Rail Operation [Internet]. [cited 2023 Aug 10]. Available from: https://cordis.europa.eu/project/id/101014984

[38] R2DATO [Internet]. [cited 2023 Aug 10]. Available from: https://projects.rail-research.europa.eu/eurail-fp2/

[39] Europe’s Rail [Internet]. [cited 2023 Aug 10]. Available from: https://rail-research.europa.eu/

Downloads

Published

2024-08-09

Issue

Section

Articles