Publication detail

Automation-Driven Dataset Preparation for Continuous Czech Sign Language Recognition

ŠNAJDER, J. KREJSA, J.

English title

Automation-Driven Dataset Preparation for Continuous Czech Sign Language Recognition

Type

article in a collection out of WoS and Scopus

Language

en

Original abstract

This paper presents an automation-driven solution for preparing a continuous Czech Sign Language dataset, addressing the lack of resources in this area. Manual processing of daily sign language news recordings would be extremely time-consuming, as the videos vary in quality, use different overlays, and have no captions. To streamline this process, we use the Structural Similarity Index Measure (SSIM) to compare key frames and extract relevant parts of the recording, such as weather forecast segments. Automatic speech recognition (ASR) then processes the accompanying audio and generates textual transcriptions of the spoken content. The outcome is the highly automated preparation pipeline and the dataset containing 4699 annotated videos of weather forecast news in Czech Sign Language providing a foundation for future research in sign language recognition.

English abstract

This paper presents an automation-driven solution for preparing a continuous Czech Sign Language dataset, addressing the lack of resources in this area. Manual processing of daily sign language news recordings would be extremely time-consuming, as the videos vary in quality, use different overlays, and have no captions. To streamline this process, we use the Structural Similarity Index Measure (SSIM) to compare key frames and extract relevant parts of the recording, such as weather forecast segments. Automatic speech recognition (ASR) then processes the accompanying audio and generates textual transcriptions of the spoken content. The outcome is the highly automated preparation pipeline and the dataset containing 4699 annotated videos of weather forecast news in Czech Sign Language providing a foundation for future research in sign language recognition.

Keywords in English

sign language, continuous, dataset, recognition, translation

Released

04.12.2024

Publisher

Institute of Electrical and Electronics Engineers Inc.

Location

Brno

ISBN

979-8-3503-9489-4

Book

2024 21st International Conference on Mechatronics - Mechatronika (ME)

Pages from–to

52–56

Pages count

5

BIBTEX


@inproceedings{BUT196505,
  author="Jan {Šnajder} and Jiří {Krejsa},
  title="Automation-Driven Dataset Preparation for Continuous Czech Sign Language Recognition",
  booktitle="2024 21st International Conference on Mechatronics - Mechatronika (ME)",
  year="2024",
  month="December",
  pages="52--56",
  publisher="Institute of Electrical and Electronics Engineers Inc.",
  address="Brno",
  isbn="979-8-3503-9489-4"
}