Is there a labeling tool where I can create audio-transcription-pairs in both directions:
- record audio for given text
- create transcriptions for given audio?
What I have tried so far:
I went through the list Tool for labeling audio (amongst others), and found lots of tools that offer 2. but not 1. The labelint tool labelstud.io looks quite promising, but it seems you cannot record audio for a given text.
The only tool I found is https://github.com/common-voice, but that seems quite heavyweight and inflexible. I would like to have a tool that multiple users can use easily, and that can easily be extended (ideally with Python, and without the need to install a database), e.g. automatically feeding the available audio resp. text and storing the corresponding user feedback (transcription resp. audio) in file.s