Idea 3: Use special tokens at the input to condition the model for all desired tasks in a single model (language id, speech detection, transcription, translation). Create a "meta-language" of special tokens of a fixed schema that orchestrates the tasks/stages.
显示更多