π Conformer-2 is an upgraded speech recognition model that outperforms Conformer-1 in terms of speed, recognition of alphanumerics and proper nouns, and noise robustness.
π Conformer-2 is already the default model on Assembly AI's API.
π The size of Conformer-2 has increased to 450 million parameters and it has been trained on 1.1 million hours of data, resulting in significant performance improvements across various domains and benchmarks.
π Conformer-2 uses noisy student teacher training to improve the training data quality and quantity.
β±οΈ Optimizations on the system side reduce latency and allow scaling beyond 1 million data mark.
π§© Conformer-2 utilizes an ensemble of teacher models and data filtering techniques to ensure high quality pseudo labels and avoid overfitting.
π‘ Conformer-2 is a state-of-the-art speech recognition model.
π Conformer-2 focuses on alphanumerics recognition and proper noun recognition.
βοΈ The correct recognition of alphanumerics is crucial for the effectiveness of a speech recognition model.
β¨ Conformer-2 has improved alphanumerics and proper noun recognition compared to Conformer-1.
π° The new parameter called Speech thresholds allows users to control the cost of transcriptions on Assembly AI.
β±οΈ Conformer-2 can process files based on a specified minimum number of minutes, reducing costs for sleep podcasts, music, and empty audio files.