Tongyi launches Fun-ASR1.5, focusing on dialect recognition

robot
Abstract generation in progress
ME News report: On April 20 (UTC+8), according to Dongcha Beating monitoring, Tongyi Laboratory released its speech recognition model Fun-ASR1.5 on April 20, and it has been launched on Alibaba Cloud Balian for API access and opened for online trials on the ModelScope community. The official says this version uses a single model to cover 30 languages, seven major Chinese dialect systems, and more than 20 regional accents, eliminating the need to build separate models for each dialect. Tongyi’s internal evaluations show that the character error rate in typical dialect scenarios has decreased by 56.2% compared with the previous version; 5 dialects have achieved accuracy above 90%, and 15 dialects above 80%. Classical poetry recognition has also been singled out for dedicated optimization, with the official reporting an internal character-level accuracy of 97%. All these figures come from Tongyi’s own self-testing, not from third-party benchmarks. The dialect “long-tail” that is hardest to handle in Chinese speech recognition has now been folded into the same set of capabilities that can be directly used for commercial applications. For scenarios such as educational live streaming, local government hotlines, and interview transcription and editing, integrators no longer need to split multiple recognition pipelines by regional accents, making deployment simpler. (Source: BlockBeats)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned