Tongyi launches Fun-ASR1.5, focusing on dialect recognition

robot
Abstract generation in progress

ME News message: On April 20 (UTC+8), according to Dongcha Beating monitoring, Tongyi Laboratory released its speech recognition model Fun-ASR1.5 on April 20. It has been launched on Alibaba Cloud Bailian for API access and opened up for online testing on the ModelScope community.

Officially, this version covers 30 languages, seven major Chinese dialect systems, and more than 20 regional accents using a single model, and it no longer builds separate models by dialect.

Tongyi’s internal evaluation shows that in typical dialect scenarios, the character error rate has decreased by 56.2% compared with the previous version. In addition, 5 dialects have reached accuracy above 90%, and 15 dialects have reached accuracy above 80%.

Recognition of ancient poems and classical works has also been split out for dedicated optimization. Tongyi’s official internal reported character-level accuracy is 97%. These figures all come from Tongyi’s own self-testing, not third-party benchmarks.

In Chinese speech recognition, the dialect “long tail,” which has historically been the most difficult to handle, is now being folded into the same set of capabilities that can be deployed directly for commercial use. For scenarios such as educational livestreams, local government affairs hotline services, and interview transcription and organization, integrators no longer need to split the recognition pipeline into multiple sets based on regional accents, making deployment simpler.

(Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned