LogoAIAny
  • Search
  • Collection
  • Category
  • Tag
  • Blog
LogoAIAny

Category

Explore by categories

LogoAIAny

Learn Anything about AI in one site.

support@aiany.app
Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
Company
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2025 All Rights Reserved.
  • All

  • AI Leaderboard

  • AI Agent Tutorials

  • AI Coding Tutorials

  • AI Agent Papers

  • Chatbot

  • Machine Learning Foundation Books

  • AI Train

  • AI Deploy

  • AI Client

  • Machine Learning Foundation Papers

  • Machine Learning Foundation Tutorials

  • AI Image Demos

  • AI Agent

  • Large Language Model Tutorials

  • Large Language Model Papers

  • Machine Learning Engineering Papers

  • Computer Vision Tutorials

  • Computer Vision Papers

  • Natural Language Processing Papers

  • Reinforcement Learning Papers

  • Speech Technology Papers

  • AI API

  • AI Coding

  • AI Image

  • AI Video

  • MLOps

  • MCP Client

  • MCP Server

  • AI Video Papers

  • AI Audio

  • AI Infra

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

2015
Dario Amodei, Rishita Anubhai +32

This paper presents Deep Speech 2, an end-to-end deep learning system for automatic speech recognition that works across vastly different languages (English and Mandarin). It replaces traditional hand-engineered ASR pipelines with neural networks, achieving human-competitive transcription accuracy on standard datasets. The system uses HPC techniques for 7x speedup, enabling faster experimentation. Key innovations include Batch Normalization for RNNs, curriculum learning (SortaGrad), and GPU deployment optimization (Batch Dispatch). The approach demonstrates that end-to-end learning can handle diverse speech conditions including noise, accents, and different languages, representing a significant step toward universal speech recognition systems.

30u30paperaudioASR
  • Previous
  • 1
  • Next