Title: Deep neural networks for acoustic modeling in speech recognition
Author: Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury
Novelties:
This paper perform neural network architecture on acoustic modeling.
Contributions:
The modern way for speech recognition is mostly use hidden Markov models, HMM, for temporal information.
They also use Gaussian mixture models, GMMsm to denote the fitness between states of HMM and frames, this model is known as GMM-HMM system.
This paper approaches a method to use deep belief nets with HMM, and shows the DNN-HMM system is better than GMM-HMM system in many aspects.
Technical Summarizes:
The restricted Boltzmann machine, RBM, contains a visible layer and a stochastic binary hidden layer. The two layers are connected by undirected connections. After training current RBM, the hidden layer is prepared for the next RBM as input data.
After that, the stack of RBMs can be seemed to be a deep belief net, DBN by set the direction of the undirected connections.
The final step is to add a softmax layer on it.


沒有留言:
張貼留言