Project Detail |
SHISSM (Sparse and Hierarchical Structures for Speech Modeling) aims at investigating and integrating emerging research areas in the context of speech modeling, including (1) Deep Neural Networks (DNNs); (2) posterior-based features and systems (as usually resulting from the DNN outputs); (3) sparse coding seeking sparse representation of the processed signals; (4) compressive sensing and sparse recovery, aiming at modeling the speech signal in large-dimensional sparse spaces, usually resulting in simpler processing (e.g., recognition) algorithms; and (5) full exploitation of modern compute resources (big data, large GPU-based processing).Keywords: Automatic Speech Recognition (ASR), hierarchical posterior-based ASR, deep architectures, deep neural networks (DNN), sparse recovery modeling, hierarchical sparse coding, human auditory modeling, speech intelligibility modeling.
Lay summary
SHISSM (Sparse and Hierarchical Structures for Speech Modeling) aims to investigate and integrate in the context of speech modeling the most recent and often complementary techniques, namely: (1) artificial neural networks, and more particularly deep neural networks (DNN); (2) systems based entirely on hierarchical posterior distributions; (3) coding and hollow representations in high dimension; (4) sparse sampling and hollow reconstruction; (5) exploitation of computing resources (GPU) and increasingly important data resources.
In this context, SHISSM will have to develop a better theoretical understanding in which it will be possible to formally integrate deep hierarchical architectures, hollow (possibly binary) architectures, using advanced methods of statistical modeling, and certain links. recently defined between encoding and sparse representations and some models like Hidden Markov Models (HMMs).
SHISSM is therefore a fairly multidisciplinary project integrating these different approaches in the context of speech modeling in general, and speech recognition in particular, even if the impact of this project should go well beyond speech . |