Exemplar Hidden Markov Models for Classification of Facial Expressions in Videos





Fig 1: This figure shows a flow diagram of the proposed approach. First each video is embedded into an exemplar Hidden Markov Model (HMM). Thereafter, the Probability Product Kernel (PPK) is used to calculate distance between each model. These distances are then used in a Kernel-based SVM for classification.



Project Synopsis: Facial expressions are dynamic events comprised of meaningful temporal segments. A common approach to facial expression recognition in video is to first convert variable-length expression sequences into a vector representation by computing summary statistics (max/mean pooling) of image-level features or of spatio-temporal features. These representations are then passed to a discriminative classifier such as a support vector machines (SVM). However, these approaches donít fully exploit the temporal dynamics of facial expressions. Hidden Markov Models (HMMs), provide a method for modeling variable-length expression timeseries. Although HMMs have been explored in the past for expression classification, they are rarely used since classification performance is often lower than discriminative approaches, which may be attributed to the challenges of estimating generative models.


Fig 2: Shows a facial expression video modeled via an HMM, where hidden states (shown on top) are assigned to each observation, while forming a Markov Chain. It depicts that the HMM modeling is able to represent the video as comprised of two distinct sub-events (neutral and apex).

Idea: This work explores an approach for combining the modeling strength of HMMs with the discriminative power of SVMs via a model-based similarity framework. Each example is first instantiated into an Exemplar-HMM model. A probabilistic kernel is then used to compute a kernel matrix, to be used along with an SVM classifier. This paper proposes that dynamical models such as HMMs are advantageous for the facial expression problem space, when employed in a discriminative, exemplar-based classification framework. The approach yields state-of-the-art results on both posed (CK+ and OULU-CASIA) and spontaneous (FEEDTUM and AM-FED) expression datasets highlighting the performance advantages of the approaches

Related publication: Sikka K., Dhall A., Bartlett, M. (2015). Exemplar Hidden Markov Models for Classification of Facial Expressions in Videos. IEEE Conference on Computer Vision and Pattern Recognition, Workshop on Analysis and Modeling Faces and Gestures. [PDF]

Results: Shown results on two datasets. Pl. refer to publication for more details.


Code: