When it comes to our state of mind and emotions, our faces can be quite telling. Facial expression is an essential aspect of nonverbal communication in humans. Even if we cannot explain how we do it, we can usually see in another person’s face how they are feeling. In many situations, reading facial expressions is particularly important. For example, a teacher might do it to check if their students are engaged or bored, and a nurse may do it to check if a patient’s condition has improved or worsened.
Thanks to advances in technology, computers can do a pretty good job when it comes to recognizing faces. Recognizing facial expressions, however, is a whole different story. Many researchers working in the field of artificial intelligence (AI) have tried to tackle this problem using various modeling and classification techniques, including the popular convolutional neural networks (CNNs). However, facial expression recognition is complex and calls for intricate neural networks, which require a lot of training and are computationally expensive.
In an effort to address these issues, a research team led by Dr. Jia Tian from Jilin Engineering Normal University in China have recently developed a new CNN model for facial expression recognition. As described in an article published in the Journal of Electronic Imaging, the team focused on striking a good balance between the training speed, memory usage, and recognition accuracy of the model.
One of the main differences between conventional CNN models and the one proposed by the team was the use of depth-wise separable convolutions. This type of convolution — the core operation performed at each layer of a CNN — differs from the standard one in that it processes different channels (such as RGB) of the input image independently and combines the results at the end.
By combining this type of convolution with a technique called “pre-activated residual blocks,” the proposed model was able to process input facial expressions in a coarse-to-fine manner. In this way, the team greatly reduced the computational cost and the necessary number of parameters to be learned by the system for accurate classification. “We managed to obtain a model with good generalization ability with as little as 58,000 parameters,” Tian said.
The researchers put their model to the test by comparing its facial expression recognition performance with that of other reported models in a classroom setting. They trained and tested all models using a popular dataset called the “Extended Cohn-Kanade dataset,” which contains over 35,000 labeled images of faces expressing common emotions. The results were encouraging, with the model developed by Tian’s team exhibiting the highest accuracy (72.4%) with the least number of parameters.
“The model we developed is particularly effective for facial expression recognition when using small sample datasets. The next step in our research is to further optimize the model’s architecture and achieve an even better classification performance,” Tian said.
Considering that facial expression recognition can be widely used in fields like human–computer interactions, safe driving, smart monitoring, surveillance, and medicine, let us hope that the team realizes their vision soon!