SEPTEMBER 2018 | VOL. 61 | NO. 9 | COMMUNICATIONS OF THE ACM 97
Sarousal = max(Sjoy, Sanger) – max(Spleasure, Ssadness), where Sjoy, Spleasure,
Ssadness, and Sanger are the classification score output by the
classifier for the four emotions. For example, consider a data
point with the following scores Sjoy = 1, Spleasure = 0, Ssadness = 0,
and Sanger = 0 – that is, this data point is one unit of pure joy.
Such data point falls on the diagonal in the upper right quadrant. A data point that has a high joy score but small scores
for other emotions would still fall in the joy quadrant, but
not on the diagonal.
EQ-radio’s emotion recognition accuracy. To evaluate EQ-Radio’s emotion classification accuracy, we collect 400 two-minute signal sequences from 12 subjects,
100 sequences for each emotion. We train two types of
emotion classifiers: a person-dependent classifier, and a
person-independent classifier. Each person-dependent
classifier is trained and tested on data from a particular
subject. Training and testing are done on mutually-exclu-sive data points using leave-one-out cross validation. 14 The
person-independent classifier, it is trained on 11 subjects
and tested on the remaining subject, and the process is
repeated for different test subjects.
We first report the person-dependent classification
results. Using the valence and arousal scores as coordinates, we visualize the person-dependent classification in
Figure 6. Different types of points indicate different emotions. We observe that emotions are well clustered and
segregated, suggesting that they are distinctly encoded in
valence and arousal, and can be decoded from features
captured by EQ-Radio. We also observe that the points tend
to cluster along the diagonal and anti-diagonal, showing
that our classifiers have high confidence in the predictions.
Finally, the accuracy of person-dependent classification
for each subject is also shown in the figure with an average
accuracy of 87.0%.
The results of person-independent emotion classification
are shown in Figure 7. EQ-Radio can recognize a subject’s
emotion with an average accuracy of 72.3% purely based on
data from other subjects, meaning that EQ-Radio succeeds
in learning person-independent features for emotion
As expected, the accuracy of person-independent classification is lower than that of person-dependent classification. This is because person-independent emotion
recognition is intrinsically more challenging since an
emotional state is a rather subjective conscious experience that could be very different among different
subjects. We note, however, that our accuracy results
are consistent with the literature both for the case of
person-dependent and person-independent emotion
classifications. 21 Further, our results present the first
demonstration of RF-based emotion classification.
To better understand the classification errors, we show
the confusion matrix of both person-dependent and person-independent classification results in Figure 8. We
find that EQ-Radio achieves comparable accuracy in recognizing the four types of emotions. We also observe that
EQ-Radio typically makes fewer errors between emotion
pairs that are different in both valence and arousal (i.e.,
joy vs. sadness and pleasure vs. anger).
accuracy of these emotion-related metrics suggests that
EQ-Radio’s emotion recognition accuracy will be on par
with contact-based techniques, as we indeed show in
Section 7. 2.
7. 2. Evaluation of emotion recognition
We evaluate EQ-Radio’s ability to recognize emotions.
Experimental Setup. Participants: We recruited 12 participants ( 6 females). Among them, 6 participants ( 3 females)
have acting experience of 3∼ 7 years. People with acting
experience are more skilled in emotion management,
which helps in gathering high-quality emotion data and
providing a reference group. 36 All subjects were compensated for their participation, and all experiments were
approved by our IRB.
Experiment design: Obtaining high-quality data for emotion analysis is difficult, especially in terms of identifying
the ground truth emotion. 36 Thus, it is crucial to design
experiments carefully. We designed our experiments in
accordance with previous work on emotion recognition
using physiological signals. 26, 36 Specifically, before the
experiment, the subjects individually prepare stimuli
(e.g., personal memories, music, photos, and videos); during the experiment, the subject sits alone in one out of the
five conference rooms and elicits a certain emotional state
using the prepared stimuli. Some of these emotions are
associated with small movements like laughing, crying,
smiling, etc. After the experiment, the subject reports the
period during which she/he felt that type of emotion. Data
collected during the corresponding period are labeled
with the subject’s reported emotion.
Throughout these experiments, each subject is monitored
using three systems: ( 1) EQ-Radio, ( 2) AD8232 ECG monitor,
and ( 3) a video camera focused on the subject’s face.
Ground Truth: As described above, subjects are
instructed to evoke a particular emotion and report the
period during which they felt that emotion. The subject’s
reported emotion is used to label the data from the corresponding period. These labels provide the ground truth
Metrics & Visualization: When tested on a particular
data point, the classifier outputs a score for each of the
emotional states. The data point is assigned the emotion
that corresponds to the highest score. The classification
accuracy is the percent of test data that is assigned the correct emotion.
We visualize the output of the classification as follows:
Recall that the four emotions in our system can be represented in a 2D plane whose axes are valence and arousal.
Each emotion occupies one of the four quadrants: Sadness
(negative valence and negative arousal), Anger (negative
valence and positive arousal), Pleasure (positive valence
and negative arousal), and Joy (positive valence and positive
arousal). Thus, we can visualize the classification result for
a particular test data by showing it in the 2D valence-arousal
space. If the point is classified correctly, it would fall in the
For any data point, we calculate the valence and arousal
scores as: Svalence = max(Sjoy, Spleasure) – max(Ssadness, Sanger) and