OpenVokaturi 2.0

// 2 January 2017

We increased the number of cues from 5 to 9; this raised the quality by approximately 5 percent. We trained the network on two databases (EmoDB and Savee: 768 recordings annotated for the five main emotions), instead of only on EmoDB; this raised the capability of generalization to new sounds tremendously. The network is no longer based on linear discriminant analysis; instead, we now use an articificial neural network that converts from 9 cues to 5 emotion probabilities via two hidden layers of 100 and 20 nodes, respectively. The classification on 5 emotions is now 66.5 percent on two databases, as measured by leave-one-out cross-validation, repeated 3 times (i.e. training 3 times on each possible subset of 767 recordings and testing each time the remaining single recording).

 Back to news overview