Making an Emotion-Aware Music Player


A sequel to a previous post about emotion recognition. Let’s apply the emotion recognition model and build a music player that will play songs fitting to your mood, for that extra death metal when you’re pissed, and some Grieg when you’re happy.


Read this: The code in this tutorial is licensed under the GNU 3.0 open source license and you are free to modify and redistribute the code, given that you give others you share the code with the same right, and cite my name (use citation format below). You are not free to redistribute or modify the tutorial itself in any way. By reading on you agree to these terms. If you disagree, please navigate away from this page.
Citation format
van Gent, P. (2016). Emotion Recognition With Python, OpenCV and a Face Dataset. A tech blog about fun things with Python and embedded electronics. Retrieved from: http://www.paulvangent.com/2016/06/30/making-an-emotion-aware-music-player/
IE users: I’ve gotten several reports that sometimes the code blocks don’t display correctly or at all on Internet Explorer. Please refresh the page and they should display fine.


Getting Started
You get home from work, power up your laptop and ask it to play you some music. The computer knows which music you like because it houses your music library, so that goes well. But, you just had a tough day at work and are a still bit pissed, so from your entire music library you really would only like to hear a subset (that one death metal song). It could also be that you’re over the moon and want to hear happy music. It would be cool to do this automatically, so let’s give this a go!
We have already briefly looked at how to recognize emotions in a previous post. Here we found that, at least from a diverse dataset of facial images, it was possible to generate a generalizable model to classify “anger”, “happiness” and “sadness” emotions from each other with about 77.2% accuracy, using a model trained on images from diverse people. With a bigger dataset this accuracy is likely to increase, and when replaced or augmented by facial images of your own emotions this will increase quite a bit further. First let’s think for a bit about what we really need. We need at least to be able to:

  • detect a face on the webcam and pre-process the image of the face;
  • grab some images of your face and dynamically update the model over time;
  • detect the emotion on your face;
  • pick a random song linked to that emotion and play it.

That seems quite manable, so let’s see if we can cook up something!


Detecting your face on the webcam
First let’s write a script that can detect faces on your webcam stream. This has been done a million times already so we won’t dwell on it too long. I’ve commented the code for those who may be less familiar with opencv or this boilerplate-type code.
We will use a pre-trained HAAR classifier supplied with OpenCV. Get it from your OpenCV folder in “opencv\sources\data\haarcascades\”, it’s called ‘haarcascade_frontalface_default.xml’.
Do it like this:

import cv2
import numpy as np
video_capture = cv2.VideoCapture(0) #the default webcam (0, the first webcam device detected). Change if you have more than one webcam connected and want to use another one than the default one
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml") #load the trained classifier model
while True:
    ret, frame = video_capture.read() #Grab frame from webcam. Ret is 'true' if the frame was successfully grabbed.
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #Convert image to grayscale to improve detection speed and accuracy
    #Run classifier on frame
    face = facecascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=20, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    for (x, y, w, h) in face: #Draw rectangle around detected faces
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 0, 255), 2) #draw it on the colour image "frame", with arguments: (coordinates), (size), (RGB color), line thickness 2
    cv2.imshow("webcam", frame) #Display frame
    if cv2.waitKey(1) & 0xFF == ord('q'): #imshow expects a termination definition in order to work correctly, here it is bound to key 'q'
        break

If all goes well this should result in a red rectangle around all faces visible on the webcam feed. However, in my case there was strong sunlight coming from the side from a window, which created high contrasts and strong shadows on my face that hampered detection. A possible solution is to equalize the histogram a bit using an adaptive equalization technique. Adding this to the code we get something like:

import cv2
import numpy as np
video_capture = cv2.VideoCapture(0)
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
while True:
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)) #Create CLAHE object
    clahe_image = clahe.apply(gray) #Apply CLAHE to grayscale image from webcam
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=10, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    for (x, y, w, h) in face: #Draw rectangle around detected faces
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 0, 255), 2) #draw it on "frame", (coordinates), (size), (RGB color), thickness 2
    cv2.imshow("webcam", frame) #Display frame
    if cv2.waitKey(1) & 0xFF == ord('q'): #imshow expects a termination definition to work correctly, here it is bound to key 'q'
        break


Processing the face
Great, we can detect faces! Before asking the classifier what emotion the face is displaying we need to crop and standardize it. The code I use in this tutorial is an adaptation of the one from the previous emotion recognition post. This extracts the face from the webcam stream and crops it, like this:

import cv2
import numpy as np
video_capture = cv2.VideoCapture(0)
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
def crop_face(gray, face): #Crop the given face
    for (x, y, w, h) in face:
        faceslice = gray[y:y+h, x:x+w]
    return faceslice
while True:
    ret, frame = video_capture.read() #Grab frame from webcam. Ret is 'true' if the frame was successfully grabbed.
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #Convert image to grayscale to improve detection speed and accuracy
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)
    #Run classifier on frame
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    for (x, y, w, h) in face: #Draw rectangle around detected faces
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 0, 255), 2) #draw it on "frame", (coordinates), (size), (RGB color), thickness 2
    if len(face) == 1: #Use simple check if one face is detected, or multiple (measurement error unless multiple persons on image)
        faceslice = crop_face(gray, face) #slice face from image
        cv2.imshow("detect", faceslice) #display sliced face
    else:
        print("no/multiple faces detected, passing over frame")
    cv2.imshow("webcam", frame) #Display frame
    if cv2.waitKey(1) & 0xFF == ord('q'): #imshow expects a termination definition to work correctly, here it is bound to key 'q'
        break

If all works well you should get two streams; one with the webcam stream, another one with a cropped black&white face. This cropped window updates only when a face is succesfully detected in the webcam stream.
To predict the emotion accurately we might want to have more than one facial image. Blurred images can be an error source (especially in low light conditions). Another source is incorrect classification on a good image. Assuming that at least some of these detection errors are randomly distributed through the detection results, averaging classification over multiple images (let’s say: 10) will improve results without much added hassle or much extra code, and it takes care of the blurred image problem too (unless the majority of the images are blurred..). Just store the detected faces in a dict{} object and test against a length criterion. Also adapt the crop_face() function, like this:

facedict = {} #Create face dictionary
def crop_face(gray, face):
    for (x, y, w, h) in face:
        faceslice = gray[y:y+h, x:x+w]
    facedict["face%s" %(len(facedict)+1)] = faceslice #append sliced face as a numbered face to the dictionary
    return faceslice
#At te end of the file, in the 'while True' loop, add a simple stop criterion:
if len(facedict) == 10:
    break #This will stop the program once 10 faces are collected.

This should result in a dictionary object containing the pixel data of 10 detected faces. Great! The infrastructure of our music player is mostly done. Now let’s look at predicting emotions from the collected facial images.


Detecting the emotion in a face
To detect the actual emotion on your face we could use a generalized model, but a better solution might be an individual one. Models trained on a single individual work much better when used on the same individual, often because in that case there is less variance between the data (here:facial features). If we minimise the variance by keeping the face the same, most of the detected differences will be due to the fact that a different emotion is expressed. You need to collect varied images of yourself to make this robust. Let’s expand the webcam script a bit to accomodate this process, use a flag (“–update”) and use argparse to parse it. This makes it easy to change the mode our program runs in (‘update mode’ or ‘normal mode’).
Also let’s restructure the main code so that it’s all neatly in functions:

import cv2
import numpy as np
import argparse
import time
import glob
import os
import Update_Model
video_capture = cv2.VideoCapture(0)
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
fishface = cv2.createFisherFaceRecognizer()
try:
    fishface.load("trained_emoclassifier.xml")
except:
    print("no xml found. Using --update will create one.")
parser = argparse.ArgumentParser(description="Options for the emotion-based music player") #Create parser object
parser.add_argument("--update", help="Call to grab new images and update the model accordingly", action="store_true") #Add --update argument
args = parser.parse_args() #Store any given arguments in an object
facedict = {}
emotions = ["angry", "happy", "sad", "neutral"]
def crop_face(clahe_image, face):
    for (x, y, w, h) in face:
        faceslice = clahe_image[y:y+h, x:x+w]
        faceslice = cv2.resize(faceslice, (350, 350))
    facedict["face%s" %(len(facedict)+1)] = faceslice
    return faceslice
def update_model(emotions):
    print("Model update mode active")
    check_folders(emotions)
    for i in range(0, len(emotions)):
        save_face(emotions[i])
    print("collected images, looking good! Now updating model...")
    Update_Model.update(emotions)
    print("Done!")
def check_folders(emotions): #check if folder infrastructure is there, create if absent
    for x in emotions:
        if os.path.exists("dataset\\%s" %x):
            pass
        else:
            os.makedirs("dataset\\%s" %x)
def save_face(emotion):
    print("\n\nplease look " + emotion + " when the timer expires and keep the expression stable until instructed otherwise.")
    for i in range(0,5):#Timer to give you time to read what emotion to express
        print(5-i)
        time.sleep(1)
    while len(facedict.keys()) < 16: #Grab 15 images for each emotion
        detect_face()
    for x in facedict.keys(): #save contents of dictionary to files
        cv2.imwrite("dataset\\%s\\%s.jpg" %(emotion, len(glob.glob("dataset\\%s\\*" %emotion))), facedict[x])
    facedict.clear() #clear dictionary so that the next emotion can be stored
def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    print("I think you're %s" %emotions[max(set(predictions), key=predictions.count)])
def grab_webcamframe():
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)
    return clahe_image
def detect_face():
    clahe_image = grab_webcamframe()
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    if len(face) == 1:
        faceslice = crop_face(clahe_image, face)
        return faceslice
    else:
        print("no/multiple faces detected, passing over frame")
while True:
    detect_face()
    if args.update: #If update flag is present, call update function
        update_model(emotions)
        break
    elif len(facedict) == 10: #otherwise it's regular a runtime, continue normally with emotion detection functionality
        recognize_emotion()
        break

I’ve streamlined the model training code from the previous tutorial a bit, download the Update_Model.py file and put it in the same folder as the main file you’re working on. Follow the previous tutorial if you want more details on how this works.
Once the file is downloaded, call the main program in “update mode” like this:

#To activate update mode, pass the --update flag
python <filename.py> --update
#To get help, use the -h or --help flag
python <filename.py> -h
python <filename.py> --help

This will present you with instructions on what emotion to express. The collected images are appended to the raw files and the model automatically updates itself. Extra emotions are also automatically added if you update the ’emotions’ list at the top of the file. Cool! Now you can try and see if you can detect “resting bitch face” as an emotion.
Note that if you train and test right away, you will likely get high accuracy. However, running it a day later might give less good results. Why? Mostly because a day later there is a big chance that there are more differences than just the emotions, which can throw off the classifier. Remember that the classifier looks for differences in patterns, not emotions. The fact that we can differentiate between emotions at all is because different emotions induce predictable differences in patterns in the pixeldata of the images. So, the next day your hair may be different, you may be in a different room if you use a laptop, the angle of your face to the webcam may be different, the lighting conditions may be different, a combination of these may play a part, or there may be other reasons such as unicorns. This all introduces extra variance in the pixeldata that is not related to the actual emotions.
The best way to counter this and get robust performance, is to add images for each emotion daily for about a week and re-train the model each time, just so you get a well-rounded representation with different types of images representing the same emotion. Train the model in different rooms you frequently work if you have a laptop. After some time the variance induced by day-to-day variation will become less relevant to the model. In other words, it will learn what features are irrelevant and have nothing to do with emotion, and learn to ignore these.
So be sure to use the ‘–update’ flag and follow instructions daily for about a week or until the performance is satisfactory.


Selecting and playing the right music for you
Now that we have successfully detected the emotion on your face, and did some basic error reduction, playing the right music is actually not that hard. In this case the best way to open music files would be to use the infrastructure available on the target computers on which the software will be run, rather than some python module that can open mp3’s. Using modules might require extra installation on the target computer, and generally offers less (friendly) control to the user in what happens after the file has been opened. We can use os.startfile(), but that is windows-only, and although that is the largest eco-system by far, it’s not very nice to Apple or Linux users. Using a suggetion here, a simple solution to open files on Windows, Apple and Linux platforms all-in-one is implemented in the open_stuff() function.


import sys, os, subprocess
def open_stuff(filename): #Open the file, credit to user4815162342, on the stackoverflow link in the text above
    if sys.platform == "win32":
        os.startfile(filename)
    else:
        opener ="open" if sys.platform == "darwin" else "xdg-open"
        subprocess.call([opener, filename])
 

To allow easy customization of the music attached to emotions we will use this excel file to easily add music files to open. In the columns you can put relative or absolute paths to the files or links you want opened if a certain emotion is detected. Use backslashes and not forward slashes for the folder structure.

import pandas, random
df = pandas.read_excel("EmotionLinks.xlsx") #open Excel file
actions["angry"] = [x for x in df.angry.dropna()] #We need de dropna() when columns are uneven in length, which creates NaN values at missing places. The OS won't know what to do with these if we try to open them.
actions["happy"] = [x for x in df.happy.dropna()]
actions["sad"] = [x for x in df.sad.dropna()]
actions["neutral"] = [x for x in df.neutral.dropna()]
#And we alter recognize_emotion() to retrieve the appropriate action list and pick a random item:
def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    recognized_emotion = emotions[max(set(predictions), key=predictions.count)]
    print("I think you're %s" %recognized_emotion)
    actionlist = [x for x in actions[recognized_emotion]] #<----- get list of actions/files for detected emotion
    random.shuffle(actionlist) #<----- Randomly shuffle the list
    open_stuff(actionlist[0]) #<----- Open the first entry in the list

Some may have noticed that this code is actually not restricted to music files at all. Because we use a general “open file” method, we basically let the OS figure out how to open what we give it. This gives a very cool (at least I think it is) added benefit: we can open anything the os knows how to handle, so not just music but also images, video’s, documents, links to webpages (youtube!), scripts, video games, etc. Suddenly the emotion-based music player can also be configured to cheer you up by, for example, changing your desktop wallpaper to happy images if you’re sad, if you’re angry it may change it to calming images, if you’re happy you can get a unicorn-overload if you so desire. It may even detect you want to let off some steam and fire up Killing Floor 2 or a similar game.
The final code will look something like this:

import cv2, numpy as np, argparse, time, glob, os, sys, subprocess, pandas, random, Update_Model, math
#Define variables and load classifier
camnumber = 0
video_capture = cv2.VideoCapture()
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
fishface = cv2.createFisherFaceRecognizer()
try:
    fishface.load("trained_emoclassifier.xml")
except:
    print("no trained xml file found, please run program with --update flag first")
parser = argparse.ArgumentParser(description="Options for the emotion-based music player")
parser.add_argument("--update", help="Call to grab new images and update the model accordingly", action="store_true")
args = parser.parse_args()
facedict = {}
actions = {}
emotions = ["angry", "happy", "sad", "neutral"]
df = pandas.read_excel("EmotionLinks.xlsx") #open Excel file
actions["angry"] = [x for x in df.angry.dropna()] #We need de dropna() when columns are uneven in length, which creates NaN values at missing places. The OS won't know what to do with these if we try to open them.
actions["happy"] = [x for x in df.happy.dropna()]
actions["sad"] = [x for x in df.sad.dropna()]
actions["neutral"] = [x for x in df.neutral.dropna()]
def open_stuff(filename): #Open the file, credit to user4815162342, on the stackoverflow link in the text above
    if sys.platform == "win32":
        os.startfile(filename)
    else:
        opener ="open" if sys.platform == "darwin" else "xdg-open"
        subprocess.call([opener, filename])
def crop_face(clahe_image, face):
    for (x, y, w, h) in face:
        faceslice = clahe_image[y:y+h, x:x+w]
        faceslice = cv2.resize(faceslice, (350, 350))
    facedict["face%s" %(len(facedict)+1)] = faceslice
    return faceslice
def update_model(emotions):
    print("Model update mode active")
    check_folders(emotions)
    for i in range(0, len(emotions)):
        save_face(emotions[i])
    print("collected images, looking good! Now updating model...")
    Update_Model.update(emotions)
    print("Done!")
def check_folders(emotions):
    for x in emotions:
        if os.path.exists("dataset\\%s" %x):
            pass
        else:
            os.makedirs("dataset\\%s" %x)
def save_face(emotion):
    print("\n\nplease look " + emotion + ". Press enter when you're ready to have your pictures taken")
    raw_input() #Wait until enter is pressed with the raw_input() method
    video_capture.open(camnumber)
    while len(facedict.keys()) < 16:
        detect_face()
    video_capture.release()
    for x in facedict.keys():
        cv2.imwrite("dataset\\%s\\%s.jpg" %(emotion, len(glob.glob("dataset\\%s\\*" %emotion))), facedict[x])
    facedict.clear()
def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    recognized_emotion = emotions[max(set(predictions), key=predictions.count)]
    print("I think you're %s" %recognized_emotion)
    actionlist = [x for x in actions[recognized_emotion]] #get list of actions/files for detected emotion
    random.shuffle(actionlist) #Randomly shuffle the list
    open_stuff(actionlist[0]) #Open the first entry in the list
def grab_webcamframe():
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)
    return clahe_image
def detect_face():
    clahe_image = grab_webcamframe()
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    if len(face) == 1:
        faceslice = crop_face(clahe_image, face)
        return faceslice
    else:
        print("no/multiple faces detected, passing over frame")
def run_detection():
    while len(facedict) != 10:
        detect_face()
    recognize_emotion()
if args.update:
    update_model(emotions)
else:
    video_capture.open(camnumber)
    run_detection()

Note that I have also added another flag: “–retrain”. This is for when you manually add photo’s to the training directory. This flag re-trains that model based on everything it finds in these folders. Another few changes are made, for example the timer in the update mode has been replaced by a prompt asking you to press enter when you’re ready to have your picture taken. This gives the user a bit more control on when he/she is ready for it.


Bonus section: changing wallpaper to balance mood
Beforehand, note that this functionality is windows-only (although making a linux or osx version should not be difficult). One use-case can be that you want to have your desktop wallpaper adapt to your mood, either to reinforce positive moods, or lessen negative moods. Using the above model (with a well-trained classifier) and a few small adaptations something can be cooked up. Here is an example that does a few things:

  • It adds a timer function that checks with settable intervals what your emotion is;
  • It checks the folder containing wallpapers fitting the detected mood, and randomly picks one;
  • It calls a batch file that does the actual changing of the wallpaper.

Store your wallpaper files in a folder structure “wallpapers\emotion”. Make sure the emotions are similar to the ones in the code (angry, happy, sad, neutral), so that you end up with “wallpapers\angry”, “wallpapers\happy”, etc. Put the wallpapers that you want to see if you’re in a particular mood in these folders. We add three functions to make it work (wallpaper_timer(), change_wallpaper(), and setwallpaperWithCtypes()), change a few others and add another flag for argparse to listen to: “–wallpaper”. The full code will then look something like this:

import cv2, numpy as np, argparse, time, glob, os, sys, subprocess, pandas, random, Update_Model, math, ctypes, win32con
#Define variables and load classifier
camnumber = 0
video_capture = cv2.VideoCapture()
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
fishface = cv2.createFisherFaceRecognizer()
try:
    fishface.load("trained_emoclassifier.xml")
except:
    print("no trained xml file found, please run program with --update flag first")
parser = argparse.ArgumentParser(description="Options for the emotion-based music player")
parser.add_argument("--update", help="Call to grab new images and update the model accordingly", action="store_true")
parser.add_argument("--retrain", help="Call to re-train the the model based on all images in training folders", action="store_true") #Add --update argument
parser.add_argument("--wallpaper", help="Call to run the program in wallpaper change mode. Input should be followed by integer for how long each change cycle should last (in seconds)", type=int) #Add --update argument
args = parser.parse_args()
facedict = {}
actions = {}
emotions = ["angry", "happy", "sad", "neutral"]
df = pandas.read_excel("EmotionLinks.xlsx") #open Excel file
actions["angry"] = [x for x in df.angry.dropna()] #We need de dropna() when columns are uneven in length, which creates NaN values at missing places. The OS won't know what to do with these if we try to open them.
actions["happy"] = [x for x in df.happy.dropna()]
actions["sad"] = [x for x in df.sad.dropna()]
actions["neutral"] = [x for x in df.neutral.dropna()]
def open_stuff(filename): #Open the file, credit to user4815162342, on the stackoverflow link in the text above
    if sys.platform == "win32":
        os.startfile(filename)
    else:
        opener ="open" if sys.platform == "darwin" else "xdg-open"
        subprocess.call([opener, filename])
def crop_face(clahe_image, face):
    for (x, y, w, h) in face:
        faceslice = clahe_image[y:y+h, x:x+w]
        faceslice = cv2.resize(faceslice, (350, 350))
    facedict["face%s" %(len(facedict)+1)] = faceslice
    return faceslice
def update_model(emotions):
    print("Model update mode active")
    check_folders(emotions)
    for i in range(0, len(emotions)):
        save_face(emotions[i])
    print("collected images, looking good! Now updating model...")
    Update_Model.update(emotions)
    print("Done!")
def check_folders(emotions):
    for x in emotions:
        if os.path.exists("dataset\\%s" %x):
            pass
        else:
            os.makedirs("dataset\\%s" %x)
def save_face(emotion):
    print("\n\nplease look " + emotion + ". Press enter when you're ready to have your pictures taken")
    raw_input() #Wait until enter is pressed with the raw_input() method
    video_capture.open(camnumber)
    while len(facedict.keys()) < 16:
        detect_face()
    video_capture.release()
    for x in facedict.keys():
        cv2.imwrite("dataset\\%s\\%s.jpg" %(emotion, len(glob.glob("dataset\\%s\\*" %emotion))), facedict[x])
    facedict.clear()
def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    recognized_emotion = emotions[max(set(predictions), key=predictions.count)]
    print("I think you're %s" %recognized_emotion)
    return recognized_emotion
def grab_webcamframe():
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)
    return clahe_image
def detect_face():
    clahe_image = grab_webcamframe()
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    if len(face) == 1:
        faceslice = crop_face(clahe_image, face)
        return faceslice
    else:
        print("no/multiple faces detected, passing over frame")
def run_detection():
    while len(facedict) != 10:
        detect_face()
    recognized_emotion = recognize_emotion()
    return recognized_emotion
def wallpaper_timer(seconds):
    video_capture.release()
    time.sleep(int(seconds))
    video_capture.open(camnumber)
    facedict.clear()
def change_wallpaper(emotion):
    files = glob.glob("wallpapers\\%s\\*.bmp" %emotion)
    current_dir = os.getcwd()
    random.shuffle(files)
    file = "%s\%s" %(current_dir, files[0])
    setWallpaperWithCtypes(file)
def setWallpaperWithCtypes(path): #Taken from http://www.blog.pythonlibrary.org/2014/10/22/pywin32-how-to-set-desktop-background/
    cs = ctypes.c_buffer(path)
    ok = ctypes.windll.user32.SystemParametersInfoA(win32con.SPI_SETDESKWALLPAPER, 0, cs, 0)
if args.update:
    update_model(emotions)
elif args.retrain:
    Update_Model.update(emotions)
elif args.wallpaper:
    cycle_time = args.wallpaper
    while True:
        wallpaper_timer(cycle_time)
        recognized_emotion = run_detection()
        change_wallpaper(recognized_emotion)
else:
    video_capture.open(camnumber)
    recognized_emotion = run_detection()
    actionlist = [x for x in actions[recognized_emotion]] #get list of actions/files for detected emotion
    random.shuffle(actionlist) #Randomly shuffle the list

Call it with the ‘–wallpaper’ flag. Note that you also need to give an integer with the wallpaper flag, like so:

python <filename.py> --wallpaper 300 #this will update the wallpaper every 5 minutes (300 sec)


Wrapping up
So far we have come quite a way and learned some pretty cool things. You can now detect faces on a webcam stream (or video file actually, by changing a line or 2), extract them, preprocess them, predict emotions on them, dynamically update the detection model and execute emotion-related programs/actions.
In my tests, after three days of training three times a day (once in the morning, afternoon and evening) the model only made a single mistake in 70 trials. I did make my expressions rather clear. If you want the model to recognize more subtle expressions this is also very, very possible, but you might get a few more mistakes. Keep in mind that in this case the training material that you supply should also be more subtle. It will also likely cost more training runs before it will get most of your more subtle expressions right.
As always, you are free to use the code in this tutorial within the restrictions of the license mentioned at the top. Please also let me know what cool things you’ve made! I hope you enjoyed this tutorial.


66 Comments

  • Bas

    June 20, 2016

    You should add voice recognition in there somewhere. “Computer, what video game am I in the mood for?”

    Reply
    • Paul van Gent

      June 20, 2016

      Actually I’m working on something similar. Keep an eye on the blog!

      Reply
  • Junior

    August 23, 2016

    Hie Paul.
    Thank you for the tutorial. I do not have background in machine learning but i have some understanding of programming and i was able to understand the flow of your program. However, i encountered an error when i tried to resize the image captured from the video. It says src is not numpy array or a scalar.

    Reply
    • Paul van Gent

      August 23, 2016

      Hi Artwel, you could try recasting the img as an array using numpy.asarray(). The documentation here explains how the function works. You could also mail me your code and I’ll have a look.

      Reply
      • Junior

        August 23, 2016

        Ok. I will have a look.
        Thank you

        Reply
  • Stranger

    September 19, 2016

    You need to update the names of emotions if you want to be consonant with previous article. There were anger not angry and sadness not sad

    Reply
    • Paul van Gent

      September 19, 2016

      Hi Stranger,
      That is correct. However the tutorial is separate from the previous one in terms of code, apart from using the same technique. You could actually use any labels you desire!

      Reply
  • bilal rafique

    April 4, 2017

    Paul,
    when i run the code of “Detecting the emotion in a face”, i get this error
    no xml found. Using –update will create one.
    Traceback (most recent call last):
    File “F:\In Shaa Allah\Ex3.py”, line 89, in
    recognize_emotion()
    File “F:\In Shaa Allah\Ex3.py”, line 61, in recognize_emotion
    pred, conf = fishface.predict(facedict[x])
    error: ..\..\..\..\opencv\modules\contrib\src\facerec.cpp:620: error: (-5) This Fisherfaces model is not computed yet. Did you call Fisherfaces::train? in function cv::Fisherfaces::predict

    Reply
    • Paul van Gent

      April 8, 2017

      Please read the tutorial and code comments, explanation of the function is in there. Don’t just paste the code!

      Reply
  • Silver

    May 1, 2017

    Hi Paul.
    Thank you for the tutorial. I’m sorry to bother you, as my experience in programming is actually poor, even i have read this tutorial several times, i still have some problems.
    1. could you please tell me what IDE you are using?
    2. where should i put these codes? in the “update_model” or the main program?
    #To activate update mode, pass the –update flag
    python –update
    #To get help, use the -h or –help flag
    python -h
    python –help
    Thank you!

    Reply
  • Silver

    May 2, 2017

    Hi Paul.
    Thank you for the tutorial. I’m sorry to bother you, as my experience in programming is actually poor, even i have read this tutorial several times, i still have some problems.
    1. could you please tell me what IDE you are using?
    2. where should i put these codes? in the “update_model” or the main program?
    #To activate update mode, pass the –update flag
    python –update
    #To get help, use the -h or –help flag
    python -h
    python –help
    Thank you!

    Reply
    • Paul van Gent

      May 4, 2017

      Hi Silver. I use Visual Studio 2015, although you could use many different IDE’s, so try a few and see what you like. All the code goes in the same folder. However, if you’re still new to programming I would recommend you try a few more basic Python tutorials first, as this one assumes intermediate Python knowledge.

      Reply
  • mostafa

    May 13, 2017

    what is “import Update_Model “

    Reply
  • Xcross

    July 24, 2017

    Sir,please send me the source code.

    Reply
    • Paul van Gent

      July 24, 2017

      All you need is on here. Just follow the tutorial and you should be fine :).

      Reply
      • Xcross

        July 24, 2017

        i follow all the steps … but some how i lost my working file..
        sir , if you give me the source file then it will be very helpful for me.. please sir

        Reply
  • wiem

    July 30, 2017

    Dear Paul van Gent,
    Hello ! thanks alot for sharing your code. I studied your work in emotion-analysis. I’m working too on facial expressions using SVM, as I see your code extract face landmarks using DLIB and train a multi-class SVM classifier to recognize facial expressions (emotions). After that you trained a SVM classifier to generate a “emo-analysis.pkl” file.
    Now I want to do training with my own dataset. However when I run the training it gives me this error:
    Enter 1 to train and 2 to predict
    1
    Making sets 0
    done
    training SVM linear 0
    /usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
    DeprecationWarning)
    Traceback (most recent call last):
    File “faceDetectionDlib.py”, line 168, in
    main()
    File “faceDetectionDlib.py”, line 135, in main
    clf.fit(npar_train, training_labels)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/svm/base.py”, line 151, in fit
    X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 521, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 424, in check_array
    context))
    ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.
    Would you help me please and guide me how could I solve this error and train the this code using other datasets ????
    Thank you.

    Reply
    • Paul van Gent

      July 30, 2017

      Hi Wiem. The last bit of the error tells what’s going wrong: “ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”. Apparently you’re feeding an empty object to the classifier. Try debugging why no images are loaded (is path correct? can it find files? are permissions set ok?)
      -Paul

      Reply
  • d.Williams

    August 7, 2017

    hey its very useful post,i successfully run your code but i have one problem
    if i add more than one song in excel file,it just play one song i want to know how one by one song play ??
    Thank you

    Reply
    • Paul van Gent

      August 13, 2017

      Simple solution would be to create a .m3u playlist with all songs, and put only that in the excel.
      Or change the code to just load that playlist whenever a specific emotion is detected. The excel is only there to make editing lists easier. Sounds like you don’t need that

      Reply
  • pooneh

    August 25, 2017

    hi paul may i ask you whats the version of your python interpreter and your open_cv? mine is 2.7.5 and 2.4.9 ,respectively which Dont know “Update_Model” module!!!
    thanks alot 🙂

    Reply
    • Paul van Gent

      August 25, 2017

      Take a look at the text as well, it explains how it all works. You need to download a little extra file for this.

      Reply
  • pooneh

    August 25, 2017

    oh! yeah thanks:)

    Reply
  • Thari

    September 26, 2017

    Could you please explain how this part of the code is works?
    face_slicedr = clahe_imager[y:y+h, x:x+w]
    face_slicedr = cv2.resize(face_slicedr, (350, 350))
    face_dictionary[“face%s” %(len(face_dictionary)+1)] = face_slicedr
    predictions = []
    confidence = []
    for x in face_dictionary.keys():
    pred, conf = fishface.predict(face_dictionary[x])
    predictions.append(pred)
    confidence.append(conf)
    recognised_emotionr = emotions[max(set(predictions), key=predictions.count)]
    print(” you’re %s” %recognised_emotionr)

    Reply
    • Paul van Gent

      September 27, 2017

      Hi Thari. The first bit of the code you reference does not seem to occur on the page (did you maybe already adapt it?). Starting from “predictions = []” seems to be the “recognize_emotion()” function. What this does is get all the cropped face images from the created dictionary, predict the emotion on each face, and write down both the emotion prediction and the corresponding confidence.

      Reply
      • Thari

        September 28, 2017

        Hi Paul,
        Yes, I adopted it.
        Thank you for your explanation.
        Could you please further explain how predict the emotion and get corresponding confidence values works or any other source to look for ?
        It takes 3 to 4 seconds to detect a emotion and sometimes it displays the same detected emotion over and over again. why is that ?

        Reply
  • limit

    October 1, 2017

    i am getting this error
    OpenCV Error: Unspecified error (File can’t be opened for writing!) in cv::FaceRecognizer::load, file ..\..\..\..\opencv\modules\contrib\src\facerec.cpp, line 398
    no trained xml file found, please run program with –update flag first
    Traceback (most recent call last):
    File “player.py”, line 18, in
    df = pandas.read_excel(“EmotionLinks.xlsx”) #open Excel file
    File “C:\Python27\lib\site-packages\pandas\io\excel.py”, line 203, in read_excel
    io = ExcelFile(io, engine=engine)
    File “C:\Python27\lib\site-packages\pandas\io\excel.py”, line 232, in __init__
    import xlrd # throw an ImportError if we need to
    ImportError: No module named xlrd

    Reply
    • Paul van Gent

      October 2, 2017

      “python -m pip install xlrd” in terminal or command prompt.
      Make sure you have elevated privileges (run as admin on windows, put “sudo” in front of the command on linux or osx).

      Reply
      • limit

        October 11, 2017

        thanks..

        Reply
  • Ritu

    February 21, 2018

    Helo sir actuly I av no idea about python so the code that you have provided can you send meh as a whole thing together because this I need as my final year project Pls help me

    Reply
    • Paul van Gent

      February 21, 2018

      Hi Ritu. I cannot help you with that. All you need is on the blog. I suggest you follow some basic Python courses so you are able to implement it. A final year project is mainly about learning something new, not copy-pasting a few things together…
      – Paul

      Reply
  • soni

    March 6, 2018

    hello paul
    i m getting error as assertion failed (!empty()) in cv ::cascadeclassifier detectmultiscale…..while
    executing the code to detect face on webcab,may i know the reason for this and how can i resolve this error.
    thank you.

    Reply
  • vilas

    March 7, 2018

    What does that trained_emoclassifier.xml file consists of?
    I am working on the same project but couldnt run the code because of that file error. I dont know what we have to enter in that file.
    Thanks in Advance

    Reply
    • Paul van Gent

      March 21, 2018

      That is a file with the model weights as trained earlier. As mentioned in the tutorial you can run the algorithm in update mode to create one.

      Reply
  • greatultron

    March 9, 2018

    no xml found. Using –update will create one.
    [ INFO:0] Initialize OpenCL runtime…
    OpenCV Error: Bad argument (This Fisherfaces model is not computed yet. Did you call Fisherfaces::train?) in predict, file /io/opencv_contrib/modules/face/src/fisher_faces.cpp, line 137
    Traceback (most recent call last):
    File “main.py”, line 91, in
    recognize_emotion()
    this is what i got , i suspect that there’s is something wrong with the following code
    try:
    fishface.load(“trained_emoclassifier.xml”)
    except:
    print(“no trained xml file found, please run program with –update flag first”)

    Reply
    • Paul van Gent

      March 21, 2018

      Hi Greatultron. You suspect correctly that this is where the program halts. However, as stated in the error message and the tutorial, you need to first run the program with the update flag to create a new dataset of your face.
      – Paul

      Reply
      • greatultron

        April 24, 2018

        thank you so much for the reply, and i resolved my issue, it was due to different version of opencv, i think your’s is 2.x and mine is 3.4, and newer version uses read() instead of load() and write() instead of save() and it now works, and thank you for such a wonderful tutorial

        Reply
        • Paul van Gent

          April 25, 2018

          Glad to have been of help. Happy coding!

          Reply
  • Duc

    March 10, 2018

    do i need to create any folders before executing the code , the code runs but no camera stream , although the light indicate that the webcam is on. Any suggestions ?

    Reply
    • Paul van Gent

      March 21, 2018

      Under “Processing the face” in the tutorial there’s a hint on how you can create this.
      – Paul

      Reply
  • Duc

    March 19, 2018

    Is there any way to build an UI for this, and why we can’t show the webcam while the code is determining our emotions ?

    Reply
    • Paul van Gent

      March 21, 2018

      Yes you could definitely build a U.I. for this. However first the detection accuracy needs to go up much further, especially with subtle real-life expressions (rather than the exaggerated expressions in the datasets used for training).
      You can show the webcam output for the detection using code from the snipper under “Processing the face”. Be aware, however, that a few frames are classified and averaged to come to an evaluation. A ‘live stream’ therefore is of little use, and actually the performance would be insufficient.
      – Paul

      Reply
  • Chetan

    March 27, 2018

    Dear Paul,
    There is xml file in my directory, i have trained and created the xml file but , when i load the xml file using the fishface.load(“trained_emoclassifier.xml”), it goes to the except and prints no xml found and i have given the file location also, even though it prints same error, why the error is happening, i cannot predict my emotion….

    Reply
    • Paul van Gent

      March 27, 2018

      Hi Chetan. Remove the try-except statement to see what is going wrong there. Alternatively catch the exception broadly:
      except Exception as e:
      print("error " + str(e))

      – Paul

      Reply
  • Anon

    March 27, 2018

    [ INFO:0] Initialize OpenCL runtime…
    OpenCV Error: Bad argument (This Fisherfaces model is not computed yet. Did you
    call Fisherfaces::train?) in cv::face::Fisherfaces::predict, file C:\projects\op
    encv-python\opencv_contrib\modules\face\src\fisher_faces.cpp, line 137
    Traceback (most recent call last):
    File “em.py”, line 104, in
    run_detection()
    File “em.py”, line 98, in run_detection
    recognize_emotion()
    File “em.py”, line 69, in recognize_emotion
    pred, conf = fishface.predict(facedict[x])
    cv2.error: C:\projects\opencv-python\opencv_contrib\modules\face\src\fisher_face
    s.cpp:137: error: (-5) This Fisherfaces model is not computed yet. Did you call
    Fisherfaces::train? in function cv::face::Fisherfaces::predict
    I’ve been getting this error how do i resolve it

    Reply
    • Paul van Gent

      March 28, 2018

      The error says it all, there is no training data available in the model. You need to train it first. Take a look at the section “Detecting the emotion in a face”
      – Paul

      Reply
  • abhilash

    March 31, 2018

    Hey Paul,
    this code works great, my following your blog and doing some changes, i could successfully trained it, the dataset is created, but the thing is that after training i’m not getting how to test it, which code to execute?

    Reply
    • Paul van Gent

      April 1, 2018

      Hi Abhilash. You can call the function “run_detection()“, which returns the detected emotion.
      – Paul

      Reply
  • Abhilash Desai

    April 2, 2018

    Tip:
    fishface.load(“trained_emoclassifier.xml”) it should be changed to -> fishface.read(“trained_emoclassifier.xml”) This will solve the problem of loading XML file.
    Hello Paul,
    Training is done, xml file is also loading but the only problem i’m facing is,
    it’s throwing the below error,
    C:\Users\srujan\Desktop\EMP>python final_code.py
    [ INFO:0] Initialize OpenCL runtime…
    OpenCV Error: Bad argument (This Fisherfaces model is not computed yet. Did you
    call Fisherfaces::train?) in cv::face::Fisherfaces::predict, file C:\projects\op
    encv-python\opencv_contrib\modules\face\src\fisher_faces.cpp,

    Reply
  • Abhilash Desai

    April 3, 2018

    hey,
    after changing to fishface.read(“trained_emoclassifier.xml”
    we have to retrain the module,
    then it works absolutely fine.
    Efforts will never go wasted.
    thank you so much for the code Paul.
    -Abhilash

    Reply
    • Paul van Gent

      April 6, 2018

      Hi Abhilash. Good to hear you solved it! That seems to be an update in the OpenCV API. What version are you using? I wrote the tutorial based on 2.4.9., which has no ‘read()’ functionality.
      Good luck with the rest of your project and let me know if you need help 🙂
      – Paul

      Reply
  • Abhilash Desai

    April 8, 2018

    hi Paul,
    i’m using OpenCV: 3.4.0
    -Abhilash

    Reply
    • Paul van Gent

      April 11, 2018

      Ok. I guess they changed the API then.
      – Paul

      Reply
  • Abhilash Desai

    April 11, 2018

    Hi Paul,
    My program is running great, it’s showing recognized emotion, but music player code isn’t running i guess, it doesn’t show any errors either.
    df = pandas.read_excel(“EmotionLinks.xlsx”) #open Excel file
    actions[“angry”] = [x for x in df.angry.dropna()] #We need de dropna() when columns are uneven in length, which creates NaN values at missing places. The OS won’t know what to do with these if we try to open them.
    actions[“happy”] = [x for x in df.happy.dropna()]
    actions[“sad”] = [x for x in df.sad.dropna()]
    actions[“neutral”] = [x for x in df.neutral.dropna()]
    i think it’s not accessing the excel sheet or i don’t know what’s happening.
    i’ve written a code for music player but not getting how to link it with the detected emotion,
    from win32com.client import Dispatch
    from time import sleep
    mp = Dispatch(“WMPlayer.OCX”)
    i=0
    if i==0:
    tune = mp.newMedia(“I:/songs/02_-_Main_Dhoondne_Ko_Zamaane_Mein(wapking.mp3”)
    elif i==1:
    tune = mp.newMedia(“I:/songs/01 – hale dil.mp3”)
    elif i==2:
    tune = mp.newMedia(“I:/songs/03 Vajra Ballalaraya.mp3”)
    mp.currentPlaylist.appendItem(tune)
    mp.controls.play()
    sleep(1)
    mp.controls.playItem(tune)
    # to stop playing use
    input(“Press Enter to stop playing”)
    mp.controls.stop()
    please help me how can i replace those i values to the detected emotions or any other ways to play music.
    thanls in advance.
    -Abhilash

    Reply
    • Paul van Gent

      April 12, 2018

      Hi Abhilash. I recommend you go through the variables step by step. After reading the excel file, is the ‘actions’ dict populated? What happens after the emotion is detected, what variable is it stored and how is it passed? If the function to open the file is called, does something happen then or not?
      Just go over the structure step-by-step, print() the values of variables to see what’s happening. That should help you debug effectively.
      – Paul

      Reply
  • Abhilash Desai

    April 12, 2018

    Hello Paul,
    i tried to figure out the problem, actually the detected emotion wasn’t redirecting to that code that which takes action in playing songs by accessing the excel sheet and i couldn’t resolve it,.:-(
    so i changed the code and added for accessing windows media player and i used prediction values and linked it using if loop,
    and it worked perfect… 🙂
    thanks a lot Paul, i feel like you’ve been a teammate of my project 😛
    i really appreciate your help..
    In Love with Python <3
    -Abhilash

    Reply
  • Luis Ruano

    April 26, 2018

    Hi Paul, thanks for your code and tutorial.
    I have this error:
    fishface.train(training_data, np.asarray(training_labels))
    error: C:\projects\opencv-python\opencv\modules\core\src\matrix.cpp:436: error: (-215) u != 0 in function cv::Mat::create
    I already look for it and it has to been something because of the size. I am not quite sure, because the error doesn’t say it has to be with memory issue. This is when the Update_Model try to train the emotions.
    The training set is around 700 images.
    Thanks for the answer

    Reply
    • Paul van Gent

      April 26, 2018

      Yes that tutorial was never meant to grow to a large dataset. The best thing to do is to re-write so that the dataset is no longer represented in the dict{} object. Just read each image as needed and update the model weights with it.
      – Paul

      Reply
      • Luis Ruano

        May 5, 2018

        Thanks for you answer Paul.
        But updating to much the model would change a lot the weight of every clasification. Isn’t that going to make the algorith less precise ?.
        I will try that anyways. I will post my results later. Thanks Paul

        Reply
        • Paul van Gent

          May 10, 2018

          Hi Luis,
          It depends on your use case. Yes, if the model needs to generalise between a lot of people it will be less precies than when you train it on just your own face. This is because in the latter case there is a lot less variance to account for.
          – Paul

          Reply
  • QWERTY

    May 22, 2018

    HI Paul,
    could you please explain how to calculate the accuracy of the trained model

    Reply
    • Paul van Gent

      May 23, 2018

      You could make a test set with various images of emotions, call fishface.predict() on each, note down the result, and compare to the actual labels of your images.
      -Paul

      Reply
  • Niraj

    June 21, 2018

    Hey Paul! Thanks for sharing an amazing application. I guess the link to Update_Model is broken. Can you fix it.?

    Reply
  • Niraj

    June 21, 2018

    Hey Paul i guess the link to Update_Model.py is broken can you fix it. Thanks for sharing an amazing Application.

    Reply
    • Paul van Gent

      June 22, 2018

      I’ve re-uploaded the file. It should work again.

      Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.