Making an Emotion-Aware Music Player

Banner2

A sequel to a previous post about emotion recognition. Let’s apply the emotion recognition model and build a music player that will play songs fitting to your mood, for that extra death metal when you’re pissed, and some Grieg when you’re happy.


Read this: The code in this tutorial is licensed under the GNU 3.0 open source license and you are free to modify and redistribute the code, given that you give others you share the code with the same right, and cite my name (use citation format below). You are not free to redistribute or modify the tutorial itself in any way. By reading on you agree to these terms. If you disagree, please navigate away from this page.

Citation format
van Gent, P. (2016). Emotion Recognition With Python, OpenCV and a Face Dataset. A tech blog about fun things with Python and embedded electronics. Retrieved from: http://www.paulvangent.com/2016/06/30/making-an-emotion-aware-music-player/

IE users: I’ve gotten several reports that sometimes the code blocks don’t display correctly or at all on Internet Explorer. Please refresh the page and they should display fine.


Getting Started
You get home from work, power up your laptop and ask it to play you some music. The computer knows which music you like because it houses your music library, so that goes well. But, you just had a tough day at work and are a still bit pissed, so from your entire music library you really would only like to hear a subset (that one death metal song). It could also be that you’re over the moon and want to hear happy music. It would be cool to do this automatically, so let’s give this a go!

We have already briefly looked at how to recognize emotions in a previous post. Here we found that, at least from a diverse dataset of facial images, it was possible to generate a generalizable model to classify “anger”, “happiness” and “sadness” emotions from each other with about 77.2% accuracy, using a model trained on images from diverse people. With a bigger dataset this accuracy is likely to increase, and when replaced or augmented by facial images of your own emotions this will increase quite a bit further. First let’s think for a bit about what we really need. We need at least to be able to:

  • detect a face on the webcam and pre-process the image of the face;
  • grab some images of your face and dynamically update the model over time;
  • detect the emotion on your face;
  • pick a random song linked to that emotion and play it.

That seems quite manable, so let’s see if we can cook up something!

 


Detecting your face on the webcam
First let’s write a script that can detect faces on your webcam stream. This has been done a million times already so we won’t dwell on it too long. I’ve commented the code for those who may be less familiar with opencv or this boilerplate-type code.

We will use a pre-trained HAAR classifier supplied with OpenCV. Get it from your OpenCV folder in “opencv\sources\data\haarcascades\”, it’s called ‘haarcascade_frontalface_default.xml’.

Do it like this:

~

import cv2
import numpy as np

video_capture = cv2.VideoCapture(0) #the default webcam (0, the first webcam device detected). Change if you have more than one webcam connected and want to use another one than the default one
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml") #load the trained classifier model

while True:
    ret, frame = video_capture.read() #Grab frame from webcam. Ret is 'true' if the frame was successfully grabbed.
    
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #Convert image to grayscale to improve detection speed and accuracy

    #Run classifier on frame
    face = facecascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=20, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)

    for (x, y, w, h) in face: #Draw rectangle around detected faces
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 0, 255), 2) #draw it on the colour image "frame", with arguments: (coordinates), (size), (RGB color), line thickness 2

    cv2.imshow("webcam", frame) #Display frame
    
    if cv2.waitKey(1) & 0xFF == ord('q'): #imshow expects a termination definition in order to work correctly, here it is bound to key 'q'
        break

 

If all goes well this should result in a red rectangle around all faces visible on the webcam feed. However, in my case there was strong sunlight coming from the side from a window, which created high contrasts and strong shadows on my face that hampered detection. A possible solution is to equalize the histogram a bit using an adaptive equalization technique. Adding this to the code we get something like:

~

import cv2
import numpy as np

video_capture = cv2.VideoCapture(0)
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")

while True:
    ret, frame = video_capture.read() 
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)) #Create CLAHE object
    clahe_image = clahe.apply(gray) #Apply CLAHE to grayscale image from webcam

    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=10, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    for (x, y, w, h) in face: #Draw rectangle around detected faces
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 0, 255), 2) #draw it on "frame", (coordinates), (size), (RGB color), thickness 2
    cv2.imshow("webcam", frame) #Display frame
    if cv2.waitKey(1) & 0xFF == ord('q'): #imshow expects a termination definition to work correctly, here it is bound to key 'q'
        break

 


Processing the face
Great, we can detect faces! Before asking the classifier what emotion the face is displaying we need to crop and standardize it. The code I use in this tutorial is an adaptation of the one from the previous emotion recognition post. This extracts the face from the webcam stream and crops it, like this:

~

import cv2
import numpy as np

video_capture = cv2.VideoCapture(0)
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")

def crop_face(gray, face): #Crop the given face
    for (x, y, w, h) in face:
        faceslice = gray[y:y+h, x:x+w]
    return faceslice

while True:
    ret, frame = video_capture.read() #Grab frame from webcam. Ret is 'true' if the frame was successfully grabbed.
    
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #Convert image to grayscale to improve detection speed and accuracy
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)

    #Run classifier on frame
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)

    for (x, y, w, h) in face: #Draw rectangle around detected faces
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 0, 255), 2) #draw it on "frame", (coordinates), (size), (RGB color), thickness 2

    if len(face) == 1: #Use simple check if one face is detected, or multiple (measurement error unless multiple persons on image)
        faceslice = crop_face(gray, face) #slice face from image
        cv2.imshow("detect", faceslice) #display sliced face
    else:
        print("no/multiple faces detected, passing over frame")

    cv2.imshow("webcam", frame) #Display frame
    
    
    if cv2.waitKey(1) & 0xFF == ord('q'): #imshow expects a termination definition to work correctly, here it is bound to key 'q'
        break

 

If all works well you should get two streams; one with the webcam stream, another one with a cropped black&white face. This cropped window updates only when a face is succesfully detected in the webcam stream.

To predict the emotion accurately we might want to have more than one facial image. Blurred images can be an error source (especially in low light conditions). Another source is incorrect classification on a good image. Assuming that at least some of these detection errors are randomly distributed through the detection results, averaging classification over multiple images (let’s say: 10) will improve results without much added hassle or much extra code, and it takes care of the blurred image problem too (unless the majority of the images are blurred..). Just store the detected faces in a dict{} object and test against a length criterion. Also adapt the crop_face() function, like this:

~

facedict = {} #Create face dictionary

def crop_face(gray, face):
    for (x, y, w, h) in face:
        faceslice = gray[y:y+h, x:x+w]
    facedict["face%s" %(len(facedict)+1)] = faceslice #append sliced face as a numbered face to the dictionary
    return faceslice
    
#At te end of the file, in the 'while True' loop, add a simple stop criterion:
if len(facedict) == 10:
    break #This will stop the program once 10 faces are collected.

 

This should result in a dictionary object containing the pixel data of 10 detected faces. Great! The infrastructure of our music player is mostly done. Now let’s look at predicting emotions from the collected facial images.

 


Detecting the emotion in a face
To detect the actual emotion on your face we could use a generalized model, but a better solution might be an individual one. Models trained on a single individual work much better when used on the same individual, often because in that case there is less variance between the data (here:facial features). If we minimise the variance by keeping the face the same, most of the detected differences will be due to the fact that a different emotion is expressed. You need to collect varied images of yourself to make this robust. Let’s expand the webcam script a bit to accomodate this process, use a flag (“–update”) and use argparse to parse it. This makes it easy to change the mode our program runs in (‘update mode’ or ‘normal mode’).

Also let’s restructure the main code so that it’s all neatly in functions:

~

import cv2
import numpy as np
import argparse
import time
import glob
import os
import Update_Model

video_capture = cv2.VideoCapture(0)
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
fishface = cv2.createFisherFaceRecognizer()
try:
    fishface.load("trained_emoclassifier.xml")
except:
    print("no xml found. Using --update will create one.")
parser = argparse.ArgumentParser(description="Options for the emotion-based music player") #Create parser object
parser.add_argument("--update", help="Call to grab new images and update the model accordingly", action="store_true") #Add --update argument
args = parser.parse_args() #Store any given arguments in an object

facedict = {}
emotions = ["angry", "happy", "sad", "neutral"]

def crop_face(clahe_image, face):
    for (x, y, w, h) in face:
        faceslice = clahe_image[y:y+h, x:x+w]
        faceslice = cv2.resize(faceslice, (350, 350))
    facedict["face%s" %(len(facedict)+1)] = faceslice
    return faceslice

def update_model(emotions):
    print("Model update mode active")
    check_folders(emotions)
    for i in range(0, len(emotions)):
        save_face(emotions[i])
    print("collected images, looking good! Now updating model...")
    Update_Model.update(emotions)
    print("Done!")

def check_folders(emotions): #check if folder infrastructure is there, create if absent
    for x in emotions:
        if os.path.exists("dataset\\%s" %x):
            pass
        else:
            os.makedirs("dataset\\%s" %x)

def save_face(emotion):
    print("\n\nplease look " + emotion + " when the timer expires and keep the expression stable until instructed otherwise.")
    for i in range(0,5):#Timer to give you time to read what emotion to express
        print(5-i)
        time.sleep(1)
    while len(facedict.keys()) < 16: #Grab 15 images for each emotion
        detect_face()
    for x in facedict.keys(): #save contents of dictionary to files
        cv2.imwrite("dataset\\%s\\%s.jpg" %(emotion, len(glob.glob("dataset\\%s\\*" %emotion))), facedict[x])
    facedict.clear() #clear dictionary so that the next emotion can be stored

def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    print("I think you're %s" %emotions[max(set(predictions), key=predictions.count)])

def grab_webcamframe():
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)
    return clahe_image

def detect_face():
    clahe_image = grab_webcamframe()
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    if len(face) == 1: 
        faceslice = crop_face(clahe_image, face)
        return faceslice
    else:
        print("no/multiple faces detected, passing over frame")

while True:
    detect_face()
    if args.update: #If update flag is present, call update function
        update_model(emotions)
        break
    elif len(facedict) == 10: #otherwise it's regular a runtime, continue normally with emotion detection functionality
        recognize_emotion()
        break

 

I’ve streamlined the model training code from the previous tutorial a bit, download the Update_Model.py file and put it in the same folder as the main file you’re working on. Follow the previous tutorial if you want more details on how this works.

Once the file is downloaded, call the main program in “update mode” like this:

~

#To activate update mode, pass the --update flag
python <filename.py> --update

#To get help, use the -h or --help flag
python <filename.py> -h
python <filename.py> --help

 

This will present you with instructions on what emotion to express. The collected images are appended to the raw files and the model automatically updates itself. Extra emotions are also automatically added if you update the ’emotions’ list at the top of the file. Cool! Now you can try and see if you can detect “resting bitch face” as an emotion.

Note that if you train and test right away, you will likely get high accuracy. However, running it a day later might give less good results. Why? Mostly because a day later there is a big chance that there are more differences than just the emotions, which can throw off the classifier. Remember that the classifier looks for differences in patterns, not emotions. The fact that we can differentiate between emotions at all is because different emotions induce predictable differences in patterns in the pixeldata of the images. So, the next day your hair may be different, you may be in a different room if you use a laptop, the angle of your face to the webcam may be different, the lighting conditions may be different, a combination of these may play a part, or there may be other reasons such as unicorns. This all introduces extra variance in the pixeldata that is not related to the actual emotions.

The best way to counter this and get robust performance, is to add images for each emotion daily for about a week and re-train the model each time, just so you get a well-rounded representation with different types of images representing the same emotion. Train the model in different rooms you frequently work if you have a laptop. After some time the variance induced by day-to-day variation will become less relevant to the model. In other words, it will learn what features are irrelevant and have nothing to do with emotion, and learn to ignore these.

So be sure to use the ‘–update’ flag and follow instructions daily for about a week or until the performance is satisfactory.

 


Selecting and playing the right music for you
Now that we have successfully detected the emotion on your face, and did some basic error reduction, playing the right music is actually not that hard. In this case the best way to open music files would be to use the infrastructure available on the target computers on which the software will be run, rather than some python module that can open mp3’s. Using modules might require extra installation on the target computer, and generally offers less (friendly) control to the user in what happens after the file has been opened. We can use os.startfile(), but that is windows-only, and although that is the largest eco-system by far, it’s not very nice to Apple or Linux users. Using a suggetion here, a simple solution to open files on Windows, Apple and Linux platforms all-in-one is implemented in the open_stuff() function.

~


import sys, os, subprocess

def open_stuff(filename): #Open the file, credit to user4815162342, on the stackoverflow link in the text above
    if sys.platform == "win32":
        os.startfile(filename)
    else:
        opener ="open" if sys.platform == "darwin" else "xdg-open"
        subprocess.call([opener, filename])
 

 

To allow easy customization of the music attached to emotions we will use this excel file to easily add music files to open. In the columns you can put relative or absolute paths to the files or links you want opened if a certain emotion is detected. Use backslashes and not forward slashes for the folder structure.

~

import pandas, random

df = pandas.read_excel("EmotionLinks.xlsx") #open Excel file
actions["angry"] = [x for x in df.angry.dropna()] #We need de dropna() when columns are uneven in length, which creates NaN values at missing places. The OS won't know what to do with these if we try to open them.
actions["happy"] = [x for x in df.happy.dropna()]
actions["sad"] = [x for x in df.sad.dropna()]
actions["neutral"] = [x for x in df.neutral.dropna()]

#And we alter recognize_emotion() to retrieve the appropriate action list and pick a random item:
def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    recognized_emotion = emotions[max(set(predictions), key=predictions.count)]
    print("I think you're %s" %recognized_emotion)
    actionlist = [x for x in actions[recognized_emotion]] #<----- get list of actions/files for detected emotion
    random.shuffle(actionlist) #<----- Randomly shuffle the list
    open_stuff(actionlist[0]) #<----- Open the first entry in the list

 

Some may have noticed that this code is actually not restricted to music files at all. Because we use a general “open file” method, we basically let the OS figure out how to open what we give it. This gives a very cool (at least I think it is) added benefit: we can open anything the os knows how to handle, so not just music but also images, video’s, documents, links to webpages (youtube!), scripts, video games, etc. Suddenly the emotion-based music player can also be configured to cheer you up by, for example, changing your desktop wallpaper to happy images if you’re sad, if you’re angry it may change it to calming images, if you’re happy you can get a unicorn-overload if you so desire. It may even detect you want to let off some steam and fire up Killing Floor 2 or a similar game.

The final code will look something like this:

~

import cv2, numpy as np, argparse, time, glob, os, sys, subprocess, pandas, random, Update_Model, math

#Define variables and load classifier
camnumber = 0
video_capture = cv2.VideoCapture()
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
fishface = cv2.createFisherFaceRecognizer()
try:
    fishface.load("trained_emoclassifier.xml")
except:
    print("no trained xml file found, please run program with --update flag first")
parser = argparse.ArgumentParser(description="Options for the emotion-based music player")
parser.add_argument("--update", help="Call to grab new images and update the model accordingly", action="store_true")
args = parser.parse_args()
facedict = {}
actions = {}
emotions = ["angry", "happy", "sad", "neutral"]
df = pandas.read_excel("EmotionLinks.xlsx") #open Excel file
actions["angry"] = [x for x in df.angry.dropna()] #We need de dropna() when columns are uneven in length, which creates NaN values at missing places. The OS won't know what to do with these if we try to open them.
actions["happy"] = [x for x in df.happy.dropna()]
actions["sad"] = [x for x in df.sad.dropna()]
actions["neutral"] = [x for x in df.neutral.dropna()]

def open_stuff(filename): #Open the file, credit to user4815162342, on the stackoverflow link in the text above
    if sys.platform == "win32":
        os.startfile(filename)
    else:
        opener ="open" if sys.platform == "darwin" else "xdg-open"
        subprocess.call([opener, filename])

def crop_face(clahe_image, face):
    for (x, y, w, h) in face:
        faceslice = clahe_image[y:y+h, x:x+w]
        faceslice = cv2.resize(faceslice, (350, 350))
    facedict["face%s" %(len(facedict)+1)] = faceslice
    return faceslice

def update_model(emotions):
    print("Model update mode active")
    check_folders(emotions)
    for i in range(0, len(emotions)):
        save_face(emotions[i])
    print("collected images, looking good! Now updating model...")
    Update_Model.update(emotions)
    print("Done!")

def check_folders(emotions):
    for x in emotions:
        if os.path.exists("dataset\\%s" %x):
            pass
        else:
            os.makedirs("dataset\\%s" %x)

def save_face(emotion):
    print("\n\nplease look " + emotion + ". Press enter when you're ready to have your pictures taken")
    raw_input() #Wait until enter is pressed with the raw_input() method
    video_capture.open(camnumber)
    while len(facedict.keys()) < 16:
        detect_face()
    video_capture.release()
    for x in facedict.keys():
        cv2.imwrite("dataset\\%s\\%s.jpg" %(emotion, len(glob.glob("dataset\\%s\\*" %emotion))), facedict[x])
    facedict.clear() 
    
def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    recognized_emotion = emotions[max(set(predictions), key=predictions.count)]
    print("I think you're %s" %recognized_emotion)
    actionlist = [x for x in actions[recognized_emotion]] #get list of actions/files for detected emotion
    random.shuffle(actionlist) #Randomly shuffle the list
    open_stuff(actionlist[0]) #Open the first entry in the list

def grab_webcamframe():
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)
    return clahe_image

def detect_face():
    clahe_image = grab_webcamframe()
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    if len(face) == 1: 
        faceslice = crop_face(clahe_image, face)
        return faceslice
    else:
        print("no/multiple faces detected, passing over frame")

def run_detection():
    while len(facedict) != 10:
        detect_face()
    recognize_emotion()

if args.update:
    update_model(emotions)
else:
    video_capture.open(camnumber)
    run_detection()

 

Note that I have also added another flag: “–retrain”. This is for when you manually add photo’s to the training directory. This flag re-trains that model based on everything it finds in these folders. Another few changes are made, for example the timer in the update mode has been replaced by a prompt asking you to press enter when you’re ready to have your picture taken. This gives the user a bit more control on when he/she is ready for it.

 


Bonus section: changing wallpaper to balance mood
Beforehand, note that this functionality is windows-only (although making a linux or osx version should not be difficult). One use-case can be that you want to have your desktop wallpaper adapt to your mood, either to reinforce positive moods, or lessen negative moods. Using the above model (with a well-trained classifier) and a few small adaptations something can be cooked up. Here is an example that does a few things:

  • It adds a timer function that checks with settable intervals what your emotion is;
  • It checks the folder containing wallpapers fitting the detected mood, and randomly picks one;
  • It calls a batch file that does the actual changing of the wallpaper.

Store your wallpaper files in a folder structure “wallpapers\emotion”. Make sure the emotions are similar to the ones in the code (angry, happy, sad, neutral), so that you end up with “wallpapers\angry”, “wallpapers\happy”, etc. Put the wallpapers that you want to see if you’re in a particular mood in these folders. We add three functions to make it work (wallpaper_timer(), change_wallpaper(), and setwallpaperWithCtypes()), change a few others and add another flag for argparse to listen to: “–wallpaper”. The full code will then look something like this:

~

import cv2, numpy as np, argparse, time, glob, os, sys, subprocess, pandas, random, Update_Model, math, ctypes, win32con

#Define variables and load classifier
camnumber = 0
video_capture = cv2.VideoCapture()
facecascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
fishface = cv2.createFisherFaceRecognizer()
try:
    fishface.load("trained_emoclassifier.xml")
except:
    print("no trained xml file found, please run program with --update flag first")
parser = argparse.ArgumentParser(description="Options for the emotion-based music player")
parser.add_argument("--update", help="Call to grab new images and update the model accordingly", action="store_true")
parser.add_argument("--retrain", help="Call to re-train the the model based on all images in training folders", action="store_true") #Add --update argument
parser.add_argument("--wallpaper", help="Call to run the program in wallpaper change mode. Input should be followed by integer for how long each change cycle should last (in seconds)", type=int) #Add --update argument
args = parser.parse_args()
facedict = {}
actions = {}
emotions = ["angry", "happy", "sad", "neutral"]
df = pandas.read_excel("EmotionLinks.xlsx") #open Excel file
actions["angry"] = [x for x in df.angry.dropna()] #We need de dropna() when columns are uneven in length, which creates NaN values at missing places. The OS won't know what to do with these if we try to open them.
actions["happy"] = [x for x in df.happy.dropna()]
actions["sad"] = [x for x in df.sad.dropna()]
actions["neutral"] = [x for x in df.neutral.dropna()]

def open_stuff(filename): #Open the file, credit to user4815162342, on the stackoverflow link in the text above
    if sys.platform == "win32":
        os.startfile(filename)
    else:
        opener ="open" if sys.platform == "darwin" else "xdg-open"
        subprocess.call([opener, filename])

def crop_face(clahe_image, face):
    for (x, y, w, h) in face:
        faceslice = clahe_image[y:y+h, x:x+w]
        faceslice = cv2.resize(faceslice, (350, 350))
    facedict["face%s" %(len(facedict)+1)] = faceslice
    return faceslice

def update_model(emotions):
    print("Model update mode active")
    check_folders(emotions)
    for i in range(0, len(emotions)):
        save_face(emotions[i])
    print("collected images, looking good! Now updating model...")
    Update_Model.update(emotions)
    print("Done!")

def check_folders(emotions):
    for x in emotions:
        if os.path.exists("dataset\\%s" %x):
            pass
        else:
            os.makedirs("dataset\\%s" %x)

def save_face(emotion):
    print("\n\nplease look " + emotion + ". Press enter when you're ready to have your pictures taken")
    raw_input() #Wait until enter is pressed with the raw_input() method
    video_capture.open(camnumber)
    while len(facedict.keys()) < 16:
        detect_face()
    video_capture.release()
    for x in facedict.keys():
        cv2.imwrite("dataset\\%s\\%s.jpg" %(emotion, len(glob.glob("dataset\\%s\\*" %emotion))), facedict[x])
    facedict.clear() 
    
def recognize_emotion():
    predictions = []
    confidence = []
    for x in facedict.keys():
        pred, conf = fishface.predict(facedict[x])
        cv2.imwrite("images\\%s.jpg" %x, facedict[x])
        predictions.append(pred)
        confidence.append(conf)
    recognized_emotion = emotions[max(set(predictions), key=predictions.count)]
    print("I think you're %s" %recognized_emotion)
    return recognized_emotion

def grab_webcamframe():
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)
    return clahe_image

def detect_face():
    clahe_image = grab_webcamframe()
    face = facecascade.detectMultiScale(clahe_image, scaleFactor=1.1, minNeighbors=15, minSize=(10, 10), flags=cv2.CASCADE_SCALE_IMAGE)
    if len(face) == 1: 
        faceslice = crop_face(clahe_image, face)
        return faceslice
    else:
        print("no/multiple faces detected, passing over frame")

def run_detection():
    while len(facedict) != 10:
        detect_face()
    recognized_emotion = recognize_emotion()
    return recognized_emotion

def wallpaper_timer(seconds):
    video_capture.release()
    time.sleep(int(seconds))
    video_capture.open(camnumber)
    facedict.clear()

def change_wallpaper(emotion):
    files = glob.glob("wallpapers\\%s\\*.bmp" %emotion)
    current_dir = os.getcwd()
    random.shuffle(files)
    file = "%s\%s" %(current_dir, files[0])
    setWallpaperWithCtypes(file)

def setWallpaperWithCtypes(path): #Taken from http://www.blog.pythonlibrary.org/2014/10/22/pywin32-how-to-set-desktop-background/
    cs = ctypes.c_buffer(path)
    ok = ctypes.windll.user32.SystemParametersInfoA(win32con.SPI_SETDESKWALLPAPER, 0, cs, 0)

if args.update:
    update_model(emotions)
elif args.retrain:
    Update_Model.update(emotions)
elif args.wallpaper:
    cycle_time = args.wallpaper
    while True:
        wallpaper_timer(cycle_time)
        recognized_emotion = run_detection()
        change_wallpaper(recognized_emotion)
else:
    video_capture.open(camnumber)
    recognized_emotion = run_detection()
    actionlist = [x for x in actions[recognized_emotion]] #get list of actions/files for detected emotion
    random.shuffle(actionlist) #Randomly shuffle the list

 

Call it with the ‘–wallpaper’ flag. Note that you also need to give an integer with the wallpaper flag, like so:

~

python <filename.py> --wallpaper 300 #this will update the wallpaper every 5 minutes (300 sec)

 


Wrapping up
So far we have come quite a way and learned some pretty cool things. You can now detect faces on a webcam stream (or video file actually, by changing a line or 2), extract them, preprocess them, predict emotions on them, dynamically update the detection model and execute emotion-related programs/actions.

In my tests, after three days of training three times a day (once in the morning, afternoon and evening) the model only made a single mistake in 70 trials. I did make my expressions rather clear. If you want the model to recognize more subtle expressions this is also very, very possible, but you might get a few more mistakes. Keep in mind that in this case the training material that you supply should also be more subtle. It will also likely cost more training runs before it will get most of your more subtle expressions right.

As always, you are free to use the code in this tutorial within the restrictions of the license mentioned at the top. Please also let me know what cool things you’ve made! I hope you enjoyed this tutorial.


29 Comments

  • Bas

    20th June 2016

    You should add voice recognition in there somewhere. “Computer, what video game am I in the mood for?”

    Reply
    • Paul van Gent

      20th June 2016

      Actually I’m working on something similar. Keep an eye on the blog!

      Reply
  • Junior

    23rd August 2016

    Hie Paul.
    Thank you for the tutorial. I do not have background in machine learning but i have some understanding of programming and i was able to understand the flow of your program. However, i encountered an error when i tried to resize the image captured from the video. It says src is not numpy array or a scalar.

    Reply
    • Paul van Gent

      23rd August 2016

      Hi Artwel, you could try recasting the img as an array using numpy.asarray(). The documentation here explains how the function works. You could also mail me your code and I’ll have a look.

      Reply
      • Junior

        23rd August 2016

        Ok. I will have a look.
        Thank you

        Reply
  • Stranger

    19th September 2016

    You need to update the names of emotions if you want to be consonant with previous article. There were anger not angry and sadness not sad

    Reply
    • Paul van Gent

      19th September 2016

      Hi Stranger,
      That is correct. However the tutorial is separate from the previous one in terms of code, apart from using the same technique. You could actually use any labels you desire!

      Reply
  • bilal rafique

    4th April 2017

    Paul,
    when i run the code of “Detecting the emotion in a face”, i get this error
    no xml found. Using –update will create one.

    Traceback (most recent call last):
    File “F:\In Shaa Allah\Ex3.py”, line 89, in
    recognize_emotion()
    File “F:\In Shaa Allah\Ex3.py”, line 61, in recognize_emotion
    pred, conf = fishface.predict(facedict[x])
    error: ..\..\..\..\opencv\modules\contrib\src\facerec.cpp:620: error: (-5) This Fisherfaces model is not computed yet. Did you call Fisherfaces::train? in function cv::Fisherfaces::predict

    Reply
    • Paul van Gent

      8th April 2017

      Please read the tutorial and code comments, explanation of the function is in there. Don’t just paste the code!

      Reply
  • Silver

    1st May 2017

    Hi Paul.
    Thank you for the tutorial. I’m sorry to bother you, as my experience in programming is actually poor, even i have read this tutorial several times, i still have some problems.

    1. could you please tell me what IDE you are using?
    2. where should i put these codes? in the “update_model” or the main program?

    #To activate update mode, pass the –update flag
    python –update

    #To get help, use the -h or –help flag
    python -h
    python –help

    Thank you!

    Reply
  • Silver

    2nd May 2017

    Hi Paul.
    Thank you for the tutorial. I’m sorry to bother you, as my experience in programming is actually poor, even i have read this tutorial several times, i still have some problems.

    1. could you please tell me what IDE you are using?
    2. where should i put these codes? in the “update_model” or the main program?

    #To activate update mode, pass the –update flag
    python –update

    #To get help, use the -h or –help flag
    python -h
    python –help

    Thank you!

    Reply
    • Paul van Gent

      4th May 2017

      Hi Silver. I use Visual Studio 2015, although you could use many different IDE’s, so try a few and see what you like. All the code goes in the same folder. However, if you’re still new to programming I would recommend you try a few more basic Python tutorials first, as this one assumes intermediate Python knowledge.

      Reply
  • mostafa

    13th May 2017

    what is “import Update_Model “

    Reply
  • Xcross

    24th July 2017

    Sir,please send me the source code.

    Reply
    • Paul van Gent

      24th July 2017

      All you need is on here. Just follow the tutorial and you should be fine :).

      Reply
      • Xcross

        24th July 2017

        i follow all the steps … but some how i lost my working file..
        sir , if you give me the source file then it will be very helpful for me.. please sir

        Reply
  • wiem

    30th July 2017

    Dear Paul van Gent,
    Hello ! thanks alot for sharing your code. I studied your work in emotion-analysis. I’m working too on facial expressions using SVM, as I see your code extract face landmarks using DLIB and train a multi-class SVM classifier to recognize facial expressions (emotions). After that you trained a SVM classifier to generate a “emo-analysis.pkl” file.
    Now I want to do training with my own dataset. However when I run the training it gives me this error:

    Enter 1 to train and 2 to predict
    1
    Making sets 0
    done
    training SVM linear 0
    /usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
    DeprecationWarning)
    Traceback (most recent call last):
    File “faceDetectionDlib.py”, line 168, in
    main()
    File “faceDetectionDlib.py”, line 135, in main
    clf.fit(npar_train, training_labels)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/svm/base.py”, line 151, in fit
    X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 521, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 424, in check_array
    context))
    ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.
    Would you help me please and guide me how could I solve this error and train the this code using other datasets ????
    Thank you.

    Reply
    • Paul van Gent

      30th July 2017

      Hi Wiem. The last bit of the error tells what’s going wrong: “ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”. Apparently you’re feeding an empty object to the classifier. Try debugging why no images are loaded (is path correct? can it find files? are permissions set ok?)
      -Paul

      Reply
  • d.Williams

    7th August 2017

    hey its very useful post,i successfully run your code but i have one problem
    if i add more than one song in excel file,it just play one song i want to know how one by one song play ??
    Thank you

    Reply
    • Paul van Gent

      13th August 2017

      Simple solution would be to create a .m3u playlist with all songs, and put only that in the excel.

      Or change the code to just load that playlist whenever a specific emotion is detected. The excel is only there to make editing lists easier. Sounds like you don’t need that

      Reply
  • pooneh

    25th August 2017

    hi paul may i ask you whats the version of your python interpreter and your open_cv? mine is 2.7.5 and 2.4.9 ,respectively which Dont know “Update_Model” module!!!
    thanks alot 🙂

    Reply
    • Paul van Gent

      25th August 2017

      Take a look at the text as well, it explains how it all works. You need to download a little extra file for this.

      Reply
  • pooneh

    25th August 2017

    oh! yeah thanks:)

    Reply
  • Thari

    26th September 2017

    Could you please explain how this part of the code is works?

    face_slicedr = clahe_imager[y:y+h, x:x+w]
    face_slicedr = cv2.resize(face_slicedr, (350, 350))
    face_dictionary[“face%s” %(len(face_dictionary)+1)] = face_slicedr
    predictions = []
    confidence = []
    for x in face_dictionary.keys():
    pred, conf = fishface.predict(face_dictionary[x])
    predictions.append(pred)
    confidence.append(conf)
    recognised_emotionr = emotions[max(set(predictions), key=predictions.count)]
    print(” you’re %s” %recognised_emotionr)

    Reply
    • Paul van Gent

      27th September 2017

      Hi Thari. The first bit of the code you reference does not seem to occur on the page (did you maybe already adapt it?). Starting from “predictions = []” seems to be the “recognize_emotion()” function. What this does is get all the cropped face images from the created dictionary, predict the emotion on each face, and write down both the emotion prediction and the corresponding confidence.

      Reply
      • Thari

        28th September 2017

        Hi Paul,

        Yes, I adopted it.
        Thank you for your explanation.
        Could you please further explain how predict the emotion and get corresponding confidence values works or any other source to look for ?
        It takes 3 to 4 seconds to detect a emotion and sometimes it displays the same detected emotion over and over again. why is that ?

        Reply
  • limit

    1st October 2017

    i am getting this error
    OpenCV Error: Unspecified error (File can’t be opened for writing!) in cv::FaceRecognizer::load, file ..\..\..\..\opencv\modules\contrib\src\facerec.cpp, line 398
    no trained xml file found, please run program with –update flag first
    Traceback (most recent call last):
    File “player.py”, line 18, in
    df = pandas.read_excel(“EmotionLinks.xlsx”) #open Excel file
    File “C:\Python27\lib\site-packages\pandas\io\excel.py”, line 203, in read_excel
    io = ExcelFile(io, engine=engine)
    File “C:\Python27\lib\site-packages\pandas\io\excel.py”, line 232, in __init__
    import xlrd # throw an ImportError if we need to
    ImportError: No module named xlrd

    Reply
    • Paul van Gent

      2nd October 2017

      “python -m pip install xlrd” in terminal or command prompt.

      Make sure you have elevated privileges (run as admin on windows, put “sudo” in front of the command on linux or osx).

      Reply
      • limit

        11th October 2017

        thanks..

        Reply

Leave a Reply