Emotion Recognition using Facial Landmarks, Python, DLib and OpenCV

Banner

Let’s improve on the emotion recognition from a previous article about FisherFace Classifiers. We will be using facial landmarks and a machine learning algorithm, and see how well we can predict emotions in different individuals, rather than on a single individual like in another article about the emotion recognising music player.


Important: The code in this tutorial is licensed under the GNU 3.0 open source license and you are free to modify and redistribute the code, given that you give others you share the code with the same right, and cite my name (use citation format below). You are not free to redistribute or modify the tutorial itself in any way. By reading on you agree to these terms. If you disagree, please navigate away from this page.

Troubleshooting: I assume intermediate knowledge of Python for these tutorials. If you don’t have this, please try a few more basic tutorials first or follow an entry-level course on coursera or something similar. This also means you know how to interpret errors. Don’t panic but first read the thing, google if you don’t know the solution, only then ask for help. I’m getting too many emails and requests over very simple errors. Part of learning to program is learning to debug on your own as well. If you really can’t figure it out, let me know.

Citation format
van Gent, P. (2016). Emotion Recognition Using Facial Landmarks, Python, DLib and OpenCV. A tech blog about fun things with Python and embedded electronics. Retrieved from: http://www.paulvangent.com/2016/08/05/emotion-recognition-using-facial-landmarks/

Protected by Copyscape

IE users: I’ve gotten several reports that sometimes the code blocks don’t display correctly or at all on Internet Explorer. Please refresh the page and they should display fine.


Introduction and getting started
Using Facial Landmarks is another approach to detecting emotions, more robust and powerful than the earlier used fisherface classifier, but also requiring some more code and modules. Nothing insurmountable though. We need to do a few things:

  • Get images from a webcam
  • Detect Facial Landmarks
  • Train a machine learning algorithm (we will use a linear SVM)
  • Predict emotions

Those who followed the two previous posts about emotion recognition will know that the first step is already done.

Also we will be using:

  • Python (2.7 or higher is fine, anaconda + jupyter notebook is a nice combo-package)
  • OpenCV (I still use 2.4.9……so lazy, grab here)
  • SKLearn (if you installed anaconda, it is already there, otherwise get it with pip install sklearn)
  • Dlib (a C++ library for extracting the facial landmarks, see below for instructions)
  • Visual Studio 2015 (get the community edition here, also select the Python Tools and the Common tools for visual c++ in the installation dialog)

Installing and building the required libraries

I am on Windows, and building libraries on Windows always gives many people a bad taste in their mouths. I can understand why, however it’s not all bad and often the problems people run into are either solved by correctly setting PATH variables, providing the right compiler or reading the error messages and installing the right dependencies. I will walk you through the process of compiling and installing Dlib.

First install CMake. This should be straightforward, download the windows installer and install. Make sure to select the option “Add CMake to the system PATH” during the install. Choose whether you want this for all users or just for your account.

Download Boost-Python and extract the package. I extracted it into C:\boost but it can be anything. Fire up a command prompt and navigate to the directory. Then do:

~

bootstrap.bat #First run the bootstrap.bat file supplied with boost-python

#Once it finished invoke the install process of boost-python like this:
b2 install #This can take a while, go get a coffee

#Once this finishes, build the python modules like this
b2 -a --with-python address-model=64 toolset=msvc runtime-link=static #Again, this takes a while, reward yourself and get another coffee.

Once all is done you will find a folder named bin, or bin.v2, or something like this in your boost folder. Now it’s time to build Dlib.

Download Dlib and extract it somewhere. I used C:\Dlib but you can do it anywhere. Go back to your command prompt, or open a new one if you closed it, and navigate to your Dlib folder. Do this sequentially:

~

# Set two flags so that the CMake compiler knows where to find the boost-python libraries
set BOOST_ROOT=C:\boost #Make sure to set this to the path you extracted boost-python to!
set BOOST_LIBRARYDIR=C:\boost\stage\lib #Same as above

# Create and navigate into a directory to build into
mkdir build
cd build

# Build the dlib tools
cmake ..

#Navigate up one level and run the python setup program
cd ..
python setup.py install #This takes some time as well. GO GET ANOTHER COFFEE TIGER!

 

Open your Python interpreter and type import dlib”. If you receive no messages, you’re good to go! Nice.


Testing the landmark detector
Before diving into much of the coding (which probably won’t be much because we’ll be recycling), let’s test the DLib installation on your webcam. For this you can use the following snippet. If you want to learn how this works, be sure to also compare it with the first script under “Detecting your face on the webcam” in the previous post. Much of the same OpenCV code to talk to your webcam, process the image by converting to grayscale, optimising the contrast with an adaptive histogram equalisation and displaying it is something we did there.

~

#Import required modules
import cv2
import dlib

#Set up some required objects
video_capture = cv2.VideoCapture(0) #Webcam object
detector = dlib.get_frontal_face_detector() #Face detector
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat") #Landmark identifier. Set the filename to whatever you named the downloaded file

while True:
    ret, frame = video_capture.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
    clahe_image = clahe.apply(gray)

    detections = detector(clahe_image, 1) #Detect the faces in the image

    for k,d in enumerate(detections): #For each detected face
        
        shape = predictor(clahe_image, d) #Get coordinates
            for i in range(1,68): #There are 68 landmark points on each face
                cv2.circle(frame, (shape.part(i).x, shape.part(i).y), 1, (0,0,255), thickness=2) #For each point, draw a red circle with thickness2 on the original frame

    cv2.imshow("image", frame) #Display the frame

    if cv2.waitKey(1) & 0xFF == ord('q'): #Exit program when the user presses 'q'
        break

 

This will result in your face with a lot of dots outlining the shape and all the “moveable parts”. The latter is of course important because it is what makes emotional expressions possible.

Unknown-3Unknown-2

Note if you have no webcam and/or would rather like to try this on a static image, replace line #11 with something like frame = cv2.imread(“filename”) and comment out line #6 where we define the video_capture object. You will get something like:

33

 

my face has dots
people tell me my face has nice dots
experts tell me these are the best dots
I bet I have the best dots

 

 


Extracting features from the faces
The first thing to do is find ways to transform these nice dots overlaid on your face into features to feed the classifer. Features are little bits of information that describe the object or object state that we are trying to divide into categories. Is this description a bit abstract? Imagine you are in a room without windows with only a speaker and a microphone. I am outside this room and I need to make you guess whether there is a cat, dog or a horse in front of me. The rule is that I can only use visual characteristics of the animal, no names or comparisons. What do I tell you? Probably if the animal is big or small, that it has fur, that the fur is long or short, that it has claws or hooves, whether it has a tail made of flesh or just from hair, etcetera. Each bit of information I pass you can be considered a feature, and based the same feature set for each animal, you would be pretty accurate if I chose the features well.

How you extract features from your source data is actually where a lot of research is, it’s not just about creating better classifying algorithms but also about finding better ways to collect and describe data. The same classifying algorithm might function tremendously well or not at all depending on how well the information we feed it is able to discriminate between different objects or object states. If, for example, we would extract eye colour and number of freckles on each face, feed it to the classifier, and then expect it to be able to predict what emotion is expressed, we would not get far. However, the facial landmarks from the same image material describe the position of all the “moving parts” of the depicted face, the things you use to express an emotion. This is certainly useful information!

To get started, let’s take the code from the example above and change it so that it fits our current needs, like this:

~

import cv2
import dlib

detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")

def get_landmarks(image):
    detections = detector(image, 1)
    for k,d in enumerate(detections): #For all detected face instances individually
        shape = predictor(image, d) #Draw Facial Landmarks with the predictor class
        xlist = []
        ylist = []
        for i in range(1,68): #Store X and Y coordinates in two lists
            xlist.append(float(shape.part(i).x))
            ylist.append(float(shape.part(i).y))
            
        for x, y in zip(xlist, ylist): #Store all landmarks in one list in the format x1,y1,x2,y2,etc.
            landmarks.append(x)
            landmarks.append(y)
    if len(detections) > 0:
        return landmarks
    else: #If no faces are detected, return error message to other function to handle
        landmarks = "error"
        return landmarks

 

The .dat file mentioned can be found in the DLIB zip file you downloaded, or alternatively on this link.

Here we extract the coordinates of all face landmarks. These coordinates are the first collection of features, and this might be the end of the road. You might also continue and try to derive other measures from this that will tell the classifier more about what is happening on the face. Whether this is necessary or not depends. For now let’s assume it is necessary, and look at ways to extract more information from what we have. Feature generation is always a good thing to try, if only because it brings you closer to the data and might give you ideas or alternative views at it because you’re getting your hands dirty. Later on we’ll see if it was really necessary at a classification level.

To start, look at the coordinates. They may change as my face moves to different parts of the frame. I could be expressing the same emotion in the top left of an image as in the bottom right of another image, but the resulting coordinate matrix would express different numerical ranges. However, the relationships between the coordinates will be similar in both matrices so some information is present in a location invariant form, meaning it is the same no matter where in the picture my face is.

Maybe the most straightforward way to remove numerical differences originating from faces in different places of the image would be normalising the coordinates between 0 and 1. This is easily done by: Norm_Formula, or to put it in code:

~

xnorm = [(i-min(xlist))/(max(xlist)-min(xlist)) for i in xlist]
ynorm = [(i-min(ylist))/(max(ylist)-min(ylist)) for i in ylist]

 

However, there is a problem with this approach because it fits the entire face in a square with both axes ranging from 0 to 1. Imagine one face with its eyebrows up high and mouth open, the person could be surprised. Now imagine an angry face with eyebrows down and mouth closed. If we normalise the landmark points on both faces from 0-1 and put them next to each other we might see two very similar faces. Because both distinguishing features lie at the edges of the face, normalising will push both back into a very similar shape. The faces will end up looking very similar. Take a moment to appreciate what we have done; we have thrown away most of the variation that in the first place would have allowed us to tell the two emotions from each other! Probably this will not work. Of course some variation remains from the open mouth, but it would be better not to throw so much away.

A less destructive way could be to calculate the position of all points relative to each other. To do this we calculate the mean of both axes, which results in the point coordinates of the sort-of “centre of gravity” of all face landmarks. We can then get the position of all points relative to this central point. Let me show you what I mean. Here’s my face with landmarks overlaid:

face2

First we add a “centre of gravity”, shown as a blue dot on the image below:

face3

Lastly we draw a line between the centre point and each other facial landmark location:

face4

Note that each line has both a magnitude (distance between both points) and a direction (angle relative to image where horizontal=0°), in other words, a vector.

But, you may ask, why don’t we take for example the tip of the nose as the central point? This would work as well, but would also throw extra variance in the mix due to short, long, high- or low-tipped noses. The “centre point method” also introduces extra variance; the centre of gravity shifts when the head turns away from the camera, but I think this is less than when using the nose-tip method because most faces more or less face the camera in our sets. There are techniques to estimate head pose and then correct for it, but that is beyond this article.

There is one last thing to note. Faces may be tilted, which might confuse the classifier. We can correct for this rotation by assuming that the bridge of the nose in most people is more or less straight, and offset all calculated angles by the angle of the nose bridge. This rotates the entire vector array so that tilted faces become similar to non-tilted faces with the same expression. Below are two images, the left one illustrates what happens in the code when the angles are calculated, the right one shows how we can calculate the face offset correction by taking the tip of the nose and finding the angle the nose makes relative to the image, and thus find the angular offset β we need to apply.

anglecalc

Now let’s look at how to implement what I described above in Python. It’s actually fairly straightforward. We just slightly modify the get_landmarks() function from above.

~

def get_landmarks(image):
    detections = detector(image, 1)
    for k,d in enumerate(detections): #For all detected face instances individually
        shape = predictor(image, d) #Draw Facial Landmarks with the predictor class
        xlist = []
        ylist = []
        for i in range(1,68): #Store X and Y coordinates in two lists
            xlist.append(float(shape.part(i).x))
            ylist.append(float(shape.part(i).y))
            
        xmean = np.mean(xlist) #Find both coordinates of centre of gravity
        ymean = np.mean(ylist)
        xcentral = [(x-xmean) for x in xlist] #Calculate distance centre <-> other points in both axes
        ycentral = [(y-ymean) for y in ylist]
        
        landmarks_vectorised = []
        for x, y, w, z in zip(xcentral, ycentral, xlist, ylist):
            landmarks_vectorised.append(w)
            landmarks_vectorised.append(z)
            meannp = np.asarray((ymean,xmean))
            coornp = np.asarray((z,w))
            dist = np.linalg.norm(coornp-meannp)
            landmarks_vectorised.append(dist)
            landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi))

        data['landmarks_vectorised'] = landmarks_vectorised
    if len(detections) < 1: 
        data['landmarks_vestorised'] = "error"

 

That was actually quite manageable, no? Now it’s time to put all of the above together with some stuff from the first post. The goal is to read the existing dataset into a training and prediction set with corresponding labels, train the classifier (we use Support Vector Machines with linear kernel from SKLearn, but feel free to experiment with other available kernels such as polynomial or rbf, or other classifiers!), and evaluate the result. This evaluation will be done in two steps; first we get an overall accuracy after ten different data segmentation, training and prediction runs, second we will evaluate the predictive probabilities.


Déja-Vu All Over Again

The next thing we will be doing is returning to the two datasets from the original post. Let’s see how this approach stacks up.

First let’s write some code. The approach is to first extract facial landmark points from the images, randomly divide 80% of the data into a training set and 20% into a test set, then feed these into the classifier and train it on the training set. Finally we evaluate the resulting model by predicting what is in the test set to see how the model handles the unknown data. Basically a lot of the steps are the same as what we did earlier.

The quick and dirty (I will clean and ‘pythonify’ the code later, when there is time) solution based off of earlier code could be something like:

~

import cv2
import glob
import random
import math
import numpy as np
import dlib
import itertools
from sklearn.svm import SVC

emotions = ["anger", "contempt", "disgust", "fear", "happiness", "neutral", "sadness", "surprise"] #Emotion list
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat") #Or set this to whatever you named the downloaded file
clf = SVC(kernel='linear', probability=True, tol=1e-3)#, verbose = True) #Set the classifier as a support vector machines with polynomial kernel

data = {} #Make dictionary for all values
#data['landmarks_vectorised'] = []

def get_files(emotion): #Define function to get file list, randomly shuffle it and split 80/20
    files = glob.glob("dataset\\%s\\*" %emotion)
    random.shuffle(files)
    training = files[:int(len(files)*0.8)] #get first 80% of file list
    prediction = files[-int(len(files)*0.2):] #get last 20% of file list
    return training, prediction

def get_landmarks(image):
    detections = detector(image, 1)
    for k,d in enumerate(detections): #For all detected face instances individually
        shape = predictor(image, d) #Draw Facial Landmarks with the predictor class
        xlist = []
        ylist = []
        for i in range(1,68): #Store X and Y coordinates in two lists
            xlist.append(float(shape.part(i).x))
            ylist.append(float(shape.part(i).y))
            
        xmean = np.mean(xlist)
        ymean = np.mean(ylist)
        xcentral = [(x-xmean) for x in xlist]
        ycentral = [(y-ymean) for y in ylist]

        landmarks_vectorised = []
        for x, y, w, z in zip(xcentral, ycentral, xlist, ylist):
            landmarks_vectorised.append(w)
            landmarks_vectorised.append(z)
            meannp = np.asarray((ymean,xmean))
            coornp = np.asarray((z,w))
            dist = np.linalg.norm(coornp-meannp)
            landmarks_vectorised.append(dist)
            landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi))

        data['landmarks_vectorised'] = landmarks_vectorised
    if len(detections) < 1: 
        data['landmarks_vestorised'] = "error"

def make_sets():
    training_data = []
    training_labels = []
    prediction_data = []
    prediction_labels = []
    for emotion in emotions:
        print(" working on %s" %emotion)
        training, prediction = get_files(emotion)
        #Append data to training and prediction list, and generate labels 0-7
        for item in training:
            image = cv2.imread(item) #open image
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #convert to grayscale
            clahe_image = clahe.apply(gray)
            get_landmarks(clahe_image)
            if data['landmarks_vectorised'] == "error":
                print("no face detected on this one")
            else:
                training_data.append(data['landmarks_vectorised']) #append image array to training data list
                training_labels.append(emotions.index(emotion))
    
        for item in prediction:
            image = cv2.imread(item)
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
            clahe_image = clahe.apply(gray)
            get_landmarks(clahe_image)
            if data['landmarks_vectorised'] == "error":
                print("no face detected on this one")
            else:
                prediction_data.append(data['landmarks_vectorised'])
                prediction_labels.append(emotions.index(emotion))

    return training_data, training_labels, prediction_data, prediction_labels   

accur_lin = []
for i in range(0,10):
    print("Making sets %s" %i) #Make sets by random sampling 80/20%
    training_data, training_labels, prediction_data, prediction_labels = make_sets()

    npar_train = np.array(training_data) #Turn the training set into a numpy array for the classifier
    npar_trainlabs = np.array(training_labels)
    print("training SVM linear %s" %i) #train SVM
    clf.fit(npar_train, training_labels)

    print("getting accuracies %s" %i) #Use score() function to get accuracy
    npar_pred = np.array(prediction_data)
    pred_lin = clf.score(npar_pred, prediction_labels)
    print "linear: ", pred_lin
    accur_lin.append(pred_lin) #Store accuracy in a list

print("Mean value lin svm: %s" %np.mean(accur_lin)) #FGet mean accuracy of the 10 runs

 

Remember that in the previous post, for the standard set at 8 categories we managed to get 69.3% accuracy with the FisherFace classifier. This approach yields 84.1% on the same data, a lot better!

We then reduced the set to 5 emotions (leaving out contempt, fear and sadness), because the 3 categories had very few images, and got 82.5% correct. This approach gives 92.6%, also much improvement.

After adding the less standardised and more difficult images from google, we got 61.6% correct when predicting 7 emotions (the contempt category remained very small so we left that out). This is now 78.2%, also quite an improvement. This remains the lowest accuracy, showing that for a more diverse dataset the problem is also more difficult. Keep in mind that the dataset I use is still quite small in machine learning terms, containing about 1000 images spread over 8 categories.


Looking at features
So we derived different features from the data, but weren’t sure whether this was strictly necessary. So was this necessary? It depends! It depends on if doing so adds more unique variance related to what you’re trying to predict, it depends on what classifier you use, etc.

Let’s run different feature combinations as inputs through different classifiers and see what happens. I’ve run all iterations on the same slice of data with 4 emotion categories of comparable size (so that running the same settings again yields the same predictive value).

Using all of the features described so far leads to:
Linear SVM: 93.9%
Polynomial SVM: 83.7%
Random Forest Classifier: 87.8%

Now using just the vector length and angle:
Linear SVM: 87.8%
Polynomial SVM: 87.8%
Random Forest Classifier: 79.6%

Now using just the raw coordinates:
Linear SVM: 91.8%
Polynomial SVM: 89.8%
Random Forest Classifier: 59.2%

Now replacing all training data with zeros:
Linear SVM: 32.7%
Polynomial SVM: 32.7%
Random Forest Classifier: 32.7%

Now this is interesting! First note that there isn’t much difference in the accuracy of the support vector machine classifiers when using the extra features we generate. This type of classifier already preprocesses the data quite extensively. The extra data we generate does not contain much if any extra information to this classifier, so it only marginally improves the performance of the linear kernel, and actually hurts the polynomial kernel because data with a lot of overlapping variance can also make a classification task more difficult (here, it probably results in overfitting the training data). By the way, this is a nice 2D visualisation of what an SVC tries to achieve, complexity escalates when adding one dimension. Now remember that the SVC operates in an N-dimensional space and try to imagine what a set of hyperplanes in 4, 8, 12, 36 or more dimensions would look like. Don’t drive yourself crazy.

Random Forest Classifiers do things a lot differently. Essentially they are a forest of decision trees. Simplified, each tree is a long list of yes/no questions, and answering all questions leads to a conclusion. In the forest the correlation between each tree and the others is kept as low as possible, which ensures every tree brings something unique to the table when explaining patterns in the data. Each tree then votes on what it thinks the answer is, and most votes win. This approach benefits extensively from the new features we generated, jumping from 59.2% to 87.8% accuracy as we combine all derived features with the raw coordinates.

So you see, the answer you likely get when you ask any scientist a direct question holds true here as well: it depends. Check your data, think twice and don’t be afraid to try a few things.

The last that may be noticed is that, when not adding any data at all and in stead presenting the classifiers with a matrix of zeros, they still perform slightly above the expected chance level of 25%. This is because the categories are not identically sized.


Looking at mistakes
Lastly, let’s take a look at where the model goes wrong. Often this is where you can learn a lot, for example this is where you might find that a single category just doesn’t work at all, which can lead you to look critically at the training material again.

One advantage of the SVM classifier we use is that it is probabilistic. This means that it assigns probabilities to each category it has been trained for (and you can get these probabilities if you set the ‘probability’ flag to True). So, for example, a single image might be “happy” with 85% probability, “angry” with “10% probability, etc.

To get the classifier to return these things you can use its predict_proba() function. You give this function either a single data row to predict or feed it your entire dataset. It will return a matrix where each row corresponds to one prediction, and each column represents a category. I wrote these probabilities to a table and included the source image and label. Looking at some mistakes, here are some notable things that were classified incorrectly (note there are only images from my google set, the CK+ set’s terms prohibit me from publishing images for privacy reasons):

google_140

anger: 0.03239878
contempt: 0.13635423
disgust: 0.0117559
fear: 0.00202098
neutral: 0.7560004
happy: 0.00382895
sadness: 0.04207027
surprise: 0.0155695

The correct answer is contempt. To be honest I would agree with the classifier, because the expression really is subtle. Note that contempt is the second most likely according to the classifier.

 

 

 

google_048

anger: 0.0726657
contempt: 0.24655082
disgust: 0.06427896
fear: 0.02427595
neutral: 0.20176133
happy: 0.03169822
sadness: 0.34911036
surprise: 0.00965867

The correct answer is disgust. Again I can definitely understand the mistake the classifier makes here (I might make the same mistake..). Disgust would be my second guess, but not the classifier’s. I have removed this image from the dataset because it can be ambiguous.

 

 

 

google_020

anger: 0.00304093
contempt: 0.01715202
disgust: 0.74954754
fear: 0.04916257
neutral: 0.00806644
happy: 0.13546932
sadness: 0.02680473
surprise: 0.01075646

The correct answer is obviously happy. This is a mistake that is less understandable but still the model is quite sure (~75%). There definitely is no hint of disgust in her face. Do note however, that happiness would be the classifier’s second guess. More training material might rectify this situation.

 

 

google_168

anger: 0.0372873
contempt: 0.08705531
disgust: 0.12282577
fear: 0.16857784
neutral: 0.09523397
happy: 0.26552763
sadness: 0.20521671
surprise: 0.01827547

The correct answer is sadness. Here the classifier is not sure at all (~27%)! Like in the previous image, the second guess (~20%) is the correct answer. This may very well be fixed by having more (and more diverse) training data.

 

 

 

google_034

anger: 0.01440529
contempt: 0.15626157
disgust: 0.01007962
fear: 0.00466321
neutral: 0.378776
happy: 0.00554828
sadness: 0.07485257
surprise: 0.35541345

The correct answer is surprise. Again a near miss (~38% vs ~36%)! Also note that this is particularly difficult because there are few baby faces in the dataset. When I said earlier that the extra google images are very challenging for a classifier, I meant it!


Upping the game – the ultimate challenge
Although the small google dataset I put together is more challenging than the lab-conditions of the CK/CK+ dataset, it is still somewhat controlled. For example I filtered out faces that were more sideways than frontal-facing, where the emotion was very mixed (happily surprised for example), and also where the emotion was so subtle that even I had trouble identifying it.

A far greater (and more realistic still) challenge is the SFEW/AFEW dataset, put together from a large collection of movie scenes. Read more about it here. The set is not publicly available but the author was generous enough to share the set with me so that I could evaluate the taken approach further.

Guess what, it fails miserably! It attained about 44.2% on the images when training on 90% and validating on 10% of the set. Although this is on par with what is mentioned in the paper, it shows there is still a long way to go before computers can recognize emotions with a high enough accuracy in real-life settings. There are also video clips included on which we will spend another post together with convolutional neural nets at a later time.

This set is particularly difficult because it contains different expressions and facial poses and rotations for similar emotions. This was the purpose of the authors; techniques by now are good enough to recognise emotions on controlled datasets with images taken in lab-like conditions, approaching upper 90% accuracy in many recent works (even our relatively simple approach reached early 90). However these sets do not represent real life settings very much, except maybe when using laptop webcams, because you always more or less face this device and sit at a comparable distance when using the laptop. This means for applications in marketing and similar fields the technology is already usable, albeit with much room for improvement still available and requiring some expertise to implement it correctly.


Final reflections
Before concluding I want you to take a moment, relax and sit back and think. Take for example the SFEW set with real-life examples, accurate classification of which quickly gets terribly difficult. We humans perform this recognition task remarkably well thanks to our highly complex visual system, which has zero problems with object rotation in all planes, different face sizes, different facial characteristics, extreme changes in lighting conditions or even partial occlusion of a face. Your first response might be “but that’s easy, I do it all the time!”, but it’s really, really, really not. Think for a moment about what an enormously complex problem this really is. I can show you a mouth and you would already be quite good at seeing an emotion. I can show you about 5% of a car and you could recognize it as a car easily, I can even warp and destroy the image and your brain would laugh at me and tell me “easy, that’s a car bro”. This is a task that you solve constantly and in real-time, without conscious effort, with virtually 100% accuracy and while only using the equivalent of ~20 watts for your entire brain (not just the visual system). The average still-not-so-good-at-object-recognition CPU+GPU home computer uses 350-450 watts when computing. Then there’s supercomputers like the TaihuLight, which require about 15.300.000 watts (using in one hour what the average Dutch household uses in 5.1 years). At least at visual tasks, you still outperform these things by quite a large margin with only 0.00013% of their energy budget. Well done, brain!

Anyway, to try and tackle this problem digitally we need another approach. In another post we will look at various forms of neural nets (modeled after your brain) and how these may or may not solve the problem, and also at some other feature extraction techniques.


The CK+ dataset was used for validating and training of the classifier in this article, references to the set are:

  • Kanade, T., Cohn, J. F., & Tian, Y. (2000). Comprehensive database for facial expression analysis. Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG’00), Grenoble, France, 46-53.
  • Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The Extended Cohn-Kanade Dataset (CK+): A complete expression dataset for action unit and emotion-specified expression. Proceedings of the Third International Workshop on CVPR for Human Communicative Behavior Analysis (CVPR4HB 2010), San Francisco, USA, 94-101.

The SFEW/AFEW dataset used for evaluation is authored by and described in:

  • A. Dhall, R. Goecke, S. Lucey and T. Gedeon, “Collecting Large, Richly Annotated Facial- Expression Databases from Movies”, IEEE MultiMedia 19 (2012) 34-41.
  • A. Dhall, R. Goecke, J. Joshi, K. Sikka and T. Gedeon, “ Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol”, ACM ICMI 2014.

138 Comments

  • Bas

    11th August 2016

    This software is going to be Huuuuggeeee!

    Reply
  • Zach

    6th October 2016

    This is amazing – thank you for providing a human readable walkthrough! I was not learning much reading the many post-doc borg machine code style walkthroughs Google keeps pointing me to.

    Reply
    • Paul van Gent

      6th October 2016

      Thanks :)! This is exactly why I decided to work out something myself and share it. Glad it helped.

      Reply
  • jettin

    7th October 2016

    sir,
    I was unable to install boost python i am getting an error like : mscv.jam no such a file or directory. i have vs2015

    Reply
    • Paul van Gent

      20th October 2016

      Hi Jettin,

      I’ve never seen this error. mscv.jam should be msvc.jam though. Are you setting your compiler name correctly?

      Reply
  • Jone

    20th October 2016

    When I run the solution code above, the pred_lin I got was always 1.0 no matter how I changed the ratio of the training and test set. I just used the CK+ dataset, and put them into 8 respective folders. Did I miss something, or something wrong with the code?

    Reply
    • Paul van Gent

      20th October 2016

      Hi Jone,

      If I run it it functions fine. What likely happens is that you have overlap between your testing and training data. For example if I set the size of the training set to 1.0 and leave the testing set at 0.1 it also returns 100% accuracy. This is because the model easily remembers what it has already seen, but then you still have no information about how well it generalizes. Use the settings from the tutorial.

      Also, I’ve written the train/test split out because it gives a more clear image of what happens. If you can’t get it to work, look for a simpler approach such as SKLearns train_test_split() function.

      Good luck!

      Reply
  • Stanislav

    9th November 2016

    Hi all !
    My configuration is : win7 – 32, Microsoft Visual Studio 15 (community), Phyton3.5, Cmake3.7.0, Boost1.6.2.0

    I try this instructions and find small bug , in string
    b2 -a –with-python address-model=64 toolset=mscv runtime-link=static
    Need replace the string toolset=mscv, to toolset=msvc !!! Its not a joke , i find this rightly string option in the bootstrap.bat
    For win32 my string is:
    b2 -a –with-python address-model=32 toolset=msvc runtime-link=static
    work fine !!

    Reply
    • Paul van Gent

      9th November 2016

      Thanks Stanislav for catching that, seems to be a typo! It should indeed be msvc (microsoft visual c++). I’ve updated it.

      Reply
      • Stanislav

        9th November 2016

        Thanks.. And I remember. The file in boost library /dlib/cmake_utils/add_python_modue caused error – “Not find header for python-py34”.
        Replace in file add_python_modue next :
        FIND_PACKAGE(Boost boost-python COMPONENTS python-py34 )
        if (NOT Boost_FOUND)
        FIND_PACKAGE(Boost boost-python COMPONENTS python-py35)
        endif()
        if (NOT Boost_FOUND)
        FIND_PACKAGE(Boost COMPONENTS python3)
        endif()
        if (NOT Boost_FOUND)
        FIND_PACKAGE(Boost COMPONENTS python)
        endif()

        to

        FIND_PACKAGE(Boost COMPONENTS system)
        if (NOT Boost_FOUND)
        FIND_PACKAGE(Boost COMPONENTS thread )
        endif()
        if (NOT Boost_FOUND)
        FIND_PACKAGE(Boost COMPONENTS python)
        endif()
        if (NOT Boost_FOUND)
        FIND_PACKAGE(Boost COMPONENTS REQUIRED )
        endif()

        And everything will be work fine ! Sorry for my bad english..

        Reply
  • Francisco Gonzalez Hernandez

    24th November 2016

    Hi Paul, I’ve used your code and I’ve obtained some good results, your work is fantastic. By the way, I want to cite you on a scientific paper, do you have any scientific paper where you want to be citated?, also, I don’t know if you can give me more information about the used works to create this work. I look forward to reading from you soon and thanks.

    Reply
    • Paul van Gent

      25th November 2016

      Hi Francisco,

      I’m happy to hear this! I’ve sent you a mail with detailed information. I don’t have the papers describing this methodology handy right now, but a quick search for “facial landmark generation” and “emotion recognition using support vector machines” should return a good overview of the field.

      Cheers,
      Paul

      Reply
  • Ying

    29th November 2016

    Hi, Paul,
    Thanks for your work and your post. I am working on a project of emotion recognition right now and your post is a saver.
    I have some doubts though, wondering if you have an answer.
    Do you think it is possible to do all this work in Linux (Ubuntu)? Even more in Raspberry Pi (also Ubuntu)?
    Thanks,
    Ying

    Reply
    • Paul van Gent

      29th November 2016

      Hi Ying,
      The great thing about Python is that it is very cross-platform. Make sure you install the right dependencies. Also the code for reading from the webcam might be a bit different (in linux it lives under /dev/video). What are you making?

      Reply
      • Ying

        5th December 2016

        I am making a facial expression (emotions) recognition algorithm which will associate users’ emotions with some specific pieces of music. Cause I am cooperating with some music artists. They will study the relation between emotions and music.

        Reply
      • Ying

        5th December 2016

        By the way, I have sent my email address for demanding the dataset of face emotions you used in your previous post, always no reply.

        Reply
  • Kazuno

    21st December 2016

    Hi, Paul…
    Currently, I am working on a project of emotion recognition through webcam. I’ve used your code and it’s really saved my life. Thanks for your work. But, I’m little bit confused with clf.predict. I didn’t know how to show emotion label for each video frame. Please help me out.

    Reply
    • Paul van Gent

      27th December 2016

      Glad to hear it helped. I’m not sure what you mean. Do you want to overlay the emotion label on a real-time video stream?

      Reply
      • Kazuno

        12th January 2017

        Owh, sorry. My bad. That’s not what I meant. Regarding your tutorial, you only show the training part and fit training set to classifier. But doesn’t show how to predict emotion through webcam based on facial landmark using ctf.predict. I know how to used fishercascade prediction based on your previous tutorial, but I just don’t know how to implement ctf.predict in this tutorial. Please help me out. Thank you.

        Reply
        • Paul van Gent

          8th February 2017

          No I didn’t use the fishercascade prediction from the previous tutorial. Here I use DLib to mark face features and use different classifiers (random forest, SVM) to predict emotions.

          Once the model is trained as shown in the tutorial, you can feed it images from a webcam just as well as images from a dataset. Based on this and the previous tutorial, you should be able to figure out how to do that :).

          You can view the SKLearn documentation to see how the classifiers work. Some use a .predict() function, others a .score().

          Reply
  • Shivam

    28th January 2017

    Hi, Paul…
    Currently, I am working on a project of emotion recognition and i am facing problem that it’s not able to run bootstrap.bat file as it’s showing an error (there is no file named bootstrap.bat supplied with boost-python
    )
    NameError: name ‘bootstrap’ is not defined
    Can you please help me out!
    thanks

    Reply
    • Paul van Gent

      8th February 2017

      Hi Shivam. Wat OS are you on?

      Reply
      • Shivam

        24th February 2017

        Sir , I am using OS- windows 8.1.
        Could you please help me out.?

        Reply
  • Gaurav

    7th February 2017

    Hi Paul

    Which version of SKlearn did you use?

    Reply
    • Paul van Gent

      8th February 2017

      Hi Gaurav. I use 0.18.

      Reply
  • Blackwood

    11th February 2017

    Hi,Paul
    This an amazing idea. But when i use visual studio 2013 to do it ,if find the predict result is very bad. the probility event not up to 20%.
    I use libsvm to train the model useing “C_SCV, LINEAR”,every sample have 272(68*4)features.and the model file is about 170Mbytes.
    is this right?
    thank you.

    Reply
    • Paul van Gent

      13th February 2017

      Hi Blackwood. I don’t know if the model size is right. What kind of dataset do you use to train the classifier? My guess is that either the dataset is too small, doesn’t contain a lot (or too little) variance, or that the feature extraction somewhere goes wrong.

      Reply
      • Blackwood

        13th February 2017

        Hi, Paul
        I use CK/CK+ dataset,and pick the first and the last picture of each emotion sequences.
        The first picture is the netural and the last picture is the emotion of the other 7 types.
        There are about 650 picture in training.

        the dataset is too small ?
        Is each sample has 272 (68*4) features?
        What is you dataset size?

        thank you .

        Reply
        • Paul van Gent

          15th February 2017

          Hi Blackwood,
          So strange, my set is also about 650 images (647). Could you send me your full code so I can have a look (info@paulvangent.com)? If I run my code it will still attain upper .8, lower .9 accuracy.
          Cheers

          Reply
          • Blackwood

            16th February 2017

            Thank you Paul.
            I check code,and find the svm parameter is not right and i changed it.
            now, the predict result is up to 85%. I have emailed the c++ code to you mailbox,
            Do you plan to do it using dnn?

          • Paul van Gent

            16th February 2017

            Hi Blackwood,
            Good to hear you found the issue. There is a deep learning tutorial planned for somewhere in the near future yes, but I will need to see when I find the time to make it :). Stay tuned!
            Cheers

  • Sujay Angadi

    14th February 2017

    plz provide proper links to download and install Cmake and python.boost

    Reply
    • Paul van Gent

      14th February 2017

      At the time of writing of the tutorial the links worked. As far as I can see they still do. What exactly do you mean?

      Also, google is your friend 🙂

      Reply
      • Paul van Gent

        14th February 2017

        Ah I see the CMake link pointed to jupyter, strange. I updated it.

        Reply
  • Ashwin

    14th February 2017

    Hi Paul!

    Great tutorial. I’m using your tutorial to find emotion using SVM as its is a part of my project.
    My Configuration is as follows –
    -Windows 10 64-bit
    -Visual Studio 2015
    -Python 2.7
    -Opencv 2.4.9
    -Cmake-3.8.0-rc1-win64-x64.msi

    When I run the command – python setup.py install. It returns the following error –
    libboost_python-vc140-mt-s-1_63.lib(errors.obj) : fatal error LNK1112: module machine type ‘x64’ conflicts with target machine type ‘X86’ [C:\Dlib\tools\python\build\dlib_.vcxproj]
    Done Building Project “C:\Dlib\tools\python\build\dlib_.vcxproj” (default targets) — FAILED.
    Done Building Project “C:\Dlib\tools\python\build\ALL_BUILD.vcxproj” (default targets) — FAILED.
    Done Building Project “C:\Dlib\tools\python\build\install.vcxproj” (default targets) — FAILED.
    Build FAILED.
    “C:\Dlib\tools\python\build\install.vcxproj” (default target) (1) ->
    “C:\Dlib\tools\python\build\ALL_BUILD.vcxproj” (default target) (3) ->
    “C:\Dlib\tools\python\build\dlib_.vcxproj” (default target) (5) ->
    (Link target) ->
    libboost_python-vc140-mt-s-1_63.lib(errors.obj) : fatal error LNK1112: module machine type ‘x64’ conflicts with target machine type ‘X86’ [C:\Dlib\tools\python\build\dlib_.vcxproj]
    0 Warning(s)
    1 Error(s)
    Time Elapsed 00:07:04.04
    error: cmake build failed!

    So I’m not able to move ahead. I really need your help in this.

    Reply
    • Paul van Gent

      14th February 2017

      Hi Ashwin,

      Thanks, glad it’s helping!
      The first bit and last bit of the error is what it’s about:
      “libboost_python-vc140-mt-s-1_63.lib(errors.obj) : fatal error LNK1112: module machine type ‘x64’ conflicts with target machine type ‘X86’ [C:\Dlib\tools\python\build\dlib_.vcxproj]”

      One of the things in your list seems to be 32-bit. You cannot mix 32 and 64 bit architectures. Verify that Python is 64 bit, Boost is 64 bit, Dlib is 64 bit. Did you build the boost library with the 64-bit flag?

      Reply
      • Ashwin

        15th February 2017

        Yes, I built the boost library with the 64-bit flag. By the way which version of Boost, Dlib and Cmake did you use in this tutorial?

        Reply
        • Paul van Gent

          15th February 2017

          Are they all 64-bit? What about your python distro? I used:
          – Boost 1.61
          – Dlib 19.2
          – Cmake 3.7.1

          However, I highly doubt that versions matter. The error is quite specific about there being an instruction set incompatibility.

          Reply
          • Ashwin

            16th February 2017

            Hi Paul. Thanks. I did install all 64 bit modules and it worked. But now when I execute the first code it gives me the following error-
            predictor = dlib.shape_predictor(“shape_predictor_68_face_landmarks.dat”) #Landmark identifier. Set the filename to whatever you named the downloaded file
            RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat

            So do i need to install that particular file ?

            Sorry I’m new to python.

          • Paul van Gent

            16th February 2017

            Yes, this file is the trained model for the face detector in Dlib. Without it, it doesn’t know what a face looks like.
            I think the download link is in the tutorial.

            Good luck :)!

          • Yuki

            21st March 2017

            Hi Paul.
            I have the same problem as Ashwin.
            I used:
            -Python 2.7.13 64bit.
            -dlib 19.1 (I don’t know how to check is it 64bit or not)
            -Boost 1.63.0 (with the flag “address-model=64”)
            -cmake 3.6.1
            This error had trouble me more 1 week, tried a lot method and still can’t solve it QAQ

  • Tony

    15th February 2017

    Hi Paul,
    Excellent tutorials. I have a doubt. The accuracy of both landmark as well as the fishare face for me are quite low, around the low thirties. Any idea why ? I am using the same data set and same algorithm.

    Any idea regarding this ?

    Thanks

    Reply
    • Paul van Gent

      15th February 2017

      Hi Tony,
      Unfortunately I have no idea from a distance. Could you send me your full code (info@paulvangent.com)? I’ll have a look.
      Cheers

      Reply
  • Blackwood

    17th February 2017

    Hi Paul:
    I hope to use more pictures to train the model . but i am not as lucky as you to get the SFEW/AFEW dataset. I have email the AFEW/SFEW downloading requirement few days ago, but no reply comes back.
    Can you tell me how and where i can get the dataset else ?
    thank you.

    Reply
    • Paul van Gent

      17th February 2017

      There is no other place to get that one, however you could also try making your own. Extract faces from google image search, movies, art, etc. It’s more work but you have control over how big and how varied you want your dataset to be.

      Reply
  • Luis

    1st March 2017

    Hello. I followed both tutorials for emotion recognition and everything worked smoothly 🙂 Now I’m looking to implement it by using deep learning and/or neural networks. Could you please recommend me how to start? I mean, what could I use as inputs in a neural network? Could it be 5 images (one emotion each)? What would be the next step? I’m a bit lost here 😛 Thanks!

    Reply
    • Paul van Gent

      13th March 2017

      Hi Luis,

      You could read up on Google’s tensorflow. Theano in Python is another popular deep learning framework. Or you could wait a week or two. I think I’ll have a deep learning emotion recognition tutorial ready by then :).

      Cheers

      Reply
  • Raaz

    1st April 2017

    Hy Paul, I need help please, when running the comand cmake.. it gives me the error ‘cmake..’ is not recognized as an internal or external command, operable program or batch file. What should I do? Thanks, great job with your website

    Reply
    • Paul van Gent

      8th April 2017

      Hi Raaz. When installing CMAke you need to check the box to “add it to your system path”, or manually add it to your system PATH variable.

      Reply
  • John

    2nd April 2017

    Can be a train model saved and used after?

    Reply
    • Paul van Gent

      8th April 2017

      Hi John. Sure.

      import cv2
      fishface = cv2.createFisherFaceRecognizer()
      [train model]
      fishface.save(“filename.xml”)

      load it with fishface.load(“filename.xml”).

      Reply
      • John

        8th April 2017

        I mean the trained SVM ? With fisherface i saved my model, but i want to make a model with SVM

        Reply
        • Paul van Gent

          8th April 2017

          Check the docs for the module you use for your SVM model. It’s always in the docs.

          Reply
  • Aparna

    7th April 2017

    Hello. I keep getting the following error:
    CMake Error at C:/dlib-19.4/dlib/cmake_utils/add_python_module:116 (message):
    Boost python library not found.
    Call Stack (most recent call first):
    CMakeLists.txt:6 (include)
    — Configuring incomplete, errors occurred!
    See also “C:/dlib-19.4/tools/python/build/CMakeFiles/CMakeOutput.log”.
    error: cmake configuration failed!

    I have tried everything I could find on internet, but haven’t been successful in installing dlib. Any suggestions would be appreciated.
    Thanks,
    Aparna

    Reply
    • Paul van Gent

      8th April 2017

      Hi Aparna. It seems boost-python is not correctly installed or located. Did it build and install correctly? Did you set the environment variables to BOOST_ROOT and BOOST_LIBRARYDIR?

      Reply
      • Aparna

        8th April 2017

        Yes I did. I tried uninstalling and installing it thrice. It gets successfully installed without any errors or warning.

        Reply
        • Aparna

          8th April 2017

          And Yes, I did setup the BOOST_ROOT and BOOST_LIBRARYDIR path variables. But no luck yet.

          Reply
          • Paul van Gent

            8th April 2017

            You have me at a loss I’m afraid. What system do you use?

          • Mani Kumar

            26th May 2017

            Hello Aparna,

            Even I faced the same issue. After searching for answers on Google
            and trying every answer for over a month, finally I found a solution that works for me.

            I installed the dlib using its wheel file.
            Download the wheel file from this link .
            I used the dlib 18.17 and not 19.4 which is the latest version.
            If you check the pypi it shows there’s no dlib 19.4 package for python 2.7.
            Please check this link .

            And make sure you have consistent installations.
            All the programs in this tutorial work on my system.
            And my system configurations:

            OS – Windows 10 64bit.
            Python 2.7 (anaconda) – 64bit
            OpenCV 2.4 – 64bit
            dlib 18.17.100 – cp27, win_amd64bit => should be used with python 2.7 – 64bit.

            Regards,
            Mani

          • Paul van Gent

            26th May 2017

            Hi Mani. Thanks for linking to the pre-built wheel files, that is indeed also a solution. It seems your link was eaten by my spam protection. I think you mean this one: http://www.lfd.uci.edu/~gohlke/pythonlibs/

  • Aparna

    8th April 2017

    I am using Windows 10 64-Bit. The other software versions are as follows:

    Boost_1.61.0
    dlib_19.2
    Visual Studio 2015
    CMAKE_3.72
    Anaconda2-4.3.1

    All the above mentioned softwares are 64-bit versions.

    I did follow all the steps that you have mentioned for installation.

    Is there anything I am missing?

    Reply
    • Aparna

      12th April 2017

      Hey Paul,

      Seems like there was some problem with my Windows. I tried installing it on Ubuntu and was successful at the setup and running your code. Great work with the article. Thanks for the awesome work. I get accuracy around 45-48% and I am not sure why. Any help is appreciated.

      Thanks

      Reply
      • wiem

        30th July 2017

        Hi ! I’m trying installing it on Ubuntu also ! However i had some issues when I train thaa code ! would you helpe me please ??
        this is my email grinawiem@gmail.com
        Thank you a lot

        Reply
        • Paul van Gent

          30th July 2017

          Hi Wiem. Please send me a message with the issues you’re facing and the code. The address is info@paulvangent.com
          -Paul

          Reply
  • Prakhar

    23rd April 2017

    Hey Paul,

    On running bootstrap.bat i keep getting the error that cl is not recognised as an external of internal command. Where am I going wrong?

    Reply
  • Rafa

    25th April 2017

    Hello Paul I need your help,

    I have followed the instructions given to work with dlib in python. I think everything work fine untill python setup.py install, obviously something is not working well.

    I have python 3.6 with conda running in window 7 64 bit

    you can see the result of my command prompt here
    http://imgur.com/QJZyh0c

    thanks in advance

    Reply
    • Paul van Gent

      25th April 2017

      Hi Rafa. It seems your C++ compiler is not found. Did you install Visual Studio? If you did, go to control panel -> programs and features -> right click on the “microsoft visual studio” entry and select “change”. wait for it to initialise and check the “common tools for C++” under “visual C++”. You should be good to go then!

      Reply
  • rafa

    25th April 2017

    Thanks for your quick response.

    You are right, there was a problem with visual studio 2017 ( is buggy and won’t compile C++11 code, which is a requirement to use dlib) so I installed visual studio 2015 with c++ packs.

    However now I have other problem:
    http://imgur.com/5DkUTJE

    Has you any idea how I can solved it?

    Reply
    • Paul van Gent

      25th April 2017

      Hi Rafa. The “SET” commands only apply to a command prompt session, so each time you close it, it forgets them. Before compiling dlib you need to do the SET BOOST_ROOT and SET BOOST_LIBRARYDIR again.

      Reply
    • julio

      4th May 2017

      hi i tried install dlib on windows 8 32 bits-cmake 3.8 win32-x86 – python2.7 – visualstudio 2012-dlib 19.4 but i have error like that:
      visual studio are using is too old and doent support c++11 you need viaual studio 2015 or newer..
      my question is …i just update 2015

      Reply
      • Paul van Gent

        4th May 2017

        Hi Julio. That is correct, and why the VS2015 is indicated under “Getting started”. Happy to hear you found the issue :).

        Reply
  • joy

    2nd May 2017

    Hello Paul, I’m new to machine learning and I’m looking to execute this program you’ve written. However, I’m not clear how we’re reading the dataset. Like the previous post do we create two folders ‘source_emotion’ and ‘source_images’? If not then it would be great if you could explain how you’re doing this. Pardon me if it’s a silly question. Thank you.

    Reply
    • Paul van Gent

      4th May 2017

      Hi Joy. If you’ve followed the previous tutorial you can use the same folder structure. The “source” folders are not necessary, rather the “dataset” folder with the emotion labels as subfolders. The code should pick up the same structure without problem. Please let me know if you run into problems.

      Reply
  • Anil

    3rd May 2017

    Hi Paul, how can i implement predict_proba() function above code to get that emotion label scores.

    Reply
    • Paul van Gent

      8th May 2017

      Hi Anil,

      Make sure than when you define the SVM classifier, that you set probability to True. In the tutorial it looks like:
      “clf = SVC(kernel=’linear’, probability=True, tol=1e-3)”
      Afterwards, “clf” becomes the classifier object. Call it with clf.predict_proba(n_samples, n_features).

      Also see the docs at:
      http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

      Cheers,
      Paul

      Reply
      • Anil

        27th May 2017

        After my implementation, it procuded me following probabilities in array (‘prob score’, array([[ 0.12368627, 0.77254657, 0.01258662, 0.09118054]])). Here there are four probabilities but I have five emotions.I think it should included five probabilities.What am I missing, where am I wrong ?
        Thx for reply.

        Reply
        • Paul van Gent

          27th May 2017

          Hi Anil. I also mailed you, but for the benefit of others my message here as well:

          If I had to make a guess, the training data does not contain any emotions from the missing fifth emotion. This way, the model will calibrate itself only for four emotions.

          – Are all the emotion folders filled with images?
          – Are the emotion labels in the list at the top of the code the exact same as the folder names? An empty list (and thus: no training data) will be returned if one of the folder names does not correspond with the emotion names.

          Good luck!
          Paul

          Reply
  • pirouz

    16th May 2017

    HI
    thanks for your remarkable job and sharing it with us
    would you please explain more about the installing dlib ?? for example what should i put after cmake command ??
    i tried with cmake (GI) to configure and generate into build folder but at the end (python setup.py install) i get error

    cheers
    pirouz

    Reply
    • Paul van Gent

      18th May 2017

      Hi Pirouz. You type “cmake ..” when in the build folder, indicating you want to build the source from its parent directory.

      What error are you getting? Any help I can offer is very much dependent on that.

      Reply
  • Mani Kumar

    26th May 2017

    Hi Paul,

    I have gone through all your tutorials regarding the emotion detection. All the code in
    your tutorials work on my system. Thank you for the good tutorial.
    I am experimenting with my own ideas and methods to extract the information
    related to emotion. And I am beginner to machine learning so I am not sure which method
    or which library is good for my ideas.

    I have a question.
    Why are you using sklearn’s svm and not dlib’s svm or opencv’s svm to train and predict?

    Reason for the question.
    To reduce the dependency on external libraries.

    Thank you,
    Mani.

    Reply
    • Paul van Gent

      26th May 2017

      Hi Mani. I use SKLearn because of their scientific integrity for inclusion of algorithms (see: http://scikit-learn.org/stable/faq.html). Additionally: it’s because it is very versatile and contains much more ML algorithms than just support vector machines. I like to work with a package that has all I need, rather than select from different packages.

      This is personal taste, you can achieve similar goals with different packages.

      If you like to exchange ideas on what algorithm to use for your purposes, send me a mail: info@paulvangent.com.

      Reply
      • Mani Kumar

        29th May 2017

        Hi Paul,

        Thank you for the reply.

        I want to know if SKLearn is portable across platforms?
        For example, the android os.

        And the SKLearn API is accessible in c++?

        Regards,
        Mani

        Reply
  • Mjay

    2nd June 2017

    Hi Paul,
    I use your code and catch this problem :
    anglerelative = (math.atan((z-ymean)/(w-xmean))*180/math.pi) – anglenose
    RuntimeWarning: divide by zero encountered in double_scalars

    Where i make mistake?
    Thanks for reply, tutorial is very good 🙂

    Reply
    • Paul van Gent

      12th June 2017

      It means that there is a division by zero (you can’t divide by zero..). However, numpy should step over it and return a NaN value. You can try catching the error and removing the data entry from the dataset if you so wish. Good luck!
      -Paul

      Reply
  • Juergen

    11th June 2017

    Hello Paul,

    thank you for this tutorial, this is excellent work!
    Just in case anyone was searching for the shape_predictor_68_face_landmarks like I did (probably I am blind), you can find it here:
    http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2

    Reply
    • Paul van Gent

      12th June 2017

      It seems I didn’t mention this in the tutorial, thanks for noticing it. I’ll add it
      -Paul

      Reply
  • simux

    20th June 2017

    Hello Paul,
    Thank you for your job, it is helpful.
    I used your code to extract feature vectors, butI got values with negative sign and in this format -1.254179104477611872e+02
    Is my work correct?

    Thank you

    Reply
    • Paul van Gent

      20th June 2017

      Hi Simux. That depends on what feature vectors you are extracting. If you’re following the code from the tutorial, from the top of my head they can be negative (going from -180 to 180 for the angles). However, if you’re extracting other features, you need to tell me a bit more about what exactly you’re doing.
      -Paul

      Reply
      • simux

        21st June 2017

        Hi Paul,
        yes I am following your tutorial. So i used your method in computing Euclidean distance. The output vector has a dimension of 268. However in your tutorial you computed the distance between the center of gravity and each point of the 68. So it must have a dimension 136.
        Why i am having 268?

        Thank you

        Reply
  • Aniket More

    5th July 2017

    Hi Paul, Here also I am getting mean accuracy of 35%, Maybe the issue is with the version of Opencv.

    Reply
    • Paul van Gent

      5th July 2017

      That is strange indeed. This tutorial doesn’t rely on opencv except for accessing your webcam and displaying a video feed. The problem must be in the data then. How many images are in your dataset?

      Could you send me your code to info@paulvangent.com? I’ll run it on my set..

      Reply
      • Aniket More

        5th July 2017

        The data set has 652 images. I am using the same code without any modification still I will mail it to you. Thank you.

        Reply
  • KVRanathunga

    12th July 2017

    Sir,
    I have got the following error when executing the the final code you have post:

    “Warning (from warnings module):
    File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 395
    DeprecationWarning)
    DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.

    Traceback (most recent call last):
    File “C:\Users\Cowshalya\Desktop\New folder\final.py”, line 98, in
    clf.fit(npar_train, training_labels)
    File “C:\Python27\lib\site-packages\sklearn\svm\base.py”, line 151, in fit
    X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
    File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 521, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
    File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 424, in check_array
    context))
    ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”

    what should i do now..????

    Reply
  • KVRanathunga

    12th July 2017

    Sir,
    I have got an error as follows when I’m executing the final code you have given.

    “Warning (from warnings module):
    File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 395
    DeprecationWarning)
    DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.

    Traceback (most recent call last):
    File “C:\Users\Cowshalya\Desktop\New folder\final.py”, line 98, in
    clf.fit(npar_train, training_labels)
    File “C:\Python27\lib\site-packages\sklearn\svm\base.py”, line 151, in fit
    X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
    File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 521, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
    File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 424, in check_array
    context))
    ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”

    No idea what to do next. sir please help me…

    Reply
    • Paul van Gent

      14th July 2017

      The error is trying to tell you that the arrays passed to check_X_y() are empty. Try debugging why this is the case. Are the files correctly read? Is the pixel data correctly stored in an array? Are the data and the label appended correctly to the X and y arrays?

      Reply
  • wiem

    30th July 2017

    Hi Sir,
    I have got an error as follows when I’m executing the final code you have given.
    Enter 1 to train and 2 to predict
    1
    Making sets 0
    training SVM linear 0
    /usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
    DeprecationWarning)
    Traceback (most recent call last):
    File “faceDetectionDlib.py”, line 168, in
    main()
    File “faceDetectionDlib.py”, line 135, in main
    clf.fit(npar_train, training_labels)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/svm/base.py”, line 151, in fit
    X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 521, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
    File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 424, in check_array
    context))
    ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.
    ________________________________________________________

    Would you help me please and guide me how could I solve this error and train the this code using other datasets ????

    Reply
    • Paul van Gent

      30th July 2017

      As stated to your question on the other posts: the last bit of the error tells what’s going wrong: “ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”. Apparently you’re feeding an empty object to the classifier. Try debugging why no images are loaded (is path correct? can it find files? are permissions set ok?)
      -Paul

      Reply
  • wiem

    12th August 2017

    Hi Paul ! I’m using Ck+ dataset however I am getting mean accuracy of 43%, I have no idea why it is so low. Could ou tel me where is the issue?
    Thank you

    Reply
    • Paul van Gent

      13th August 2017

      No I cannot tell at a distance. You can check a few things:

      – Are you using the correct versions of all packages?
      – Where are the mistakes made? Is there a pattern?
      – Are all paths correct? Are all images in the correct folders?
      – Some users have reported glob.glob to behave differently in python 3, make sure lists are sorted properly when creating the dataset

      Reply
      • wiem

        14th August 2017

        Hi Paul! Thanks alot for your quick answer . I send you the code I used and gave me 43% accuracy !!! Will you please check it and tell me where i did wrong ! my email is grinawiem@gmail.com.
        Thanks

        Reply
  • wiem

    14th August 2017

    Hi Paul ! I tried to train the code with one only emotion in one folder. So the accurancy become 50% !!!
    I thing the mean problem is the glob.glob :
    ————————————————————————————————-

    files = glob.glob(“/home/wiem/Bureau/CK+/train/*/*” )

    ————————————————————————————————-
    in the original code is written : files = glob.glob(“dataset/%s/*” %emotion)
    however when I use %emotion it gives me this error :
    —————————————————————————————————–
    Enter 1 to train and 2 to predict
    1
    Making sets 0
    working on anger
    Traceback (most recent call last):
    File “test8.py”, line 128, in
    main()
    File “test8.py”, line 103, in main
    training_data, training_labels, prediction_data, prediction_labels = make_sets()
    File “test8.py”, line 64, in make_sets
    training, prediction = get_files(“/home/wiem/Bureau/rafd/Evaluation/train/” )
    File “test8.py”, line 22, in get_files
    files = glob.glob(“/home/wiem/Bureau/CK+/train/*/*” %emotion)
    TypeError: not all arguments converted during string formatting
    —————————————————————————————————-
    would you tell me please

    Reply
    • Paul van Gent

      14th August 2017

      Hi Wiem. With only one emotion SKLearn throws an error: you cannot fit a discriminatory model on one outcome! I expect something goes wrong there. As I mentioned at your other post: the likely issue is glob.glob sorting behaviour.

      Seems you’re running into some basic Python problems such as string formatting. You really need to have a grasp of these things and other python concepts before trying more complex exercises such as this tutorial.

      I recommend you follow some basic tutorials. A good one is https://learnpythonthehardway.org

      Good luck!

      Reply
  • wiem

    17th August 2017

    Hi Sir,
    Thank you a lot for your help! I figured out what was wrong. It’s just like you said the path of the data set was wrong. And, now the training works very well. However, would you please explain to me how I could Test and evaluate the training model and get its accuracy ?
    Thanks

    Reply
  • Neeraj Panchbhai

    17th August 2017

    Im not able to install dlib successfully please help.

    Reply
    • Paul van Gent

      18th August 2017

      Never, ever ask for help without

      – a detailed description
      – what goes wrong
      – what you have tried

      Reply
  • wiem

    21st August 2017

    Hi Paul ! I wonder the accuracy given after the 10 runs in :
    ____________________________________________________________________________________________________
    print(“Mean value lin svm: %s” %np.mean(accur_lin)) # Get mean accuracy of the 10 runs
    ____________________________________________________________________________________________________
    Is that the value of the training or the prediction of the test of the model created ! Because this code gives me accuracy of 0.96 with the dataset MUG. So will you explain this to me ?
    Thanks

    Reply
    • Paul van Gent

      21st August 2017

      Hi Wiem. It’s the test accuracy: the accuracy of the training dataset is not particularly useful, since this doesn’t tell anything about model performance, only model retention. The different accuracy with the MUG dataset is most likely because the data is structured differently,and as I mentioned before, there is likely some issue on your system with glob.glob sorting the returned images in the CK+ set.

      However, neither accuracies tell you much about how the model will perform: always collect some data that is similar to that your system of application will use when functioning.

      As mentioned before, I truly recommend you dig into Python (and machine learning theory) a bit more before attempting these kinds of projects. This will help you find faults easier.

      Reply
  • wiem

    21st August 2017

    Thank you Paul very much for your explanation and your advice I appreciate your help . I will flow your suggestions carefully.
    Thanks

    Reply
  • Sergio

    25th August 2017

    Thanks for the guide!! finally i could install dlib 19.4 for python on windows without errors.

    Reply
  • Thari

    26th August 2017

    Every thing worked fine until ” python setup.py install”. I tried this many time but installing processes did not go future after ” face_recognition.cpp”. I waited more that 2 hours.

    My system is windows 10, visual studio 2017, python 2.7 (64-bit), RAM- 8GB
    cmake version 3.9.1 Dlib 19.4

    Here I attached the command window in a text file for your reference.

    http://textuploader.com/d64sp

    Reply
  • nadhir

    20th September 2017

    Hi Paul, I’ve used your code and I’ve obtained some good results, your work is fantastic. Thank you for sharing it.
    I want ask you if you used the original size of image in the dataset CK+ or there is an optimal size you use to get better result. The, I want to cite you on a scientific paper as a reference , do you have any scientific paper where you want to be citated?, also, I don’t know if you may give me more information about the used works to create this work. I look forward to reading from you soon and thanks.
    Cordially

    Reply
    • Paul van Gent

      22nd September 2017

      Hi Nadhir. Great! The citation can be to the article on the website, its format is at the top of the post, it is:

      “van Gent, P. (2016). Emotion Recognition Using Facial Landmarks, Python, DLib and OpenCV. A tech blog about fun things with Python and embedded electronics. Retrieved from: http://www.paulvangent.com/2016/08/05/emotion-recognition-using-facial-landmarks/

      As far as the size of the images goes, it was 350×350 pixels as stated in the other tutorial where the pre-processing was done. I’m not sure about the absolute optimal size for this, but I’m sure good performance can also be had with smaller images. Of course the larger the image, the smaller the facial movement you can quantify, but for the purposes of these tutorials (archetypical emotional expressions) the size was more than enough.

      Good luck. If you want you can share the paper after publication and we can put it up on the article as well.

      Reply
  • Phillemon

    24th September 2017

    This is great, however I am finding it difficult to obtain the ck+ dataset. Can you please send it to my email. phillemonrasekgwalo@gmail.com thank you

    Reply
    • Paul van Gent

      26th September 2017

      Hi Phillemon. I’m sorry, the terms of the dataset prohibit me sharing it. You need to obtain it from the original source, or find another one I’m afraid.

      Reply
  • Randall Theuns

    13th October 2017

    Hey Paul,

    Great work on this. I’m currently busy implementing this for a minor i’m doing. I’ve prepared the dataset, trained the model, it seems to give good accuracy (~85% with CK+). Right now I’m to add webcam support to allow for semi-real-time emotion recognition, but whenever I use SVC.predict on a vectorized facial detection, I only get either 5 or 7 as predictions. If I use predict_proba instead, I get an array with only 7 probabilities.

    Do you have any clue why this happens?

    The code is available on github: https://github.com/drtheuns/minor_riet
    In particular, src/webcam.py and src/frames.py matter.

    Reply
    • Paul van Gent

      14th October 2017

      Hi Randall. Several things might cause the prediction to revert to only two classes:

      – Make sure you keep everything standardised. Is your face on the webcam image much larger or much smaller in terms of pixel size than the training data? Resize it before working with landmark coordinates
      – What’s happening in the 15%? Are there one or two categories that host most mistakes?
      – Are you expressing extreme emotional expressions? The CK+ dataset has extreme expressions. an ML model classifies correctly only the type of data you train it on.

      SVC.predict_proba() works just like that: it outputs a list of decision probabilities based on which the classification is made. If you feed SVC.predict_proba() an array of image data, it gives a matrix of predictions back.

      You could also try making your own dataset to append to the CK+ one. Maybe you can get 10 of your friends to each have a few photos made for each emotional expression. This might help predictions improve as well, since it trains the model explicitly on data gathered from the same sensor as from which predictions are made (your webcam).

      Lastly, please do me a favour and cite my blog in the readme.md. This helps me get more traffic and in turn helps me produce more content.

      -Paul

      Reply
  • Randall Theuns

    14th October 2017

    Hi Paul,

    Thanks for the swift and detailed reply.

    There’s a good chance that the size of the face in the webcam is the problem. I’ll have to look into that. Due to some time constraints and deadlines, I don’t have too much time to troubleshoot te 15% (only have a total of ~8 weeks, of which 4 remain to create this prototype, and I still have to visualise it).

    The predict_proba ‘issue’ I was talking about was more about the number of probabilities it returns (7, even though it was trained with 8 emotions), but this might have to do with too low probability, or just the same issue as above.

    I’ll see if I can increase the dataset a bit.

    You were already cited in the readme! https://github.com/drtheuns/minor_riet#citations

    Thanks!
    – Randall

    Reply
    • Paul van Gent

      15th October 2017

      Hi Randall. Are you sure that the model wasn’t trained on 7 emotions? It should return probabilities for the same number of classes as on which it has been trained, no matter the low probabilities..

      Don’t put too much thought into the remaining 15%, you will never reach 100% under realistic conditions (an aggregate of 60-80% would already be very good performance in real-world settings).

      Thanks for the citation, I must have missed that

      -Paul

      Reply
      • Randall Theuns

        15th October 2017

        Hey Paul,

        Just as a quick reply and perhaps a hint for other people. The code used to sort and prepare the dataset assumed 8 emotions, including happy. The code to predict emotions, however, assumes the same 8 emotions, including *happyiness*. This means that, to train the happy emotion on the model, it was looking for a folder called happiness, rather than happy.

        Fixing this simple issue seemed to have fixed the 7 or 8 prediction_proba issue. Another quick not about the above code:
        In the landmark pieces of code, range(1, 68) is used, therefore only grabbing 67 of the 68 landmarks.

        Thank you for the article and quick replies.
        – Randall

        Reply
  • Johnny

    16th October 2017

    Hi Paul. Very nice job.
    But i m struggling with following thing:
    I want to classify one sample from my webcam. I do not know what function to use and what parameter to give.
    I mean after clf.fit (training) i want to predict a frame from webcam. I used clf.predict_proba but the parameter expected must be equal with size of the training data (this is the error received).
    Do you know how to proceed to classify one frame from webcam ?
    Br

    Reply
  • Johnny

    16th October 2017

    Solved with predict_proba()

    Reply
  • Víctor

    19th October 2017

    Hello, I have two questions from a line of the code.
    1) I do not understand why to calculate the angle of each landmark point yo do:
    landmarks_vectorised.append(int(math.atan((y – ymean) / (x – xmean)) * 360 / math.pi))
    This would make sense for me if x and y are the coordinates of each landmark point. Nevertheless, x is xcentral and y is ycentral. And xcentral is x-xmean. By doing the sentence I mentioned before I understand that you are subtracting the mean 2 times.

    2) In the same line code from before:
    landmarks_vectorised.append(int(math.atan((y – ymean) / (x – xmean)) * 360 / math.pi))
    I do not understand why to pass the angle to degrees you multiply per 360/pi and not 360/2*pi, that this is what I was expecting.

    Reply
    • Paul van Gent

      23rd October 2017

      Hi Victor. Thanks for catching that, and you’re right of course. I must’ve been asleep when writing that line I guess! I’ve updated it:
      landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi))

      -Paul

      Reply
  • Mun

    23rd October 2017

    Hi Paul
    I followed all steps from ‘http://www.paulvangent.com/2016/08/05/emotion-recognition-using-facial-landmarks/’ and also ‘http://www.paulvangent.com/2016/04/01/emotion-recognition-with-python-opencv-and-a-face-dataset/’.

    Anyway i wondered how to show the results such as “anger: 0.03239878
    contempt: 0.13635423 disgust: 0.0117559 fear: 0.00202098 neutral: 0.7560004
    happy: 0.00382895 sadness: 0.04207027 surprise: 0.0155695”

    You mentioned to use ‘predict_proba()’ function to show the results about emotions. Do i need to make new script apart from main code? or, just add like
    print(“Emotion: \n{}” .format(np.argmax(gbrt.predict_proba(testvalue), axis))), this one ??

    Reply
    • Paul van Gent

      25th October 2017

      Hi Mun. Yes, predict_proba() is a function of the SKLearn classifier:

      clf = SVC(kernel=’linear’, probability=True, tol=1e-3) #Here make sure you set “probability” to true, otherwise the model cannot return the decision weights later on.
      #train model here
      clf.predict_proba(testvalue)

      – Paul

      Reply
      • Mun

        1st November 2017

        Thank you for you kindness! I will try it 🙂

        Reply
  • rob

    2nd November 2017

    Good day Sir please am facing some issues install boost python I downloaded the latest version and ran bootstrap.bat in my cmd but am facing this error

    cl not recognized as an internal or external command
    failed to build boost.build engine

    Reply
  • Rob

    3rd November 2017

    Hello sir i downloaded the latest version of boost python and when trying to build it in command prompt i came with this error

    c:\boost>build.bat
    ‘build.bat’ is not recognized as an internal or external command,
    operable program or batch file.

    c:\boost>bootstrap.bat
    Building Boost.Build engine

    Failed to build Boost.Build engine.
    Please consult bootstrap.log for further diagnostics.

    You can try to obtain a prebuilt binary from

    http://sf.net/project/showfiles.php?group_id=7586&package_id=72941

    Also, you can file an issue at http://svn.boost.org
    Please attach bootstrap.log in that case.
    Please i need your help on this Sir what do you think am doing wrong am using the visual studio 2017 developers command prompt

    Reply
    • Paul van Gent

      8th November 2017

      Did you install CMake and added it to the path variable?

      Reply
      • rob

        8th November 2017

        I alrrady did that but still having the same error

        Reply
  • Oran

    8th November 2017

    hi,
    and thanks for your work and publishing it…..

    I just wanted to update that I was able to get the kc+ and kc database here:
    http://www.consortium.ri.cmu.edu/ckagree/

    Reply
  • Kooper

    25th November 2017

    Hi.
    I’m trying to follow this tutorial but I’m stuck at beginning….
    I don’t know why I can’t download the CK+ dataset ?!
    I end up getting 403 forbidden as response ?
    Is there anyone who can help me ?

    Reply
    • Paul van Gent

      4th December 2017

      Hi Kooper. The availability of the dataset is intermittent. Unfortunately there is not much we can do about this. You could look at alternatives such as the Yale Face set.

      Reply

Leave a Reply