Let’s improve on the emotion recognition from a previous article about FisherFace Classifiers. We will be using facial landmarks and a machine learning algorithm, and see how well we can predict emotions in different individuals, rather than on a single individual like in another article about the emotion recognising music player.
Important: The code in this tutorial is licensed under the GNU 3.0 open source license and you are free to modify and redistribute the code, given that you give others you share the code with the same right, and cite my name (use citation format below). You are not free to redistribute or modify the tutorial itself in any way. By reading on you agree to these terms. If you disagree, please navigate away from this page.
Troubleshooting: I assume intermediate knowledge of Python for these tutorials. If you don’t have this, please try a few more basic tutorials first or follow an entry-level course on coursera or something similar. This also means you know how to interpret errors. Don’t panic but first read the thing, google if you don’t know the solution, only then ask for help. I’m getting too many emails and requests over very simple errors. Part of learning to program is learning to debug on your own as well. If you really can’t figure it out, let me know.
Citation format
van Gent, P. (2016). Emotion Recognition Using Facial Landmarks, Python, DLib and OpenCV. A tech blog about fun things with Python and embedded electronics. Retrieved from: http://www.paulvangent.com/2016/08/05/emotion-recognition-using-facial-landmarks/
IE users: I’ve gotten several reports that sometimes the code blocks don’t display correctly or at all on Internet Explorer. Please refresh the page and they should display fine.
Introduction and getting started
Using Facial Landmarks is another approach to detecting emotions, more robust and powerful than the earlier used fisherface classifier, but also requiring some more code and modules. Nothing insurmountable though. We need to do a few things:
- Get images from a webcam
- Detect Facial Landmarks
- Train a machine learning algorithm (we will use a linear SVM)
- Predict emotions
Those who followed the two previous posts about emotion recognition will know that the first step is already done.
Also we will be using:
- Python (2.7 or higher is fine, anaconda + jupyter notebook is a nice combo-package)
- OpenCV (I still use 2.4.9……so lazy, grab here)
- SKLearn (if you installed anaconda, it is already there, otherwise get it with pip install sklearn)
- Dlib (a C++ library for extracting the facial landmarks, see below for instructions)
- Visual Studio 2015 (get the community edition here, also select the Python Tools and the Common tools for visual c++ in the installation dialog)
Installing and building the required libraries
I am on Windows, and building libraries on Windows always gives many people a bad taste in their mouths. I can understand why, however it’s not all bad and often the problems people run into are either solved by correctly setting PATH variables, providing the right compiler or reading the error messages and installing the right dependencies. I will walk you through the process of compiling and installing Dlib.
First install CMake. This should be straightforward, download the windows installer and install. Make sure to select the option “Add CMake to the system PATH” during the install. Choose whether you want this for all users or just for your account.
Download Boost-Python and extract the package. I extracted it into C:\boost but it can be anything. Fire up a command prompt and navigate to the directory. Then do:
bootstrap.bat #First run the bootstrap.bat file supplied with boost-python
#Once it finished invoke the install process of boost-python like this:
b2 install #This can take a while, go get a coffee
#Once this finishes, build the python modules like this
b2 -a --with-python address-model=64 toolset=msvc runtime-link=static #Again, this takes a while, reward yourself and get another coffee.
Once all is done you will find a folder named bin, or bin.v2, or something like this in your boost folder. Now it’s time to build Dlib.
Download Dlib and extract it somewhere. I used C:\Dlib but you can do it anywhere. Go back to your command prompt, or open a new one if you closed it, and navigate to your Dlib folder. Do this sequentially:
# Set two flags so that the CMake compiler knows where to find the boost-python libraries
set BOOST_ROOT=C:\boost #Make sure to set this to the path you extracted boost-python to!
set BOOST_LIBRARYDIR=C:\boost\stage\lib #Same as above
# Create and navigate into a directory to build into
mkdir build
cd build
# Build the dlib tools
cmake ..
#Navigate up one level and run the python setup program
cd ..
python setup.py install #This takes some time as well. GO GET ANOTHER COFFEE TIGER!
Open your Python interpreter and type “import dlib”. If you receive no messages, you’re good to go! Nice.
Testing the landmark detector
Before diving into much of the coding (which probably won’t be much because we’ll be recycling), let’s test the DLib installation on your webcam. For this you can use the following snippet. If you want to learn how this works, be sure to also compare it with the first script under “Detecting your face on the webcam” in the previous post. Much of the same OpenCV code to talk to your webcam, process the image by converting to grayscale, optimising the contrast with an adaptive histogram equalisation and displaying it is something we did there.
#Import required modules
import cv2
import dlib
#Set up some required objects
video_capture = cv2.VideoCapture(0) #Webcam object
detector = dlib.get_frontal_face_detector() #Face detector
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat") #Landmark identifier. Set the filename to whatever you named the downloaded file
while True:
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
clahe_image = clahe.apply(gray)
detections = detector(clahe_image, 1) #Detect the faces in the image
for k,d in enumerate(detections): #For each detected face
shape = predictor(clahe_image, d) #Get coordinates
for i in range(1,68): #There are 68 landmark points on each face
cv2.circle(frame, (shape.part(i).x, shape.part(i).y), 1, (0,0,255), thickness=2) #For each point, draw a red circle with thickness2 on the original frame
cv2.imshow("image", frame) #Display the frame
if cv2.waitKey(1) & 0xFF == ord('q'): #Exit program when the user presses 'q'
break
This will result in your face with a lot of dots outlining the shape and all the “moveable parts”. The latter is of course important because it is what makes emotional expressions possible.
Note if you have no webcam and/or would rather like to try this on a static image, replace line #11 with something like frame = cv2.imread(“filename”) and comment out line #6 where we define the video_capture object. You will get something like:
my face has dots
people tell me my face has nice dots
experts tell me these are the best dots
I bet I have the best dots
Extracting features from the faces
The first thing to do is find ways to transform these nice dots overlaid on your face into features to feed the classifer. Features are little bits of information that describe the object or object state that we are trying to divide into categories. Is this description a bit abstract? Imagine you are in a room without windows with only a speaker and a microphone. I am outside this room and I need to make you guess whether there is a cat, dog or a horse in front of me. The rule is that I can only use visual characteristics of the animal, no names or comparisons. What do I tell you? Probably if the animal is big or small, that it has fur, that the fur is long or short, that it has claws or hooves, whether it has a tail made of flesh or just from hair, etcetera. Each bit of information I pass you can be considered a feature, and based the same feature set for each animal, you would be pretty accurate if I chose the features well.
How you extract features from your source data is actually where a lot of research is, it’s not just about creating better classifying algorithms but also about finding better ways to collect and describe data. The same classifying algorithm might function tremendously well or not at all depending on how well the information we feed it is able to discriminate between different objects or object states. If, for example, we would extract eye colour and number of freckles on each face, feed it to the classifier, and then expect it to be able to predict what emotion is expressed, we would not get far. However, the facial landmarks from the same image material describe the position of all the “moving parts” of the depicted face, the things you use to express an emotion. This is certainly useful information!
To get started, let’s take the code from the example above and change it so that it fits our current needs, like this:
import cv2
import dlib
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
def get_landmarks(image):
detections = detector(image, 1)
for k,d in enumerate(detections): #For all detected face instances individually
shape = predictor(image, d) #Draw Facial Landmarks with the predictor class
xlist = []
ylist = []
for i in range(1,68): #Store X and Y coordinates in two lists
xlist.append(float(shape.part(i).x))
ylist.append(float(shape.part(i).y))
for x, y in zip(xlist, ylist): #Store all landmarks in one list in the format x1,y1,x2,y2,etc.
landmarks.append(x)
landmarks.append(y)
if len(detections) > 0:
return landmarks
else: #If no faces are detected, return error message to other function to handle
landmarks = "error"
return landmarks
The .dat file mentioned can be found in the DLIB zip file you downloaded, or alternatively on this link.
Here we extract the coordinates of all face landmarks. These coordinates are the first collection of features, and this might be the end of the road. You might also continue and try to derive other measures from this that will tell the classifier more about what is happening on the face. Whether this is necessary or not depends. For now let’s assume it is necessary, and look at ways to extract more information from what we have. Feature generation is always a good thing to try, if only because it brings you closer to the data and might give you ideas or alternative views at it because you’re getting your hands dirty. Later on we’ll see if it was really necessary at a classification level.
To start, look at the coordinates. They may change as my face moves to different parts of the frame. I could be expressing the same emotion in the top left of an image as in the bottom right of another image, but the resulting coordinate matrix would express different numerical ranges. However, the relationships between the coordinates will be similar in both matrices so some information is present in a location invariant form, meaning it is the same no matter where in the picture my face is.
Maybe the most straightforward way to remove numerical differences originating from faces in different places of the image would be normalising the coordinates between 0 and 1. This is easily done by: , or to put it in code:
xnorm = [(i-min(xlist))/(max(xlist)-min(xlist)) for i in xlist]
ynorm = [(i-min(ylist))/(max(ylist)-min(ylist)) for i in ylist]
However, there is a problem with this approach because it fits the entire face in a square with both axes ranging from 0 to 1. Imagine one face with its eyebrows up high and mouth open, the person could be surprised. Now imagine an angry face with eyebrows down and mouth closed. If we normalise the landmark points on both faces from 0-1 and put them next to each other we might see two very similar faces. Because both distinguishing features lie at the edges of the face, normalising will push both back into a very similar shape. The faces will end up looking very similar. Take a moment to appreciate what we have done; we have thrown away most of the variation that in the first place would have allowed us to tell the two emotions from each other! Probably this will not work. Of course some variation remains from the open mouth, but it would be better not to throw so much away.
A less destructive way could be to calculate the position of all points relative to each other. To do this we calculate the mean of both axes, which results in the point coordinates of the sort-of “centre of gravity” of all face landmarks. We can then get the position of all points relative to this central point. Let me show you what I mean. Here’s my face with landmarks overlaid:
First we add a “centre of gravity”, shown as a blue dot on the image below:
Lastly we draw a line between the centre point and each other facial landmark location:
Note that each line has both a magnitude (distance between both points) and a direction (angle relative to image where horizontal=0°), in other words, a vector.
But, you may ask, why don’t we take for example the tip of the nose as the central point? This would work as well, but would also throw extra variance in the mix due to short, long, high- or low-tipped noses. The “centre point method” also introduces extra variance; the centre of gravity shifts when the head turns away from the camera, but I think this is less than when using the nose-tip method because most faces more or less face the camera in our sets. There are techniques to estimate head pose and then correct for it, but that is beyond this article.
There is one last thing to note. Faces may be tilted, which might confuse the classifier. We can correct for this rotation by assuming that the bridge of the nose in most people is more or less straight, and offset all calculated angles by the angle of the nose bridge. This rotates the entire vector array so that tilted faces become similar to non-tilted faces with the same expression. Below are two images, the left one illustrates what happens in the code when the angles are calculated, the right one shows how we can calculate the face offset correction by taking the tip of the nose and finding the angle the nose makes relative to the image, and thus find the angular offset β we need to apply.
Now let’s look at how to implement what I described above in Python. It’s actually fairly straightforward. We just slightly modify the get_landmarks() function from above.
def get_landmarks(image):
detections = detector(image, 1)
for k,d in enumerate(detections): #For all detected face instances individually
shape = predictor(image, d) #Draw Facial Landmarks with the predictor class
xlist = []
ylist = []
for i in range(1,68): #Store X and Y coordinates in two lists
xlist.append(float(shape.part(i).x))
ylist.append(float(shape.part(i).y))
xmean = np.mean(xlist) #Find both coordinates of centre of gravity
ymean = np.mean(ylist)
xcentral = [(x-xmean) for x in xlist] #Calculate distance centre <-> other points in both axes
ycentral = [(y-ymean) for y in ylist]
landmarks_vectorised = []
for x, y, w, z in zip(xcentral, ycentral, xlist, ylist):
landmarks_vectorised.append(w)
landmarks_vectorised.append(z)
meannp = np.asarray((ymean,xmean))
coornp = np.asarray((z,w))
dist = np.linalg.norm(coornp-meannp)
landmarks_vectorised.append(dist)
landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi))
data['landmarks_vectorised'] = landmarks_vectorised
if len(detections) < 1:
data['landmarks_vestorised'] = "error"
That was actually quite manageable, no? Now it’s time to put all of the above together with some stuff from the first post. The goal is to read the existing dataset into a training and prediction set with corresponding labels, train the classifier (we use Support Vector Machines with linear kernel from SKLearn, but feel free to experiment with other available kernels such as polynomial or rbf, or other classifiers!), and evaluate the result. This evaluation will be done in two steps; first we get an overall accuracy after ten different data segmentation, training and prediction runs, second we will evaluate the predictive probabilities.
Déja-Vu All Over Again
The next thing we will be doing is returning to the two datasets from the original post. Let’s see how this approach stacks up.
First let’s write some code. The approach is to first extract facial landmark points from the images, randomly divide 80% of the data into a training set and 20% into a test set, then feed these into the classifier and train it on the training set. Finally we evaluate the resulting model by predicting what is in the test set to see how the model handles the unknown data. Basically a lot of the steps are the same as what we did earlier.
The quick and dirty (I will clean and ‘pythonify’ the code later, when there is time) solution based off of earlier code could be something like:
import cv2
import glob
import random
import math
import numpy as np
import dlib
import itertools
from sklearn.svm import SVC
emotions = ["anger", "contempt", "disgust", "fear", "happiness", "neutral", "sadness", "surprise"] #Emotion list
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat") #Or set this to whatever you named the downloaded file
clf = SVC(kernel='linear', probability=True, tol=1e-3)#, verbose = True) #Set the classifier as a support vector machines with polynomial kernel
data = {} #Make dictionary for all values
#data['landmarks_vectorised'] = []
def get_files(emotion): #Define function to get file list, randomly shuffle it and split 80/20
files = glob.glob("dataset\\%s\\*" %emotion)
random.shuffle(files)
training = files[:int(len(files)*0.8)] #get first 80% of file list
prediction = files[-int(len(files)*0.2):] #get last 20% of file list
return training, prediction
def get_landmarks(image):
detections = detector(image, 1)
for k,d in enumerate(detections): #For all detected face instances individually
shape = predictor(image, d) #Draw Facial Landmarks with the predictor class
xlist = []
ylist = []
for i in range(1,68): #Store X and Y coordinates in two lists
xlist.append(float(shape.part(i).x))
ylist.append(float(shape.part(i).y))
xmean = np.mean(xlist)
ymean = np.mean(ylist)
xcentral = [(x-xmean) for x in xlist]
ycentral = [(y-ymean) for y in ylist]
landmarks_vectorised = []
for x, y, w, z in zip(xcentral, ycentral, xlist, ylist):
landmarks_vectorised.append(w)
landmarks_vectorised.append(z)
meannp = np.asarray((ymean,xmean))
coornp = np.asarray((z,w))
dist = np.linalg.norm(coornp-meannp)
landmarks_vectorised.append(dist)
landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi))
data['landmarks_vectorised'] = landmarks_vectorised
if len(detections) < 1:
data['landmarks_vestorised'] = "error"
def make_sets():
training_data = []
training_labels = []
prediction_data = []
prediction_labels = []
for emotion in emotions:
print(" working on %s" %emotion)
training, prediction = get_files(emotion)
#Append data to training and prediction list, and generate labels 0-7
for item in training:
image = cv2.imread(item) #open image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #convert to grayscale
clahe_image = clahe.apply(gray)
get_landmarks(clahe_image)
if data['landmarks_vectorised'] == "error":
print("no face detected on this one")
else:
training_data.append(data['landmarks_vectorised']) #append image array to training data list
training_labels.append(emotions.index(emotion))
for item in prediction:
image = cv2.imread(item)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
clahe_image = clahe.apply(gray)
get_landmarks(clahe_image)
if data['landmarks_vectorised'] == "error":
print("no face detected on this one")
else:
prediction_data.append(data['landmarks_vectorised'])
prediction_labels.append(emotions.index(emotion))
return training_data, training_labels, prediction_data, prediction_labels
accur_lin = []
for i in range(0,10):
print("Making sets %s" %i) #Make sets by random sampling 80/20%
training_data, training_labels, prediction_data, prediction_labels = make_sets()
npar_train = np.array(training_data) #Turn the training set into a numpy array for the classifier
npar_trainlabs = np.array(training_labels)
print("training SVM linear %s" %i) #train SVM
clf.fit(npar_train, training_labels)
print("getting accuracies %s" %i) #Use score() function to get accuracy
npar_pred = np.array(prediction_data)
pred_lin = clf.score(npar_pred, prediction_labels)
print "linear: ", pred_lin
accur_lin.append(pred_lin) #Store accuracy in a list
print("Mean value lin svm: %s" %np.mean(accur_lin)) #FGet mean accuracy of the 10 runs
Remember that in the previous post, for the standard set at 8 categories we managed to get 69.3% accuracy with the FisherFace classifier. This approach yields 84.1% on the same data, a lot better!
We then reduced the set to 5 emotions (leaving out contempt, fear and sadness), because the 3 categories had very few images, and got 82.5% correct. This approach gives 92.6%, also much improvement.
After adding the less standardised and more difficult images from google, we got 61.6% correct when predicting 7 emotions (the contempt category remained very small so we left that out). This is now 78.2%, also quite an improvement. This remains the lowest accuracy, showing that for a more diverse dataset the problem is also more difficult. Keep in mind that the dataset I use is still quite small in machine learning terms, containing about 1000 images spread over 8 categories.
Looking at features
So we derived different features from the data, but weren’t sure whether this was strictly necessary. So was this necessary? It depends! It depends on if doing so adds more unique variance related to what you’re trying to predict, it depends on what classifier you use, etc.
Let’s run different feature combinations as inputs through different classifiers and see what happens. I’ve run all iterations on the same slice of data with 4 emotion categories of comparable size (so that running the same settings again yields the same predictive value).
Using all of the features described so far leads to:
Linear SVM: 93.9%
Polynomial SVM: 83.7%
Random Forest Classifier: 87.8%
Now using just the vector length and angle:
Linear SVM: 87.8%
Polynomial SVM: 87.8%
Random Forest Classifier: 79.6%
Now using just the raw coordinates:
Linear SVM: 91.8%
Polynomial SVM: 89.8%
Random Forest Classifier: 59.2%
Now replacing all training data with zeros:
Linear SVM: 32.7%
Polynomial SVM: 32.7%
Random Forest Classifier: 32.7%
Now this is interesting! First note that there isn’t much difference in the accuracy of the support vector machine classifiers when using the extra features we generate. This type of classifier already preprocesses the data quite extensively. The extra data we generate does not contain much if any extra information to this classifier, so it only marginally improves the performance of the linear kernel, and actually hurts the polynomial kernel because data with a lot of overlapping variance can also make a classification task more difficult (here, it probably results in overfitting the training data). By the way, this is a nice 2D visualisation of what an SVC tries to achieve, complexity escalates when adding one dimension. Now remember that the SVC operates in an N-dimensional space and try to imagine what a set of hyperplanes in 4, 8, 12, 36 or more dimensions would look like. Don’t drive yourself crazy.
Random Forest Classifiers do things a lot differently. Essentially they are a forest of decision trees. Simplified, each tree is a long list of yes/no questions, and answering all questions leads to a conclusion. In the forest the correlation between each tree and the others is kept as low as possible, which ensures every tree brings something unique to the table when explaining patterns in the data. Each tree then votes on what it thinks the answer is, and most votes win. This approach benefits extensively from the new features we generated, jumping from 59.2% to 87.8% accuracy as we combine all derived features with the raw coordinates.
So you see, the answer you likely get when you ask any scientist a direct question holds true here as well: it depends. Check your data, think twice and don’t be afraid to try a few things.
The last that may be noticed is that, when not adding any data at all and in stead presenting the classifiers with a matrix of zeros, they still perform slightly above the expected chance level of 25%. This is because the categories are not identically sized.
Looking at mistakes
Lastly, let’s take a look at where the model goes wrong. Often this is where you can learn a lot, for example this is where you might find that a single category just doesn’t work at all, which can lead you to look critically at the training material again.
One advantage of the SVM classifier we use is that it is probabilistic. This means that it assigns probabilities to each category it has been trained for (and you can get these probabilities if you set the ‘probability’ flag to True). So, for example, a single image might be “happy” with 85% probability, “angry” with “10% probability, etc.
To get the classifier to return these things you can use its predict_proba() function. You give this function either a single data row to predict or feed it your entire dataset. It will return a matrix where each row corresponds to one prediction, and each column represents a category. I wrote these probabilities to a table and included the source image and label. Looking at some mistakes, here are some notable things that were classified incorrectly (note there are only images from my google set, the CK+ set’s terms prohibit me from publishing images for privacy reasons):
anger: 0.03239878
contempt: 0.13635423
disgust: 0.0117559
fear: 0.00202098
neutral: 0.7560004
happy: 0.00382895
sadness: 0.04207027
surprise: 0.0155695
The correct answer is contempt. To be honest I would agree with the classifier, because the expression really is subtle. Note that contempt is the second most likely according to the classifier.
anger: 0.0726657
contempt: 0.24655082
disgust: 0.06427896
fear: 0.02427595
neutral: 0.20176133
happy: 0.03169822
sadness: 0.34911036
surprise: 0.00965867
The correct answer is disgust. Again I can definitely understand the mistake the classifier makes here (I might make the same mistake..). Disgust would be my second guess, but not the classifier’s. I have removed this image from the dataset because it can be ambiguous.
anger: 0.00304093
contempt: 0.01715202
disgust: 0.74954754
fear: 0.04916257
neutral: 0.00806644
happy: 0.13546932
sadness: 0.02680473
surprise: 0.01075646
The correct answer is obviously happy. This is a mistake that is less understandable but still the model is quite sure (~75%). There definitely is no hint of disgust in her face. Do note however, that happiness would be the classifier’s second guess. More training material might rectify this situation.
anger: 0.0372873
contempt: 0.08705531
disgust: 0.12282577
fear: 0.16857784
neutral: 0.09523397
happy: 0.26552763
sadness: 0.20521671
surprise: 0.01827547
The correct answer is sadness. Here the classifier is not sure at all (~27%)! Like in the previous image, the second guess (~20%) is the correct answer. This may very well be fixed by having more (and more diverse) training data.
anger: 0.01440529
contempt: 0.15626157
disgust: 0.01007962
fear: 0.00466321
neutral: 0.378776
happy: 0.00554828
sadness: 0.07485257
surprise: 0.35541345
The correct answer is surprise. Again a near miss (~38% vs ~36%)! Also note that this is particularly difficult because there are few baby faces in the dataset. When I said earlier that the extra google images are very challenging for a classifier, I meant it!
Upping the game – the ultimate challenge
Although the small google dataset I put together is more challenging than the lab-conditions of the CK/CK+ dataset, it is still somewhat controlled. For example I filtered out faces that were more sideways than frontal-facing, where the emotion was very mixed (happily surprised for example), and also where the emotion was so subtle that even I had trouble identifying it.
A far greater (and more realistic still) challenge is the SFEW/AFEW dataset, put together from a large collection of movie scenes. Read more about it here. The set is not publicly available but the author was generous enough to share the set with me so that I could evaluate the taken approach further.
Guess what, it fails miserably! It attained about 44.2% on the images when training on 90% and validating on 10% of the set. Although this is on par with what is mentioned in the paper, it shows there is still a long way to go before computers can recognize emotions with a high enough accuracy in real-life settings. There are also video clips included on which we will spend another post together with convolutional neural nets at a later time.
This set is particularly difficult because it contains different expressions and facial poses and rotations for similar emotions. This was the purpose of the authors; techniques by now are good enough to recognise emotions on controlled datasets with images taken in lab-like conditions, approaching upper 90% accuracy in many recent works (even our relatively simple approach reached early 90). However these sets do not represent real life settings very much, except maybe when using laptop webcams, because you always more or less face this device and sit at a comparable distance when using the laptop. This means for applications in marketing and similar fields the technology is already usable, albeit with much room for improvement still available and requiring some expertise to implement it correctly.
Final reflections
Before concluding I want you to take a moment, relax and sit back and think. Take for example the SFEW set with real-life examples, accurate classification of which quickly gets terribly difficult. We humans perform this recognition task remarkably well thanks to our highly complex visual system, which has zero problems with object rotation in all planes, different face sizes, different facial characteristics, extreme changes in lighting conditions or even partial occlusion of a face. Your first response might be “but that’s easy, I do it all the time!”, but it’s really, really, really not. Think for a moment about what an enormously complex problem this really is. I can show you a mouth and you would already be quite good at seeing an emotion. I can show you about 5% of a car and you could recognize it as a car easily, I can even warp and destroy the image and your brain would laugh at me and tell me “easy, that’s a car bro”. This is a task that you solve constantly and in real-time, without conscious effort, with virtually 100% accuracy and while only using the equivalent of ~20 watts for your entire brain (not just the visual system). The average still-not-so-good-at-object-recognition CPU+GPU home computer uses 350-450 watts when computing. Then there’s supercomputers like the TaihuLight, which require about 15.300.000 watts (using in one hour what the average Dutch household uses in 5.1 years). At least at visual tasks, you still outperform these things by quite a large margin with only 0.00013% of their energy budget. Well done, brain!
Anyway, to try and tackle this problem digitally we need another approach. In another post we will look at various forms of neural nets (modeled after your brain) and how these may or may not solve the problem, and also at some other feature extraction techniques.
The CK+ dataset was used for validating and training of the classifier in this article, references to the set are:
- Kanade, T., Cohn, J. F., & Tian, Y. (2000). Comprehensive database for facial expression analysis. Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG’00), Grenoble, France, 46-53.
- Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The Extended Cohn-Kanade Dataset (CK+): A complete expression dataset for action unit and emotion-specified expression. Proceedings of the Third International Workshop on CVPR for Human Communicative Behavior Analysis (CVPR4HB 2010), San Francisco, USA, 94-101.
The SFEW/AFEW dataset used for evaluation is authored by and described in:
- A. Dhall, R. Goecke, S. Lucey and T. Gedeon, “Collecting Large, Richly Annotated Facial- Expression Databases from Movies”, IEEE MultiMedia 19 (2012) 34-41.
- A. Dhall, R. Goecke, J. Joshi, K. Sikka and T. Gedeon, “ Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol”, ACM ICMI 2014.
Bas
August 11, 2016This software is going to be Huuuuggeeee!
Zach
October 6, 2016This is amazing – thank you for providing a human readable walkthrough! I was not learning much reading the many post-doc borg machine code style walkthroughs Google keeps pointing me to.
Paul van Gent
October 6, 2016Thanks :)! This is exactly why I decided to work out something myself and share it. Glad it helped.
jettin
October 7, 2016sir,
I was unable to install boost python i am getting an error like : mscv.jam no such a file or directory. i have vs2015
Paul van Gent
October 20, 2016Hi Jettin,
I’ve never seen this error. mscv.jam should be msvc.jam though. Are you setting your compiler name correctly?
Jone
October 20, 2016When I run the solution code above, the pred_lin I got was always 1.0 no matter how I changed the ratio of the training and test set. I just used the CK+ dataset, and put them into 8 respective folders. Did I miss something, or something wrong with the code?
Paul van Gent
October 20, 2016Hi Jone,
If I run it it functions fine. What likely happens is that you have overlap between your testing and training data. For example if I set the size of the training set to 1.0 and leave the testing set at 0.1 it also returns 100% accuracy. This is because the model easily remembers what it has already seen, but then you still have no information about how well it generalizes. Use the settings from the tutorial.
Also, I’ve written the train/test split out because it gives a more clear image of what happens. If you can’t get it to work, look for a simpler approach such as SKLearns train_test_split() function.
Good luck!
Stanislav
November 9, 2016Hi all !
My configuration is : win7 – 32, Microsoft Visual Studio 15 (community), Phyton3.5, Cmake3.7.0, Boost1.6.2.0
I try this instructions and find small bug , in string
b2 -a –with-python address-model=64 toolset=mscv runtime-link=static
Need replace the string toolset=mscv, to toolset=msvc !!! Its not a joke , i find this rightly string option in the bootstrap.bat
For win32 my string is:
b2 -a –with-python address-model=32 toolset=msvc runtime-link=static
work fine !!
Paul van Gent
November 9, 2016Thanks Stanislav for catching that, seems to be a typo! It should indeed be msvc (microsoft visual c++). I’ve updated it.
Stanislav
November 9, 2016Thanks.. And I remember. The file in boost library /dlib/cmake_utils/add_python_modue caused error – “Not find header for python-py34”.
Replace in file add_python_modue next :
FIND_PACKAGE(Boost boost-python COMPONENTS python-py34 )
if (NOT Boost_FOUND)
FIND_PACKAGE(Boost boost-python COMPONENTS python-py35)
endif()
if (NOT Boost_FOUND)
FIND_PACKAGE(Boost COMPONENTS python3)
endif()
if (NOT Boost_FOUND)
FIND_PACKAGE(Boost COMPONENTS python)
endif()
to
FIND_PACKAGE(Boost COMPONENTS system)
if (NOT Boost_FOUND)
FIND_PACKAGE(Boost COMPONENTS thread )
endif()
if (NOT Boost_FOUND)
FIND_PACKAGE(Boost COMPONENTS python)
endif()
if (NOT Boost_FOUND)
FIND_PACKAGE(Boost COMPONENTS REQUIRED )
endif()
And everything will be work fine ! Sorry for my bad english..
Francisco Gonzalez Hernandez
November 24, 2016Hi Paul, I’ve used your code and I’ve obtained some good results, your work is fantastic. By the way, I want to cite you on a scientific paper, do you have any scientific paper where you want to be citated?, also, I don’t know if you can give me more information about the used works to create this work. I look forward to reading from you soon and thanks.
Paul van Gent
November 25, 2016Hi Francisco,
I’m happy to hear this! I’ve sent you a mail with detailed information. I don’t have the papers describing this methodology handy right now, but a quick search for “facial landmark generation” and “emotion recognition using support vector machines” should return a good overview of the field.
Cheers,
Paul
Ying
November 29, 2016Hi, Paul,
Thanks for your work and your post. I am working on a project of emotion recognition right now and your post is a saver.
I have some doubts though, wondering if you have an answer.
Do you think it is possible to do all this work in Linux (Ubuntu)? Even more in Raspberry Pi (also Ubuntu)?
Thanks,
Ying
Paul van Gent
November 29, 2016Hi Ying,). What are you making?
The great thing about Python is that it is very cross-platform. Make sure you install the right dependencies. Also the code for reading from the webcam might be a bit different (in linux it lives under /dev/video
Ying
December 5, 2016I am making a facial expression (emotions) recognition algorithm which will associate users’ emotions with some specific pieces of music. Cause I am cooperating with some music artists. They will study the relation between emotions and music.
Ying
December 5, 2016By the way, I have sent my email address for demanding the dataset of face emotions you used in your previous post, always no reply.
Kazuno
December 21, 2016Hi, Paul…
Currently, I am working on a project of emotion recognition through webcam. I’ve used your code and it’s really saved my life. Thanks for your work. But, I’m little bit confused with clf.predict. I didn’t know how to show emotion label for each video frame. Please help me out.
Paul van Gent
December 27, 2016Glad to hear it helped. I’m not sure what you mean. Do you want to overlay the emotion label on a real-time video stream?
Kazuno
January 12, 2017Owh, sorry. My bad. That’s not what I meant. Regarding your tutorial, you only show the training part and fit training set to classifier. But doesn’t show how to predict emotion through webcam based on facial landmark using ctf.predict. I know how to used fishercascade prediction based on your previous tutorial, but I just don’t know how to implement ctf.predict in this tutorial. Please help me out. Thank you.
Paul van Gent
February 8, 2017No I didn’t use the fishercascade prediction from the previous tutorial. Here I use DLib to mark face features and use different classifiers (random forest, SVM) to predict emotions.
Once the model is trained as shown in the tutorial, you can feed it images from a webcam just as well as images from a dataset. Based on this and the previous tutorial, you should be able to figure out how to do that :).
You can view the SKLearn documentation to see how the classifiers work. Some use a .predict() function, others a .score().
Shivam
January 28, 2017Hi, Paul…
Currently, I am working on a project of emotion recognition and i am facing problem that it’s not able to run bootstrap.bat file as it’s showing an error (there is no file named bootstrap.bat supplied with boost-python
)
NameError: name ‘bootstrap’ is not defined
Can you please help me out!
thanks
Paul van Gent
February 8, 2017Hi Shivam. Wat OS are you on?
Shivam
February 24, 2017Sir , I am using OS- windows 8.1.
Could you please help me out.?
Paul van Gent
February 24, 2017Be sure to download the correct boost file. The most current is here:
https://sourceforge.net/projects/boost/files/boost/1.63.0/
If I download and extract the zip, the bootstrap.bat is there.
shivam
April 17, 2017Sir I have downloaded the correct boost file only from https://sourceforge.net/projects/boost/files/boost/1.63.0/
But the bootstrap.bat file is not available in boost/1.63.0 folder.
Could you please help me out with any other method to build this boost python so that I can Continue with my project??
thanks.
Gaurav
February 7, 2017Hi Paul
Which version of SKlearn did you use?
Paul van Gent
February 8, 2017Hi Gaurav. I use 0.18.
Blackwood
February 11, 2017Hi,Paul
This an amazing idea. But when i use visual studio 2013 to do it ,if find the predict result is very bad. the probility event not up to 20%.
I use libsvm to train the model useing “C_SCV, LINEAR”,every sample have 272(68*4)features.and the model file is about 170Mbytes.
is this right?
thank you.
Paul van Gent
February 13, 2017Hi Blackwood. I don’t know if the model size is right. What kind of dataset do you use to train the classifier? My guess is that either the dataset is too small, doesn’t contain a lot (or too little) variance, or that the feature extraction somewhere goes wrong.
Blackwood
February 13, 2017Hi, Paul
I use CK/CK+ dataset,and pick the first and the last picture of each emotion sequences.
The first picture is the netural and the last picture is the emotion of the other 7 types.
There are about 650 picture in training.
the dataset is too small ?
Is each sample has 272 (68*4) features?
What is you dataset size?
thank you .
Paul van Gent
February 15, 2017Hi Blackwood,
So strange, my set is also about 650 images (647). Could you send me your full code so I can have a look (info@paulvangent.com)? If I run my code it will still attain upper .8, lower .9 accuracy.
Cheers
Blackwood
February 16, 2017Thank you Paul.
I check code,and find the svm parameter is not right and i changed it.
now, the predict result is up to 85%. I have emailed the c++ code to you mailbox,
Do you plan to do it using dnn?
Paul van Gent
February 16, 2017Hi Blackwood,
Good to hear you found the issue. There is a deep learning tutorial planned for somewhere in the near future yes, but I will need to see when I find the time to make it :). Stay tuned!
Cheers
Sujay Angadi
February 14, 2017plz provide proper links to download and install Cmake and python.boost
Paul van Gent
February 14, 2017At the time of writing of the tutorial the links worked. As far as I can see they still do. What exactly do you mean?
Also, google is your friend 🙂
Paul van Gent
February 14, 2017Ah I see the CMake link pointed to jupyter, strange. I updated it.
Ashwin
February 14, 2017Hi Paul!
Great tutorial. I’m using your tutorial to find emotion using SVM as its is a part of my project.
My Configuration is as follows –
-Windows 10 64-bit
-Visual Studio 2015
-Python 2.7
-Opencv 2.4.9
-Cmake-3.8.0-rc1-win64-x64.msi
When I run the command – python setup.py install. It returns the following error –
libboost_python-vc140-mt-s-1_63.lib(errors.obj) : fatal error LNK1112: module machine type ‘x64’ conflicts with target machine type ‘X86’ [C:\Dlib\tools\python\build\dlib_.vcxproj]
Done Building Project “C:\Dlib\tools\python\build\dlib_.vcxproj” (default targets) — FAILED.
Done Building Project “C:\Dlib\tools\python\build\ALL_BUILD.vcxproj” (default targets) — FAILED.
Done Building Project “C:\Dlib\tools\python\build\install.vcxproj” (default targets) — FAILED.
Build FAILED.
“C:\Dlib\tools\python\build\install.vcxproj” (default target) (1) ->
“C:\Dlib\tools\python\build\ALL_BUILD.vcxproj” (default target) (3) ->
“C:\Dlib\tools\python\build\dlib_.vcxproj” (default target) (5) ->
(Link target) ->
libboost_python-vc140-mt-s-1_63.lib(errors.obj) : fatal error LNK1112: module machine type ‘x64’ conflicts with target machine type ‘X86’ [C:\Dlib\tools\python\build\dlib_.vcxproj]
0 Warning(s)
1 Error(s)
Time Elapsed 00:07:04.04
error: cmake build failed!
So I’m not able to move ahead. I really need your help in this.
Paul van Gent
February 14, 2017Hi Ashwin,
Thanks, glad it’s helping!
The first bit and last bit of the error is what it’s about:
“libboost_python-vc140-mt-s-1_63.lib(errors.obj) : fatal error LNK1112: module machine type ‘x64’ conflicts with target machine type ‘X86’ [C:\Dlib\tools\python\build\dlib_.vcxproj]”
One of the things in your list seems to be 32-bit. You cannot mix 32 and 64 bit architectures. Verify that Python is 64 bit, Boost is 64 bit, Dlib is 64 bit. Did you build the boost library with the 64-bit flag?
Ashwin
February 15, 2017Yes, I built the boost library with the 64-bit flag. By the way which version of Boost, Dlib and Cmake did you use in this tutorial?
Paul van Gent
February 15, 2017Are they all 64-bit? What about your python distro? I used:
– Boost 1.61
– Dlib 19.2
– Cmake 3.7.1
However, I highly doubt that versions matter. The error is quite specific about there being an instruction set incompatibility.
Ashwin
February 16, 2017Hi Paul. Thanks. I did install all 64 bit modules and it worked. But now when I execute the first code it gives me the following error-
predictor = dlib.shape_predictor(“shape_predictor_68_face_landmarks.dat”) #Landmark identifier. Set the filename to whatever you named the downloaded file
RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat
So do i need to install that particular file ?
Sorry I’m new to python.
Paul van Gent
February 16, 2017Yes, this file is the trained model for the face detector in Dlib. Without it, it doesn’t know what a face looks like.
I think the download link is in the tutorial.
Good luck :)!
Yuki
March 21, 2017Hi Paul.
I have the same problem as Ashwin.
I used:
-Python 2.7.13 64bit.
-dlib 19.1 (I don’t know how to check is it 64bit or not)
-Boost 1.63.0 (with the flag “address-model=64”)
-cmake 3.6.1
This error had trouble me more 1 week, tried a lot method and still can’t solve it QAQ
Tony
February 15, 2017Hi Paul,
Excellent tutorials. I have a doubt. The accuracy of both landmark as well as the fishare face for me are quite low, around the low thirties. Any idea why ? I am using the same data set and same algorithm.
Any idea regarding this ?
Thanks
Paul van Gent
February 15, 2017Hi Tony,
Unfortunately I have no idea from a distance. Could you send me your full code (info@paulvangent.com)? I’ll have a look.
Cheers
Blackwood
February 17, 2017Hi Paul:
I hope to use more pictures to train the model . but i am not as lucky as you to get the SFEW/AFEW dataset. I have email the AFEW/SFEW downloading requirement few days ago, but no reply comes back.
Can you tell me how and where i can get the dataset else ?
thank you.
Paul van Gent
February 17, 2017There is no other place to get that one, however you could also try making your own. Extract faces from google image search, movies, art, etc. It’s more work but you have control over how big and how varied you want your dataset to be.
Luis
March 1, 2017Hello. I followed both tutorials for emotion recognition and everything worked smoothly 🙂 Now I’m looking to implement it by using deep learning and/or neural networks. Could you please recommend me how to start? I mean, what could I use as inputs in a neural network? Could it be 5 images (one emotion each)? What would be the next step? I’m a bit lost here 😛 Thanks!
Paul van Gent
March 13, 2017Hi Luis,
You could read up on Google’s tensorflow. Theano in Python is another popular deep learning framework. Or you could wait a week or two. I think I’ll have a deep learning emotion recognition tutorial ready by then :).
Cheers
Rachnaa R
February 21, 2018Dear Sir,
First of all thank you for your efforts! Believe me it helps a lot. I am a beginner to this environment.
Secondly, I coudnt find the deep learning emotion recognition tutorial you were talking about. Do you mind providing me the link of the same?
Regards,
Rachnaa R.
Raaz
April 1, 2017Hy Paul, I need help please, when running the comand cmake.. it gives me the error ‘cmake..’ is not recognized as an internal or external command, operable program or batch file. What should I do? Thanks, great job with your website
Paul van Gent
April 8, 2017Hi Raaz. When installing CMAke you need to check the box to “add it to your system path”, or manually add it to your system PATH variable.
John
April 2, 2017Can be a train model saved and used after?
Paul van Gent
April 8, 2017Hi John. Sure.
import cv2
fishface = cv2.createFisherFaceRecognizer()
[train model]
fishface.save(“filename.xml”)
load it with fishface.load(“filename.xml”).
John
April 8, 2017I mean the trained SVM ? With fisherface i saved my model, but i want to make a model with SVM
Paul van Gent
April 8, 2017Check the docs for the module you use for your SVM model. It’s always in the docs.
Aparna
April 7, 2017Hello. I keep getting the following error:
CMake Error at C:/dlib-19.4/dlib/cmake_utils/add_python_module:116 (message):
Boost python library not found.
Call Stack (most recent call first):
CMakeLists.txt:6 (include)
— Configuring incomplete, errors occurred!
See also “C:/dlib-19.4/tools/python/build/CMakeFiles/CMakeOutput.log”.
error: cmake configuration failed!
I have tried everything I could find on internet, but haven’t been successful in installing dlib. Any suggestions would be appreciated.
Thanks,
Aparna
Paul van Gent
April 8, 2017Hi Aparna. It seems boost-python is not correctly installed or located. Did it build and install correctly? Did you set the environment variables to BOOST_ROOT and BOOST_LIBRARYDIR?
Aparna
April 8, 2017Yes I did. I tried uninstalling and installing it thrice. It gets successfully installed without any errors or warning.
Aparna
April 8, 2017And Yes, I did setup the BOOST_ROOT and BOOST_LIBRARYDIR path variables. But no luck yet.
Paul van Gent
April 8, 2017You have me at a loss I’m afraid. What system do you use?
Mani Kumar
May 26, 2017Hello Aparna,
Even I faced the same issue. After searching for answers on Google
and trying every answer for over a month, finally I found a solution that works for me.
I installed the dlib using its wheel file.
Download the wheel file from this link .
I used the dlib 18.17 and not 19.4 which is the latest version.
If you check the pypi it shows there’s no dlib 19.4 package for python 2.7.
Please check this link .
And make sure you have consistent installations.
All the programs in this tutorial work on my system.
And my system configurations:
OS – Windows 10 64bit.
Python 2.7 (anaconda) – 64bit
OpenCV 2.4 – 64bit
dlib 18.17.100 – cp27, win_amd64bit => should be used with python 2.7 – 64bit.
Regards,
Mani
Paul van Gent
May 26, 2017Hi Mani. Thanks for linking to the pre-built wheel files, that is indeed also a solution. It seems your link was eaten by my spam protection. I think you mean this one: http://www.lfd.uci.edu/~gohlke/pythonlibs/
Aparna
April 8, 2017I am using Windows 10 64-Bit. The other software versions are as follows:
Boost_1.61.0
dlib_19.2
Visual Studio 2015
CMAKE_3.72
Anaconda2-4.3.1
All the above mentioned softwares are 64-bit versions.
I did follow all the steps that you have mentioned for installation.
Is there anything I am missing?
Aparna
April 12, 2017Hey Paul,
Seems like there was some problem with my Windows. I tried installing it on Ubuntu and was successful at the setup and running your code. Great work with the article. Thanks for the awesome work. I get accuracy around 45-48% and I am not sure why. Any help is appreciated.
Thanks
wiem
July 30, 2017Hi ! I’m trying installing it on Ubuntu also ! However i had some issues when I train thaa code ! would you helpe me please ??
this is my email grinawiem@gmail.com
Thank you a lot
Paul van Gent
July 30, 2017Hi Wiem. Please send me a message with the issues you’re facing and the code. The address is info@paulvangent.com
-Paul
Prakhar
April 23, 2017Hey Paul,
On running bootstrap.bat i keep getting the error that cl is not recognised as an external of internal command. Where am I going wrong?
Paul van Gent
April 25, 2017Hi Prakhar.
http://stackoverflow.com/questions/8800361/cl-is-not-recognized-as-an-internal-or-external-command
Also, if you didn’t do it yet, install ‘common tools for C++’ in visual studio.
maymuna
May 6, 2017hi, facing same error, i dont have any microsoft visual studio? how do i solve this error?
Paul van Gent
May 8, 2017Install visual studio + common tools for visual C++
Rafa
April 25, 2017Hello Paul I need your help,
I have followed the instructions given to work with dlib in python. I think everything work fine untill python setup.py install, obviously something is not working well.
I have python 3.6 with conda running in window 7 64 bit
you can see the result of my command prompt here
http://imgur.com/QJZyh0c
thanks in advance
Paul van Gent
April 25, 2017Hi Rafa. It seems your C++ compiler is not found. Did you install Visual Studio? If you did, go to control panel -> programs and features -> right click on the “microsoft visual studio” entry and select “change”. wait for it to initialise and check the “common tools for C++” under “visual C++”. You should be good to go then!
rafa
April 25, 2017Thanks for your quick response.
You are right, there was a problem with visual studio 2017 ( is buggy and won’t compile C++11 code, which is a requirement to use dlib) so I installed visual studio 2015 with c++ packs.
However now I have other problem:
http://imgur.com/5DkUTJE
Has you any idea how I can solved it?
Paul van Gent
April 25, 2017Hi Rafa. The “SET” commands only apply to a command prompt session, so each time you close it, it forgets them. Before compiling dlib you need to do the SET BOOST_ROOT and SET BOOST_LIBRARYDIR again.
julio
May 4, 2017hi i tried install dlib on windows 8 32 bits-cmake 3.8 win32-x86 – python2.7 – visualstudio 2012-dlib 19.4 but i have error like that:
visual studio are using is too old and doent support c++11 you need viaual studio 2015 or newer..
my question is …i just update 2015
Paul van Gent
May 4, 2017Hi Julio. That is correct, and why the VS2015 is indicated under “Getting started”. Happy to hear you found the issue :).
joy
May 2, 2017Hello Paul, I’m new to machine learning and I’m looking to execute this program you’ve written. However, I’m not clear how we’re reading the dataset. Like the previous post do we create two folders ‘source_emotion’ and ‘source_images’? If not then it would be great if you could explain how you’re doing this. Pardon me if it’s a silly question. Thank you.
Paul van Gent
May 4, 2017Hi Joy. If you’ve followed the previous tutorial you can use the same folder structure. The “source” folders are not necessary, rather the “dataset” folder with the emotion labels as subfolders. The code should pick up the same structure without problem. Please let me know if you run into problems.
Anil
May 3, 2017Hi Paul, how can i implement predict_proba() function above code to get that emotion label scores.
Paul van Gent
May 8, 2017Hi Anil,
Make sure than when you define the SVM classifier, that you set probability to True. In the tutorial it looks like:
“clf = SVC(kernel=’linear’, probability=True, tol=1e-3)”
Afterwards, “clf” becomes the classifier object. Call it with clf.predict_proba(n_samples, n_features).
Also see the docs at:
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
Cheers,
Paul
Anil
May 27, 2017After my implementation, it procuded me following probabilities in array (‘prob score’, array([[ 0.12368627, 0.77254657, 0.01258662, 0.09118054]])). Here there are four probabilities but I have five emotions.I think it should included five probabilities.What am I missing, where am I wrong ?
Thx for reply.
Paul van Gent
May 27, 2017Hi Anil. I also mailed you, but for the benefit of others my message here as well:
If I had to make a guess, the training data does not contain any emotions from the missing fifth emotion. This way, the model will calibrate itself only for four emotions.
– Are all the emotion folders filled with images?
– Are the emotion labels in the list at the top of the code the exact same as the folder names? An empty list (and thus: no training data) will be returned if one of the folder names does not correspond with the emotion names.
Good luck!
Paul
Kowsi1997
February 26, 2018hi paul,
what is n_samples and n_features in the above code inorder to use predict_proba() function for the above code??
Paul van Gent
February 26, 2018Hi Kowsi. I refer to a 2d ndarray or list, with shape “all samples, all features”. So, if you have 50 pictures with 20 features on each picture, you feed predict_proba an array of shape(50,20)
pirouz
May 16, 2017HI
thanks for your remarkable job and sharing it with us
would you please explain more about the installing dlib ?? for example what should i put after cmake command ??
i tried with cmake (GI) to configure and generate into build folder but at the end (python setup.py install) i get error
cheers
pirouz
Paul van Gent
May 18, 2017Hi Pirouz. You type “cmake ..” when in the build folder, indicating you want to build the source from its parent directory.
What error are you getting? Any help I can offer is very much dependent on that.
Mani Kumar
May 26, 2017Hi Paul,
I have gone through all your tutorials regarding the emotion detection. All the code in
your tutorials work on my system. Thank you for the good tutorial.
I am experimenting with my own ideas and methods to extract the information
related to emotion. And I am beginner to machine learning so I am not sure which method
or which library is good for my ideas.
I have a question.
Why are you using sklearn’s svm and not dlib’s svm or opencv’s svm to train and predict?
Reason for the question.
To reduce the dependency on external libraries.
Thank you,
Mani.
Paul van Gent
May 26, 2017Hi Mani. I use SKLearn because of their scientific integrity for inclusion of algorithms (see: http://scikit-learn.org/stable/faq.html). Additionally: it’s because it is very versatile and contains much more ML algorithms than just support vector machines. I like to work with a package that has all I need, rather than select from different packages.
This is personal taste, you can achieve similar goals with different packages.
If you like to exchange ideas on what algorithm to use for your purposes, send me a mail: info@paulvangent.com.
Mani Kumar
May 29, 2017Hi Paul,
Thank you for the reply.
I want to know if SKLearn is portable across platforms?
For example, the android os.
And the SKLearn API is accessible in c++?
Regards,
Mani
Mjay
June 2, 2017Hi Paul,
I use your code and catch this problem :
anglerelative = (math.atan((z-ymean)/(w-xmean))*180/math.pi) – anglenose
RuntimeWarning: divide by zero encountered in double_scalars
Where i make mistake?
Thanks for reply, tutorial is very good 🙂
Paul van Gent
June 12, 2017It means that there is a division by zero (you can’t divide by zero..). However, numpy should step over it and return a NaN value. You can try catching the error and removing the data entry from the dataset if you so wish. Good luck!
-Paul
Juergen
June 11, 2017Hello Paul,
thank you for this tutorial, this is excellent work!
Just in case anyone was searching for the shape_predictor_68_face_landmarks like I did (probably I am blind), you can find it here:
http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
Paul van Gent
June 12, 2017It seems I didn’t mention this in the tutorial, thanks for noticing it. I’ll add it
-Paul
simux
June 20, 2017Hello Paul,
Thank you for your job, it is helpful.
I used your code to extract feature vectors, butI got values with negative sign and in this format -1.254179104477611872e+02
Is my work correct?
Thank you
Paul van Gent
June 20, 2017Hi Simux. That depends on what feature vectors you are extracting. If you’re following the code from the tutorial, from the top of my head they can be negative (going from -180 to 180 for the angles). However, if you’re extracting other features, you need to tell me a bit more about what exactly you’re doing.
-Paul
simux
June 21, 2017Hi Paul,
yes I am following your tutorial. So i used your method in computing Euclidean distance. The output vector has a dimension of 268. However in your tutorial you computed the distance between the center of gravity and each point of the 68. So it must have a dimension 136.
Why i am having 268?
Thank you
Aniket More
July 5, 2017Hi Paul, Here also I am getting mean accuracy of 35%, Maybe the issue is with the version of Opencv.
Paul van Gent
July 5, 2017That is strange indeed. This tutorial doesn’t rely on opencv except for accessing your webcam and displaying a video feed. The problem must be in the data then. How many images are in your dataset?
Could you send me your code to info@paulvangent.com? I’ll run it on my set..
Aniket More
July 5, 2017The data set has 652 images. I am using the same code without any modification still I will mail it to you. Thank you.
KVRanathunga
July 12, 2017Sir,
I have got the following error when executing the the final code you have post:
“Warning (from warnings module):
File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 395
DeprecationWarning)
DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
Traceback (most recent call last):
File “C:\Users\Cowshalya\Desktop\New folder\final.py”, line 98, in
clf.fit(npar_train, training_labels)
File “C:\Python27\lib\site-packages\sklearn\svm\base.py”, line 151, in fit
X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 521, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 424, in check_array
context))
ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”
what should i do now..????
KVRanathunga
July 12, 2017Sir,
I have got an error as follows when I’m executing the final code you have given.
“Warning (from warnings module):
File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 395
DeprecationWarning)
DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
Traceback (most recent call last):
File “C:\Users\Cowshalya\Desktop\New folder\final.py”, line 98, in
clf.fit(npar_train, training_labels)
File “C:\Python27\lib\site-packages\sklearn\svm\base.py”, line 151, in fit
X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 521, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 424, in check_array
context))
ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”
No idea what to do next. sir please help me…
Paul van Gent
July 14, 2017The error is trying to tell you that the arrays passed to check_X_y() are empty. Try debugging why this is the case. Are the files correctly read? Is the pixel data correctly stored in an array? Are the data and the label appended correctly to the X and y arrays?
wiem
July 30, 2017Hi Sir,
I have got an error as follows when I’m executing the final code you have given.
Enter 1 to train and 2 to predict
1
Making sets 0
training SVM linear 0
/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)
Traceback (most recent call last):
File “faceDetectionDlib.py”, line 168, in
main()
File “faceDetectionDlib.py”, line 135, in main
clf.fit(npar_train, training_labels)
File “/usr/local/lib/python3.5/dist-packages/sklearn/svm/base.py”, line 151, in fit
X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 521, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File “/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py”, line 424, in check_array
context))
ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.
________________________________________________________
Would you help me please and guide me how could I solve this error and train the this code using other datasets ????
Paul van Gent
July 30, 2017As stated to your question on the other posts: the last bit of the error tells what’s going wrong: “ValueError: Found array with 0 feature(s) (shape=(1, 0)) while a minimum of 1 is required.”. Apparently you’re feeding an empty object to the classifier. Try debugging why no images are loaded (is path correct? can it find files? are permissions set ok?)
-Paul
wiem
August 12, 2017Hi Paul ! I’m using Ck+ dataset however I am getting mean accuracy of 43%, I have no idea why it is so low. Could ou tel me where is the issue?
Thank you
Paul van Gent
August 13, 2017No I cannot tell at a distance. You can check a few things:
– Are you using the correct versions of all packages?
– Where are the mistakes made? Is there a pattern?
– Are all paths correct? Are all images in the correct folders?
– Some users have reported glob.glob to behave differently in python 3, make sure lists are sorted properly when creating the dataset
wiem
August 14, 2017Hi Paul! Thanks alot for your quick answer . I send you the code I used and gave me 43% accuracy !!! Will you please check it and tell me where i did wrong ! my email is grinawiem@gmail.com.
Thanks
wiem
August 14, 2017Hi Paul ! I tried to train the code with one only emotion in one folder. So the accurancy become 50% !!!
I thing the mean problem is the glob.glob :
————————————————————————————————-
files = glob.glob(“/home/wiem/Bureau/CK+/train/*/*” )
————————————————————————————————-
in the original code is written : files = glob.glob(“dataset/%s/*” %emotion)
however when I use %emotion it gives me this error :
—————————————————————————————————–
Enter 1 to train and 2 to predict
1
Making sets 0
working on anger
Traceback (most recent call last):
File “test8.py”, line 128, in
main()
File “test8.py”, line 103, in main
training_data, training_labels, prediction_data, prediction_labels = make_sets()
File “test8.py”, line 64, in make_sets
training, prediction = get_files(“/home/wiem/Bureau/rafd/Evaluation/train/” )
File “test8.py”, line 22, in get_files
files = glob.glob(“/home/wiem/Bureau/CK+/train/*/*” %emotion)
TypeError: not all arguments converted during string formatting
—————————————————————————————————-
would you tell me please
Paul van Gent
August 14, 2017Hi Wiem. With only one emotion SKLearn throws an error: you cannot fit a discriminatory model on one outcome! I expect something goes wrong there. As I mentioned at your other post: the likely issue is glob.glob sorting behaviour.
Seems you’re running into some basic Python problems such as string formatting. You really need to have a grasp of these things and other python concepts before trying more complex exercises such as this tutorial.
I recommend you follow some basic tutorials. A good one is https://learnpythonthehardway.org
Good luck!
wiem
August 17, 2017Hi Sir,
Thank you a lot for your help! I figured out what was wrong. It’s just like you said the path of the data set was wrong. And, now the training works very well. However, would you please explain to me how I could Test and evaluate the training model and get its accuracy ?
Thanks
Neeraj Panchbhai
August 17, 2017Im not able to install dlib successfully please help.
Paul van Gent
August 18, 2017Never, ever ask for help without
– a detailed description
– what goes wrong
– what you have tried
wiem
August 21, 2017Hi Paul ! I wonder the accuracy given after the 10 runs in :
____________________________________________________________________________________________________
print(“Mean value lin svm: %s” %np.mean(accur_lin)) # Get mean accuracy of the 10 runs
____________________________________________________________________________________________________
Is that the value of the training or the prediction of the test of the model created ! Because this code gives me accuracy of 0.96 with the dataset MUG. So will you explain this to me ?
Thanks
Paul van Gent
August 21, 2017Hi Wiem. It’s the test accuracy: the accuracy of the training dataset is not particularly useful, since this doesn’t tell anything about model performance, only model retention. The different accuracy with the MUG dataset is most likely because the data is structured differently,and as I mentioned before, there is likely some issue on your system with glob.glob sorting the returned images in the CK+ set.
However, neither accuracies tell you much about how the model will perform: always collect some data that is similar to that your system of application will use when functioning.
As mentioned before, I truly recommend you dig into Python (and machine learning theory) a bit more before attempting these kinds of projects. This will help you find faults easier.
wiem
August 21, 2017Thank you Paul very much for your explanation and your advice I appreciate your help . I will flow your suggestions carefully.
Thanks
Sergio
August 25, 2017Thanks for the guide!! finally i could install dlib 19.4 for python on windows without errors.
Thari
August 26, 2017Every thing worked fine until ” python setup.py install”. I tried this many time but installing processes did not go future after ” face_recognition.cpp”. I waited more that 2 hours.
My system is windows 10, visual studio 2017, python 2.7 (64-bit), RAM- 8GB
cmake version 3.9.1 Dlib 19.4
Here I attached the command window in a text file for your reference.
http://textuploader.com/d64sp
nadhir
September 20, 2017Hi Paul, I’ve used your code and I’ve obtained some good results, your work is fantastic. Thank you for sharing it.
I want ask you if you used the original size of image in the dataset CK+ or there is an optimal size you use to get better result. The, I want to cite you on a scientific paper as a reference , do you have any scientific paper where you want to be citated?, also, I don’t know if you may give me more information about the used works to create this work. I look forward to reading from you soon and thanks.
Cordially
Paul van Gent
September 22, 2017Hi Nadhir. Great! The citation can be to the article on the website, its format is at the top of the post, it is:
“van Gent, P. (2016). Emotion Recognition Using Facial Landmarks, Python, DLib and OpenCV. A tech blog about fun things with Python and embedded electronics. Retrieved from: http://www.paulvangent.com/2016/08/05/emotion-recognition-using-facial-landmarks/”
As far as the size of the images goes, it was 350×350 pixels as stated in the other tutorial where the pre-processing was done. I’m not sure about the absolute optimal size for this, but I’m sure good performance can also be had with smaller images. Of course the larger the image, the smaller the facial movement you can quantify, but for the purposes of these tutorials (archetypical emotional expressions) the size was more than enough.
Good luck. If you want you can share the paper after publication and we can put it up on the article as well.
Phillemon
September 24, 2017This is great, however I am finding it difficult to obtain the ck+ dataset. Can you please send it to my email. phillemonrasekgwalo@gmail.com thank you
Paul van Gent
September 26, 2017Hi Phillemon. I’m sorry, the terms of the dataset prohibit me sharing it. You need to obtain it from the original source, or find another one I’m afraid.
Randall Theuns
October 13, 2017Hey Paul,
Great work on this. I’m currently busy implementing this for a minor i’m doing. I’ve prepared the dataset, trained the model, it seems to give good accuracy (~85% with CK+). Right now I’m to add webcam support to allow for semi-real-time emotion recognition, but whenever I use SVC.predict on a vectorized facial detection, I only get either 5 or 7 as predictions. If I use predict_proba instead, I get an array with only 7 probabilities.
Do you have any clue why this happens?
The code is available on github: https://github.com/drtheuns/minor_riet
In particular, src/webcam.py and src/frames.py matter.
Paul van Gent
October 14, 2017Hi Randall. Several things might cause the prediction to revert to only two classes:
– Make sure you keep everything standardised. Is your face on the webcam image much larger or much smaller in terms of pixel size than the training data? Resize it before working with landmark coordinates
– What’s happening in the 15%? Are there one or two categories that host most mistakes?
– Are you expressing extreme emotional expressions? The CK+ dataset has extreme expressions. an ML model classifies correctly only the type of data you train it on.
SVC.predict_proba() works just like that: it outputs a list of decision probabilities based on which the classification is made. If you feed SVC.predict_proba() an array of image data, it gives a matrix of predictions back.
You could also try making your own dataset to append to the CK+ one. Maybe you can get 10 of your friends to each have a few photos made for each emotional expression. This might help predictions improve as well, since it trains the model explicitly on data gathered from the same sensor as from which predictions are made (your webcam).
Lastly, please do me a favour and cite my blog in the readme.md. This helps me get more traffic and in turn helps me produce more content.
-Paul
Randall Theuns
October 14, 2017Hi Paul,
Thanks for the swift and detailed reply.
There’s a good chance that the size of the face in the webcam is the problem. I’ll have to look into that. Due to some time constraints and deadlines, I don’t have too much time to troubleshoot te 15% (only have a total of ~8 weeks, of which 4 remain to create this prototype, and I still have to visualise it).
The predict_proba ‘issue’ I was talking about was more about the number of probabilities it returns (7, even though it was trained with 8 emotions), but this might have to do with too low probability, or just the same issue as above.
I’ll see if I can increase the dataset a bit.
You were already cited in the readme! https://github.com/drtheuns/minor_riet#citations
Thanks!
– Randall
Paul van Gent
October 15, 2017Hi Randall. Are you sure that the model wasn’t trained on 7 emotions? It should return probabilities for the same number of classes as on which it has been trained, no matter the low probabilities..
Don’t put too much thought into the remaining 15%, you will never reach 100% under realistic conditions (an aggregate of 60-80% would already be very good performance in real-world settings).
Thanks for the citation, I must have missed that
-Paul
Randall Theuns
October 15, 2017Hey Paul,
Just as a quick reply and perhaps a hint for other people. The code used to sort and prepare the dataset assumed 8 emotions, including happy. The code to predict emotions, however, assumes the same 8 emotions, including *happyiness*. This means that, to train the happy emotion on the model, it was looking for a folder called happiness, rather than happy.
Fixing this simple issue seemed to have fixed the 7 or 8 prediction_proba issue. Another quick not about the above code:
In the landmark pieces of code, range(1, 68) is used, therefore only grabbing 67 of the 68 landmarks.
Thank you for the article and quick replies.
– Randall
Johnny
October 16, 2017Hi Paul. Very nice job.
But i m struggling with following thing:
I want to classify one sample from my webcam. I do not know what function to use and what parameter to give.
I mean after clf.fit (training) i want to predict a frame from webcam. I used clf.predict_proba but the parameter expected must be equal with size of the training data (this is the error received).
Do you know how to proceed to classify one frame from webcam ?
Br
Johnny
October 16, 2017Solved with predict_proba()
Víctor
October 19, 2017Hello, I have two questions from a line of the code.
1) I do not understand why to calculate the angle of each landmark point yo do:
landmarks_vectorised.append(int(math.atan((y – ymean) / (x – xmean)) * 360 / math.pi))
This would make sense for me if x and y are the coordinates of each landmark point. Nevertheless, x is xcentral and y is ycentral. And xcentral is x-xmean. By doing the sentence I mentioned before I understand that you are subtracting the mean 2 times.
2) In the same line code from before:
landmarks_vectorised.append(int(math.atan((y – ymean) / (x – xmean)) * 360 / math.pi))
I do not understand why to pass the angle to degrees you multiply per 360/pi and not 360/2*pi, that this is what I was expecting.
Paul van Gent
October 23, 2017Hi Victor. Thanks for catching that, and you’re right of course. I must’ve been asleep when writing that line I guess! I’ve updated it:
landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi))
-Paul
Mun
October 23, 2017Hi Paul
I followed all steps from ‘http://www.paulvangent.com/2016/08/05/emotion-recognition-using-facial-landmarks/’ and also ‘http://www.paulvangent.com/2016/04/01/emotion-recognition-with-python-opencv-and-a-face-dataset/’.
Anyway i wondered how to show the results such as “anger: 0.03239878
contempt: 0.13635423 disgust: 0.0117559 fear: 0.00202098 neutral: 0.7560004
happy: 0.00382895 sadness: 0.04207027 surprise: 0.0155695”
You mentioned to use ‘predict_proba()’ function to show the results about emotions. Do i need to make new script apart from main code? or, just add like
print(“Emotion: \n{}” .format(np.argmax(gbrt.predict_proba(testvalue), axis))), this one ??
Paul van Gent
October 25, 2017Hi Mun. Yes, predict_proba() is a function of the SKLearn classifier:
clf = SVC(kernel=’linear’, probability=True, tol=1e-3) #Here make sure you set “probability” to true, otherwise the model cannot return the decision weights later on.
#train model here
clf.predict_proba(testvalue)
– Paul
Mun
November 1, 2017Thank you for you kindness! I will try it 🙂
rob
November 2, 2017Good day Sir please am facing some issues install boost python I downloaded the latest version and ran bootstrap.bat in my cmd but am facing this error
cl not recognized as an internal or external command
failed to build boost.build engine
Rob
November 3, 2017Hello sir i downloaded the latest version of boost python and when trying to build it in command prompt i came with this error
c:\boost>build.bat
‘build.bat’ is not recognized as an internal or external command,
operable program or batch file.
c:\boost>bootstrap.bat
Building Boost.Build engine
Failed to build Boost.Build engine.
Please consult bootstrap.log for further diagnostics.
You can try to obtain a prebuilt binary from
http://sf.net/project/showfiles.php?group_id=7586&package_id=72941
Also, you can file an issue at http://svn.boost.org
Please attach bootstrap.log in that case.
Please i need your help on this Sir what do you think am doing wrong am using the visual studio 2017 developers command prompt
Paul van Gent
November 8, 2017Did you install CMake and added it to the path variable?
rob
November 8, 2017I alrrady did that but still having the same error
Oran
November 8, 2017hi,
and thanks for your work and publishing it…..
I just wanted to update that I was able to get the kc+ and kc database here:
http://www.consortium.ri.cmu.edu/ckagree/
Kooper
November 25, 2017Hi.
I’m trying to follow this tutorial but I’m stuck at beginning….
I don’t know why I can’t download the CK+ dataset ?!
I end up getting 403 forbidden as response ?
Is there anyone who can help me ?
Paul van Gent
December 4, 2017Hi Kooper. The availability of the dataset is intermittent. Unfortunately there is not much we can do about this. You could look at alternatives such as the Yale Face set.
Nico
January 6, 2018I have few questions.
1. Why distances are not scaled? For example if you have a face near camera you will get big distances from center of gravity. If the face is far the distances are smaller.
2. landmarks_vectorised.append(w) What if the face is tilted? the difference between x- xcentral will change. Why is not applied a correction? Also why is not scaled (question 1) if the face is near camera or is far.
3. landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi)). Why the correction angle for tilt face is not applied?
4. landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi)) . What should we do around 180 degree? If we have 2 photos of same emotion and lets suppose that we take one same point in both images. in one image the angle calculated will be -178 degree and in the other will be +178. That is a big difference in value but the points are very close each other
(E.g. x differene for first image is 0.002 and x difference for second is -0.0002. Let suppose that y difference is possitive and it is the same for both picture)
Thanks!
I appreciate your work !
Paul van Gent
January 25, 2018Hi Nico. Apologies for the late reply, I’ve been at an international conference followed by a holiday. To answer your questions:
1. In the two tutorials here I use standardised images. I agree that this is not really that clear from the text. I definitely recommend that when using non-standardised sources such as a webcam that you detect the face first, then crop the image to a standardised size, before attempting landmark points detection.
2. & 3. It seems the tilt correction has dropped from the code. Some time ago I migrated to another plugin for all the code, since my old one became unstable with newer wordpress versions.. Thanks for noticing this. I’ll upload the full code tomorrow evening when I’m back home.
4. That’s a valid point. Some of these issues are prevented by rotating the point matrix, but some will remain. I’m unsure of a direct fix here, apart from having a larger dataset where both instances occur with reasonable frequency… If you have any ideas let me know!
Cheers,
Paul
Nico
January 29, 2018Ok, thanks for answers. You can also change regularization parameter C in case of liniar kernel for a better score. Also gamma for poli or rbf. I have obtained 2% better by tunning this (i have used optunity).
Also i have observed that the emotion sadness is not quite well recognize (68% – this emotions make the rate down for me). Now, i m thinking what to do handle this for better score for this emotion. For 6 emotions i have got 87% for poli and 86% for linear. I tried to make one-vs-all models but i cannot improve this rate (also the recognition time increases). Maybe I will try to train a sadness vs all model with LBPH or eigenvectors. I will also want to try the Gabor Wavelet method but i must read more about it.
Paul van Gent
February 21, 2018Yes hyperparameter tuning was not applied, so there is definitely a gain to get there! Keep me updated with your progress.
– Paul
Leonardo
January 25, 2018Hi!! this is awesome! congrats and thank you!!
I would like to suggest you something:
You could make a bottleneck folder, were you put all your landmarks detection files, before you start.
Next you make the getFiles, from these landmarks files.
In the end, you have only once detected all the landmarks, and have avoided a lot of computing!!
Thanks!!
Paul van Gent
January 25, 2018You’re absolutely right, this would be a great approach when fine-tuning the algorithm :).
– Paul
ali
January 26, 2018i have problem in dlib.shape_predictor(“shape_predictor_68_face_landmarks.dat”) i saved my test in the same file of dlib but i don’t know where is the problem the error is: Traceback (most recent call last):
File “C:\Python27\alimcheik\TestinLandmarkDetecting.py”, line 7, in
predictor = dlib.shape_predictor(“shape_predictor_68_face_landmarks.dat”)
RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat
Paul van Gent
January 26, 2018Hi Ali. Strange! I would say make sure you have the filename 100% right, and that you have permission to read the file (although I can not really imagine why this would be a problem..).
You can also try calling dlib.shape_predictor() with the full path: dlib.shape_predictor(“C:\\Python27\\alimcheik\\shape_predictor_68_face_landmarks.dat”). Be sure to use double backslashes because a single one is interpreted as a character escape.
– Paul
Noemi
February 5, 2018Hi, I’m learning python and when I run your first code I get an error. How can I fix it?
File “C:\Users\noemi\Documents\Python\LandMarks.py”, line 22
for i in range(1,68): #There are 68 landmark points on each face
^
IndentationError: unexpected indent
[Finished in 0.3s]
Paul van Gent
February 14, 2018Hi Noemi. These kinds of errors indicate wrong indentation. See this link for more info.
If the indents look good visually and you still get the error, you’re likely mixing spaces and tabs.
– Paul
Kowsalya
February 21, 2018Hi Paul,
I’m getting following error on running the last code.Also I have given path like this;(i’m using CK dataset)
files = glob.glob(“\\F:\\proj\\emotion\\cohn-kanade\\S010\\%s\\*” %emotion)
or should i need to split the dataset into different emotion folder(happy,sad,neutral..)????
Making sets 0
working on anger
working on contempt
working on disgust
working on fear
working on happiness
working on neutral
working on sadness
working on surprise
training SVM linear 0
Traceback (most recent call last):
File “emo.py”, line 96, in
clf.fit(npar_train, training_labels)
File “G:\ana\envs\keras_env\lib\site-packages\sklearn\svm\base.py”, line 149, in fit
X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
File “G:\ana\envs\keras_env\lib\site-packages\sklearn\utils\validation.py”, line 573, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File “G:\ana\envs\keras_env\lib\site-packages\sklearn\utils\validation.py”, line 441, in check_array
“if it contains a single sample.”.format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Paul van Gent
February 21, 2018Hi Kowsalya. From what it seems, you need to pre-process the data. Be sure you follow the data preprocessing steps in the previous post before delving in here.
– Paul
Kowsi1997
February 26, 2018thank you so much paul !!!
Rachnaa R
February 23, 2018“b2 install” is not being executed by CMD
says “b2 is not recognized as an internal or external command”
If anyone knows how to fix this, please help.
Nandy
March 1, 2018what is meant by features?list the features that has been extracted in the above code?
Paul van Gent
March 1, 2018Hi Nandy. A ‘feature’ is a machine learning term for a particular property in the source data that is expressed numerically. For example if you want to recognise a cat vs a human on a photo, you could have a variable that states whether the thing on the photo has a tail or not. This variable ‘tail’ (yes/no) could be considered a feature.
See also this link for more info.
– Paul
Nandy
March 2, 2018Thank you paul!!! what are the features in the above code
Paul van Gent
March 4, 2018The landmarks detected on the face.
Adarsh
March 6, 2018Hey Paul,
how can we implement this code in order to input just a image to test and return the appropriate emotion label as the output.
Paul van Gent
March 6, 2018After you’ve trained the model (you can save it and re-load later if necessary), you can just use its ‘predict()’ function and pass an image data array.
Shobi
March 6, 2018Hello Paul,
I’m using CK+ dataset what might be the n_samples,n_features inorder to use the predict() ??.Please help me out.
Paul van Gent
March 7, 2018You feed it the features you generate from the data. In the case of this tutorial those are the coordinates of the detected landmarks.
More info on predict() can be found here:
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC.predict
Nitin Agarwal
March 7, 2018Hi Paul,
WIthout cleaning the neutral folder I got an accuracy of around 85% but after cleaning it, the accuracy drops to 75%. Can you explain why?
Paul van Gent
March 21, 2018In the Neutral folder, before cleaning, there’s a lot of repeating images from the same person. This biases the classifier upwards because the total variance in the set is less (images with the same person with the same expression differ very little from each other).
– Paul
Liam Ure
April 1, 2018Hi,
I am currently attempting to do this for a university project. Unfortunately I missed most of the semester due to illness so I am on my own now! I am currently pulling my hair out trying to figure out where to go from here. I have everything above working but what I want to do is use the webcam to detect the users emotion in real(ish)time. I have looked at the logs and some of your comments above but I am at a complete loss of what to do. Can you advise? If you help me out Ill certainly buy you a beer!
Feel free to contact me via email if it is easier for you, liamure@yahoo.co.uk
Paul van Gent
April 1, 2018Hi Liam. Almost all the things you need are in the tutorial then. Take a look here, here and here. These are some examples of getting frames from a webcam. Put them in some sort of loop to do the real-time-ish detection.
I would really recommend to cache at least 5-10 predictions before having your algorithm make a decision. If you can, an extra dataset cannot hurt either.
– Paul
Liam
April 2, 2018Thats a great help, the part that I am having trouble with is actually running the prediction against the trained model, here is some psudocode to explain what I am attempting to do
1. Train model as shown above.
2. Open webcam take a frame
3. Run that frame against the trained model that was done in step 1
4. Print out the emotion in console along with percentage
The part I am having trouble with is part 3, I just dont know how to take the frame, run it through the model and then print out the result.
I have taken a look at the answers above but I just cannot make sense of the documentation at all. Any help would be appreciated
Liam Ure
April 2, 2018Ill clarify that I now know how to capture the frame. I do not know how to run the captured frame against the model to provide a prediction
Liam Ure
April 2, 2018Thats a great help, the part that I am having trouble with is actually running the prediction against the trained model, here is some psudocode to explain what I am attempting to do
1. Train model as shown above.
2. Open webcam take a frame
3. Run that frame against the trained model that was done in step 1
4. Print out the emotion in console along with percentage
The part I am having trouble with is part 3, I just dont know how to take the frame, run it through the model and then print out the result.
I have taken a look at the answers above but I just cannot make sense of the documentation at all. Any help would be appreciated
hassan
April 9, 2018Please help me.
Why there is detection of just one face in live video streaming? Although there are many faces in the live video, but dlib detect only one face at a time? How I can get rid from that issue? I am following exact your code. Please help me.
Paul van Gent
April 11, 2018Hi Hassan. In my example I detect single faces, since the training and testing data consist of this. If you want to detect more faces in a single frame, you need to detect and crop all faces individually first. You can then pass the crops to the classifier one at a time to classify each face independently. Note that execution time will increase when there are more faces in a frame.
You will also need to handle what to do with the extra face detections.
– Paul
iqra
April 19, 2018Hi Paul
I face this error when train data using your above code given
Making sets 0
working on contempt
Traceback (most recent call last):
File “C:\coding\traininClassificationSet.py”, line 92, in
training_data, training_labels, prediction_data, prediction_labels = make_sets()
File “C:\coding\traininClassificationSet.py”, line 67, in make_sets
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #convert to grayscale
cv2.error: C:\projects\opencv-python\opencv\modules\imgproc\src\color.cpp:11111: error: (-215) scn == 3 || scn == 4 in function cv::cvtColor
Paul van Gent
April 23, 2018Hi Iqra. You get this error most often if no image data is loaded. Make sure the paths point to the correct folders, and don’t use backslashes if you’re on linux.
If you see no problems there, verify that image data exists in the data object.
– Paul
john
July 22, 2019just change the code a little bit with extension *.jpg thats it
Matias Aguirre
May 4, 2018Hello Paul, first of all thank you for the publication.
I am trying to translate the code to C # with EMGUCV using the OpenCV own libraries Facemark and SVN.
I have a question about a part of the code:
xcentral is an arrangement of each element of xlist but subtracting xmean then why on lines 45, 46 and 47 is an array created with the values of xlist and ylist subtracting again ymean and xmean?
Could that subtraction be replaced by the values of xcentral and ycentral?
Maybe it’s an obvious question but I’m too new to Python.
Paul van Gent
May 10, 2018Hi Matias,
It has to do with adding as much information for the classifier as possible:
In line 36-39 I standardise by subtracting the pointcloud’s mean from each coordinate, so that the resulting array describes the facial landmarks relative to their center, rather than relative to the position in the image of the face. These are added to the feature list in line 43,44.
Additionally, in line 45-46 another featureset is generated by describing the vector norm relative to the center (the resulting xcentral,ycentral arrays are very different from what results from
np.linalg.norm(coornp-meannp)
Finally line 49 describes the vector angles relative to the image plane. I will add code to rotate the face tonight or tomorrow, so that it will instead describe the vector angles relative to the facial orientation.
So in essence the feature set describes:
– the landmark coordinates relative to point cloud center
– the vector length between each coordinate and the point cloud center
– the vector angle relative to the image plane (as said I’ll upload the rotation bit so that it become relative to the facial orientation).
Hope this clarifies it for you. Otherwise let me know if I can help
Cheers,
Paul
Mutayyba Waheed
May 27, 2018Hi Sir.!!
I need help from from can you please Guide me how can i solve this error
Traceback (most recent call last):
File “E:\opencv_data\Facial_landmark_detection_test-master\Facial_landmark_detection\test\E.py”, line 8, in
from sklearn.svm import SVC
File “C:\Python27\lib\site-packages\sklearn\__init__.py”, line 134, in
from .base import clone
File “C:\Python27\lib\site-packages\sklearn\base.py”, line 13, in
from .utils.fixes import signature
File “C:\Python27\lib\site-packages\sklearn\utils\__init__.py”, line 11, in
from .validation import (as_float_array,
File “C:\Python27\lib\site-packages\sklearn\utils\validation.py”, line 18, in
from ..utils.fixes import signature
File “C:\Python27\lib\site-packages\sklearn\utils\fixes.py”, line 144, in
from scipy.sparse.linalg import lsqr as sparse_lsqr # noqa
File “C:\Python27\lib\site-packages\scipy\sparse\linalg\__init__.py”, line 114, in
from .isolve import *
File “C:\Python27\lib\site-packages\scipy\sparse\linalg\isolve\__init__.py”, line 6, in
from .iterative import *
File “C:\Python27\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py”, line 7, in
from . import _iterative
ImportError: DLL load failed: %1 is not a valid Win32 application.
Paul van Gent
May 30, 2018From what I can gather there’s several options that might work. See them here:
https://stackoverflow.com/questions/19019720/importerror-dll-load-failed-1-is-not-a-valid-win32-application-but-the-dlls
Cheers
– Paul
Amira
June 17, 2018can you please help me to save and load the training model ?
Paul van Gent
June 20, 2018Hi Amira. You can use Pickle to dump a Python object to disk. It also works with classifier models.
– Paul
Amira
June 18, 2018when i run the project i keep getting this annoying error please help
Traceback (most recent call last):
File “C:/Users/HOME/PycharmProjects/Emojis/main.py”, line 58, in
prediction = model.predict(faces)
File “C:\Users\HOME\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\svm\base.py”, line 548, in predict
y = super(BaseSVC, self).predict(X)
File “C:\Users\HOME\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\svm\base.py”, line 308, in predict
X = self._validate_for_predict(X)
File “C:\Users\HOME\AppData\Local\Programs\Python\Python35\lib\site-packages\sklearn\svm\base.py”, line 459, in _validate_for_predict
(n_features, self.shape_fit_[1]))
ValueError: X.shape[1] = 272 should be equal to 268, the number of features at training time
Paul van Gent
June 20, 2018You’re using input images of different sizes. You need to resize them to the same sizes as the training data before classifying.
Anh
June 23, 2018Hello Paul,
I’m currently stuck on Separating the images into the correct folder. I grabbed the CK+ dataset from the main source, put it in the correct folders as you mentioned above.
I tried your code in both Windows environment and Linux environment and both showed the same error (I changed \\ into / for linux). Both returned “List Index out of range” error either when I tried to take the [-1] index in the sourcefile_emotion or [0] in the sourcefile_neutral.
I tried to print out all the variables and everything worked correctly until the line when you’re trying to glob the source image.
Any help from you would be great.
Thank you in advance
Paul van Gent
June 25, 2018Hi Anh. Both of the index errors you encountered occur because the generated list is empty. when using glob(), ensure the correct path pushed to the function.
Also, if you’re on anything but windows, use
sorted()
to ensure the list is returned sorted. Windows does so automatically.– Paul
Ali
August 17, 2018Hi Paul. I downloaded the data set and put it beside the script. when I run the code i face with this error:
Making sets 0
working on anger
working on contempt
working on disgust
working on fear
working on happiness
working on neutral
working on sadness
working on surprise
training SVM linear 0
Traceback (most recent call last):
File “SVM1.py”, line 90, in
clf.fit(npar_train, training_labels)
File “C:\Users\Ali\Anaconda3\envs\opencv-env\lib\site-packages\sklearn\svm\base.py”, line 149, in fit
X, y = check_X_y(X, y, dtype=np.float64, order=’C’, accept_sparse=’csr’)
File “C:\Users\Ali\Anaconda3\envs\opencv-env\lib\site-packages\sklearn\utils\validation.py”, line 573, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File “C:\Users\Ali\Anaconda3\envs\opencv-env\lib\site-packages\sklearn\utils\validation.py”, line 441, in check_array
“if it contains a single sample.”.format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
palkab
August 18, 2018Hi Ali. I think the paths are not correct or the images are not loaded properly (do you have one folder with a single example?).
The error means that the input shapes are not correct. Usually they need to have shape (n, dimensions_x, dimensions_y, n_channels), so check your paths and what you’re feeding the network.
– Paul
Thân Trọng Quý
September 17, 2018Big thank for providing this wonderful piece.
I have a question. When I installed Dlib. I got this error: “can’t open file ‘setup.py’: [Errno 2] No such file or directory”.
I have navigated to my Dlib folder
Can you please help me with this? Thank you very much
palkab
September 18, 2018How did you install dlib?
wiem
September 17, 2018Hi Paul, thank you very much for your helpful tutorial. I am trying now to do emotion recognition and face authentification in the same time. So, like that the system will return the image giving with it’s emotion and the name of the person. So, I wonder if I may use the same model trained in emotion recognition using Landmarks to train it to do face recognition using Landmarks also ?
Any help from you would be great.
Thank you in advance
palkab
September 18, 2018Hi Wiem. Face recognition based on landmarks likely will not be accurate. If you’re in a hurry you can use the facerecogniser classes from opencv.
If you want to do it yourself, generally a deep net is used but there are different approaches. Do you have more details on the situation you want to apply it in?
wiem
September 18, 2018Thank you, Paul, very much for your reply. In this tutorial, you used facial Landmarks to do emotion recognition and I am trying to do emotion recognition and face identification at the same time. So, I am wondering if is it applicable to do the training of the system using two classifications ( classification of emotion and classification of identity) using those Landmarks. So like that in the prediction, the system will return to me the emotion and the identity of the face detected.
Any help from you would be great.
Thank you in advance
palkab
September 28, 2018Hi Wiem. Sorry for the later reply.
You could do that, but the question is if that is the best way to go. Two separate classifiers will probably be more accurate. You could use the face recognition classifier that comes with OpenCV. There’s also various ways of accomplishing this efficiently with convnets. Contact me if you want more info, might be quicker: info@paulvangent.com
Cheers
Daniel
October 22, 2018Very nice examples Paul, I am form Brazil and studying Electrical Engineer. I was happy for find your blog and see you jobs. Continue with this and motivate people. Thanks a lot!
Rohan Sathasivam
November 1, 2018Amazing work Paul. Just for one of my research projects on emotional recognition, I used your method to experiment and I got the same results as you got. I also experimented by increasing the number of pictures (4 times of every emotion except ‘neutral’), and this produced around 95% accuracy.
In another experiment using CNN, I connected the landmark dots using a white line, extracted this line and placed it over a black background image to reduce the noise. This method yields me 97% accuracy (Note: I increased the number of images by 4 folds )
palkab
November 1, 2018Hi Rohan, Nice work! You might also try training your own landmark detector, strip the output layer afterwards and append a perceptron to it and train it on the emotion picture dataset. Don’t forget to freeze the weights of the actual CNN layers in the network!
I’m writing a post on building your own landmark detector now. Give it a few days. Afterwards if there’s time (the main limiting factor I’m afraid…) I might do a post on what I mention above.
Cheers,
Paul
Rohan
November 3, 2018Thank you very much for your precious time and the notion. It would be great to be able to customize the landmarks.
Thankfully,
Rohan
Bilal Shoaib
November 3, 2018i have a problem with installation of dlib library….. kindly help me
palkab
November 4, 2018What is the issue?
Maz
November 7, 2018Hi Paul,
I was looking to take a one-vs-all approach for classification, which I believe the SVM does behind the scenes in this implementation. Do you know how I could get the probability of a image being each of the expressions for a given image passed in. E.g. happy: 73%, sad: 2%, angry: 62%, etc.
I want to be able to compare the probability of it being each type of emotion, to understand the similarities between classes
Maz
November 7, 2018Sorry, I realise you’ve already answered my previous question, I just somehow missed that part of your post. Great job
palkab
November 8, 2018No worries, happy coding!
Jay Trivedi
December 31, 2018Hi Sir ,
I was looking for face recognition code and suddenly I see this its very interesting as thus face recognition is my project for studying will you help me sir….. I’m facing some errors while executing code given by you
palkab
January 1, 2019So what’s the error?
Kaushalya Sandamali
January 25, 2019Hi sir,
This tutorial is very interesting and very helpful for my FYP. It perfectly running for CK+ data set.
When I try this code on testing data(which are not in CK+ dataset). It hardly detects an emotion. however when i use images from training data set it detects emotions accurately. what can be the possible reasons for this?
Thanks in advance.
palkab
February 6, 2019Hi Kaushalya,
Likely the test set is too different from the training set. I recommend enhancing the training set with a few hundred labeled images representative of the test set, so that the model can learn to differentiate better.
– Paul
Chithira Raj
January 30, 2019Sir, Is there any transactions or journals regarding this technique? If possible, kindly share.
Thank you
palkab
February 6, 2019Yes there are but I don’t have them on this computer. If you take a look on google scholar you’ll find plenty.
– Paul
Tangela
February 5, 2019Regardless of just how much you enjoy the
game, there’ll always be new what to understand.
Hoai Duc
April 24, 2019sir i have this proplem , how can it solve it :’ascii’ codec can’t decode byte 0x81 in position 356: ordinal not in range(128
Anna Tabalan
June 10, 2019Hello sir! I was directed to this tutorial from your past ones that used Fisherface. Is it possible to get the probabilities for each emotion using that algorithm instead of SVM?
Qandeel
June 19, 2019Sir This tutorial is very interesting and very helpful for my FYP but i have a problem please let me know how to take the frame, run it through the model and then print out the result.
EZZA
July 28, 2019i am facing this error can any one help me?
model = dlib.shape_predictor(“shape_predictor_68_face_landmarks.dat”)
RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat
Gaurav
September 8, 2019Hi Paul sir…
Greeting for the day..
Actually i am Ph.D perusing students.
My research proposal is emotion recognition on static images.
And I am bit confuse to start the work in it, as I am not getting starting point.
which tool is best? and from where i should start ? . I know basics of python so can you please help sir. to start my work on the given proposal.
srikanth
December 4, 2019i need same thing with tensorflow