Analyzing a Discrete Heart Rate Signal Using Python – Part 1

HB

This the first part in a four part series about how to analyse a continuous heart rate signal, extract those measures often used in scientific publications, and filter the signal and detect error.

Important: The code in this tutorial is licensed under the GNU 3.0 open source license and you are free to modify and redistribute the code, given that you give others you share the code with the same right, and cite my name (use citation format below). You are not free to redistribute or modify the tutorial itself in any way. By reading on you agree to these terms. If you disagree, please navigate away from this page.

Citation format
van Gent, P. (2016). Analyzing a Discrete Heart Rate Signal Using Python. A tech blog about fun things with Python and embedded electronics. Retrieved from: http://www.paulvangent.com/2016/03/15/analyzing-a-discrete-heart-rate-signal-using-python-part-1/

Github rep: Github repository located here


 

The following tutorial assumes intermediate knowledge of the Python programming language, FIR-filters and fast fourier transform methods. The tutorial should be suitable for those with intermediate levels of Python skill. It is divided into separate parts so that you can easily skip over those parts you understand. The data was collected with a simple data logger based on an ATTiny45, as described in another post.


Some theory and background
The heart rate contains a lot of information about you. Apart from the obvious number of beats per minute there is information about your health, physical fitness and even mental activity in there. If you got yourself a heart rate sensor and some data, you could of course buy an analysis package or try some of the open source things available, but not all of them might do the things you want or need. This was the case for me as well. So, why not try to make one yourself? If you’re reading this, chances are you want to try this. This tutorial series is for those interested in learning more about heart rate analysis and how to write a simple but effective analysis algorithm in Python using a few basic modules.

For the first part of this series there isn’t a lot of relevant theory. Just know that when talking about the shape of the heart rate signal, each beat is characterized by a QRS-complex as shown below. The R component is the large peak. This obviously is the one of interest to us:
components


First things first
First let’s download the dataset and plot the signal, just to get a feel for the data and start finding ways of meaningfully analysing it. I use pandas for most of my data tasks, and matplotlib for most plotting needs.

~

import pandas as pd
import matplotlib.pyplot as plt

dataset = pd.read_csv("data.csv") #Read data from CSV datafile

plt.title("Heart Rate Signal") #The title of our plot
plt.plot(dataset.hart) #Draw the plot object
plt.show() #Display the plot

step1_output
The signal looks nice and clean. Keep in mind that this will not always be the case, even with good sensors and especially not when measuring outside the lab! More on dealing with noise and automatically determining signal quality in part 3.


Detecting the first peaks
The first step is to find the position of all the R-peaks. To do this we need to determine Regions of Interest (ROI’s), namely for each R-peak in the signal. After we have these, we need to determine their maxima. There are a few ways to go about doing this:

  • Fit a curve on the ROI datapoints, solve for the x-position of the maximum;
  • Determine the slope between each set of points in ROI, find the set where the slope reverses;
  • Mark datapoints within ROI, find the position of the highest point.

The first provides an exact mathematical solution (more accurately: as exact as you want), but is also the most expensive computationally. With our example dataset the difference does not matter, but at the time I was developing this algorithm I had to go through almost half a billion datapoints quickly, and future datasets seemed only to get larger. Most of this code will also be re-written to C for use on an embedded ARM chip, so we need to keep it simple and efficient.
The latter two methods are computationally much less expensive, but also less precise. High precision with these methods relies much more strongly on a high sampling rate than it does with the curve fitting method. After all, the actual maximum of the curve will probably lie somewhere between two data points rather than on an actual data point, with a higher sampling rate this error margin will decrease. In Part 4 (currently in writing) we will also work with the curve fitting method for a more accurate approximation of the R-peaks, for now we’ll use the position of the highest point in the ROI as the position of the beat.

Now to work: first separate the different peaks from one another. For this we draw a moving average, mark ROI’s where the heart rate signal lies above the moving average, and finally find the highest point in each ROI as such:

~

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import math

dataset = pd.read_csv("data.csv")

#Calculate moving average with 0.75s in both directions, then append do dataset
hrw = 0.75 #One-sided window size, as proportion of the sampling frequency
fs = 100 #The example dataset was recorded at 100Hz
mov_avg = pd.rolling_mean(dataset.hart, window=(hrw*fs)) #Calculate moving average

#Impute where moving average function returns NaN, which is the beginning of the signal where x hrw
avg_hr = (np.mean(dataset.hart)) 
mov_avg = [avg_hr if math.isnan(x) else x for x in mov_avg]

mov_avg = [x*1.2 for x in mov_avg] #For now we raise the average by 20% to prevent the secondary heart contraction from interfering, in part 2 we will do this dynamically
dataset['hart_rollingmean'] = mov_avg #Append the moving average to the dataframe

#Mark regions of interest
window = []
peaklist = []
listpos = 0 #We use a counter to move over the different data columns
for datapoint in dataset.hart:
    rollingmean = dataset.hart_rollingmean[listpos] #Get local mean
    
    if (datapoint < rollingmean) and (len(window) < 1): #If no detectable R-complex activity -> do nothing
        listpos += 1

    elif (datapoint > rollingmean): #If signal comes above local mean, mark ROI
        window.append(datapoint)
        listpos += 1
        
    else: #If signal drops below local mean -> determine highest point
        maximum = max(window)
        beatposition = listpos - len(window) + (window.index(max(window))) #Notate the position of the point on the X-axis
        peaklist.append(beatposition) #Add detected peak to list
        window = [] #Clear marked ROI
        listpos += 1

ybeat = [dataset.hart[x] for x in peaklist] #Get the y-value of all peaks for plotting purposes

plt.title("Detected peaks in signal")
plt.xlim(0,2500)
plt.plot(dataset.hart, alpha=0.5, color='blue') #Plot semi-transparent HR
plt.plot(mov_avg, color ='green') #Plot moving average
plt.scatter(peaklist, ybeat, color='red') #Plot detected peaks
plt.show()

Firstpeaks

We have marked the highest point in each R-complex in our signal, not bad! However, this is an idealized signal. In part 3 of this series we will see how to deal with noise, measurement error, detection error and prevent our module from throwing errors at poor quality signal sections.

Some may ask: why the moving average, why not just use a horizontal line at around 700 as a threshold? This is a valid question and would work fine with this signal. Why we use a moving average has to do with less idealized signals. The amplitude of the R-peaks can change over time, especially when the sensor moves a bit. The amplitude of the smaller secondary peak can also change independently of the amplitude of the R-peak, sometimes having almost the same amplitude. One approach to reducing incorrectly detected peaks discussed in part 3 is by fitting the moving average at different heights and determining which fit seems best. More on this later.


Calculating heart rate
We know the position of each peak in time, so calculating the average ‘beats per minute’ (BPM) measure over this signal is straightforward. Just calculate the distance between the peaks, take the average and convert to a per minute value, like so:

~

RR_list = []
cnt = 0
while (cnt < (len(peaklist)-1)):
    RR_interval = (peaklist[cnt+1] - peaklist[cnt]) #Calculate distance between beats in # of samples
    ms_dist = ((RR_interval / fs) * 1000.0) #Convert sample distances to ms distances
    RR_list.append(ms_dist) #Append to list
    cnt += 1

bpm = 60000 / np.mean(RR_list) #60000 ms (1 minute) / average R-R interval of signal
print "Average Heart Beat is: %.01f" %bpm #Round off to 1 decimal and print

Also update the plot method to show the BPM in the legend:

~

plt.title("Detected peaks in signal")
plt.xlim(0,2500)
plt.plot(dataset.hart, alpha=0.5, color='blue', label="raw signal") #Plot semi-transparent HR
plt.plot(mov_avg, color ='green', label="moving average") #Plot moving average
plt.scatter(peaklist, ybeat, color='red', label="average: %.1f BPM" %bpm) #Plot detected peaks
plt.legend(loc=4, framealpha=0.6)
plt.show()

BPM2

Pretty cool! You have taken the first step in analyzing the heart rate signal. The number of beats per minute is a useful measure that is very often used in scientific studies and has many non-scientific uses as well, but the signal contains much, much more information. Part 2 of this series will deal with extracting more complex information from the heart rate signal.


Rounding up
Finally let’s tidy up our code and put it in callable functions. This will make our life much easier in the next part, and our code much more organized and re-usable. Note that probably the tidy thing to do is to make the functions part of a Class, but to keep the tutorial accessible also to those less experienced in Python (and perhaps not familiar or confident with classes), I’ve chosen to omit this from all code in this tutorial series.

Let’s put the BPM value and the lists we calculate in a dictionary that we can call, and can append with the measures we will calculate in part 2. Also let’s write a wrapper function process() so that we can call our analysis with as little code as possible:

~

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import math

measures = {}

def get_data(filename):
    dataset = pd.read_csv(filename)
    return dataset

def rolmean(dataset, hrw, fs):
    mov_avg = pd.rolling_mean(dataset.hart, window=(hrw*fs))
    avg_hr = (np.mean(dataset.hart)) 
    mov_avg = [avg_hr if math.isnan(x) else x for x in mov_avg]
    mov_avg = [x*1.2 for x in mov_avg]
    dataset['hart_rollingmean'] = mov_avg

def detect_peaks(dataset):
    window = []
    peaklist = []
    listpos = 0 
    for datapoint in dataset.hart:
        rollingmean = dataset.hart_rollingmean[listpos]
        if (datapoint < rollingmean) and (len(window) < 1):
            listpos += 1
        elif (datapoint > rollingmean):
            window.append(datapoint)
            listpos += 1
        else:
            maximum = max(window)
            beatposition = listpos - len(window) + (window.index(max(window)))
            peaklist.append(beatposition)
            window = []
            listpos += 1
    measures['peaklist'] = peaklist
    measures['ybeat'] = [dataset.hart[x] for x in peaklist]

def calc_RR(dataset, fs):
    RR_list = []
    peaklist = measures['peaklist']
    cnt = 0
    while (cnt < (len(peaklist)-1)):
        RR_interval = (peaklist[cnt+1] - peaklist[cnt])
        ms_dist = ((RR_interval / fs) * 1000.0)
        RR_list.append(ms_dist)
        cnt += 1
    measures['RR_list'] = RR_list
    
def calc_bpm():
    RR_list = measures['RR_list']
    measures['bpm'] = 60000 / np.mean(RR_list)

def plotter(dataset, title):
    peaklist = measures['peaklist']
    ybeat = measures['ybeat']
    plt.title(title)
    plt.plot(dataset.hart, alpha=0.5, color='blue', label="raw signal")
    plt.plot(dataset.hart_rollingmean, color ='green', label="moving average")
    plt.scatter(peaklist, ybeat, color='red', label="average: %.1f BPM" %measures['bpm'])
    plt.legend(loc=4, framealpha=0.6)
    plt.show() 

def process(dataset, hrw, fs): #Remember; hrw was the one-sided window size (we used 0.75) and fs was the sample rate (file is recorded at 100Hz)
    rolmean(dataset, hrw, fs)
    detect_peaks(dataset)
    calc_RR(dataset, fs)
    calc_bpm()
    plotter(dataset, "My Heartbeat Plot")

Call it like this:

~

import heartbeat as hb #Assuming we named the file 'heartbeat.py'

dataset = hb.get_data("data.csv")
hb.process(dataset, 0.75, 100)

#We have imported our Python module as an object called 'hb'
#This object contains the dictionary 'measures' with all values in it
#Now we can also retrieve the BPM value (and later other values) like this:
bpm = hb.measures['bpm']

#To view all objects in the dictionary, use "keys()" like so:
print hb.measures.keys()

Note that we have kept the get_data() function separate from our wrapper. This way our module can also accept dataframe objects already stored in memory, useful for example when these are generated by another program. This keeps our module flexible.

Pt1_finalplot2


53 Comments

  • Tu

    17th January 2017

    Is the data.csv file have inter beat interval values in millisecond ?

    Reply
    • Paul van Gent

      17th January 2017

      No, it is the raw heart rate signal.

      Reply
      • Tu

        17th January 2017

        You mean the heart beats ? can you please tell the unit?

        Reply
        • Paul van Gent

          17th January 2017

          What you see is the change in sensor conductance in response to the discoloration of the skin as the heart beats. There is an arbitrary time scale, but that is not relevant to the tutorial. Follow the series and it should become clear!

          Reply
  • Manujaya

    21st April 2017

    sir when i compile this code there is an error given

    raise ValueError(“window must be an integer”)
    ValueError: window must be an integer”
    can you please help me to solve this problem. This is really need for my project
    thank you

    Reply
    • Paul van Gent

      21st April 2017

      Hi Manujaya. The error should tell you all you need to fix it. I suggest you start with a few more basic Python​ tutorials. This one assumes at least an intermediate familiarity.

      Reply
  • Pingback: Start coding for Bobbi – the open source ECG monitor! | Pavlov | A dash of code, a pinch of science

  • Kamil

    25th May 2017

    Could you tell me how to process the ECG signal, which I have stored as a ndarray? It’s a standard numpy ndarray with one row (but there isn’t any key like ‘hart’). I wanted to pass it to the processing functions.

    Reply
  • Eina

    14th June 2017

    Hai Mr. van Gent,

    I’m doing this project for my thesis. I have some question that I want to ask.
    1. Is the data in csv raw (not in Hz)? If yes, why it’s so high? is it like the frequency of the pulse or the pulse it self? Because I’m trying to implement your code in my data that I get.
    2. If you get the raw pulse and you change it to frequency (in Hz), how did you change it? Because I haven’t found it yet.

    Thanks

    Reply
    • Paul van Gent

      14th June 2017

      Hi Eina.
      1. Yes it’s raw sensordata. I’m not sure what you mean here by ‘why is it so high’. I would suggest you follow the tutorial and plot the data, then you see why it’s sensordata. What sensor are you using and what data comes out of it?
      2. What do you mean with ‘raw pulse’? If you mean the raw signal, that’s what the tutorial series is about. If you mean you already have BPM but want Hz, well, BPM=beats per minute, think of Hz as beats-per-second.
      Let me know.
      -Paul

      Reply
  • Marcus Holzberger

    7th July 2017

    Very nice . Thank you

    Reply
  • Santosh

    8th July 2017

    Hai Mr. van Gent,

    i am research student i am working on real time heart rate monitoring and analysis signal and calculate RR interval ,can i do real time data analysis code using in python
    for access real time data using pulse sensor please help me how can access? .

    Reply
    • Paul van Gent

      8th July 2017

      Santosh, I’m not sure what you mean exactly. Do you want to read the sensor? Do you want help with the algorithm? Let me know
      -Paul

      Reply
      • Santosh

        17th November 2017

        Hai Mr. van Gent
        i am working disease prediction using HRV , so my question is how to predict diseases using real time HRv and which algorithm is best for prediction

        Reply
    • Wadaane

      21st July 2017

      Hello,
      Me too, I also wanted to do real time analyse and graph, but matplotlib animations are too slow.
      I gave up the idea to draw all datapoint, I think to reduce the number of samples to draw.
      You get me ?
      Like 10 beats take almost 4000 samples at 500Hz, I’ll use the 4000 samples for analysing, but only draw half, or quarter, like draw every fourth datapoint.
      I didn’t try it yet, but I think that’ll speed things up.

      Reply
      • Paul van Gent

        21st July 2017

        That probably won’t work, the library isn’t fast enough for anything real-time. You could look at some c or c++ libs. An alternative is to buffer 5 sec, analyse&plot, and repeat

        Reply
  • Anam

    8th July 2017

    Hello Paul, could you explain why you took 0.75s as your window size? What proportion are you talking about?

    Reply
    • Paul van Gent

      8th July 2017

      Hi Anam. I see I’ve made a mistake in the code. It’s not 0.75s (unless your sampling rate is 60Hz), but rather 0.75*sampling rate in both directions. The reason for this is that the larger this threshold, the less like the HR signal the rolling average will be, the smaller, the more similar. I would recommend you plot the signal + rolling average with different window sizes. See what happens, and why in this case 0.75 (or similar sizes) work.
      -Paul

      Reply
      • Anam

        8th July 2017

        Yep my signal is actually quite noisy and with 0.75, it actually considered the T-peaks to be R as well! I experimented with the value a little and 0.45-0.5 helped. Another problem I’m having is that my ECG sensor gives me the signal shape in a negative range of voltages sometimes (I think I need to read up on lead configurations a little) so I actually have to multiply the moving average by a value lesser than 1 instead of 1.2 or 1.1 like you’ve used. So I guess I’m just stuck with good ol’ trial and error!

        Reply
        • Paul van Gent

          8th July 2017

          Before tweaking I recommend you pass through part 2 and part 3 as well. There are some bits about signal filtering there, which is (always) a good idea, especially if you are having troubles with noise.

          Alternatively, if you don’t care about how it works, look at my repo: https://github.com/paulvangentcom/heartrate_analysis_python

          Reply
        • Paul van Gent

          8th July 2017

          Regarding the negative range: you can consider shifting it all up by some amount so it’s all positive, or cut off the negative ranges (if these don’t contain peak data, which I suspect based on your explanation).

          Reply
  • Becky

    31st August 2017

    Hi,

    I tried the final version on github for my data-set but was receiving an error and just decided to walk through the tutorial and make sure I was following. I know there is more to come but wanted to point out a couple of adjustments I had to make at this point for things to work:

    1. In the peak detection loop, I had to add in a check for ‘nan’. I also added in the option to start the peak detection at a later sample because a had a lot of nan values at the start that were throwing things off.

    if ((datapoint < rollingmean) and (len(window) < 1)) or (math.isnan(datapoint)) or (listpos<3000):

    2. Negative values seemed to be an issue so I subtracted off the min if it was negative:

    ECGmin=np.nanmin(dataset.ECG)
    if ECGmin<1: dataset.ECG=dataset.ECG-ECGmin

    I'm just curious if there was something I missed in adding these work arounds or thought I would share if they are helpful to others.

    This library is great, thanks for sharing!

    Reply
    • Paul van Gent

      31st August 2017

      Hi Becky. Thanks for your suggestions!

      1. That is correct, NaN’s are not handled. I Recommend interpolating your signal to fill in the NaN values, rather than handling them separately in the detection. I’ll put an interpolation preprocessing handler on the to-do list for github
      2. Correct, I recently ran into this when using a different ECG device as well, as well as a device where the signal needed to be flipped in its entirety. The fixes are there but not merged to github yet, on the to-do list.

      Reply
  • Meera

    6th September 2017

    Hi,
    Your project is amazing. Thank you. Is it also possible to apply the moving average filter in real-time? As in when I’m acquiring the signal at 20 Hz i would like to see my raw data, filtered data and beats per minute.

    Reply
    • Paul van Gent

      11th September 2017

      Hi Meera. 20Hz is much too low to get any meaningful results, I would look at 100Hz minimum. Python is not fast enough for this kind of real-time thing. You could, however, generate one plot each second..

      Reply
  • Amelia

    15th September 2017

    Hello Paul, your work is superb… I came across it while I was working on a project of wireless ecg transmission, and thought of using the same to verify whether the ecg has been received at the receiver side correctly- using the total number of beats and bpm…..Unfortunately I had some trouble with the python language and sorry to ask this but the .hart parameter – what exactly is that – when I run the code I often get errors saying the hart parameter is not available for the dataset etc.

    <the portions I am refering to are:

    plt.plot(dataset.hart)

    ….
    mov_avg = pd.rolling_mean(dataset.hart, window=(hrw*fs))
    ….

    etc…. Is it something that is available only to .csv files with both time and hr columns?
    (Sorry if the question is inapproprriate w.r.t to the topic…but it would be really helpful if you could answer this)

    Reply
    • Paul van Gent

      19th September 2017

      Hi Amelia. Thanks for the kind words!

      the .hart parameter refers to a specific column in the dataframe, so if you name your dataframe “df”, then you can access the column “hart” by calling df.hart. Does that make sense?

      As a reference, take a look at the github version, this drops the Pandas dependency and adds some optimizations. You might find using this one + documentation easier than following the tutorial if you’re not that familiar with Python.

      Reply
  • Amelia

    15th September 2017

    (also my sampling frequency is 244 Hz….will that sampling frequency have any negative effects on the calculation?
    -also how large the sample dataset can be (i.e. the minimum maximum allowed dataset size) for the code to work well..?

    Reply
    • Paul van Gent

      19th September 2017

      That should be fine. Generally a higher sampling rate is better as the signal will contain more data (resulting in more precision) to base measures on.

      Reply
  • amulya

    11th October 2017

    hi
    at step 2 plotting peaks….yea……im getting an error like
    validate
    raise ValueError(“window must be an integer”)

    ValueError: window must be an integer

    what do i do

    Reply
    • Paul van Gent

      12th October 2017

      What’s happening is that the Pandas function ‘rolling_mean’ expects the window size to be an integer. You can ensure an integer either by setting the window size parameters appropriately (lines 9,10), or force it like so:

      mov_avg = pd.rolling_mean(dataset.hart, int(window=(hrw*fs)))

      -Paul

      Reply
      • emily

        24th October 2017

        i seem to have a issue : window in an undefined function.

        Reply
        • Paul van Gent

          25th October 2017

          Emily, please share the details. With this I have no idea what’s happening. You can either post it here or mail me your code and a screenshot of the full error message to info@paulvangent.com

          -Paul

          Reply
    • M. Chowdhury

      30th October 2017

      mov_avg = pd.rolling_mean(dataset.hart, window=int(hrw*fs))

      worked for me, even though I got some warning. You can try and update us.

      Reply
  • amulya

    24th October 2017

    ohhh okay thats a lot of help thank you so much and for this blog.

    Reply
  • General

    1st November 2017

    Amazing tutorial THANK YOU SO MUCH!

    Reply
  • meredith

    3rd November 2017

    the code
    ***
    dataset = pd.read_csv(“data.csv”)

    #Calculate moving average with 0.75s in both directions, then append do dataset
    hrw = 0.75 #One-sided window size, as proportion of the sampling frequency
    fs = 100 #The example dataset was recorded at 100Hz
    mov_avg = pd.rolling_mean(dataset.hart, int(window=(hrw*fs)))

    #Impute where moving average function returns NaN, which is the beginning of the signal where x hrw
    avg_hr = (np.mean(dataset.hart))
    mov_avg = [avg_hr if math.isnan(x) else x for x in mov_avg]

    mov_avg = [x*1.2 for x in mov_avg] #For now we raise the average by 20% to prevent the secondary heart contraction from interfering, in part 2 we will do this dynamically
    dataset[‘hart_rollingmean’] = mov_avg #Append the moving average to the dataframe

    #Mark regions of interest
    window = []
    peaklist = []
    listpos = 0 #We use a counter to move over the different data columns
    for datapoint in dataset.hart:
    rollingmean = dataset.hart_rollingmean[listpos] #Get local mean

    if (datapoint < rollingmean) and (len(window) do nothing
    listpos += 1

    elif (datapoint > rollingmean): #If signal comes above local mean, mark ROI
    window.append(datapoint)
    listpos += 1

    else: #If signal drops below local mean -> determine highest point
    maximum = max(window)
    beatposition = listpos – len(window) + (window.index(max(window))) #Notate the position of the point on the X-axis
    peaklist.append(beatposition) #Add detected peak to list
    window = [] #Clear marked ROI
    listpos += 1

    ybeat = [dataset.hart[x] for x in peaklist] #Get the y-value of all peaks for plotting purposes

    plt.title(“Detected peaks in signal”)
    plt.xlim(0,2500)
    plt.plot(dataset.hart, alpha=0.5, color=’blue’) #Plot semi-transparent HR
    plt.plot(mov_avg, color =’green’) #Plot moving average
    plt.scatter(peaklist, ybeat, color=’red’) #Plot detected peaks
    plt.show()
    ***

    i did this it says window in mov_avg = pd.rolling_mean(dataset.hart, int(window=(hrw*fs)))

    TypeError: ‘window’ is an invalid keyword argument for this function

    please help im new to python but i need to code for learning purposes i tried my best.

    Reply
    • Paul van Gent

      8th November 2017

      Hi Meredith, what version of pandas are you using? I believe the “rolling_mean()” function was planned to change to “rolling()” in newer versions.

      Reply
      • Paul van Gent

        11th November 2017

        If you’re using the VS command prompt, make sure you use the correct instruction set. Use the 64-bit VS prompt if you’re trying to build a 64-bit library, etc.

        Reply
  • meredith

    10th November 2017

    hi paul thanks for replying my pandas version is ‘0.18.1’

    Reply
    • Paul van Gent

      11th November 2017

      Try calling rolling() in stead of rolling_mean.
      – Paul

      Reply
  • meredith

    13th November 2017

    it says module pandas has no attribute rolling…..so it does not contain the function?

    thanks paul

    Reply
  • ann

    15th November 2017

    hi paul,
    in 2nd code i just changed hrw= any integer without fractions and i get the output otherwise no, why does this happen?

    Reply
    • Paul van Gent

      4th December 2017

      Hi Ann. This is an issue with the current implementation. As soon as my paper about the algorithm is published, I’ll update everything (and put part 4 online). For now use floor division (“//” in stead of “/”), or wrap the division in int(). It doesn’t work because the rolling window function expects an integer.

      Reply
  • Sree

    20th November 2017

    Hi,
    Great Tutorial!
    How would I modify this code to read in input from audio source (say I have a heart sound recording).
    Could I use:
    from scipy.io import wavfile as wav
    rate, data = wav.read(‘heart_sound.wav’)
    In this case would my data be equivalent to dataset.hart?

    Thanks again for the great tutorial!

    Reply
    • Paul van Gent

      4th December 2017

      Hi Sree, We talked on Github. I will try to implement a quick fix for you asap.

      Reply
  • anu

    21st November 2017

    can i please have your mail paul cause i could send screenshots of the signals i have cause there’s a question i have

    Reply
  • Samer

    3rd December 2017

    Hi Paul!
    I am loading an already recorded raw ECG signal and filtering it:
    x = []
    with open(‘C:/Users/PC/Downloads/SamirECG(Win).txt’,’r’) as csvfile:
    dat = csv.reader(open(‘C:/Users/PC/Downloads/SamirECG(Win).txt’,’r’))
    for row in dat:
    x.append(float(row[0]))
    x = np.transpose(x)
    dataset = x[50:2000]
    f2 = np.convolve(dataset, np.ones((7,)) / 7, mode = ‘valid’)

    Things are working well regarding this issue, but when I try to run the first part of your code, which does the moving average calculation:
    hrw = 0.75 #One-sided window size, as proportion of the sampling frequency
    fs =500 #The example dataset was recorded at 100Hz
    mov_avg = pd.rolling_mean(f2.hart, window=(hrw*fs))

    I keep getting the following error:
    AttributeError: ‘numpy.ndarray’ object has no attribute ‘hart’

    Could you please help me fix this problem?
    Your help is appreciated,
    thanks!

    Reply
    • Paul van Gent

      4th December 2017

      Hi Samer. The implementation on this site uses Pandas DataFrame objects as data handlers. These, unlike numpy arrays that you’re using, can use named data retrieval. In the case of pandas: f2.hart, or similarly, f2[‘hart’].values, would return the values in the column with the header ‘hart’.

      The quickest way for you to get moving is to clone the numpy-based code from my github (https://github.com/paulvangentcom/heartrate_analysis_python). This code natively handles numpy arrays. Alternatively, you can open your csv using pandas and put the ECG data in a column named ‘hart’.

      Note that if you’re using raw ECG values, you need to transpose the entire signal up so that there are no negative values. This is addressed in the final part of the tutorial which will go online early 2018.

      – Paul

      Reply
  • John

    15th December 2017

    Hi Paul:
    (Please, see my thanks for your work on the comments section of part 2!)

    Many reading your code have been complaining about an integer error on parameter ‘window’ in pandas.rolling_mean. There seems to be a glitch of some sort. I was getting the error too.
    Here is what I’ve noticed:
    1. forcing window=int(hrw*fs) gives the same error
    2. typing “mov_avg = pd.rolling_mean(dataset.hart, window=int(hrw*fs))” on ipython works. Same goes for “mov_avg=dataset[‘hart’].rolling(window=75,center=False).mean()”
    3. calling the functions from the main body of heartbeat.py also works fine when window=int(hrw*fs).

    My recommendation to ones having this problem is calling the functions like this:
    >>import heartbeat
    >>dataset=get_data(‘data.csv’)
    >>process(dataset,0.75,100)
    >>bpm=measures[‘bpm’]
    >>print measures.keys()

    I’m using Python2.7 and tested the above on Spyder 3.1.4 under Anaconda2 distribution.
    I’m not really sure where the heart of the problem is. Just seem to have found a workaround.

    Thank you (and any reading),
    John

    Reply
    • Paul van Gent

      16th December 2017

      Thanks. I hop others will benefit from this as well. Pretty soon I hope my software paper about this is published and I can release part 4, which contains many changes including dropping pandas. You can sneak peak in my github for few changes.
      – Paul

      Reply

Leave a Reply