Analyzing a Discrete Heart Rate Signal Using Python – Part 2

This the second part in a four part series about how to use Python for heart rate analysis. In this part you will learn about more complex information embedded in the heart rate signal, and how to extract it using Python.

Important: The code in this tutorial is licensed under the GNU 3.0 open source license and you are free to modify and redistribute the code, given that you give others you share the code with the same right, and cite my name (use citation format below). You are not free to redistribute or modify the tutorial itself in any way. By reading on you agree to these terms. If you disagree, please navigate away from this page.
Citation format
van Gent, P. (2016). Analyzing a Discrete Heart Rate Signal Using Python. A tech blog about fun things with Python and embedded electronics. Retrieved from:

Github rep: Github repository located here

Find the other parts at these pages:

The following tutorial assumes intermediate knowledge of the Python programming language, FIR-filters and fast fourier transform methods. Highschool mathematics and basic statistics will help in understanding the theory and background, but are not required for the coding.
The tutorial should be suitable for those with intermediate levels of Python skill. It is divided into separate parts so that you can easily skip over those parts you understand. The data was collected with a simple data logger based on an ATTiny45, as described in another post.

Some Theory and Background
The heart rate signal contains a lot of information, not just about your heart but also about your breathing, short-term blood pressure regulation, body temperature regulation and hormonal blood pressure regulation (long term). It has also (though not always consistently) been linked to mental effort, and not surprisingly so since your brain is a very hungry organ, using up to 25% of your total glucose and 20% of your oxygen consumption. If its activity increases your heart needs to work harder to keep it supplied.
The measures we are interested in can be split into time-series data en frequency domain data. If you are familiar with fourier transforms the frequency part will make sense. If not, the wikipedia page has a good explanantion, and also a very nice visualisation of the process. The basic idea is that you take a signal that repeats over time (such as the heart rate signal), and determine what frequencies make up the signal. You ‘transform the signal from the time domain to the frequency domain’. Here is another visualisation I particularly love, which clearly shows how a repeating signal can be approximated as the sum of different sine waves repeating over time.
Time Series Data
For the time-series portion of the heart rate signal we are mostly concerned with the intervals between the heart beats and how they vary over time. We want the position of all R-complexes (R1, R2,…Rn), the intervals between them (RR1, RR2,…RRn, defined as ) and the differences between adjacent intervals (RRdiff-1,…RRdiff-n, defined as ).
To visualize:

The time series measures often found in the scientific literature are:

The IBI, SDNN, SDSD, RMSSD en pNNx (and also frequency domain measures) are often grouped under “Heart Rate Variability” (HRV) measures, because they give information about how the heart rate varies over time.
Frequency Domain Data
On the frequency side of the heart rate signal the most often found measures are called the HF (High Frequency), MF (Mid Frequency) and LF (Low Frequency) bands, an eternal testament to the level of creative naming found in science. The MF en HF bands are often taken together and just labeled ‘HF’. LF and HF roughly correspond to 0.04-0.15Hz for the LF band and 0.16-0.5Hz for the HF band. The LF band seems related to short-term blood pressure variation, the HF band to breathing rate.
The frequency spectrum is calculated performing a Fast Fourier Transform over the R-R interval dataseries. This method is, as the name implies, fast compared to the Discrete Fourier Transform method. The larger the dataset, the larger the speed difference between the methods.
We will calculate the measure for HF and LF by first to re-sampling the signal so we can estimate the spectrum, then transforming the re-sampled signal to the frequency domain, then integrating the area under the curve at the given intervals.

Time Domain Measures – Getting Started
Looking at the above time-series measures, we need a few ingredients to easily calculate all of them:

  • A list of the positions of all R-peaks;
  • A list of the intervals between all subsequent R-R pairs (RR1, RR2,.. RR-n);
  • A list of the differences between all subsequent intervals between R-R pairs (RRdiff1,… RRdiffn);
  • A list of the squared differences between all subsequent differences between R-R pairs.

We already have the list of all R-peak positions from the detect_peaks() function from the first part, which is contained in dict[‘peaklist’]. We also have a list of R-R pair differences from the calc_RR() function, which is in dict[‘RR_list’]. Great! No additional code written, already 50% of the way there.
To get the final two ingredients we use dict[‘RR_list’] and calculate both the differences and the squared differences between adjacent values:

RR_diff = []
RR_sqdiff = []
RR_list = measures['RR_list']
cnt = 1 #Use counter to iterate over RR_list

while (cnt < (len(RR_list)-1)): #Keep going as long as there are R-R intervals
    RR_diff.append(abs(RR_list[cnt] - RR_list[cnt+1])) #Calculate absolute difference between successive R-R interval
    RR_sqdiff.append(math.pow(RR_list[cnt] - RR_list[cnt+1], 2)) #Calculate squared difference
    cnt += 1

print(RR_diff, RR_sqdiff)

Calculating Time Domain Measures
Now that the ingredients are there, it’s easy to calculate all measures:

ibi = np.mean(RR_list) #Take the mean of RR_list to get the mean Inter Beat Interval
print("IBI:", ibi)

sdnn = np.std(RR_list) #Take standard deviation of all R-R intervals
print("SDNN:", sdnn)

sdsd = np.std(RR_diff) #Take standard deviation of the differences between all subsequent R-R intervals
print("SDSD:", sdsd)

rmssd = np.sqrt(np.mean(RR_sqdiff)) #Take root of the mean of the list of squared differences
print("RMSSD:", rmssd)

nn20 = [x for x in RR_diff if (x>20)] #First create a list of all values over 20, 50
nn50 = [x for x in RR_diff if (x>50)]
pnn20 = float(len(NN20)) / float(len(RR_diff)) #Calculate the proportion of NN20, NN50 intervals to all intervals
pnn50 = float(len(NN50)) / float(len(RR_diff)) #Note the use of float(), because we don't want Python to think we want an int() and round the proportion to 0 or 1
print("pNN20, pNN50:", pnn20, pnn50)

That’s it for the time series measures. Let’s wrap them up in callable functions. We expand the calc_RR() function to calculate our extra ingredients and append them to the dictionary object, and also merge calc_bpm() with the other time series measurements into a new function calc_ts_measures() and append them to the dictionary:
def calc_RR(dataset, fs):
    peaklist = measures['peaklist']
    RR_list = []
    cnt = 0
    while (cnt < (len(peaklist)-1)):
        RR_interval = (peaklist[cnt+1] - peaklist[cnt])
        ms_dist = ((RR_interval / fs) * 1000.0)
        cnt += 1
    RR_diff = []
    RR_sqdiff = []
    cnt = 0
    while (cnt < (len(RR_list)-1)):
        RR_diff.append(abs(RR_list[cnt] - RR_list[cnt+1]))
        RR_sqdiff.append(math.pow(RR_list[cnt] - RR_list[cnt+1], 2))
        cnt += 1
    measures['RR_list'] = RR_list
    measures['RR_diff'] = RR_diff
    measures['RR_sqdiff'] = RR_sqdiff

def calc_ts_measures():
    RR_list = measures['RR_list']
    RR_diff = measures['RR_diff']
    RR_sqdiff = measures['RR_sqdiff']
    measures['bpm'] = 60000 / np.mean(RR_list)
    measures['ibi'] = np.mean(RR_list)
    measures['sdnn'] = np.std(RR_list)
    measures['sdsd'] = np.std(RR_diff)
    measures['rmssd'] = np.sqrt(np.mean(RR_sqdiff))
    NN20 = [x for x in RR_diff if (x>20)]
    NN50 = [x for x in RR_diff if (x>50)]
    measures['nn20'] = NN20
    measures['nn50'] = NN50
    measures['pnn20'] = float(len(NN20)) / float(len(RR_diff))
    measures['pnn50'] = float(len(NN50)) / float(len(RR_diff))

#Don't forget to update our process() wrapper to include the new function
def process(dataset, hrw, fs):
    rolmean(dataset, hrw, fs)
    calc_RR(dataset, fs)

Call it like this:
import heartbeat as hb #Assuming we named the file ''

dataset = hb.get_data('data.csv')

hb.process(dataset, 0.75, 100)
#The module dict now contains all the variables computed over our signal:
#Remember that you can get a list of all dictionary entries with "keys()":

Analysing the signal and calculating all measures in a single thread takes about 25ms on my computer. That’s pretty quick!

Frequency Domain Measures – Getting Started
The calculation of the frequency domain measures is a bit more tricky. The main reason is that we do not want to transform the heart rate signal to the frequency domain (doing so would only return a strong frequency equal to BPM/60, the heart beat expressed in Hz). Rather, we want to transform the R-R intervals to the frequency domain. Difficult to understand? Think about it like this: your heart rate varies over time as your heart speeds up and slows down in response to the changing demands of your body. This variation is expressed in the changing distances between heart beats over time (the R-R intervals we calculated earlier). The distances between R-R peaks vary over time with their own frequency. To visualize, plot the R-R intervals we calculated earlier:
From the plot you can clearly see that the R-R intervals do not change abrubtly per heart beat, but vary over time in a sine-wave like pattern (more precisely: a combination of different sine waves). We want to find the frequencies that make up this pattern.
However, any fourier transform method depends on evenly spaced data, and our R-R intervals are most certainly not evenly spaced in time. This is because the position in time of the intervals is dependent on their length, which is different for each interval. I hope this makes sense.
To find our measures, we need to:

  • Create an evenly spaced timeline with the R-R intervals on it;
  • Interpolate the signal, which serves to both create an evenly spaced time-series and increase the resolution;
    • This interpolation step is also called re-sampling in some studies.
  • Transform the signal to the frequency domain;
  • Integrate the area under the LF and HF portion of the spectrum.

Calculating Frequency Domain Measures
First we create an evenly spaced timeline for the R-R intervals. To do this we take the sample position of all R-peaks, which is housed in the list dict[‘peaklist’] that we calculated in part 1. We then assign the y-values to these sample positions from the list dict[‘RR_list’], which contains the duration of all R-R intervals. Finally we interpolate the signal.

from scipy.interpolate import interp1d #Import the interpolate function from SciPy

peaklist = measures['peaklist'] #First retrieve the lists we need

RR_list = measures['RR_list']
RR_x = peaklist[1:] #Remove the first entry, because first interval is assigned to the second beat.
RR_y = RR_list #Y-values are equal to interval lengths
RR_x_new = np.linspace(RR_x[0],RR_x[-1],RR_x[-1]) #Create evenly spaced timeline starting at the second peak, its endpoint and length equal to position of last peak
f = interp1d(RR_x, RR_y, kind='cubic') #Interpolate the signal with cubic spline interpolation

Note that the time series does not start at the first peak, but at the sample position of the second R-peak. Because we work with the intervals, the first interval is available at the second peak.
We can now use the created function f() to find the y value of any x-position within our signal range:
print f(250)
#Returns 997.619845418, the Y value at x=250

Simlarly, we can pass our entire timeseries RR_x_new to the function and plot it:
plt.title("Original and Interpolated Signal")
plt.plot(RR_x, RR_y, label="Original", color='blue')
plt.plot(RR_x_new, f(RR_x_new), label="Interpolated", color='red')

Now to find the frequencies that make up the interpolated signal, use numpy’s fast fourier transform np.fft.fft() method, calculate sample spacing, convert sample bins to Hz and plot:
#Set variables
n = len(dataset.hart) #Length of the signal
frq = np.fft.fftfreq(len(dataset.hart), d=((1/fs))) #divide the bins into frequency categories
frq = frq[range(n/2)] #Get single side of the frequency range

Y = np.fft.fft(f(RR_x_new))/n #Calculate FFT
Y = Y[range(n/2)] #Return one side of the FFT

plt.title("Frequency Spectrum of Heart Rate Variability")
plt.xlim(0,0.6) #Limit X axis to frequencies of interest (0-0.6Hz for visibility, we are interested in 0.04-0.5)
plt.ylim(0, 50) #Limit Y axis for visibility
plt.plot(frq, abs(Y)) #Plot it
plt.xlabel("Frequencies in Hz")

Nice! You can clearly see the LF and HF frequency peaks in the signal.
The last thing remaining is to integrate the area under curve at the LF (0.04 – 0.15Hz) and HF (0.16 – 0.5Hz) frequency bands. We need to find the data points corresponding to the frequency range we’re interested in. During the FFT we calculated the one-sided frequency range frq, so we can search this for the required data point positions.
lf = np.trapz(abs(Y[(frq>=0.04) & (frq<=0.15)])) #Slice frequency spectrum where x is between 0.04 and 0.15Hz (LF), and use NumPy's trapezoidal integration function to find the area
print("LF:", lf)

hf = np.trapz(abs(Y[(frq>=0.16) & (frq<=0.5)])) #Do the same for 0.16-0.5Hz (HF)
print("HF:", hf)

This returns:
LF: 38.8900414093
HF: 47.8933830871

These are the areas under the frequency plot at the frequency spectra of interest. Remember from the theory that HF is related to breathing and LF to short-term blood pressure regulation. The measures have also been implicated in increased mental activity.

Wrapping It All Up
We’ve come a long way and can extract many meaningful measures from a heart rate signal now. However, chances are the module will fail if you input other heart rate data, because it may be more noisy than our ideal data, or may contain artefacts. The third part of this series will deal with signal filtering, error detection and outlier rejection.


  • santosh pandure

    August 29, 2016

    respected sir,
    can you give me your email id for communication. i need your help.

    • Paul van Gent

      August 29, 2016

      What can I help you with? Please ask the questions in the comments as they may help others as well.

  • Anupam

    November 29, 2016

    Dataset values are ranging from 400 to 700 but the arduino output is ranging from 40 to 70. Should I just multiply it by 10? Do you happen to have an optimized code of Pulse sensor for the Arduino?

    • Paul van Gent

      November 29, 2016

      That depends on how you’re reading from the Arduino. Check the code you are you using there. If you use an analog read it will have a range of 0-1023. Never change values in a dataset because they are not what you expect!

  • Alex

    January 11, 2017

    Thank you for the nice tutorial. I have a question, is it possible to use heart rate directly to calculate these features? or are there any other features that can be calculated directly from heart rate and not from IBI?

    • Paul van Gent

      January 12, 2017

      I’m not completely sure what you mean. Do you mean you have a BPM value and want to calculate the other measures? You can use the BPM value to get the average IBI over a section, but for most measures you need each peak position.

      • Alex

        January 12, 2017

        Yes exactly! Is it possible for frequency domain features: If I first convert BPM to RR intervals, detect peaks in that signal and then calculate frequency domain features? or this approach is wrong?

        • Paul van Gent

          January 17, 2017

          Hi Alex,
          Unfortunately I would not recommend this in most cases, but it depends on the resolution of the BPM data.
          The BPM value is usually an aggregate measure over multiple beats. Converting to RR-intervals means guessing what RR-intervals make up a segment of several beats. This state is unknown and has many possible permutations, estimating it will likely lead to unreliable results.
          You can quite easily create an arduino heart rate measurement system. The easiest is to use a Pulse Sensor or similar optical HR sensor. You could also create an electrical heart rate sensor with two contacts, a few op-amps and an optocoupler (always electrically de-couple circuitry attached to your body!)

  • Alex

    January 18, 2017

    Hi Paul,
    Thank you for your detailed response. Can you tell me, what steps I can follow from this tutorial to calculate HRV features if I have inter-beat-interval data?

    • Paul van Gent

      January 18, 2017

      If you have all inter-beat-intervals it is quite easy. You can do two things:
      – You can take a look at the formulas for the varying HRV measures in this tutorial or on other places on the web
      – If you want to use the code from this tutorial series, you can synthesize an R-R list from the IBI intervals. Example, if you have an IBI list of [500, 450, 500], this would mean you have data for 4 beats. These are at 0, 500, 950 and 1450ms respectively. Note that you need a per-beat-IBI, not an aggregate!
      Good luck 🙂

  • Ken

    January 23, 2017

    May I ask a question?
    I’ve been analysing ECG data using python(so this page is really helpful for me, thank you). I found that other research articles or web pages about HRV always use PSD(Power Spectral Density) to calculate LF and HF(In this page, you’re using amplitude spectrum, aren’t you?). What I want to know is can we use amplitude spectrum to calculate LF(and HF)?

    • Paul van Gent

      January 23, 2017

      Hi Ken,
      That is correct and I applaud you for noticing. Indeed I use the Amplitude Spectrum in this part. Doing a PSD can get quite complicated. In practice you can approximate the PSD with the absolute Fourier Transform magnitude squared. So if your Fourier function is F(x), then an approximation of the PSD is |F(x)|^2.
      Another approach is to use a periodogram, see this link.
      Both are quick and dirty estimators of the PSD, and both suffer from similar drawbacks. To convolute matters people often use the term PSD for the above or more complex methods interchangably.
      I’m stil writing part 4 in this series (have been for a while, I seem to be stuck in a time-vortex), where I actually go into this among other things. Before that, however, I need to increase my grasp of the methods as well :).

      • Ken

        January 24, 2017

        Thank you for the early reply, Paul!
        So in this case, squared Y means the rough values of PSD, right?
        example code:
        PSD = (np.fft.fft(f(RR_x_new))/n) ** 2 # Is this correct?
        I actually tried to use periodogram, but the output looked different from what I expected. I couldn’t change any optional parameters except fs at that time. So do we need to change any other options to calculate PSD?(Sorry for asking many questions. I am not familiar with wave analysis…)

        • Paul van Gent

          January 24, 2017

          You need to square every bin in the |FFT| spectrum. I’m not in the position to test your code right now. Plot them both and see if it seems right.
          However, note that the |F(x)|^2 is a very dirty way of estimating the PSD. One of the main problems is that adding more data does not add more variance, while a PSD refers to the spectrum energy per time unit. You can now see why it’s a dirty approximation. Doing a PSD the correct way is quite complex. However many papers that mention PSD seem to use the “absolute FFT squared” method, depending on what software they used to analyse the HRV data.

          • Ken

            January 25, 2017

            Oh I forgot to add the absolute function,
            PSD = np.abs((np.fft.fft(f(RR_x_new))/n)) ** 2 # This is the right one, I think.
            The output using this code seemed right(values of lf and hf are reasonable). But there is still difficulty using periodogram one(both lf and hf are nearly 0). So I am going to use fft this time. Thank you for your advice.

          • Paul van Gent

            January 26, 2017

            You’re welcome. Glad it was of help.

  • Marcus Holzberger

    July 7, 2017

    Hello Paul,
    first: very nice tutorial.
    I am not sure for which reason you calculate the rolling mean. You detect the Peak indexes from this. Is there any disadvantage to detect the peaks from the real signal? I do this with peakutils:
    with this peakutil.indexes it is very easy to make own thresholds, so that you can also detect the peaks in the T-waves for example.

    • Paul van Gent

      July 7, 2017

      Hi Marcus,
      A standard way of detecting peaks mathematically is to find intersections with some threshold, then solve for the maximum between two intersections. The rolling mean functions as this threshold in the code.
      Peak detection with other libraries is of course also a possibility, but in a tutorial series that is mostly about signal analysis I prefer to be more explicit in how things are calculated, and not rely on external libraries.
      Be sure to keep an eye on the site, I’ve almost finished part 4 about optimising and improving the whole thing.

      • Marcus Holzberge

        July 8, 2017

        Ah Ok, in part 3 it becomes reasonable for me. I should have read this before. I’m not sure if my code with peakutils can handle this.

        • Paul van Gent

          July 8, 2017

          Good to hear part 3 helped.
          As a general tip I would always recommend you to develop your own solution first before jumping onto modules (given you have the time). If anything, this will give you a good grasp of the problem and the possible solutions. This will help prevent things breaking later on.
          Once you finish the series I recommend you take a look at the repo, there’s some optimisations and generalisations there:

  • Dean

    October 7, 2017

    Hi Marcus, may I ask a question?
    In code, Is NN50 = nn50 ??
    Thanks for your tutorial!!
    nn20 = [x for x in RR_diff if (x>20)]
    nn50 = [x for x in RR_diff if (x>50)]
    pnn20 = float(len(NN20)) / float(len(RR_diff))
    pnn50 = float(len(NN50)) / float(len(RR_diff))

    • Paul van Gent

      October 10, 2017

      Hi Dean, yes I think that should indeed refer to nn50

  • ria

    November 16, 2017

    i completed part 1 and it was amazing and in part 2 peaklist gives me a key error should i add this code with or seperately?

  • John

    December 15, 2017

    Hi Paul, thank you for the tutorial and the hard work to put it together.
    I’d appreciate if you’d allow me to suggest a more “pythonian” way to calculate pNNx
    def pNN(x,t=50):
    “”” The proportion of differences greater than t ms. t defaults to 50ms; x is list of RR differences. “””
    pnnt=[i for i in x if (i>t)]
    return float(len(pnnt))/float(len(RR_diff))
    Then call pNN20(RR_diff, 20), pNN50(RR_diff), pNN80(RR_diff, 80) etc. This way other percentages/ratios are also covered without being explicitly calculated.
    Thank you for this excellent piece of work!

    • Paul van Gent

      December 16, 2017

      Hi John, thanks for the suggestion. That is indeed more pythonian!
      – Paul

  • Nuria

    March 9, 2018

    #Set variables
    Hello Paul,
    the post is marvelous. I have a doubt in the part where you transform the signal to the frequency domain. Can you explain more deeply what the code do?
    n = len(dataset.hart) #Length of the signal
    frq = np.fft.fftfreq(len(dataset.hart), d=((1/fs))) #divide the bins into frequency categories
    frq = frq[range(n/2)] #Get single side of the frequency range
    #Do FFT
    Y = np.fft.fft(f(RR_x_new))/n #Calculate FFT
    Y = Y[range(n/2)] #Return one side of the FFT

    • Paul van Gent

      March 21, 2018

      Hi Nuria. This is where the fast fourier transform is performed. I create the frequency bins (which frequencies belong to each datapoint-datapoint interval), then take a one-sided part of this (fft creates a signal mirrored across x=0). The fft itself is a standard numpy function.

  • Eduardo

    April 19, 2018

    Hi Paul, one question about normal range for LF/HF values: I have applied your method to my data and I have obtain a completely different value range, the thing is that I don’t know if Im doing something wrong or it is due a bad adaptation of your code, the thing is that I have a bvp signal from a plethismograph, adapting your code I obtain the same RRI_DF array and then I apply:
    #Set variables
    n = len(BVP_DF[“spData”]) #Length of the original bvp signal
    frq = np.fft.fftfreq(len(BVP_DF[“spData”]), d=((1/64)))
    frq = frq[range(int(n/2))] #Get single side of the frequency range
    #Do FFT
    Y = np.fft.fft(f(RR_x_new))/n #Calculate FFT
    Y = Y[range(int(n/2))] #Return one side of the FFT
    The thing is that my results are much lower than the you are showing above, my numbers for HF y LF are: LF: 1.07319855346
    HF: 1.84067854321
    And I don’t know if these could be correct or if they are to low for a real LF/HF value.
    Thanks for your article, really interesting!

    • Paul van Gent

      April 23, 2018

      Hi Eduardo. That seems correct. The thing with LFHF is that you use the ratio, so you need to divide: LF/HF. This ratio is what is used in the literature. I’ll make some adjustments to the tutorial to reflect this. I do use a simplistic method of approximating the power spectral density here (squared FFT). There are better ways.
      Stay tuned for a long-ago-announced part four. I keep saying it’s coming, and it is, but I need to await a publication procedure before I can put it online.
      – Paul

  • Raphael

    May 10, 2018

    Hi Paul
    You you interpolate the RR intervals to convert to continuous RR interval before extracting the frequency domain features. You are doing this using a third order (i.e. cubic) spline. I assume that the way you calculate the frequency features needs a continuous signal. Would it not also be possible to compute frequency features without interpolation?
    However, why are you using this cubic interpolation instead of a linear interpolation? I think with a cubic interpolation the signal can be overestimated or underestimated, especially around the RR values.

    • Paul van Gent

      May 14, 2018

      Hi Raphael,
      Apologies for not replying sooner, somehow I missed your comment. I upsample because of the analysis: when using a FFT to extract frequency domain features, the width of the frequency bins depends on the sampling rate of the underlying signal. Here upsampling means we can extract a bit more information.
      I use a cubic spline because I want to approximate the natural variation in the heart rate as you breathe. This variation happens gradually in a sinusoidal fashion, so using a cubic spline will approximate the underlying signal much better than a linear interpolation. The latter would leave the shape described by the sample datapoints intact and just upsample the number of points, in other words it would not add any information.
      Your question does make sense though. Always be careful when interpolating to make sure it makes sense. Adding information that is useless will only create problems.
      – Paul

  • Heedong

    June 29, 2018

    Hello Paul Your code has been a great help to the project we are doing right now.
    I would like to run the code of the whole process up to the contents of Part 2 by deep running, but is there a decent model? We are thinking of RNN-LSTM now, but there is no specific plan.

    • palkab

      June 29, 2018

      Hi Heedong. Why? Using deep learning on such a problem is likely not the way to go.

      An exact solution like what is done by the algorithm on the blog and on the github page is faster, requires only a fraction of an MB of ram and likely preciser.

      Something like an LSTM will already require several hundred MB of ram, larger models will require even more, and likely a GPU.

      Deep learning is powerful for sure, but not the answer to everything!

  • Yang

    June 29, 2018

    Hello Paul Your code has been a great help to the project we are doing right now.
      I would like to run the code of the whole process up to the contents of Part 2 by deep running, but is there a decent model? We are thinking of RNN-LSTM now, but there is no specific plan.

  • Raul

    February 19, 2019

    I want to congratulate you for the work done and the great contribution to the community. Also, I would like to apologize because my English is probably not correct.

    First of all, I would like to understand what the calculation of the area in the LF and HF bands is for, and how this relates to breathing. I understand that the process of inhalation and exhalation in the respiratory system causes slight variations on the IBI. Therefore, I understand that this information can be extracted from the FFT of the interpolated signal. But my intuition says that the frequency of breathing is in the frequency of the corresponding band, as long as the FFT is performed on temporary windows that can be included in the slower cycles of breathing. If it is done on the total of the session, a global characteristic of the frequencies is produced.

    Therefore, my question is what do these areas represent in the LF and HR bands?

    In the second place he affirms that: “the VLF and ULF band are often connected to physical activity, thermal regulation and hormonal blood pressure regulation”, I do not discuss it but I think that when he states that: “The high peak left in the graph consists of the VLF (Very Low Frequency) and ULF (Ultra Low Frequency) band in the HRV “is not true, I would say that in this case it is the continuous component of the interpolated signal (which will be around a value of 1000), it is say, it represents the average value of the interpolated signal. To say this, I am based on the fact that I see that it has around 2500 samples and that you normally use 100Hz as the sampling rate. This gives you a frequency resolution after the FFT of 0.04Hz (just the first value of the LF band) that closely resembles the jumps between points of the FFT graph. I think then, that if you wanted to have such a VLF band (or ULF) you would need to increase the number of samples without altering the sampling frequency, this would increase the frequency resolution of the FFT. Perhaps another option would be to decimate the interpolated signal.

    • palkab

      February 28, 2019

      Hi Raul. Thanks for your remarks and comments.

      Regarding the LF and HF bands, there is a complex interplay between breathing, parasympathetic nervous system activity, baroreceptors/blood pressure, and heart rate. These are expressed in the variation of peak-peak intervals over time. We interpolate the peak-peak intervals to get a measure of the rate of change of the heart rate over time. You can find an excellent paper outlining the physiological relationships here:

      Regarding the VLF/ULF bands, it seems you are correct. I will remove that bit of text from the page. Thanks!


  • duhita

    March 2, 2019

    Hey Paul,

    in this snippet of code,
    from scipy.interpolate import interp1d
    meaures = [peaklist]
    print(‘frequency domain maesures’, peaklist)
    RR_x = peaklist[1:]
    print(‘RR_X IS : ‘, RR_x)

    RR_y = RR_interval_list
    print(‘RR_Y IS : ‘, RR_y)

    RR_x_new = np.linspace(RR_x[0], RR_x[-1],RR_x[-1])
    print(“RR_X_NEW IS : “, RR_x_new)
    f = interp1d(RR_x_new, RR_y, kind=’cubic’)
    print(‘f is : ‘,f(250))

    I am getting an error as:

    this peaklist is for measuring of frequency domain maesures [63, 165, 264, 360, 460, 565, 674, 773, 863, 953, 1048, 1156, 1272, 1385, 1487, 1592, 1698, 1803, 1897, 1994, 2097, 2206, 2308, 2406]
    RR_X IS : [165, 264, 360, 460, 565, 674, 773, 863, 953, 1048, 1156, 1272, 1385, 1487, 1592, 1698, 1803, 1897, 1994, 2097, 2206, 2308, 2406]
    RR_Y IS : [1020.0, 990.0, 960.0, 1000.0, 1050.0, 1090.0, 990.0, 900.0, 900.0, 950.0, 1080.0, 1160.0, 1130.0, 1020.0, 1050.0, 1060.0, 1050.0, 940.0, 970.0, 1030.0, 1090.0, 1020.0, 980.0]
    RR_X_NEW IS : [ 165. 165.93180873 166.86361746 … 2404.13638254 2405.06819127
    2406. ]
    Traceback (most recent call last):
    File “C:/Users/Dee1/PycharmProjects/HeartAna1.1/”, line 235, in
    f = interp1d(RR_x_new, RR_y, kind=’cubic’)
    File “C:\ProgramData\Anaconda3\envs\HeartAna1.1\lib\site-packages\scipy\interpolate\”, line 433, in __init__
    _Interpolator1D.__init__(self, x, y, axis=axis)
    File “C:\ProgramData\Anaconda3\envs\HeartAna1.1\lib\site-packages\scipy\interpolate\”, line 60, in __init__
    self._set_yi(yi, xi=xi, axis=axis)
    File “C:\ProgramData\Anaconda3\envs\HeartAna1.1\lib\site-packages\scipy\interpolate\”, line 125, in _set_yi
    raise ValueError(“x and y arrays must be equal in length along ”
    ValueError: x and y arrays must be equal in length along interpolation axis.

    Process finished with exit code 1

    I would really appreciate your help and thank you so much for posting such an amazing article

    • palkab

      March 5, 2019

      Hi Duhita. I remember running into this issue but don’t recall exactly what the circumstances were. Would you be willing to share the datafile you use and the exact code to, so I can have a look?

      If time is critical, download heartpy from my github and install it: , it is the more mature package version of the tutorial basics. Your issue is fixed in that one.


  • duhita

    March 7, 2019

    hey Paul,

    thank for guiding me. I will try the heartpy from GitHub. right now I am stuck at interpolation. for the frequency domain analysis. I am searching web like crazy for how to interpolate.

    • palkab

      March 16, 2019

      Scipy.signal has a module ‘resample’ that does what you need. Heartpy already interpolates the signal for frequency analysis

  • Hamid Reza Maneshti

    June 1, 2019

    Hi Pual,
    thanks for this tutorial it’s very helpful for me. I have a question, I need to analyze data in real-time, in detail I wanna read data from Arduino and send those data to my pc for doing analyze. after that show analyzed data on my pc in real-time. how can I do it?

  • Noman marwat

    August 6, 2019

    Hi Paul Van Gent, I hope my comment will finds you well. Thank you so much you made my life easy. Part 3 and 4 links are not available. Could you please solve this issue? I need those parts of this series for my project. Thanks in advance

  • BestBen

    August 11, 2019

    I have noticed you don’t monetize, don’t waste your traffic, you can earn extra bucks every
    month with new monetization method. This is the best adsense alternative for any type of website (they approve all sites), for
    more info simply search in gooogle: murgrabia’s tools


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.