Heart rate and respiratory rate detection algorithm based on the
Transcription
Heart rate and respiratory rate detection algorithm based on the
Heart rate and respiratory rate detection algorithm based on the Kinect for Windows v2 Paul Hofland 10558969 July 7, 2016 15 EC Bachelor Thesis Physics & Astronomy conducted between 01-03-2016 and 07-07-2016 Supervisors: Prof. Dr. M.C.G. Aalders & Ir. M.J. Brandt Second evaluator: Prof. Dr. A.G.J.M. van Leeuwen Academisch Medisch Centrum Faculteit der Natuurwetenschappen, Wiskunde en Informatica Universiteit van Amsterdam 1 Abstract An algorithm based on the Kinect for Windows v2 that performs human heart- and respiratory rate measurements without physical contact is developed. Both the theoretical methods and the experimental results of this algorithm are presented. The respiratory rate is determined by tracking the subject chest while measuring its movements with the RGD-D depth camera. The heart rate is determined by tracking the subject’s face while measuring its RGB color values, this allows for a photoplethysmography method to be used. The green channel values are later analyzed in LabVIEW to determine the heart rate. To experimentally validate this algorithm, fourteen tests with different operating conditions were performed on eight subjects. The experimental results show that the respiratory and heart rate can be determined. This shows that the Kinect v2 is a viable alternative to other respiratory- and heart rate measurement devices, while it still has more possible functionalities to offer. 2 Hartslag en ademhaling meten met een Kinect v2 Hartslag en ademhaling worden in het ziekenhuis normaal gesproken met meetinstrumenten gemeten die fysiek contact hebben met de patiënt. Dit zorgt ervoor dat er accuraat gemeten kan worden. Maar fysiek contact zorgt ook voor stress of ongemak bij de patiënt waardoor de meting beı̈nvloed kan worden. Om dit probleem te vermijden moet er zonder fysiek contact gemeten worden. In dit onderzoek wordt er gebruik gemaakt van de Microsoft Kinect for Windows v2. De Kinect v2 heeft een RGB kleurencamera, een RGB-D dieptecamera en een infrarood camera die gebruik maakt van een infrarood licht bron. Deze sensoren zorgen ervoor dat er beeld en diepte informatie tegelijkertijd gemeten kan worden. De Kinect v2 kan met deze sensoren verschillende metingen doen. De emoties van de patiënt zouden bijvoorbeeld ook gemeten kunnen worden. Het is daarom belangrijk om uit te zoeken wat de Kinect v2 kan. Dit onderzoek presenteert een algoritme voor de Kinect v2 dat de hartslag en ademhaling kan bepalen zonder fysiek contact. De ademhaling wordt bepaald door continue enkele punten op het lichaam te volgen die samen een rechthoek over de borstkas vormen. Tijdens elke frame wordt van de ingesloten pixels de gemiddelde afstand tot aan de Kinect v2 gemeten. Het algoritme slaat dit op en vergelijkt deze afstand met de afstanden die gemeten zijn in vorige frames om te bepalen hoe snel de borstkas naar voren of naar achteren beweegt. Door deze beweging te meten, kan het algoritme de ademhaling bepalen. Om de hartslag te bepalen, worden er enkele punten op het gezicht bepaald en gevolgd die samen twee rechthoekjes over de wangen vormen. We kijken naar de wangen omdat daar de bloedstroom goed zichtbaar is vanwege de bloedvaten die daar onder de huid lopen. Vervolgens worden de gemiddelde primaire kleuren (rood, groen en blauw) van de ingesloten pixels opgeslagen. Het zuurstofrijke bloed dat vanaf het hart komt, absorbeert meer rood licht dan het zuurstofarme bloed. Dit komt omdat verschillende atomen verschillende kleuren licht absorberen. Net als zwarte of rode verf zijn het de atomen die bepalen welke kleuren licht er worden geabsorbeerd en welke kleuren licht niet. Hierdoor kan het verschil gemeten worden tussen zuurstofrijk en zuurstofarm bloed ondanks dat het zich onder de huid bevindt. Om het voorgestelde algoritme te valideren, zijn er veertien metingen in verschillende omstandigheden uitgevoerd, waarbij het algoritme de hartslag en ademhaling van de gebruiker bepaalde. Zo kan er geconcludeerd worden dat ademhaling en de hartslag in ideale omstandigheden gemeten kan worden. Hiermee bewijst de Kinect v2 zich als alternatief voor andere meetinstrumenten, en dat terwijl de Kinect v2 nog veel meer mogelijkheden biedt. 3 Contents 1 Introduction 5 2 System configuration 6 2.1 Microsoft Kinect for Windows v2 . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Crime-lite 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Rate detection methods 7 3.1 Respiratory rate detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Heart rate detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Experimental work 10 4.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Work method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5 Results 11 5.1 Respiratory rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.2 Heart rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 6 Discussion 16 7 Conclusion 17 8 Acknowledgements 17 Appendices 18 A Microsoft Kinect for Windows v2 system requirements 18 Respiratory rate detection code 19 Heart rate data analysis graphical code 20 B C 4 1 Introduction Heartbeat and breathing are two fundamental processes in living organisms. There are many respiratory and heart diseases which require monitoring. Most monitoring methods use medical devices that require physical contact, methods such as plethysmography [7] and pulse oximetry [8]. These methods may cause unease for the user, and will therefore influence the measurement. Non contact monitoring methods have also been researched [6] [9]. These methods work with great accuracy, but they can only measure the respiratory rate and/or the heart rate. Nowadays, the field of computer vision has been developed rapidly for both the commercial and non-commercial market. The Kinect v2 is a product of this development and delivers high accuracy sensors for a low consumer-grade cost [1]. The Kinect v2 offers a RGB color camera, a RGB-D depth camera and a infrared camera by using a infrared light source, which makes the Kinect v2 capable of capturing depth and image data simultaneously. These features have the potential to measure more than only the respiratory and heart rate. The SDK 2.0 from Microsoft allows new programmers, such as myself, to investigate the possibilities that the Kinect v2 has to offer. Simultaneously detecting bruises and the emotions of the user are potential functions of the Kinect v2. Combining these possibilities might even lead to a lie detector for abused kids. Capabilities of such a device are therefore worth researching. In this paper, we present a heart- and respiratory rate detection algorithm based on the Kinect v2. The heart rate is determined by tracking the chest and using the RGB-D depth camera to measure its distance, a similar method is presented in [6]. To determine the heart rate, a photoplethysmography (rPPG) method is used, which looks at subtle color changes on the skins surface [11]. To experimentally validate the algorithm, several test with different operating conditions were performed and are presented. In this paper, the Kinect for Windows v2, the software that has been used to create the proposed algorithm and the light source are introduced in chapter two. Both methods of calculating the heart- and respiratory rate are explained in chapter three. In chapter four, the experimental work is explained. The experimental results are presented in chapter five. And finally, in chapters six and seven the paper is discussed and concluded. 5 2 System configuration The proposed algorithm is based on the following hardware and software. 2.1 Microsoft Kinect for Windows v2 Microsoft’s Kinect for Windows v2 features a RGB color camera at a resolution of 1920 x 1080 pixels, a RGB-D depth camera at a resolution of 512 x 424 pixels and an infrared camera at a resolution of 512 x 424 pixels, by using an infrared light source, as shown in Figure 1. Microsoft stated that the Kinect v2 runs at 30 frames per second [3]. But in our case, probably due to the performance demanding algorithm, an average sample rate of 20.1 frames per second was achieved on the computer that the experimental measurements were done with. For the Kinect v2 system requirements and price see appendix A. Figure 1: Microsoft Kinect for Windows v2. (left): Kinect v2 (front view). (right): Camera configuration (front view). Copyright of IEEE Sensors Journal. To determine the depth of objects, the Kinect v2 utilizes a Time-of-Flight method by calculating the round-trip time of a pulse of light [2]. This method gives a better depth image quality than its predecessor, that used a structured-light method. The depth accuracy and resolution are not uniform because they depend on distance, viewing angle and edge noise [1]. For this reason, the error on the measurements are assumed to not be significant for this research. 2.2 Software The proposed algorithm is written in C# using Microsoft Visual Studio 2015. Microsoft Kinect for Windows SDK 2.0 was used for its tracking abilities and libraries. NI LabVIEW 2015 was used to analyze the heart rate data. 2.3 Crime-lite 2 The Crime-lite 2 is a light emitting diode (LED) that has been used as the only light source for all measurements. It provides a uniform white (400-700 nm) illumination that is independent of the mains frequency [4]. This will avoid that the frequency of a 6 normal light source, which is the same as the main frequency (50Hz), would influence the measurements. 3 Rate detection methods To determine the respiratory and heart rate, the following methods were used. 3.1 Respiratory rate detection To determine the respiratory rate, the subject has to be identified by the algorithm first. This is done by recognizing several body parts (joints) on the human body and tracking them every frame. This operation was possible by using the functionalities of the SDK 2.0 software from Microsoft, which can recognize up to 25 joints on the human body. A method of identifying different human body parts from a single depth image is researched and presented in [5]. For our purposes, only three joints on the human body are tracked: the shoulders and the torso, as shown in Figure 2. These three points define the region of interest, the subject’s chest. Figure 2: Region of interest in red and tracked joints in yellow, achieved by using Microsoft’s SDK 2.0. Image taken with the RGB-D depth camera of the Kinect v2. After identifying and tracking the chest wall, the algorithm uses the RGB-D depth camera to measure the distance of every pixel inside the region of interest. The mean value of these distances will be calculated and saved every frame at an average rate of 20.1 frames per second. To improve the signal, the algorithm takes the average of the mean depth 7 values of three consecutive frames. This results into an average data sample frequency of 6.7 frames per second, which is a good sample rate to detect the respiratory rate [6]. The experimental results of this depth signal improvement are presented in chapter five. The algorithm then compares the averaged measurements with each other by calculating the difference between them. The algorithm keeps track of the sign and determines whether the subject is either inhaling or exhaling when the sign stays negative or positive respectively three times in a row. This eliminates the irregularities of breathing. After deciding whether the subject is inhaling, the algorithm will eventually find the subject exhaling. When this happens, the algorithm finds and saves the point in time at which the maximum distance between the chest and the Kinect v2 was measured. The algorithm will only do this after the subject first inhaled and then exhaled. This assures that the algorithm only takes the maximum distances into account. When three maximum distances are found, the algorithm can calculate the frequency of the chest movements by extrapolating the time between the three maxima. Doing this will give an almost real-time respiratory rate of the subject. As the algorithm requires three maxima it will be three breaths behind on the real-time respiratory rate. This method has been chosen instead of taking more measurements and calculating its weighted average over time because this would give an statistical respiratory rate, which was researched in [6], instead of a almost real-time one. The experimental results of this respiratory rate detection method is presented in chapter five. For the respiratory rate detection code, see appendix B. 3.2 Heart rate detection To determine the heart rate, the subject’s face has to be identified as well as the body. This is done by recognizing several key points on the face such as the nose, eyes and mouth. A mesh is created to fit over the subject’s face. This will create stationary points (vertices) on the face which the algorithm can track. This operation is again achieved by the SDK 2.0 software from Microsoft. A method of creating a non-uniform face mesh from a three dimensional image is researched and presented in [10]. Using the vertices that are positioned on the mouth corners and on the bottom of the eyes, two rectangular boxes that form over the cheeks are created. These vertices define two regions of interest, as shown in Figure 3. The algorithm then measures the RGB values of every pixel in the regions of interest. This allows for a photoplethysmography (rPPG) method to be used, which looks at subtle color changes on the skin surface using the RGB color camera [11]. The heart rate detection of this method is based on the oxygenated blood flow that changes the amount 8 of hemoglobin molecules and proteins. These changes also affect the optical absorption of the light spectrum. It is therefore possible to identify the heart rate based on color changes in the reflections of the skin. Figure 3: Regions of interest in red and tracked joints in yellow, achieved by using Microsoft’s SDK 2.0. Image taken with the infrared camera of the Kinect v2. Only the green channel is taken into account, as this channel has the best features for photoplethysmography [11]. In NI LabVIEW 2015 the signal is first resampled at a steady rate of 20 Hz. A steady sample rate instead of a varying one is required for a Fourier analysis, which will be used later on. A low and high cutoff bandpass filter are applied to ignore inhuman heart rate frequencies and to reduce noise. These bandpass filters are set to 0.667 Hz and 1.667 Hz respectively. These frequencies represent a heart rate of 40 bpm and 100 bpm respectively. The assumption is made that the subjects will not have a heart rate frequency outside of these values. A fast Fourier transformation is then applied, which gives a spectrum of frequencies. The results found with this method are presented in chapter five. For the heart rate data analysis graphical code, see Appendix C. 9 4 Experimental work To experimentally validate the proposed algorithm for the Kinect v2, fourteen tests with different operating conditions were performed on eight subjects. The experimental measurements were executed as follows. 4.1 Experimental setup The Kinect v2 was stationary positioned on top of a desk, one meter away from the subject. The Crime-lite 2 light source was positioned behind the Kinect v2 to illuminate the subject in both sitting and standing positions. The room in which the measurements took place had no windows and was only illuminated by this light source. The subjects had to wear a M70 fingertip pulse oximeter which was filmed by a camera that was also positioned on top of the desk. 4.2 Work method The eight subjects that were tested consisted of both man and women between the age of 20 and 30. To not influence the measurement, the subjects were not told what was measured. Instead they had been given the instruction to read a short story on a tablet. Since the respiratory rate is easy to manipulate, this method was chosen to give a more natural and realistic result. The subjects also had to wear a M70 fingertip pulse oximeter. The measurements performed with this tool are used as a golden standard for the heart rate detection results, as shown in chapter five. Every test that was done with the Kinect v2 consisted of 1000 frames, comparable to approximately 50 seconds. In total fourteen experimental tests were performed, consisting of nine tests in sitting position and five tests in standing position. Out of the nine tests in sitting position, two are depth signal improvement tests, five tests were done in ideal conditions, one test was done while the subject was wearing a jacket and one test was done in bad lighting conditions. Out of the five tests in standing position, four tests were done in ideal conditions and one test was done while the subject was wearing a jacket. 10 5 Results The results of the respiratory and heart rate measurements and the depth signal improvement are presented in this chapter. 5.1 Respiratory rate First a depth signal improvement measurement was done by averaging the mean distance values of three consecutive frames. This results in a sampling rate of 6.7 frames per second instead of 20.1 frames per second, which is a better sample rate to detect the respiratory rate [6]. This also reduces the noise and makes the signal easier to analyze, as shown in Figure 4. Figure 4: Depth signal improvement of the proposed algorithm. (left): Improved signal at 6.7 fps. (right): Raw signal at 20.1 fps. Using the depth signal improvement, four tests were done in ideal conditions with the subjects in standing position. These results are shown in Figure 5, which shows the maximum distances found by the algorithm in red. The first maximum in all measurements were not found by the algorithm because it has to detect the signal going down before it can set the boundaries in between which the algorithm searches for a maximum distance. The results show that after the first found maximum, some maxima aren’t found by the algorithm. It is also notable that the signal is significantly worse when the subjects are in a standing position. Results of the two tests that were performed in ideal conditions with the subjects in sitting position are shown in Figure 6. 11 Figure 5: Depth signal of the subjects in standing position, in red the maxima found by the algorithm. Figure 6: Depth signal of the subjects in sitting position, in red the maxima found by the algorithm. Clothing such as a jacket, seem to only significantly affect the signal when the subject is in a standing (a) position as shown in Figure 7. Again, a clear difference in signal quality is visible between the subjects in standing (a) and sitting (b) positions. 12 Figure 7: Depth signal of the subject with a jacket on, in red the maxima found by the algorithm. (a): Subject in standing position. (b): Subject in sitting position. In bad lighting conditions, the algorithm is still capable of finding the maximum distances, that are shown in red, see Figure 8. The results also show that the signal is noticeable worse than the signal measured in ideal conditions (Figure 5). Figure 8: Depth signal of the subject in standing position in bad lighting conditions, in red the maxima found by the algorithm. 13 5.2 Heart rate Figure 9 shows the green channel signals that are achieved by measuring the RGB values of the cheeks. These signals have much more noise than the depth signals in Figure 5,6,7 and 8. Noise reduction and the removal of inhuman heart rate frequencies is necessary before detecting a heart rate. The signals are analyzed in NI LabVIEW 2015, which applies two bandpass filters and a fast Fourier transformation. These results are shown in Figure 10. Figure 9: Green channel signals of the subject’s cheeks, measured with the Kinect v2. Figure 10 shows the frequency spectra with the the peak value in red, which indicates the heart rate frequency. The determined heart rate using this proposed method, as well as the measured heart rate with the M70 fingertip pulse oximeter, are shown in table 1. 14 Figure 10: Frequency spectrum of the subjects heart rate. In red the peak values which indicates the heart rate frequency. (a): The peak indicates a heart rate of 55.4 bpm. (b): The peak indicates a heart rate of 67.2 bpm. (c): The peak indicates a heart rate of 53.7 bpm. Subject M70 oximeter heart rate (bpm) Detection heart rate (bpm) (a) 60 55.4 (b) 70 67.2 (c) 51 53.7 Table 1: Heart rate measurement results. 15 6 Discussion The average sampling rate was not consistent, as the time between two frames was varying between 126 ms and 39 ms. This may result into too few data points in between breaths, which means that the algorithm won’t detect it as a breath as it needs at least four data points. Examples of this problem are visible in Figure 5. The algorithm is not optimized which makes it performance demanding and probably increases the effect of a varying sample rate. Using shapes instead of boxes to define the regions of interest has also been tried, but this method was even more performance demanding. Figure 5 also shows the signal being more chaotic than in Figure 6. This is due to the subject movements while standing. For example, when the subject moves forward while exhaling, the signal would be reduced as these movements cancel each other out when measuring depth. This problem also occurs when the subject moves backward while inhaling. A method in which the relative distance between the chest and the shoulders are analyzed could solve this problem, because the shoulders move along with the subject while they are independent of the movements caused by breathing. Such a method is researched and presented in [6]. As mentioned before, the algorithm calculates the respiratory rate by extrapolating the time between three found maximum distances. This means that when the algorithm fails to locate a maximum distance, the error is significantly large. On the other hand, it only takes another three breaths to determine a new respiratory rate, since the algorithm is almost real-time. This problem could be partially solved by taking more measurements and calculating a weighted average, this method however would give a statistical respiratory rate [6] instead of a almost real-time one. There are no respiratory measurements done with other medical devices to use as a gold standard. This means that a comparison with the actual respiratory rate in not possible. Instead, only the depth signal can be used to confirm the respiratory rate detection. It also has to be noticed that Microsoft does not state with which methods the body and face recognition are achieved. As mentioned before, there are methods researched to achieve this but it is not certain that Microsoft uses similar methods [5] [10]. All heart rate results are achieved by analyzing the measured data in LabVIEW 2015 and not in the actual algorithm. Although this could be implemented, due to a shortage of time this method has been chosen instead. The measurements done with the M70 fingertip oximeter were filmed, which makes the average heart rate, that is used as a golden standard, an estimate average of these measurements. 16 Table 1 shows the results of the heart rate detection and the results of the M70 fingertip pulse oximeter. Both measurements don’t exactly agree, but they are quite similar. There are no errors on the detected heart rates, since these errors are difficult to determine, they depend on distance, viewing angle and edge noise [1]. We can therefore only conclude that the measurements are similar. 7 Conclusion As mentioned before, the SDK 2.0 from Microsoft allows new programmers, such as myself, to investigating the possibilities of the Kinect v2. This research shows one of the applications that the Kinect v2 is capable of. The signals measured with the Kinect v2 show a lot of detail, which is promising for further research of this device. The results show that the respiratory rate could be determined, although the algorithm sometimes fails to recognize some maximum distances. Since the algorithm determines the respiratory rate almost in real-time, the error would be ignored after three breaths. The heart rate was determined by analyzing the green channel color values in NI LabVIEW 2015. By applying two bandpass filters and a fast Fourier transformation, a heart rate could be determined which was similar to the golden standard measurements. 8 Acknowledgements This bachelor thesis was a very exciting project for me because I started it without any programming skills. I’m very grateful that my supervisor Maurice Aalders accepted me and supported me throughout this challenge. I would also like to thank my second supervisor Martin Brandt for helping me with solving all the bugs and problems that I’ve encountered as a new programmer. He motivated and supported me every single time I was stuck on a problem. 17 Appendices A Microsoft Kinect for Windows v2 system requirements • Windows 8 (x64) • 64 bit (x64) processor • 4 GB Memory (or more) • I7 3.1 GHz (or higher) • Built-in USB 3.0 host controller (Intel or Renesas chipset) • DX11 capable graphics adapter The current price of the Microsoft Kinect for Windows v2 is e99,- 18 B Respiratory rate detection code r i c o c h e c k = ( z v a l u e s t o t a l a v e r a g e [ frame count − 1] − z v a l u e s t o t a l a v e r a g e [ frame count − 2 ] ) ; i f ( r i c o c h e c k > 0) { positive = positive + 1; negative = 0; } i f ( r i c o c h e c k < 0) { negative = negative + 1; positive = 0; } i f ( p o s i t i v e == 3 ) { inhale = 0; exhale = 1; s t a r t i n d e x = frame count ; } i f ( n e g a t i v e == 3 && e x h a l e == 1 ) { max value = 0 ; f o r ( i n t i n d e x = s t a r t i n d e x ; i n d e x < f r a m e c o u n t ; i n d e x++) { i f ( z v a l u e s t o t a l a v e r a g e [ i n d e x ] > max value ) { max value = z v a l u e s t o t a l a v e r a g e [ i n d e x ] ; max index = i n d e x ; } } max times . Add( t i j d 3 [ max index ] ) ; exhale = 0; inhale = 1; } 19 C Heart rate data analysis graphical code This heart rate data analysis graphical code was made by Prof. Dr. M.C.G. Aalders. 20 References [1] L. Yang, L. Zhang, H. Dong, A. Alelaiwi, and A. El Saddik. Evaluating and improving the depth accuracy of Kinect for Windows v2. Sensors Journal, IEEE, 2015. [2] A. Kolb, E. Barth, R. Koch, and R. Larsen. Time-of-flight sensors in computer graphics. Eurographics, 2009. [3] Kinect v2. https://developer.microsoft.com/en-us/windows/kinect/hardware. Accessed on July 5, 2016. [4] Crime-lite 2. http://www.fosterfreeman.com/forensic-light-sources/329-crime- lite-specs. Accesed on July 5, 2016. [5] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finoccio, R. Moore, A. Kipman and A. Blake. Real-time human post recognition in parts from single depth images. IEE Conference on Computer Vision and Pattern Recognition, 2011. [6] F. Benetazzo, A. Freddi, A. Monteri and A. Longhi. Respiratory rate detection algorithm based on RGB-D camera: theoretical background and experimental results. Healthcare Technology Letters, 2014. [7] K.F. Whyte, M. Gugger, et al. Accuracy of respiratory inductive plethysmograph in measuring tidal volume during sleep. J. Appl. Physiol., 1991. [8] J.A. Sukor, M.S. Mohktar, S.J. Redmond and N.H. Lovell. Signal Quality Measures on Pulse Oximetry and Blood Pressure Signals Acquired from Self-Measurement in a Home Environment. IEEE Journal of Biomedical and Health Informatics, 2014. [9] G. Tabak and A.C. Singer. Non-contact heart rate detection via periodic signal detection methods. 2015 49th Asilomar Conference on Signals, Systems and Computers, 2015. [10] W.J. Chew, K.P. Seng and L. Ang. Optimal Non-uniform Face Mesh for 3D Face Recognition. Intelligent Human-Machine Systems and Cybernetics, 2009. [11] W. Verkruysse, L.O. Svaasand and J.S. Nelson. Remote plethysmographic imaging using ambient light. Opt Express, vol. 16, no. 26, 2008. 21