Introduction

The aim of this project is to use the camera and processing power of modern day cell phones to develop an intuitive and user-friendly application for the detection and concentration estimation of various bio-markers in blood sample images. It is later planned to be used as a screening test for cancer. The application will allow the user to take images of the blood samples in a set format. The image will then be segmented to detect the regions of interest. After noise removal, the intensity of each individual blob will be calculated. A linear curve will be fit through the intensity and known concentration data and the concentrations of the unknown samples will be estimated from the standard curve which will quantify the various molecules present in the sample. The aim of the project this year was to develop the iOS version of ConcAnalyzer which was developed for Android last year.

Demo

A video showing entire usage flow of the final iOS Application is exhibited below.

Work Done in GSoC 2017

This project is a continuation of last year's work which can be found in this repository. Before implementing last year's algorithm in iOS, it was changed in some respects to get better accuracy and results from all the 3 channels of the input image. This changed algorithm was first added to the Android version of the application and then the iOS version was developed. A list of commits can be found below.

Android

2c14d2b

iOS

For iOS, a new repository was created and a list of commits made this summer can be found here. Instructions on how to build the app can be found in the repository's README.

Details of the Algorithm Used

The algorithm used for detection of regions of interest is same as last year which can be found here (Points 1-9). Only the algorithm for concentration estimation has been changed which is explained below.

For each blob \(B\) of the 7 blobs.
- Clip the rectangular portion bounding \(B\) from \(I_{raw}\) to get \(R_{clip}\).
- For each channel of \(R_{clip}\) generate the histogram and find the intensity which is most frequent. We get 3 intensities for each blob namely \(i_{red}\), \(i_{green}\), and \(i_{blue}\).
Use the intensity values (\(i_{red}\), \(i_{green}\), and \(i_{blue}\)) and known concentration values of the five standard samples to train 3 linear regression models to get slopes and intercepts for each channel. Also calculate the \(R^2\) score of each regression model.
Calculate the concentration of the quality control and unknown samples using the slope and intercept values. We get 3 concentration values for each sample.

Acknowledgment

I'm indebted to Dr. Tomas Helikar for giving me the opportunity of working on this amazing project. I would also like to thank Daniel Cohen Gindi & Philipp Jahoda, whose library, Charts, has been used in this project. It has been released under Apache License 2.0.

Also, Thank you, Google. 😀