this PDF file - Department of Electronics, Computing

Transcription

this PDF file - Department of Electronics, Computing
University of Derby
School of Computing & Mathematics
A project completed as part of the requirements for the
BSc (Hons) Computer Science
entitled:
Image Recognition: Investigating the possibilities of identifying a
spider from an image using popular feature detection algorithms
By Jamie West
In the years 2011 - 2015
Abstract
This project set out to investigate the ways in which applications can be used to identify
animals, more specifically spiders. The investigation involved researching what already
currently exists, to find out that there are very few similar solutions out there. Going further in
depth, this paper explains the steps of the feature tracking algorithms and the advantages of
the three main ones, so the superior one can be used in a web application that was developed.
The application successfully identified a set of spiders so it proved there is a possible solution.
It was however not able to cope with a large scale demand in a timely fashion.
ContentsPage
Abstract .......................................................................................................................................1
Contents Page..............................................................................................................................1
1. Introduction .............................................................................................................................2
2. Literature Review....................................................................................................................4
2.1 Species of British Spiders and People’s Perception .........................................................4
2.2 What is already out there ..................................................................................................4
2.3 Image Recognition Techniques.........................................................................................6
2.4 Image Recognition Algorithms .........................................................................................8
3. Methodology .........................................................................................................................15
3.1 Implementation Technologies .........................................................................................15
3.2 Architecture Design and Development ...........................................................................17
3.3 The Algorithms ...............................................................................................................20
3.4 Result Reviewing ............................................................................................................21
4. Results, Analysis, and Critical Review .................................................................................23
4.1 Results .............................................................................................................................23
4.2 Analysis...........................................................................................................................26
5. Conclusion ............................................................................................................................29
5.1 Summary of key Issues ...................................................................................................29
5.2 Objectives .......................................................................................................................29
5.3 Possible improvements and future research ....................................................................30
6. References .............................................................................................................................31
7. Bibliography .........................................................................................................................36
8. Appendix ...............................................................................................................................37
1|Page
1.Introduction
Knowledge is one of the most important and powerful tools a human can have. Being able to
instantly recognise and identify objects, including animals, are a key part in understanding our
environment and surroundings. But what if we are unable to identify such things? We could
ask other people who share this knowledge, or attempt to learn it ourselves. However, in the
modern world, technology must be able to help us with this problem. Fortunately there are
applications out there such as encyclopedias and AI systems that are smart enough to detect
certain objects. Yet, having a widespread application that can detect animals from a picture,
still proves to be unheard of. As a result, this dissertation will attempt to research the
challenges into creating such a software, to see if the findings give answers as to why none
exist. This project will also develop a piece of software that uses reverse image technology to
provide information about common UK spiders and to see how reliable it can be. Spiders have
been chosen because developing an application to detect all animals will consume far too
much time for the given deadline.
There are already algorithms developed out there that aid in the detection of object
recognition from images, so one of the major tasks to explore is which algorithm to use as it
will be implemented into the final web application project. Consequently, this leads onto the
following hypothesis for this project:One image feature detection algorithm should work better than the rest and this should be
used to develop a reverse image recognition app to identify a certain species of spider without
too much degree of failure.
Aims/Objectives
The aim of this study is to first investigate and test the popular feature detection algorithms on
various species of arachnid common to the British Isles. Upon collecting my results, we will
integrate and develop the superior algorithm into a useful and responsive web app to give
feedback to the user such as its name, where it is commonly found, and how aggressive it is.
In order to achieve this aim, the following will be objectives that follow the S.M.A.R.T
guidelines of being Specific, Measureable, Achievable, Realistic, and Timely:
2|Page
1. To comprehensively research the algorithms needed to be able to develop a reverse
image search program, while also investigating what already exists before I start any
coding.
2. To research and provide information before development to identify the top five
common UK spiders so I can provide a suitable spread of data to use during the
development of the program.
3. To create a prototype project that has the ability to recognise simple shapes and
objects to see if the solution works before then adding the functionality to recognise
the five types of spiders that have been chosen as test data.
4. For each algorithm, its accuracy will be tested as well as it’s time for completion
against a small sample size of each spider species to determine which popular feature
detection algorithm is most suitable for use.
5. To develop a web application that is aimed at users that can identify a species of
spider by providing it with an image. It should get back the name of it, the original
image, and some other key facts.
6. To further build the web application to make it distributable and scalable by being able
to deal with a large amount of users as well as taking no longer than 6 seconds to bring
back a response.
Discussion
This paper will be structured into four further parts. Part one will be the literature review
where an in depth examination about previous work on similar applications and feature
tracking algorithms will be discussed quite thoroughly. Part two will be the methodology
which an overview of available technologies will be given, as well as a design of the system
architecture, and how the data collection process will occur for result gathering. This leads
straight into part three, results and analysis This section lists all the data in a structured set of
tables before making conclusions, and transforming the data into a more visual representation
using graphs. Finally, this paper will end with a conclusion discussing key points that went
wrong, a summary of the above objectives, and future development or research.
3|Page
2.LiteratureReview
2.1SpeciesofBritishSpidersandPeople’sPerception
2.1.1BritishSpiders
The proposed application will be designed and developed to identify British spiders. As this is
only a study to see if it’s possible to create this application, only a small test sample will be
needed. Therefore a decision has been made to be able to identify the five most common
spiders found around the British Isles. The National History Museum (2013) supplies a visual
document with six of the common spiders. They are as followed: Missing Sector Orb Web,
House, Daddy Long Legs, Lace Web, Zebra Jumping, and False Widow. As this paper only
uses five, False Widow will be dropped.
The colour and shapes of the spider will be important when it comes to deciding which image
recognition technique to use. Looking at the spiders we can see that the shape of them are
quite distinct; this is a good thing as it will make detecting the species of spider a lot easier.
The House Spider and the Lace Web Spider do look pretty similar though so it wouldn’t be
too surprising if some mismatch errors occur. It is also obvious that matching images by
colours will not work with spiders as they are all a dark brown/grey colour.
2.1.2Peoples’Perception
Spiders are among some of the most feared animals for humans in the UK, as 30% of women
and 20% of men act nervous or frightened when approached by one (Alpers, et al., 2009).
Wagener and Zettle (2011, p.12) states in their investigation that one method to reduce the
fear of spiders is to use an Information Based Approach (IBA). Its goal is to educate
individuals by letting them gain as much information and knowledge about arachnids. The
proposed application of this paper will be doing exactly that, which is why it’s important.
Once a spider has been identified, facts will be presented to the user educating them about that
particular spider. This is not the first time software has been used to try and cure animal
phobias; Alcaniz, et al., (2013) provides an application that uses augmented reality so it
makes users appear to see creatures around their hands, so with time they eventually start
getting used to being around the animals.
2.2Whatisalreadyoutthere
This following part of the paper will show the research and results of findings about what
systems already exist, that can identify animals or have the ability to recognise objects.
4|Page
2.2.1ReverseImageSearch-Google
Googles reverse image search was released back in June 2007, with the ability to drag and
drop images into the search bar, while having google run complex algorithms against trillions
of websites to bring back a view of similar images (Google, 2015). Unfortunately, as of 2011,
their API to use this feature was deprecated so there was no way to accurately access the
meta-data that google calculated (Google, 2012). This is not a good thing as text of the spiders
name will be needed in order to provide facts to the user, as just showing the user some more
of the similar images would not be useful enough.
After doing lengthy exploration on the official app stores from various operating systems
(Google Play Store, App Store, and Windows Store), only a very few apps were found that
could be classed as similar. Some examples of these are Search Image (Future Mobile, 2015),
Search By Image (Emilian Postolache, 2013), and PicFinder (SOFTDX, 2014). However,
after looking into these apps, they all just use googles image search by integrating it into their
apps. So currently, there are no officially released apps with the purpose to identify animals or
objects given an input image. This is also another reason why developing this app can be
really useful.
Nevertheless, this does prove it is achievable for a distributable system to compare images in
an instantaneously fashion against a huge set of data.
2.2.2OtherSoftwareImplementations
Moving away from specific apps, there is a piece of software that captures animals from a
video feed and quite accurately identifies the species (Huang, et al., 2013). Their research
paper provides multiple tables of how accurate the program was compared with every other
animal in their database – the success average is 83%. An algorithm called SIFT (discussed in
section 2.4.1) gets used to provide the feature extraction, which is promising as this
application project will be taking into account and using SIFT in the development stage.
There is also a piece of software out there that does another similar task to monitor animal
populations using a camera trap (Bolger, et al., 2013). This second program also uses the
SIFT algorithm, which is shown visually using a giraffe and its brown patterned spots.
Overall, it does look very possible to be able to create an image recognition application to
identify spiders of multiple species, as it has been done with other types of animals. Feature
matching does seem to be a good way of going about it, and section 2.4 of this research paper
will go in depth about three of the popular algorithms that could be used.
5|Page
2.3ImageRecognitionTechniques
2.3.1Introduction
Computer vision consists of various methods of understanding images by retrieving,
analysing, and processing – but how does a computer manage to compare two images
together? The following will discuss two of that popular methods giving examples and
explain the relevance to this project where possible.
2.3.2Histograms
In mathematics, a histogram is defined as a type of chart that shows the frequency distribution
of a set of continuous data (Laerd Statistics, 2013). In computer vision, an image can have
multiple different types of histograms. For example the continuous data set can be the
grayscale of the image (Boucherkha, et al., 2009, p.1). Other simple examples can be the RGB
data sets (Red, Green, and Blue). Below shows an example of three RGB histograms taken
from Boucherkhas paper.
Using these histograms, we can gather information about the picture and come to conclusions,
such as images with a lot of blue could mean it’s a photo with the sky in the background. It
can also be used to compare image similarity, for example comparing RGB histograms from
two images. If the histograms have a high match, then a conclusion can be drawn that the
images are similar. However, this method isn’t 100% reliable as two totally different images
can contain the same amount of RGB, just in different pixel locations so it becomes obvious
why it can’t be fully relied upon to find objects within an image.
There are multiple algorithms that have been produced for matching two histograms to
provide more accurate results. A very well documented one which can be found by Pele and
Werman (2010) is known as The Quadratic-Chi Histogram Distance Family - where P and Q
are two histograms. The algorithm is shown briefly below for further reading if needed or
interested.
𝑄𝐹 # (𝑃, 𝑄) =
𝑃 − 𝑄 * A(P − Q)
6|Page
Unfortunately, the histogram method wouldn’t come into much use for the proposed
development of the spider application. The reasoning for this is that most British spiders are a
similar dark brownish colour, as noted in section 2.1.1. Also, there is no way to be sure what
the background image would be when a user takes a photo of the spider.
2.3.3Keypointdetectionandfeaturematching
This image detection technique allows to automatically examine an image to extract features
(usually corners or edges) that are unique to the objects within the image, in such a manner
that we are able to detect an object based on its features (Lowe, 1999). Lowes’ paper
discusses that the process is usually split up into 3 main stages; detection, description, and
matching. Detection identifies interesting features within an image known as keypoints, which
should be detectable from multiple viewpoints of the same image. Description takes every
keypoint and gives it a description that is independent from features such as scale and rotation.
The final matching step takes the set of descriptions from each image and determines which
keypoints are similar – the more similar points, the more likely both images match.
There are lots of algorithm implementations of this done so far and the following will explain
three of the more popular methods; the first one being SIFT (Scale-Invariant Feature
Transform) - published by David Lowe in 1999 as noted in his paper. SURF is a second
method (Speed up Robust Features) that I will discuss, created in 2006 by Herbert Bay (Bay,
et al, 2006). Finally, I will go on to describe another method known as ORB (Orientated
Brief), developed by a team at Willow Garage in 2011 (Bradski, et al., 2011). This is often
chosen as an effective and efficient alternative to SIFT and SURF, as the title of the authors
paper suggests.
Two other feature matching algorithms that have come across during the research is listed
below. This paper will not however, give a detailed explanation of these, but instead
acknowledge they’re available to use, in case further reading is wanted or needed:
•
FREAK (Fast Retina Keypoint) developed by Alahi, A., Ortiz, R., and
Vandergheynst, P (2012) to provide a fast way to calculate keypoint descriptions. This
algorithm doesn’t have a detection stage unfortunately.
•
BRISK (Binary Robust Invariant Scalable Keypoints) developed by Chli, M.,
Leutenegger, S., and Siegwart, R (2011) to provide an effective and efficient way for
the generation of keypoints. As a result, this focuses only on keypoint detection and
no description stage is available.
7|Page
2.4ImageRecognitionAlgorithms
2.4.1SIFT
Step one, referred to as Scale-space Extrema Detection, searches over all possible locations
on the image and at different size scales using a cascade filtering approach – so that the most
expensive operations only occur to the interesting points that pass a given test (Lowe, 2004,
p.2). Scale and the Gaussian function is a very important factor to grasp in order to understand
this algorithm, so I will briefly explain what the Gaussian function is.
A Gaussian function is often referred to Gaussian smoothing or Gaussian blur in image
processing, with its aim being to reduce the noise of an image (random variations of
brightness or colour) resulting in enhanced blob detection (Guo, 2011, p.134). The most
common blob detector is the Laplacian of the Gaussian (LoG) (Akakin, et al., 2013, p.1)
where an input image is applied to a Gaussian at a certain scale (𝑡).The formula below given
by Lowe (2004, p.5) shows that 𝐺(𝑥, 𝑦, 𝑡)is the scale space Gaussian, 𝐼(𝑥, 𝑦)is the image, t is
the scale, and the LoG is the result:
𝐿 𝑥, 𝑦, 𝑡 = 𝐺 𝑥, 𝑦, 𝑡 ∗ 𝐼(𝑥, 𝑦)
The Laplacian function is very expensive to process so Lowe made SIFT implement a
Difference of Gaussian (DoG) which is used as a faster approximation, a point Alex, et al
(2013, p.2) also agrees on. The change is that instead of working out the LoG for each scale
space, we get the difference between two of the same image Gaussians, but in separate scale
spaces. The function and a visual diagram can be shown below which is taken from Lowes
(2004, pp. 5-6) paper.
𝐷 𝑥, 𝑦, 𝑡 = 𝐺 𝑥, 𝑦, 𝑡 7897 − 𝐺(𝑥, 𝑦, 𝑡 :;< )
8|Page
The next task from the sift algorithm is that for each DoG layer (grid on the right), a sample
point is chosen and compared to its surrounding 9 neighbours on; the current image layer, the
image layer scale above, and in the image layer scale below. If this point has a larger or
smaller value than all the rest, it is chosen to be a potential keypoint. This can be seen by the
image below that Lowe (2004 p.7) also provides with the black cross being the current chosen
point:
Step two known as Keypoint Localisation is to improve on our results to get a smaller set of
more accurate keypoints, by discarding keypoints that have low-contrast or poorly lie on an
edge (Lowe, 2004). To refine the keypoints by contrast, each one has a function known as the
Taylor Expansion applied to it. If this new value is less than a given threshold (usually set at
0.03) then it is rejected (Ofir, 2009, p.22). Now we’ve gotten rid of low-contrast points, we
move on to discarding bad edge points by finding the principle curvature of each keypoint. To
do this, a Hessian matrix H (as shown below) is computed against the location of the keypoint
at the scale resulting in a value. If this value is lower than a specified threshold, it is also
rejected (Hu, et al., 2008, pp.2-3).
𝐻=
𝐷>>
𝐷>?
𝐷>?
𝐷>?
Step three, Orientation Assignment is to assign a direction to each keypoint so its orientation
does not matter i.e. the image can be detected if it is rotated quarter clockwise etc. (Battiato, et
al., 2007, p.2). For each scale point (𝐿 𝑥, 𝑦, 𝑡 as discussed above), we work out its gradient
magnitude and orientation. The algorithms below, provided by Ofir (2009, p.30), shows how
to do this:
𝑚 𝑥, 𝑦 =
(L(x + 1, y) − L(x − 1, y))2 + (L(x, y + 1) − L(x, y − 1))2
θ(x, y) = 𝑡𝑎𝑛JK
(L x, y + 1 − L x, y − 1 )
(L x + 1, y − L x − 1, y )
9|Page
The values are then put into a gradient histogram as Ofir (2009, p.31) nicely illustrates below.
The points that lie within 80% of the highest peak are chosen. Therefore each keypoint can be
duplicated but have a different direction (as long as they were in the top 20%). Now we have
successfully achieved keypoints that are independent of scale and orientation.
Step four, the shortest of steps known as Keypoint Descriptor, is to remove the effect of
illumination on each of the keypoints and represent them as a vector/matrix (Hu, et al., 2008,
p3). A performance overview given by Ofir (2009, p.37) shows that the final vector for the
algorithm gives a 80% performance result when the images have around 10% noise or up to
45 degree viewpoint change - giving between 1k-100k keypoints, which is summarised to be
robust.
2.4.2SURF
The SURF algorithm is pretty similar to SIFT in how it is structured as it shares many
theoretical similarities (Namit and Xu, 2008, p.3). This algorithm can however be explained
in three steps instead of four.
Step one, which is keypoint detection, also uses the mathematical concept of retrieving the
Gaussian of the image (Bay, et al, 2006). However, instead of getting the Difference of
Gaussian (DoG) a whole new technique is used to make the process much quicker. The
approach is call Box Filtering and Moura, et al (2008, p.1) explains that it involves taking a
point and making a grid/box around it, i.e. 9x9 with the point being the centre. The aim is to
replace each point with the average (mean) of its surrounding neighbours including itself. An
example of this can be shown by the figure below taken from Moura, et al’s paper (2008, p.1).
10 | P a g e
The next part is to get the scale spaces of each point, i.e. get the keypoint in different scales so
matching images of different sizes work. In SIFT, the DoG method is used where pyramid
layers are subtracted from one another (one layer size minus another layer size) but Pederson
(2011) explains SURF up-scales the filtering size. This means starting with 9x9, then using a
filter of size 15x15, 21x21, and 27x27. With this scale space representation, SURF applies a
technique as shown in the SIFT algorithm above, where a sample point is chosen in its 3-D
space and compared with its neighbours (Namit and Xu, 2008, p.7). Again, if the point is
higher than its neighbours then it is classed as a maxima point and considered to be a
candidate keypoint. I will include the diagram below for a second time, from Lowe (2004,
p.7), to help clarity.
Step two, orientation assignment must happen in order to provide rotational invariance i.e.
detected no matter what the rotation is (Bay, et al, 2006, p.6). This process is however very
different when compared to SIFT; as Bay goes on to further discuss. It first calculates the
Harr-Wavelet responses in both the x and y direction with a radius of 6s, where s is the scale
at which the keypoint was taken. These values are plotted in a space as shown in the figure
below taken from Bays paper. The authors of the algorithm (Bay, et al.) describes the process
on how to calculate the dominant orientation by making a window of angle 60 degrees and
calculating the sum of all plotted values within it. The window then slides by a degree and
recalculates the sum. The dominant orientation is the direction where the highest sum has
been counted, as shown in the same diagram below.
11 | P a g e
Step three, the final step, is to collate the information we have for each keypoint and store it
within a vector known as a descriptor (Pederson, p.4). For each keypoint, Pederson
recommends a square region of size 20sx20s to be taken, with s being the size of the scale.
This square region is then broken into 4x4 sub-regions and for each one, Haar Wavelet
response gets taken again in horizontal direction (𝑑> ) and vertical direction (𝑑? ). The vector
can be shown as the equation, taken from Pederson’s paper, below:
𝑣=
𝑑> ,
𝑑? ,
|𝑑> | ,
|𝑑> |
2.4.3ORB
ORB is undoubtedly different from the techniques used in SIFT and SURF, as will become
obvious after reading, however because they are all feature matching algorithms they do have
two stages in common; keypoints and descriptors. In the previous two algorithms, the
keypoints and descriptors were tightly bound together in a series of steps mainly relying on
Gaussian functions. ORB clearly separates these two parts as such finding the keypoints use a
method known as oFAST, and the descriptors are made up by using a method known as
rBRIEF (Kim, et al, 2014, p.14).
Keypoint Detection, is the first stage to identify the interesting keypoints. To do this ORB
makes use of the FAST (Features from Accelerated Segment Test) algorithm which is used
for high speed corner detection (Drummond, and Rosten., 2005, p.6). It works by selecting a
pixel 𝑃 and getting its intensity𝐼. A threshold value 𝑡 is chosen by the programmer and a
circle of a given radius around a pixel is computed (Drummond, and Rosten., 2006, pp.4-5).
This is shown in the image below taken from Drummond and Rostens paper.
12 | P a g e
For each of the pixels lying on the circle (numbered 1-16 on the image above) it will be
marked as contiguous if it is brighter than 𝐼 + 𝑡 or darker than𝐼 − 𝑡. If the number of
contiguous points is greater than a given value𝑁, the centre point can be marked as a corner
(Drummond, et al., 2008, pp.13-14).
However, FAST does not compute orientation, so at the moment, our keypoints are not
rotation invariant. ORB applies a technique known as Orientation by Intensity Centroid thus
resulting in the new method oFAST (Bradski, et al., 2011, p.2). Oxford Dictionaries (2015),
defines a centroid in mathematics to be the centre of mass of a shape, therefore an intensity
centroid is the average location of where the most intense pixel resides. According to Bradski,
et al, the intensity centroid is usually offset from the centre of the circle we have chosen. Thus
resulting in a vector/direction.
Keypoint Description is stage two of ORB, where it also makes use of an available algorithm
called BRIEF (Binary Robust Independent Elementary Features). Creating the keypoint
vectors for thousands of points can take up a whole lot of memory space and computational
time. As a result, BRIEF implemented a method to make the keypoint descriptors into a
shorter more memory efficient type known as a binary string (Calonder, et al., 2010). To do
this, we first use the algorithm below (which is taken from Calonders paper) on each pixel in
a set size. 𝑝 is the pixel and 𝑝 𝑥 is the intensity at a point 𝑥 and 𝑝(𝑦) is the pixel intensity
at𝑦.
𝑡(𝑝; 𝑥, 𝑦)
1𝑖𝑓𝑝 𝑥 < 𝑝(𝑦)
0𝑖𝑓𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
The next stage is to choose our vector length,𝑁, for the binary string - it is usually 128, 256,
or 512. With this variable we can work out the final binary string for the keypoint using the
given algorithm:
28JK 𝑡(𝑝; 𝑥8 , 𝑦8 )
𝑓\ 𝑃 ≔
K^8^_
Having the BRIEF descriptors is only part of the solution because it handles rotation variance
very poorly and as a result must be ‘steered’ according to the orientation of the keypoints
(Bradski, et al., 2011, p.3). For any feature set of length 𝑛 at locations (𝑥8 ,𝑦8 ), define a 2𝑏𝑦𝑛
matrix(𝑆), which contains those coordinates. Then using the orientation𝜃, its rotation matrix
𝑅d is found and rotates S to get the final steered (rotated version), 𝑆d .
𝑆d = 𝑅d 𝑆
𝑥K ,
𝑆 = 𝑦 ,
K
⋯,
⋯,
𝑥\
𝑦\
13 | P a g e
2.4.4PerformanceandRelevancetoWork
In order to create a responsive web app for spider identification, an efficient image
recognition algorithm must be chosen. One could argue that it makes sense to choose the most
recent algorithm but as this might seem logical, it is not a sufficient way of choosing which
one should be used. The following presents research findings of how efficient and effective
these algorithms are before conducting my own experiments in the methodology (section 3.3)
of this paper.
In order to find the efficiency and accuracy of the algorithms, tests must be performed against
suitable data. The following shows some examples other papers have concluded; first
investigating SIFT and SURF. The first paper by Oyallon and Rabin (2013) performed
extensive comparisons of each stage of the algorithms and came to the conclusion SURF was
very useful as a fast tracking system, as it processed its operations much faster when
compared to SIFT. But on the overhand, when computation time wasn’t a factor, SIFT
outperformed SURF on accuracy. This deduction is also supported on Panchal, et al’s (2013,
p.5) paper by looking at their test results.
After finding out that SIFT gives better accuracy and SURF gives better speed, it’s time to see
how these compare against the third algorithm - ORB. ORBs developers Bradski, et al (2011),
did a series of evaluations against SIFT and SURF, and their results showed ORB was
significantly faster. For a set of 2686 images it took ORB 15.3ms, SURF 217ms, and SIFT
5228ms. This data is taken from the creators’ paper so it is worth noting it could very well be
biased. However, Isik and Ozkan (2014), does provide a very well summarised set of
benchmarks for a wide variety of algorithms, including some not discussed in this paper. The
conclusions of these tests are that FAST+BRIEF is the best possible combination for response
time with 0.083ms. But as we know, those are incompatible with what we want to do as
discussed in section 2.4.3. The runner up, beating both SIFT and SURF, is actually ORB with
a time of 0.236ms. Through the research on these papers, a deduction can be made that ORB
is the direction to go, in order for fast responsive times that will make it scalable in a
distributed systems environment.
14 | P a g e
3.Methodology
3.1ImplementationTechnologies
3.1.1ServerSetUpIncludingSecurity–(Linux,Windows,PHP,ASP)
In order to make the application publicly accessible, it will have to be located on at least one
server. The two most popular servers currently available are UNIX and Microsoft Windows;
both offering different advantages (W3Techs.com, 2015). Looking at that data, it is clear to
see that Linux is the popular choice on most web servers and being open source helps its case
too. As the core of this application will be a simple program that is hosted on a web server,
the decision has been made to choose Linux (Ubuntu 14.04 to be precise). The reasoning
behind, as stated by Cabrera (2009), is mainly due to the fact that it is generally more secure
and is the more popular choice for high performance web applications.
As this web application will be using Ubuntu server, it’s safe to say that the scripting
language that will be used, is PHP. There are many arguments out there comparing PHP and
ASP.NET, such as the one by Mikoluk (2013). But the truth is, as noted by Frederick (2013),
PHP is a programming language whereas ASP.NET isn’t, which makes no sense to be able to
compare. A lot of web applications follow a stack known as LAMP (Linux, Apache, MySQL,
and PHP) and as a result, this method has been highly adapted for web performance (Jaffe,
2005) so this will be the route the spider application will take as well.
3.1.2ApplicationLanguage–(C++,Java,MATLAB,OpenCV)
MATLAB (Matrix Laboratory) is a special purpose programming language developed in 1984
by MathsWorks as discussed in an origins video by its co-founder Moler (2004). Its purpose
is to provide a hassle-free way of coding elements such as mathematic functions, model
simulations, data processing, graphs, and algorithm development (Gilat, 2011). Using this in
the application would be very beneficial as it will help aid the implementation of the feature
detection algorithms.
Another application language that could be used is known as OpenCV (Open Source
Computer Vision) (Laganiere, 2011). It is a set of libraries developed for multiple platforms
such as C++, Java, Python, and even MATLAB. This framework is massively useful as it has
over 2500 built in algorithms (OpenCV, 2015), including the SIFT, SURF, and ORB. As
OpenCV is initially designed and developed for C++, this will be the language that is used for
the application.
15 | P a g e
3.1.3ScalableandDistributable-(Cassandra,Hadoop,Spark,NodeJS,Azure,Puppet)
Making a system in a commercial environment distributable, by ensuring it has high
performance and is scalable, is such an important step to ensure. Many tools and platforms are
available to be used together to make this happen. Not all of them will be used in this project
as they are not all needed and there are hardware restrictions due to not being a large
commercial product.
The first piece of software that will be discussed is Cassandra. Cassandra was first initially
developed by Facebook and is now an Apache open source project. It is a Database
Management System (DBMS) for distributed systems, designed to handle large amounts of
data across many servers (Lakshman, et al., 2010). It has multiple features that makes this a
good choice, such as being scalable, fault-tolerant so data replicates in case of server loss, and
also these failed server nodes can be replaced with zero down time (Cassandra, 2015). This
makes Cassandra good for this application instead of a more common SQL DBMS.
Another point worth noting for distributing a system is being able to handle many requests
and respond pretty much instantaneously. Three following applications will be discussed that
can help achieve this - Hadoop, Spark, and Node.js. Hadoop is an Apache open source
framework that supports processing of large amounts of data sets that can easily be scaled up
from a single computer to multiple machines (Hadoop, 2015). Hadoop is primarily written in
Java and an important feature that will be most useful for the spider application is the
MapReduce feature for large scale data processing, as it will need to be able to compare many
images all at the same time. Unfortunately though, Hadoop does not support image files in its
file system, so implementing this framework will require an added complexity. Another added
challenge, is the application will be written in C++ whereas Hadoop primarily support Java,
there is however a wrapper class available to help aid this. Spark is also an Apache open
source framework similar to Hadoop, as its purpose is for large-scale data processing (Spark,
2015). It boasts that it can run programs up to 100x faster than Hadoop MapReduce if in
memory or 10x faster if on disk. So without a doubt, Spark will be the framework this project
will use for objective number 6 as briefed in section 1 of this paper. Node.js is a slightly
easier platform for building fast, scalable, network applications (NodeJs, 2015). It works by
splitting the requests, to multiple cores available in the system, for example in a small regular
4 core processor, the process would be able to complete 4x as fast. In a distributed system, the
hardware can scale up and allocate more cores as to keep up with performance and demand.
16 | P a g e
This brings the paper onto the final part of this section – how to scale up and allocate
hardware. Puppet is a piece of software that can be used to easily and rapidly help increase the
setup of virtual machines or web services (Puppet Labs, 2015). So if more hardware was to be
required in order to expand and be more scalable, a server admin would just set up a new
machine, and using puppet, bring it to life with a preloaded set of applications such as a
LAMP setup with Spark configured to work with the network. Microsoft Azure (Microsoft,
2015) is another popular way for a product to be scalable. It is developed by Microsoft and is
a cloud computing platform for building, deploying, and managing applications. It mentions
on its website that it will automatically grow and decrease its services so you will only be
using what is needed, making the pricing very flexible. Unfortunately due to hardware and
other limitations, this project would be unable to provide scalable solutions as only one server
with a 4 core processor is available but Spark will be used for this distribution.
3.2ArchitectureDesignandDevelopment
3.2.1ApplicationDesign
The C++ application will be split up into two programs. The first one will be to extract the
xml matrix, from the uploaded user image. The second program will be to compare the
uploaded user xml, with each of the server side image xml files (which will already be pre
extracted and saved).
The website will use twitter bootstrap to be able to stay responsive on all modern browser and
devices. So it will automaticaly resize and reshape the page if the user was to rotate their
phone or zoom in on a tablet. This is a very important step to get right as the user experience
is such a great factor in order to keep visitors coming back and using it.
It is however worth pointing out, members of the public would be able to create their own
applications, web or native, that can talk to the C++ algorithm. All they need to do is submit
an image POST request to the server, and all the data would be returned. This flexibility is
also an added bonus for users to implement their own graphical user interfaces.
3.2.2ArchitectureDesign
The architecture design is significant and vital to get right as it is the backbone of the entire
project. This section will first discuss the design as if the project were to become a
commercial product on an enterprise level, by providing a diagram than discussing it. The
next part will be how this project will actually be implemented, by also providing a different
diagram and talking about it.
17 | P a g e
The diagram below is the proposed commercial web software product.
It starts by having a user access the Apache HTTP web server that contains the PHP/HTML
files. A user will then make a POST request by sending an image file. PHP first runs an exec()
function that extracts the users image into an xml file. The image and xml file is then saved on
a separate media storage server for later use. PHP then runs the second exec() function, that
makes an Apache Spark request to compare the users xml (from the media server) against a
collection of server side xml files (from Cassandra loaded in memory) on an application
stored on another server located in the server farm. This server farm is located on the cloud so
it can scale up and down depending on the need of hardware demand. After the C++
application has been given the two xml files, they are compared using the chosen algorithm
from section 3.3.2. All results are then collected and sent back to the web server, where it is
returned in the POST response, for the website to show the client. The users image and xml
fields are then deleted from the media server.
18 | P a g e
The diagram below is the proposed architecture design of how the application will look while
it is not a large scale public application. One server is only being used in this diagram as
money and hardware issues are a significant limitation. So as a result, the single Linux server
has been split into three main subdirectories to mimic the design above. The HTTP Apache
server is a directory that contains all the front end and back end web files. The other folder,
labelled NTFS, is located elsewhere that contains all the source images, source xml files, and
also is a place where the users image and xml file gets temporarily stored. Designing the
server like this, will also make it a lot easier if it was to ever integrate into a professional
commercial setup.
Similar to the situation above, the user navigates to the webpage by connecting to the web
address of the server. They upload their image via a POST request where PHP runs a first
exec() function to extract the xml from the image. The two resulting files will then be saved in
the NTFS area. The second PHP function is then carried out by sending a Node.JS request to
the C++ application, giving a user and source xml file as the input. Node.JS has been chosen
instead of Spark, because a successful implementation proved too difficult. After the C++
application has performed its algorithm on all source xml files. The results are then collected
and returned to the users webpage. The uploaded xml and image is then also deleted.
19 | P a g e
3.3TheAlgorithms
3.3.1TestingWhichAlgorithmToUse
At the moment there is a choice to make the user app use the SIFT, SURF, or ORB
algorithms. In section 2.4.4 we concluded ORB is the one most suitable on the basis of other
peoples research and tests. However in order to be fully certain ORB is the right choice,
objective 4 of section 1 will be completed, which is to develop an app that will test both the
time of completion, and if it succesfullly identifies the spider, for a small sample set of images
against all three algorithms.
The first step is to find a large collection of spider images to be used. Appendix A, shows the
large set of spider images that are stored serverside for comparisons. Whereas Appendix B, is
the collection of images that will be used to test in the finished application. For example, a
spider image from Appendix B will be compared with all spider images from Appendix A. In
this test experiment, we will test 5 images – one from each species (Apendix B.1,B.4,
B.7,B.10, and B.13)
3.3.2TheResults
Each row of the table below corresponds to an image of a different species of spider. Five
images have been chosen as this app currently only supports five spiders. The time is recorded
in seconds from the algorithm start to finish, and accuracy is shown by the expected and
actual result.
SIFT
SURF
ORB
Appendix
time
expected
actual
time
expected
actual
time
expected
actual
B.1
B.4
B.7
B.10
B.13
45.20
26.29
41.97
31.45
33.67
Zebra
House
Orb
Daddy
Lace
Lace
Daddy
Orb
Daddy
Daddy
4.414
3.063
3.347
2.614
3.655
Zebra
House
Orb
Daddy
Lace
Orb
Orb
Orb
Orb
Orb
5.172
3.893
5.540
4.582
5.192
Zebra
House
Orb
Daddy
Lace
Lace
House
Orb
Daddy
Lace
As we can see from above, the accuracy rating of both SIFT and SURF is very low when
compared with ORB. Also looking at the results, we can see the SIFT algorithm is
significantly more time consuming compared to SURF. However SURF is, on average
slightly faster than ORB. Concluding overall, the results do align up with the previous
research undertook and it is decided for the final user implementation of the app, ORB will be
implemented.
20 | P a g e
3.4ResultReviewing
3.4.1Introduction
Before the final web application is developed, a suitable plan must be put forward to test,
review, and analyse the application. This section will discuss five different performance tests
in detail, as to what the test involves and how it will go about recording the results. The
results will be provided in the next chapter of this paper (section 4).
3.4.2PerformanceTest1-Statastics
The first performance test that will be carried out is similar to the one seen in section 3.3.2.
All fifteen images from Appendix B will be ran against the ORB algorithm and its time for
completion, expected and actual, and also a new statistic - certainty. The formula that will be
used for certainty is shown in an example below. All results are to four significant figures.
Image x, bring back five results, 442 matches with zebra, 12 matches with house, 14 matches
with orb, 84 matches with daddy, and 1 match with lace. From looking at this by eye, we can
be fairly certain the winner is zebra as it has the highest; however as a formula it will be:
𝒉𝒊𝒈𝒉𝒆𝒔𝒕𝑴𝒂𝒕𝒄𝒉 ∗ 𝟏𝟎𝟎
(𝒛𝒆𝒃𝒓𝒂𝑴𝒂𝒕𝒄𝒉𝒆𝒔 + 𝒉𝒐𝒖𝒔𝒆𝑴𝒂𝒕𝒄𝒉𝒆𝒔 + 𝒐𝒓𝒃𝑴𝒂𝒕𝒄𝒉𝒆𝒔 + 𝒅𝒂𝒅𝒅𝒚𝑴𝒂𝒕𝒄𝒉𝒆𝒔 + 𝒍𝒂𝒄𝒆𝑴𝒂𝒕𝒄𝒉𝒆𝒔)
Certainty = 𝟒𝟒𝟐 ∗ 𝟏𝟎𝟎
= 79.93%
(𝟒𝟒𝟐 + 𝟏𝟐 + 𝟏𝟒 + 𝟖𝟒 + 𝟏)
Therefore, there is a 79.93% certainty that the correct spider identified is a Zebra Spider.
3.4.3PerformanceTest2-Functionality
Now we know the statistics of the ORB algorithm, the next step is to test the application as a
whole. To do this, every image from Appendix B will be submitted one at a time to the
application and its time of completion will be recorded. Theoretically, the time should be
slightly higher than the algorithm as we need to take into account the POST request and
image upload time. On the other hand, as noted in section 3.2.3, all our server side images
have already been converted and saved as xml matrix files so using the web application could
be quicker. Whichever the case, all the results should hopefully be less than 6 seconds as
requested in section 1 objective 6.
21 | P a g e
3.4.4PerformanceTest3-Distributability
This test is to determine how responsive and distributable the application is. What this step
does, is fires multiple requests at once to see how well the server can cope. We can use the
results from this to determine how many servers would be needed if the application was to go
public. The average time of completion would be recorded per each batch request. For
example the average time of 5 simultaneous requests, then average time of 10 simultaneous
requests etc. In order to mimic the requests all at once, a small edit to the JavaScript code will
be made, as shown in pseudocode below. This will be removed once testing of this has
finished, as it is only needed temporarily.
FOR numberOfSimultaneousRequests
execute application
ENDFOR
3.4.5PerformanceTest4-SoftwareandHardware
This fourth performance test is how the application functions on different devices and
systems. The following is a list that will be tested: mobile, tablet, pc, android, iOS, IE,
chrome, and Firefox. The latest version will be used, where required (as of April 21st 2015).
For each test taken, the design and functionality of the application will be recorded. What this
means is, does the application actually work, and is everything visible and clear.
3.4.6PerformanceTest5–NetworkSpeeds
This test will compare the time for the webpage to complete downloading, and the time for
the application to finish, against different network speeds to see how well the app can cope if
it was to be used using a wireless network. This is important as the application is intended to
be used randomly as it’s hard to predict when a spider will show. As a result, the most
frequent device used would be a mobile phone. Unless being indoors at home, most mobile
users will be accessing data over their network provider, which may or may not restrict their
data bandwidth. There is nothing stopping the user from taking an image and saving it for
later though.
22 | P a g e
4.Results,Analysis,andCriticalReview
4.1Results
The following five sections are the results of all the five tests carried out on the application as
discussed above in the methodology section 3.3. This section only provides a tabular view of
the results. Whereas Section 4.2 will summarise all the findings and determine conclusions
based upon the tests carried out, including more visual data representation such as graphs.
4.1.1PerformanceTest1-Statistics
Image 1 (Appendix B.1)
Results
Image 2 (Appendix B.2)
Results
Certainty (%)
35.08
Certainty (%)
20.48
Time (s)
5.172
Time (s)
5.100
Expected: Zebra
Lace
Expected: Zebra
Zebra
Image 3 (Appendix B.3)
Results
Image 4 (Appendix B.4)
Results
Certainty (%)
41.54
Certainty (%)
41.29
Time (s)
7.150
Time (s)
3.847
Expected: Zebra
Zebra
Expected: House
House
Image 5 (Appendix B.5)
Results
Image 6 (Appendix B.6)
Results
Certainty (%)
34.71
Certainty (%)
39.01
Time (s)
4.044
Time (s)
4.770
Expected: House
House
Expected: House
House
Image 7 (Appendix B.7)
Results
Image 8 (Appendix B.8)
Results
Certainty (%)
36.70
Certainty (%)
24.24
Time (s)
5.694
Time (s)
3.669
Expected: Orb
Zebra
Expected: Orb
Image 9 (Appendix B.9)
Orb
Results
Image 10 (Appendix B.10) Results
Certainty (%)
25.81
Certainty (%)
52.11
Time (s)
4.959
Time (s)
4.663
Expected: Daddy
Daddy
Expected: Orb
Orb
23 | P a g e
Image 11 (Appendix B.11) Results
Image 12 (Appendix B.12) Results
Certainty (%)
31.37
Certainty (%)
28.86
Time (s)
4.118
Time (s)
4.068
Expected: Daddy
Daddy
Expected: Daddy
Zebra
Image 13 (Appendix B.13) Results
Image 14 (Appendix B.14) Results
Certainty (%)
35.26
Certainty (%)
37.72
Time (s)
5.386
Time (s)
6.162
Expected: Lace
Lace
Expected: Lace
Lace
Image 15 (Appendix B.15) Results
Certainty (%)
37.95
Time (s)
4.021
Expected: Lace
Lace
4.1.2PerformanceTest2-Functionality
Image Size (kb)
Image Upload Time (x10-4s)
Program Completion Time (s)
1 (Appendix B.1)
73.1
2.890
4.215
2 (Appendix B.2)
134
3.049
4.894
3 (Appendix B.3)
160
2.272
8.314
4 (Appendix B.4)
12.7
3.362
2.305
5 (Appendix B.5)
22
2.730
1.892
6 (Appendix B.6)
23.7
3.061
4.611
7 (Appendix B.7)
42.9
2.241
7.541
8 (Appendix B.8)
6.47
2.849
2.303
9 (Appendix B.9)
78.6
2.840
5.696
10 (Appendix B.10)
32.4
2.270
3.584
11 (Appendix B.11)
35.9
2.809
2.401
12 (Appendix B.12)
27.3
2.270
2.759
13 (Appendix B.13)
72.7
3.030
5.907
14 (Appendix B.14)
126
2.229
6.579
15 (Appendix B.15)
7.62
2.220
2.632
Image Name
24 | P a g e
4.1.3PerformanceTest3-Distributability
Simultaneous Requests Average time taken to complete (s)
1
4.915
2
8.795
3
13.88
4
17.72
5
22.54
10
46.77
15
70.23
20
93.57
4.1.4PerformanceTest4-SoftwareandHardware
Medium Does the functionality work? Is everything visible? Proof – Appendix
Mobile
Yes
Yes
C.1
Tablet
Yes
Yes
C.2
Desktop
Yes
Yes
C.3
Android
Yes
Yes
C.1
iOS
Yes
Yes
C.4
Chrome
Yes
Yes
C.3
IE
Yes
Yes
C.5
Firefox
Yes
No
C.6
4.1.5PerformanceTest5–NetworkSpeeds
Network Speed (Mbps) Page Startup time (s) App Completion Time (s)
Wi-Fi - 30Mbps
0.136
4.40
4G – 4Mbps
0.173
4.62
3G – 750Kbps
0.751
5.50
2G – 250 Kbps
1.890
7.76
GPRS – 50Kbps
10.25
16.59
25 | P a g e
4.2Analysis
Now we have collected the data, we are able to provide better visual representation of it and
give conclusions and understandings of the findings. To start the analysis, time will be
discussed. Each of the test results will be referred to as their test number i.e. 4.1.1 will be test
1 and 4.1.2 will be test 2.
Completiontimeatvaryingimagesizes
ProgramCompletionTime(s)
9
8
7
6
5
4
3
2
1
0
0
20
40
60
80
100
120
140
160
180
Size(kb)
Above is a chart showing a size of an image in kb (the x axis) and the time the program took
to complete (y axis). There is a clear positive correlation on this graph which shows that the
bigger size an image is, the longer it takes for the algorithm to complete. The image upload
time via size is so minute that it doesn’t interfere with seconds. The reason it takes longer for
the algorithm to complete is due to having much more data to work with. It is worth pointing
out however, there is an anomaly in the results as shown by the orange circle. After some
investigation work, it is unknown as to why this occurred. To conclude, if this application was
to go public, the reasonable upload size of an image taken should be between 10 and 80 kb for
optimum speed performance and to meet objective number 6 in section 1.
26 | P a g e
Certainty
7%
13%
27%
53%
20-30
31-40
41-50
51-60
The pie chart above provides a visual representation showing the ratio of how certain the
application was at giving the correct result. Looking at the chart, the 31%-40% certainty is the
most popular (53% of the pie) with 20%-30% (27% of the pie) being the second most popular
result. Having a certainty between 31 and 50 might seem low, but it isn’t in fact too bad. The
reasoning behind this is because there are a possible of 5 spider choices, so in order for one to
stand out over all the rest, at least 21% certainty must happen. This reasoning can be even
further proved as out of all fifteen tests, only three came back wrong.
Completiontimeusingdifferentnetworkspeeds
18
ProgramCopletionTime(s)
16
14
12
10
8
6
4
2
0
0
5
10
15
20
25
30
35
Networkspeed(Mbps)
27 | P a g e
The graph above shows a negative correlation between the network speed and the time it takes
for the program to complete. At the low end (GPRS and 2G) we can see a steep decline in the
time it takes to complete, but as the network speed increases, the slope becomes less and less
until eventually appearing even. This is because after around 4Mbps, network does not
become an issue and other factors such as the applications algorithm will consume the rest of
the time. Linking to the objectives in section 1, optimum performance would be a device
using either 3G, 4G, or Wi-Fi.
Completiontimewithmultiplerequests
100
ProgramCompletionTime(s)
90
80
70
60
50
40
30
20
10
0
0
5
10
15
20
25
Numberofinstancesrequested
The line graph shows a strong positive correlation between how many requests have been
made to the application, and how long it takes for a response. After 2 requests have been
made, it rises above 10 seconds to return a result which tragically is not an outcome to have
been desired. After further investigation work by logging into the server and checking the
process information (as shown in append D.1), the CPU usage is huge for just one request so
as there is no CPU available, Apache queues the requests which is why we have resulted in a
near perfect straight line upwards (strong correlation). This is also unfortunate because
objective 6 mentions that this application should be scalable and distributable by being able to
handle large amount of user requests. But in the development of the application, it was unable
to implement Apache Spark which would have solved this issue, which means the application
isn’t all bad – it just needs to be further upgraded in the future to support Apache Spark.
28 | P a g e
5.Conclusion
5.1SummaryofkeyIssues
One of the first key issues that was thought about quite early on was how to gather photo data
for the five different species of spiders. At first, it seem plausible to gather primary data by
photographing these spiders as they are fairly common. However, it is much more difficult
than one would suspect due to not having a good quality camera, not being able to find the
spider, and also not having it stay completely still. As a result, this paper has had to use
secondary source images and permission has been asked where it was stated to. This does
have a benefit for this project though, as the images will be much clearer as most of them are
professionally done.
Another issue this project faced was trying to get the application distributable by
implementing Hadoop or Spark. Hadoop was tried first but after a lot of attempts it didn’t
really have much support for images and things started to get confusing really fast. However,
Spark hasn’t got C++ support, so the application got rewritten to Java and Spark was
successfully installed on the server, but unfortunately couldn’t figure out how to utilise it. As
a consequence, multiple requests take twice the amount of time, as the CPU usage is so high.
If this was in a commercial environment, the decision would be made not to release the
application to the public until this is fixed.
A third key issue that was tackled during this project was the setup of the server, as little
knowledge was known how to create a LAMP setup. A lot of research and tinkering with
backend took place before it started working. The hardest part was setting up Apache but it
was very well worth it as this knowledge can be used and shared on other future projects too.
5.2Objectives
Objective number one was to provide an extensive literature review on feature tracking
algorithms and what has already been accomplished. This paper goes in depth explaining how
three of these algorithms work in section 2.3 and also what has also been done in section 2.2.
As a result, this objective has been met.
The second objective was to research and provide a suitable set of spiders and information to
be able to use in the application. This can be shown in the literature review section 2.1 and
also in the screenshots of the final application which can be viewed in appendix C, where the
spider statistics are returned. This objective was also met.
29 | P a g e
The next objective was to create a prototype application to recognise shapes so it could be
further developed at a later stage to recognise spiders. This was successfully created, even
with a small tweak to identify a wasp or a ladybird. However this version was not used as it
was not needed. This objective was successfully met with no problems.
The fourth objective states to test each algorithm with a small set of images to determine its
average accuracy, and time. The results of this is shown in section 3.3 with ORB being the
winner. Therefore, this objective was also completed.
The fifth objective – to produce a complete web application that determines what species of
spider you have submitted, as well as bringing back information about the spider has also
been successfully completed and finished, as proven in the results section of this paper. As a
result, this objective has been successfully met.
The sixth objective states that the finished application would be further built upon to allow for
a scalable solution so large amounts of users can use it at once while bringing back the results
in less than 6 seconds. Unfortunately, multiple attempts were made, but not successfully
completed. As a result, this objective has not been met.
5.3Possibleimprovementsandfutureresearch
The first definite improvement that would be made is the ability for the application to handle
large amounts of user requests in a timely fashion. It was attempted in this project, but failed,
so to get it working would be a really good relief.
Another improvement that would definitely be considered for future development is the
ability for users to add their submitted spider image to the collection of server side images.
This will help the application be more accurate and would increase the certainty. However,
there is a disadvantage as a moderator would have to be employed to make sure the image is
of the correct spider and also no bad images make it onto the server. The completion time
would not be a problem either as long as Apache Spark or Hadoop is being utilised.
A possible research area that will be interesting and useful to look into the future would be
how artificial intelligence (AI) can come into play with being able to identify animals. Such
as can a program distinguish legs of an animal and make assumptions such as ‘it has 8 legs so
it must be a spider’ and ‘those legs have this pattern so it’s this type of spider’.
Whatever the future has in image recognition, it is certainly very established in this modern
age and will only continue to improve solving problems that we all long to make easier.
30 | P a g e
6.References
Akakin, H,. Kong, H., and Sarma, S,. (2013). A Generalised Laplacian of Gaussian
Filter for Blob Detection and Its Applications. IEEE Transactions on Cybernetics,
43.6, pp.408-418.
Alahi, A., Ortiz, R., and Vandergheynst, P,. (2012). FREAK: Fast Retina Keypoint.
IEEE International Conference on Computer Vision and Pattern Recognition, pp.510517.
Alcaniz, M., Botella, C., Breton, J., Brotons, D., Burkhardt, J., Lopez, J., and Ortega,
M,. (2013). The Theraputic Lamp: Treating Small-Animal Phobias. IEEE Computer
Graphics and Applications, 33.1, pp.80-86.
Alex, A., Asari, V., And Matthew A,. (2013). Local Difference of Gaussian Binary
Pattern: Robust Features for Face Sketch Recognition. IEEE International Conference
on Systems, Man, and Cybernetics, pp.1211-1216.
Alpers, G., Gerdes, A., and Uhl, G,. (2009). Spiders are special: fear and disgust
invoked by pictures of arthropods. Evolution of Human Behaviour, 30.66.
Battiato, S., Gallo, G., Puglisi, G., and Scellato, S,. (2007). SIFT Features Tracking for
Video Stabilization. IEEE International Conference on Image Analysis and
Processing, pp.825-830.
Bay, H., Ess, A., Gool, Luc., and Tuytelaars, T,. (2006). Speeded-Up Robust Features
(SURF). Computer Vision and Image Understanding, 110.3, pp.346-359.
Bolger, D., Farid, H., Lee, D., Morrison, T., and Vance, B,. (2012). A computerassisted system for photographic mark-recapture analysis. Method in Ecology and
Evolution, 3.5, pp. 813-822.
Boucherkha, S., Chikhi, S., and Meskaldji, K,. (2009). Colour Quantization and its
Impact on Color Histogram Based Image Retrieval. NDT’09 First International
Conference on Networked Digital Technologies, pp.515-517.
Bradski, G., Konolige, K., Rabaud, V., and Rublee, E,. (2011). ORB: an efficient
alternative to SIFT or SURF. 2011 IEEE International Conference on Computer
Vision, pp.2564-2571.
31 | P a g e
Cabrera, J,. (2009). Windows vs. Linux: A Comparative Study, [Online] Available at:
http://zach.in.tu-clausthal.de/teaching/werkzeuge_literatur/LinuxvWindows.pdf [Last
Accessed 22nd April 2015].
Calonder, M., Fua, P., Lepetit, V., and Strecha, C,. (2010). BRIEF: Binary Robust
Independent Elementary Features. ECCV’10 11th European Conference on Computer
Vision, 6314, pp.778-792.
Cassandra., (2015). Welcome to Apache Cassandra [Online]. Available at:
http://cassandra.apache.org/ [Last Accessed 12th March 2015].
Chli, M., Leutenegger, S., and Siegwart, R,. (2011). BRISK: Binary Robust Invariant
Scalable Keypoints. IEEE International Conference on Computer Vision, pp.25482555.
Drummond, T., and Rosten, E,. (2005). Fusing Points and Lines for High Performance
Tracking. IEEE International Conference on Computer Vision, 2, pp.17-21.
Drummond, T., and Rosten, E,. (2006). Machine learning for high-speed corner
detection. ECCV’06 Proceedings of the 9th European Conference on Computer Vision,
1, pp.430-443.
Drummond, T., Porter, R., and Rosten, E,. (2008). Faster and better: a machine
language approach to corner detection. Transactions on Pattern Analysis and Machine
Intelligence, 32.1, pp.105-119.
Emilian Postolache, (2013). Search By Image. [Computer Program]. Available at:
https://play.google.com/store/apps/details?id=net.kaos.reverse [Last Accessed: 31st
March 2015].
Frederick, J,. (2014). PHP vs ASP.NET? What you should really be comparing
instead..., [Online] Available at: https://www.linkedin.com/pulse/2014111418263712880086-php-vs-asp-net-what-you-should-really-be-comparing-instead [Last
Accessed 22nd April 2015].
Future Mobile, (2015). Search Image. [Computer Program]. Available at:
https://play.google.com/store/apps/details?id=com.futuremobile.searchimage [Last
Accessed: 31st March 2015].
32 | P a g e
Gilat, A., (2011). MATLAB An Introduction with Applications. 4th ed. New York: John
Wiley & Sons.
Google. (2012). Google Image Search API (Deprecated), [Online] Available at:
https://developers.google.com/image-search/?hl=it [Last Accessed: 31st Match 2015].
Google. (2015). Reverse Image Search, [Online] Available at:
https://support.google.com/websearch/answer/1325808?hl=en-GB [Last Accessed:
31st Match 2015].
Guo, H. (2011). A Simple Algorithm for Fitting a Gaussian Function. IEEE Signal
Processing Magazine, 28.5, pp.134-137.
Hadoop,. (2015). Welcome to Apache Hadoop!, [Online] Available at:
https://hadoop.apache.org/ [Last Accessed 22nd April 2015].
Hu, X., Tang, Y., and Zhang, Z,. (2008). Video object matching based on SIFT
algorithm. IEEE International Conference on Neural Networks and Signal Processing,
pp.412-415.
Huang, T., Jansen, P., Kays, R., Wang, J., Wang T., and Yu, X,. (2013). Automated
identification of animal species in camera trap images. Journal on Image and Video
Processing 2013, 1.52.
Isik, S., and Ozkan, K,. (2014). A Comparative Evaluation of Well-known Feature
Detection and Descriptors. International Journal of Applied Mathematics, Electronics,
and Computers, 3.1.
Jaffe, D. (2005). LAMP Quickstart for Red Hat Enterprise Linux 4 [PDF] Available at:
http://www.dell.com/downloads/global/solutions/lamp_quickstart_rhel4.pdf [Last
Accessed 22nd April 2015].
Kim, Y., Moon, I., Oh, C., and Park, J,. (2014). Performance Analysis of ORB Image
Matching Based on Android. International Journal of Software Engineering and Its
Applications 8.3, pp.11-20.
Laerd Statistics,. (2013). Histograms. Lund Research, [Online] Available at:
https://statistics.laerd.com/statistical-guides/understanding-histograms.php [Last
Accessed: 26th March 2015].
33 | P a g e
Laganiere, R., (2011). OpevCV 2 Computer Vision Application Programming
Cookbook. [e-book] Birmingham: Packt Publishing. Available at:
https://books.google.co.uk/books?id=OC7jc8zWjlkC&printsec=frontcover [Last
accessed: 12th March 2015].
Lakshman, A., and Malik, P,. (2010). Cassandra: a decentralized structured storage
system. ACM SIGOPS Operating Systems Review, 44.2, pp.35-40.
Lowe, G. (2004). Distinctive Image Features from Scale-Invariant Keypoints.
International Journal of Computer Vision [Online], 60.2, pp91-110. Available at:
https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf [Last Accessed: 26th March 2015].
Microsoft,. (2015). What is Microsoft Azure?, [Online] Available at:
http://azure.microsoft.com/en-gb/overview/what-is-azure/ [Last Accessed 22nd April
2015].
Mikoluk, K,. (2013). PHP vs. ASP.NET: Costs, Scalability and Performance,
[Online]. Available at: https://blog.udemy.com/php-vs-asp-net/ [Last Accessed 22nd
April 2015].
Moler, C., (2004). Origins of MATLAB, MathWorks. [Online Video Speech].
Available at: http://uk.mathworks.com/videos/origins-of-matlab-70332.html [Last
Accessed 12th March 2015].
Moura, J., Pires, B., and Singh, K,. (2011). Approximating Image Filters With Box
Filters. IEEE International Conference on Image Processing, pp.85-88.
Namit, G., and Xu, A,. (2008). SURF: Speeded Up Robust Features. McGill
University [Online]. Available at: http://www.cim.mcgill.ca/~siddiqi/COMP-5582008/AnqiGaurav.pdf [Last Accessed 26th March 2015].
National History Museum, (2013). Spiders in your home. NHM, [PDF] Available at:
http://www.nhm.ac.uk/resources-rx/files/spiders-in-your-home-id-guide-133363.pdf
[Last Accessed: 27th March 2015].
NodeJs,. (2015). Nodejs, [Online] Available at: https://nodejs.org/ [Last Accessed 22nd
April 2015].
Ofir, P., (2009). SIFT – The Scale Invariant Feature Transform. [PDF] Berlin.
Available at: http://www.inf.fu34 | P a g e
berlin.de/lehre/SS09/CV/uebungen/uebung09/SIFT.pdf [Last Accessed: 26th March
2015].
OpenCV., (2015). About [Online Documentation]. Available at:
http://opencv.org/about.html [Last Accessed 12th March 2015].
Oxford Dictionaries. (2015). Centroid. Oxford University Press [Online] Available at:
http://www.oxforddictionaries.com/definition/english/centroid [Last Accessed: March
26th 2015].
Oyallon,E., and Rabin, J,. (2013). An analysis and implementation of the SURF
method, and its comparison to SIFT. Image Processing On Line, Preprint. Available
at: http://www.ipol.im/pub/pre/69/ [Last Accessed: 27th March 2015].
Panchal, P., Panchal, S., and Shah, S,. (2013). A Comparison of SIFT and SURF.
International Journal of Innovative Research in Computer and Communication
Engineering, 1.2.
Pederson, J,. (2011). Study group SURF: Feature detection & description. [PDF].
Available at: http://cs.au.dk/~jtp/SURF/report.pdf [Last Accessed: 26th March 2015].
Pele, O., and Werman, M,. (2010). The Quadratic-Chi Histogram Distance Family.
ECCV’10 11th European Conference on Computer Vision, 6312, pp.749-762.
Puppet Labs,. (2015). What is Puppet?, [Online] Available at:
https://puppetlabs.com/puppet/what-is-puppet [Last Accessed: 22nd April 2015].
SOFTDX, (2014). PicFinder – Image Search. [Computer Program]. Available at:
https://play.google.com/store/apps/details?id=com.softdx.picfinder [Last Accessed:
31st March 2015].
Spark,. (2015). Apache Spark, [Online] Available at: https://spark.apache.org/ [Last
Accessed 22nd April 2015].
W3techs.com,. (2015). Usage Statistics and Market Share of Operating Systems for
Websites [Online] Available at:
http://w3techs.com/technologies/overview/operating_system/all [Last Accessed: 22nd
April 2015].
Wagener, A., and Zettle, R,. (2011). Targeting Fear of Spiders With Control-,
Acceptance-, and Information Based Approaches. The Psychological Record, 61.1
35 | P a g e
7.Bibliography
•
Ahmad, R., Al-Qershi, O,. Hamid, N., and Yahya,A,. (2012). A Comparison between
Using SIFT and SURF for Characteristic Region Based Image Steganography.
International Journal of Computer Science, 9.3.
•
Borghesani, D., Cucchiara, R., Grana, C., and Manfredi, M,. (2013). A fast approach
for integrating ORB descriptors in the bag of words model. Multimedia Content and
Mobile Devices, 8667, pp.
•
Chrome., (2015). Device Mode & Mobile Emulation [Online Documentation]
Available at: https://developer.chrome.com/devtools/docs/device-mode [Last
Accessed: 21st April 2015].
•
Draper, B., and O’Hara, S,. (2010). Introduction to the Bag Of Feature paradigm for
image classification and retrieval [PDF] Available at:
http://arxiv.org/pdf/1101.3354.pdf [Last Accessed: 23rd April 2015].
•
Dubuisson, S., (2010). The computation of the Bhattacharyya distance between
histograms without histograms. Image Processing Theory Tools and Applications, pp.
373-378.
•
George, L., Hadi, R., and Sulong, G,. (2014). Vehicle Detection and Tracking
Techniques: a concise review. Signal and Image Processing: An International
Journal, 5.1.
•
Lajoie, J., Lippman, S., and Moo, B,. (2013). C++ Primer. 5th edition. AddisonWesley.
•
Meade,. (2009). Image Comparison – fast algorithm, StackOverflow [Online]
Available at: http://stackoverflow.com/questions/843972/image-comparison-fastalgorithm [Last Accessed: 22nd April 2015].
•
OpenCV., (2015). Feature Detection and Description [Online Documentation]
Available at:
http://docs.opencv.org/modules/features2d/doc/feature_detection_and_description.ht
ml [Last Accessed: 19th April 2015].
36 | P a g e
8.Appendix
A–ServerSideImages
1
Chris Buddle, http://thebuggeek.com/2012/06/
2
Silvia Reiche, http://www.silviareiche.com/spiders.html
3
G. Bradley, http://www.uksafari.com/zebraspiders.htm
4
Foxglove Covert LNR, http://www.foxglovecovert.org.uk/blog/zebra-spider/
37 | P a g e
5
Arrowguard, http://www.arrowguard.co.uk/pest-types/spiders/
6
Alamy, http://www.theguardian.com/environment/shortcuts/2014/sep/23/hairy-scarylethal-hunters-how-dangerous-household-spiders
7
Alamy, http://www.theguardian.com/commentisfree/2014/sep/23/horny-male-spidersscare-stories
8
Sam Carr, http://justthesam.com/2009/09/house-bath-spider-tegenaria-duellica-akategenaria-gigantea/
38 | P a g e
9
Ray Wilson,
http://www.raywilsonbirdphotography.co.uk/Galleries/Invertebrates/000_invert_imag
es/Arachnid_images/2006-09-22_JS8Q4766_Araneus_diadematus.jpg
10
Planet Earth, http://planetearth.nerc.ac.uk/images/uploaded/custom/orb-web-spider.jpg
11
Andy Horton, http://www.glaucus.org.uk/Spider040.jpg
12
Jana Ripley, http://www.spiderzrule.com/spider053/100_1725_small.jpg
39 | P a g e
13
Granit,
http://www.indianetzone.com/photos_gallery/31/GardenHarvestmen_22220.jpg
14
Vengolis,
http://commons.wikimedia.org/wiki/File:Tailed_daddy_longlegs_spiders.jpg
15
Dalavich, http://commons.wikimedia.org/wiki/File:Harvestman_Spider.JPG
16
Gordon England, http://www.photography.gordonengland.co.uk/photogallery2/d/2416-6/harvestman-5282.JPG
40 | P a g e
17
Big Scary, https://c1.staticflickr.com/3/2451/3600845719_ba9d199db2.jpg
18
David Fenwick,
http://www.aphotofauna.com/images/spiders/spider_amaurobius_fenestralis_08-0613_1.jpg
19
Graham Calow,
http://warehouse1.indicia.org.uk/upload/p17qea3hom10gpvbh1m2j8sq1ms94.jpg
20
Graham Calow,
http://warehouse1.indicia.org.uk/upload/p189dqkf4h16gf106n1vhgdbm1mrsn.jpg
41 | P a g e
B–TestImages
1
Dylan Wrathall, http://2.bp.blogspot.com/kKaa3VqeZ58/UcAs_8VH0xI/AAAAAAAAEZs/4n_voc4-7N8/s1600/Zebra+Spider1+.jpg
2
BJ Schoenmakers,
http://upload.wikimedia.org/wikipedia/commons/c/c6/Salticus_scenicus_%28zebra_spider%2
9%2C_Arnhem%2C_the_Netherlands.JPG
3
Dereila, http://www.dereilanatureinn.ca/woodlands/spiders-glance/imgs/24spider%201.jpg
4
Unknown, http://www.callnorthwest.com/wp-content/uploads/2012/12/House-Spider247x300.jpg
42 | P a g e
5
Derek Hobson,
https://primitivescrewheads.files.wordpress.com/2015/04/conversation-with-themrs.jpg?w=300&h=225
6
Adam Miller, http://www.lancing-nature.bn15.net/natimages/spiders/spider6902.jpg
7
Stockport Nature.Com,
http://www.reddishvalecountrypark.com/communities/1/004/009/311/871/images/455
3739050.swf
8
Carol Horner,
http://m5.i.pbase.com/v3/64/512164/1/50204245.IMG_8528OrbWeaverSpider.jpg
43 | P a g e
9
Bob Kingsley, http://www.myminnesotawoods.umn.edu/wpcontent/uploads/2013/06/orb-weaver-spider-flickr-bob-kingsley-e1370962184796.png
10
The Backyard Arthropod Project, http://somethingscrawlinginmyhair.com/wpcontent/uploads/2008/08/full-leg-spread-harvestman1.jpg
11
Unknown, http://www.animalphotos.me/spider/spider-harv_files/harvestman.jpg
12
David Fenwick,
http://www.aphotofauna.com/images/spiders/spider_dicranopalpus_ramosus_harvestm
an_08-08-13_2.jpg
44 | P a g e
13
Graham Calow, http://warehouse1.indicia.org.uk/upload/medp17pgpvhjm15t12fu4i1vpk17hs5.jpg
14
Graham Calow, http://warehouse1.indicia.org.uk/upload/medp18k7vq3ajed51t48qgt1nca1gor4.jpg
15
Unknown,
http://news.bbcimg.co.uk/media/images/46439000/jpg/_46439615_truewindowlaceweaver(c)perrycornishwww.phocus-on.co.uk.jpg
45 | P a g e
C–TestResultImages
1–Mobile,Android
2-Tablet
46 | P a g e
3–Desktop,Chrome
4-iOS
47 | P a g e
5–InternetExplorer
6-Firefox
48 | P a g e
D–FurtherInvestigation
49 | P a g e