kuantan polisas

Transcription

kuantan polisas
MODELING AND QUERYING ALTERNATIVE PATHS IN KUANTAN
MAZLINA BT MOHAMAD SALLEH
A project report submitted in partial fulfillment
of the requirements for the award of the degree of
Master of Science (Computer Science)
Faculty of Computer Science and Information System
Universiti Teknologi Malaysia
NOVEMBER 2007
“I declare that I have read this project and in my opinion this project has
satisfied the scope and quality for the award of the degree of Master of Science
(Computer Science)”
Signature
:
______________________
Name of Supervisor I
:
Assoc. Prof. Dr. Naomie Salim
Date
:
26 NOVEMBER 2007_____
ii
I declare that this project report is the result of my own research except as cited in
the references. The project report has not been accepted for any degree and is not
concurrently submitted in candidature of any other degree.
Signature
:
……..………………………………….
Name of candidate
:
MAZLINA BT. MOHAMAD SALLEH
Date
:
26 NOVEMBER 2007
iii
To my beloved mom and dad, brothers, sister and friends, thanks for the endless
loves and care,
to the most beloved kids, Daneal, Lissa and Azfar , you’re light of my life
and a very special thanks
to my supervisor, Assoc. Prof. Dr.Naomie Salim for the encouragement and support
iv
ACKNOWLEDGEMENTS
First and foremost, I would like to thank ALLAH s.w.t for all the achievements that
I have gained today.Here, I would like to express my gratitude to my supervisor,
Associate Professor Dr. Naomie Salim for attention, guidance, encouragement and
patient through out this length of study. Not forgetting also to other examiners, Dr.
Siti Zaiton Mohd Hashim and Dr. Ismail Mat Amin for the comments and guidance
towards this project, En Shafry for the ideas and to all the lecturers, please have my
heartfelt thanks and sincere gratitude for all your valuable kindness and may Allah
reciprocate all your good deeds in the best way. I would also like to extend my
deepest appreciation to the staff at Jabatan Ukur dan Pemetaan Pahang, staff at
Kuantan Municipal Council (GIS Department), staff at Pahang Tourism Centre and
also not forgetting En. Fikri Ismail and En.Firdaus Ali from Politeknik Sultan Haji
Ahmad Shah (Polisas) for sharing of information, knowledge, ideas contribution
and comments. To families and friends, the endless love will remain forever and
only Allah can bless your kindness.
v
ABSTRACT
Route finding based on geodetic data had addressed the growing data
management and analysis needs of spatial applications such as Geographic
Information System (GIS). Spatial databases are prominently used in Geographic
Information System (GIS)
application like digital map application. This study
discusses the process of modeling data from map on road network information
consisting of points including the starting and ending road points and intersection
between other road segments. It enables the storage of spatial dataset or the
geographical information syatem (GIS) for calculation of distance between points.
Route finding solution takes the distance information in the form of directed graph
based on starting (source) and ending (destination) nodes for the desired paths. The
graph theory algorithm used in this study is applied from the Floyd’s approach and
using the combination of searching techniques of breadth first search (BFS) and
depth first search (DFS) strategies. The structured query language (SQL) is used for
querying the database structures. A browser interface for the system makes the
information dissemination easier.
vi
ABSTRAK
Pencarian arah jalan berdasarkan data geografi telah memperlihatkan banyak
keperluan analisis berkaitan aplikasi spatial seperti Sistem Maklumat Geografi /
Geographic Information System (GIS). Pangkalan data spatial telah banyak
digunakan dalam aplikasi GIS seperti aplikasi peta digital. Kajian ini
membincangkan proses untuk memodelkan data berkaitan rangkaian jalan daripada
maklumat peta yang mengandungi maklumat lokasi atau titik seperti titik permulaan
dan pengakhiran jalan serta simpang yang terdapat pada sebuah jalan kepada jalan
yang lain, seterusnya membolehkan maklumat tersebut disimpan dalam bentuk set
data spatial untuk membolehkan aplikasi pengiraan jarak dilakukan antara titik
berkaitan yang dikehendaki. Penyelesaian dalam pencarian arah jalan menggunakan
maklumat jarak bagi setiap segmen jalan dari satu titik (nod) ke titik (nod) yang lain
dlam bentuk graf terarah. Algoritma teori graf yang digunakan adalah berdasarkan
pendekatan daripada Floyd serta kombinasi strategi carian mendatar dan mendalam.
Kajian ini juga memilih bahasa pertanyaan berstruktur/ Structured Query Language
(SQL) sebagai struktur pangkalan data. Penggunaan pelayar / browser sebagai
antaramuka memudahkan penyampaian maklumat dan maklumbalas kepada
pengguna.
vii
TABLE OF CONTENTS
CHAPTER
1
2
TITLE
PAGE
DECLARATION
ii
DEDICATION
iii
ACKNOWLEDGEMENTS
iv
ABSTRACT
v
ABSTRAK
vi
TABLE OF CONTENTS
vii
LIST OF TABLES
x
LIST OF FIGURES
xi
LIST OF ABBREVIATIONS
xii
OVERVIEW
1
1.1
Introduction
1
1.2
Problem Background
2
1.3
Problem Statement
3
1.4
Objectives
3
1.5
Project Scope
3
1.6
Significance of Study
4
1.7
Summary
4
LITERATURE REVIEW
2.1
2.2
5
Introduction
5
Graph Theory
5
viii
3
2.2.1
Graph Characteristics and Models
6
2.2.2
Road Network Modelling
7
2.2.3
Graphs in Database
9
2.3
Path Algorithm
9
2.4
Searching Technique Analysis
13
2.4.1
Breadth-First Search (BFS)
13
2.4.2
Depth-First Search (DFS)
15
2.4.3
Best First Search
16
2.5
Structured Query Language
16
2.6
Spatial Database
17
2.7
Digital Map
20
2.8
Kuantan Digital Map
20
2.9
Discussion
20
METHODOLOGY
22
3.1
Introduction
21
3.2
Research Framework
21
3.2.1
Extracting map information
23
3.2.2
Modeling selected data
25
3.2.3
Selecting Suitable Query and Algorithm
29
3.2.4
Result Analysis
32
3.3
Instrumentation
32
3.3
Conclusion
32
ix
4
5
RESULTS
34
4.1
Introduction
34
4.2
Initial Findings
32
4.3
Experiments
35
4.4
Findings
40
4.5
Conclusion
42
CONCLUSION
42
REFERENCES
45
APPENDICES
47
Appendix A – List of nodes(places)
Appendix B – List of edges (source and destination)
x
LIST OF TABLES
TABLE NO.
TITLE
PAGE
2.1
Terms in Graph
8
3.1
Sample data from table location
28
3.2
Sample data from road_graph
29
3.3
List of hardware used
32
4.1
Sample of retrieval query in details
36
4.2
Sample of retrieval query in details
39
4.3
Summary of 8 query processing
41
xi
LIST OF FIGURES
FIGURE NO.
TITLE
PAGE
2.1
Single and double direction road representation
9
2.2
Breadth-first Search (BFS)
14
2.3
Depth-first Search (DFS)
15
2.4
Example of topological data
19
2.5
Example of data storage for arc
19
2.6
Google facilities in assisting direction findings
20
3.1
Stages in database design
23
3.2
Research Framework
23
3.3
Kuantan Map
24
3.4
Map of Kuantan town
25
3.5
Kuantan town network road
26
3.6
Nodes and arc representation
27
3.7
Sample query from user through browser
31
4.1
Sample of data derived
38
xii
LIST OF ABBREVIATIONS
GIS
Geographical Information System
SQL
Structured Query Language
RDBMS
Relational Database Management System
JUPEM
Jabatan Ukur dan Pemetaan Pahang
BFS
Breadth-First Search
DFS
Depth-First Search
CHAPTER 1
OVERVIEW
1.1
Introduction
Routes became an interesting topic when associated with travelling matters.
Routes are generally associated with map, road network model and geographic data.
A great number of studies have been discovered in developing the techniques for
finding alternatives routes, the ‘best’ routes or the optimal routes within the road
network, depending on the weights given. For example, people tend to use the same
route to travel from home to office because of the road length or other conveniences
but sometimes, they have to use other alternative routes due to occurrences of events
such as traffic congestion, construction work or road damage. In pre-trip planning,
for example, drives tend to take one path for the outbound trip and then take another
path for their return due to any reason such as for different sightseeing or attending
other functions at loca'tion that might intersect with the selected return path. Then, it
comes the questions, Which way should be taken? Or, how can I get there within
that path? Or, which routes can be the shortest routes from A to B?
Finding routes can be implemented by applying a set of algorithm and using
a mathematical based graph theory. This study presents the model of road network
based on graph structure using the MySQL database. The basic data of the road is
represented using nodes as points of locations and edges or arc as path. It consists of
source node as starting point and the end node as the destination. This study also
embed the use of spatial data such as positioning (x and y) of certain locations. With
2
the development of geographic information system (GIS) technology, network
transportation analysis within a GIS environment has become a common practice in
many application areas (Roozbeh et al., 2003). For this query, user has a definite
destination in mind and desires to acquire the optimal route leading to the
destination. The queries can vary and can answer few questions such as “Which is
the shortest route from Kuantan Airport to Cherating” or “How many routes can be
taken from Teruntum Complex to Teluk Chempedak”.
Motivations for this study came from the desire of manipulating spatial GIS
data that can be represented by advancement of database technology based on graph
theory application. This study describes one such extension, where database
technology is used to implement path queries over a graph view of relational data.
Partial-path information is pre-computed and stored based on Structured Query
Language (SQL). Path querying is implemented using SQL functions, thus enabling
the retrieved path tables to be manipulated within SQL queries in the same way as
standard relational tables.
1.2
Problem Background
Finding paths or routes has become an interest area, as part of the GIS
application or Spatial Database application. Many applications can be applied based
on using technique for finding direction such as in area of transportation network
and travelling guideline. Most of the application on finding paths for relevant
application such as map application focused on finding the best path or the shortest
path acquired. The well known web services such as Google or Yahoo provide
retrieval path based on the shortest path finding algorithm, comprising the best path
according to the minimal distance for required length of path. In the normal
situation of route finding, the path from one location to others, for example from
location A to location B could produce more than one path, let say 5 paths and from
the 5 retrieval paths, one path could be the shortest path in terms of minimal length
3
distance, another path could be the fastest route in terms of minimal time of driving,
another path could be the free flow of traffic but took longer distance from the
others and the others might be the path with the good sightseeing of views. Thus, it
provides choices for user to choose the paths that suit their needs and requirements
throughout the trip as well as to assist road users in decision making for the cost
effectiveness of time and distance.
As being mentioned above, the algorithm used in most web services for path
finding was based on the minimal distance that considered as ‘best’ distance
whereas the consideration should be given for users who would like to know the
entire possibilities path between set of location. In recent decades, road network
system has become complex and congested that affect the people conveniences, thus
this situation has derived this study to find the alternative solution for users in path
findings through the geodetic data.
According to the JUPEM (Pahang branch), GIS application on Kuantan
digital map for path finding is more directed to give one solution, that based on
main road and minimal distance. For example, to find path from Teruntum Complex
to Teluk Cempedak, the system compute the ‘best’ path according to the path that
has high priority such as main road and has a minimal distance of length whereas, it
could be has several paths that can be selected by user. This situation also has
derived this study to provide the flexibility of path choices from a set of selected
location.
1.3
Problem Statement
This research is to study the feasibility and effectiveness of using the query
support in relational database system (RDBMS) for information retrieval on finding
the routes or alternative paths between two entities with the adaptations of path
4
algorithms described in the literature. In this research, a framework for a data
modeled, algorithm selection and queries retrieval technique is developed.
1.4
Objectives
1. To build a data model for storing Kuantan map information on road
network.
2. To apply graph algorithm for finding alternative paths and their lengths
between two locations using Floyd-Warshall approach towards spatial
dataset.
1.5
Project Scope
1. The technique of storage and retrieval will be supported by relational
database management system (RDBMS) and structured query language
(SQL).
2. The data model is based on mathematical graph theory approach.
3. The graph theory algorithm is based on adaptation from Floyd approach
that embedded to the SQL schema.
4. The data set and area of study is based on part of Kuantan town map,
consisting 24 points of locations and 61 edges of direction.
5
1.6
Significance of study.
This study is necessary to support the use of mathematics graph theory and
algorithm in paths finding using the set of database. This study is hoped to be
another contribution in area of data storage application and information retrieval for
local set data of geographic items. The use of algorithm in finding alternative paths
can be an additional function to the existing application of path finding especially
for Kuantan dataset.
1.7
Summary
This report consists of five chapters. This first chapter presents the overview
of the project comprising the general introduction of the problem background and
the scope of study. The problem background has described the current situation of
path retrieval for general and Kuantan map applications, thus explain the drive of
this study. Chapter 2 includes the related literature reviews towards the study,
chapter 3 present the project methodology and workflow process, chapter 4 analyzes
the findings and chapter 5 comprising conclusion and suggestions for future work.
6
CHAPTER 2
LITERATURE REVIEW
2.1
Introduction
This chapter will discuss literature reviews that set the background of this
study. The reviews revising the study of graph theory and it’s related with road
network modeling, spatial database modeling, path algorithm foundation, and SQL
usage and map study.
2.2
Graph Theory
Basically, the construction of the data is based on mathematical graph
theory. Basically, network in graph theory is defined as a directed or undirected
graph and can be written as G = (N, A) consisting of:
•
N: A set of nodes or vertices consisting of discreet points.
•
A : A set of edges or arcs consisting connections between the vertices, which
can be either directed or not and always associated with numerical values
Together with this structural definition, algorithms also generally need to know
about properties of these elements. For example, the length, travel-time or general
cost of every edge needs to be known. Mathematically, this is denoted as a function
l which maps edges to real numbers: l: E -> R. For example, the length of an arc
7
connecting nodes i and j, can be denoted as l (i, j). This is indirectly used to decide
how far a vertex is from another (Ahuja et al., 1993).
According to Celko (2004), graph theory is a branch of topology and the
study of geometric relations, can be ideal for modeling hierarchies like family trees.
Example of applications that implement a graph principle is organizational charts,
language rules, and route maps. Figure 1 shows the example of simple graph
consisting relational nodes and edges between them.
Figure 1 : A Simple Graph
Let the set of nodes in Figure 1 as N, the set of edges = L, graph = G. Then, the
tuple or ordered pair {N, L} can be defined as:
N = {A, B, C, D, E, F},
2.2.1
L = {AC, CD, CF, BE},
G= {N, L}
Graph Characteristics and models
Basically, these terms are useful in referring to the graph application in this
study.
Nodes and edges
Two nodes are considered as adjacent if there is an edge
between them and connecting to a common node. In directed
graph, the number that entering a node is its in degree and the
number leaving is it’s out degree.
Path and cycle
A connected sequence of edges is a path, its length the number
8
of edges traversed. Two nodes are connected if there is a path
between them. If there is a path connecting every pair of
nodes, the graph is a connected graph. A path in which no
node repeats is a simple path and a path which returns to its
own origin without crossing itself is a cycle or circuit. A graph
with multiple paths between at least one pair of nodes is
reconvergent. A reconvergent graph may be cyclic or acyclic
Traversing graphs
There are two main approaches, breadth-first and depth-first.
Breadth-first traversal visits all a node's siblings before moving
on to the next level, and typically uses a queue. Depth-first
traversal follows edges down to leaves and back before
proceeding to siblings, and typically uses a stack.
Sparsity
A graph where the size of E approaches the maximum N2 is
dense. When the multiple is much smaller than N, the graph is
considered sparse.
Trees
A tree is a connected graph with no cycles. It is also a graph
where the in degree of the root node is 0, and the in degree of
every other node is 1. A tree where every node is of out degree
<=2 is a binary tree. A forest is a graph in which every
connected component is a tree.
Euler paths
A path which traverses every edge in a graph exactly once is
an Euler path. An Euler path which is a circuit is an Euler
circuit.
Table 2.1 Terms in Graph
2.2.2
Road Network Modeling
The modeling of road networks is strongly connected with graph theory
(Hofnman et al., 2003) and can be represented as a directed graph. According to
Garofalakis et al., (2006), it is possible to design a graph that corresponds to the
9
road network by using the GIS data that represent the road network from certain city
map.
According to fundamental definitions of graph theory from Gibbons (1985),
a graph or specifically a directed graph can represent a map that depicts the road
network of certain places such as city or country. A directed graph or digraph G can
be assigned as an ordered pair G: = (N, E) with a set of nodes N with |N| = n and set
of ordered pairs of nodes E with |E| = m, call directed edges. Each node of the graph
represents an intersection or terminal point of the roads. Each edge from one node to
another represents a directed link between two adjacent intersections or between an
intersection and a terminal point. A road of single direction or normally known as
one way direction is represented with a directed edge to the particular direction
while a road of double direction or two ways direction is represented with two edges
to both directions as shown in Figure 2.
Figure 2.1 Single and double direction road representation
In this case, a graph also consist a weight or cost for each edge of the graph
and in this case of study, we assume that the cost is the length from one node to
another through the edges. A path from starting to destination node is a sequence
from adjacent nodes and adding up the cost of all edges of the respective path will
give a result of total cost or length.
10
2.2.3
Graphs in Databases
According to Erwig and Gutting (1994), each data model has its own
facilities to represent relationship among objects. The reasons for considering
graphs in representing data model is that, the real life problem can be directly
expressed in terms of graph concepts (paths, spanning, trees) and most of the
problems can be solved by adaptation with the suitable algorithms. By reference to
the graph theory, data can be represented as a series of node and edges, nodes
represent an object of interest and edge indicates a relationship between two nodes.
Nodes and the edges then will be the attribute of field or object that can be modeled
using database schema.
The implementation example of using data set in route planning are such
being offered by MapQuest and any other GIS application, for example, works by
Wong et al.,(2004) and Okyere (2000) contributed to the studies in using graph
theory for dataset. Other than that, implementation of using graph in series of
dataset can be varies such as in life sciences or biological data representation.
According to Stephens et al., (2004), graphs can enable complex networks to be
visualized in a straightforward manner that captures the structure of the system and
they can support the hierarchies of information that well suited for modeling the
different level of biological.
The increasing number of data sources in size and complexity has make it
more importance to be managed as a graph representation in a relational database
management system (RDBMS) that can offer users the ability to store data in
secure, highly available and scalable environment.
2.3
Path Algorithm
Path algorithm has been used widely in application on finding routes such as
to find the shortest routes, finding optimal routes, to find the ‘best” routes and also
11
to find alternative routes towards the traversal search. Sometimes, people try to find
the best routes in terms for certain weighted such as time, cost, distance or any
combination from these terms. Many studies have been done in discussing the path
algorithm used in various application, for example, work from Scott and Bernstein
(2000) had discussed a constrained shortest path problem that can be used in
generating alternative paths and also from Roozbeh, et al., presented the evaluation
of route finding methods between three algorithm, Dijkstra’s Algorithm, Heuristic
Methods and Genetic Algorithm.
According to Saltenis (2001), path problems can be categorized as a single
source destination, single pair and all pairs. Single source destination means path
between from one given source (vertices) to each of (destination) vertices. Single
pair comply the situation on finding path by given 2 vertices, source and destination.
All pairs meaning finding path for every pair of vertices and normally by applying
the dynamic programming algorithm. This study will concentrate variants on
finding path for single pair by given origin node and destination. Given an input
graph, a source node and destination node, the single pair algorithm will return
paths between the 2 nodes through the graph traversing.
Works from Hamill and Martin (2003) have modified the algorithm for path
finding based on Hierarchical Encoded Path View (HEPV) by Jing, Huang and
Rundensteiner and Zhang’s disk-based Dijkstra’s (diskSP). This algorithm took 3
important elements as an inputs, that’s graph, source node and destination node as
an n with the weighted between related node, thus the path returns were reachable
from the source node via a path no longer than n and noted that in each case, the
input graph can be generated from ordinary relational data: nodes correspond to
entities in the database and path correspond to the connection between entities. This
study basically is referred to this algorithm guideline but with an approach and
adaptation from Floyd Warshall algorithm.
12
2.3.1 Floyd-Warshall Fundamental
Floyd–Warshall algorithm is one of the graph analyses of path findings. It is
a graph analysis algorithm for finding shortest paths in a weighted, directed graph. It
also known as Roy–Floyd algorithm, since Bernard Roy described this algorithm in
1959 (Floyd and Warshall, 1962). The Floyd–Warshall algorithm is an example of
dynamic programming. The Floyd-Warshall algorithm compares all possible paths
through the graph between each pair of vertices. To find shortest paths between pair
of vertices, it will incrementally improve estimation on the shortest path between
two vertices, until the estimation known to be optimal. The pseudo code below was
a heart of Floyd and yet as the basis of finding alternative paths. For example, the
pseudo code below is based on given directed graph G = (V, E), weighted with edge
costs and consist of pairs of all vertices from u to v (u, v). All weights are assumed
as non-negative numbers and the cost of a path will be the sum of the costs of all
edges in the path. The cost c (u, v) is assigned to each of the pairs for all possible
pair’s u and v in the graph. Let:
c (u, v) = the given edge cost if edge (u, v) exists
c (u, v) = infinity if there is no edge (u, v) in the graph
Assumed that the vertices are labeled or indexed using integers ranging from 1 to n
and let cost[i,j,k] hold the cost of least cost path between vertex i and j with
intermediate nodes chosen from vertices 1, 2….k. The traversing will check each
vertice for path finding and print the node found starting from the root node,
avoiding printing the second node if it is same as first, move to the next node, repeat
the same procedure and so on. The pseudo code below was implemented for C
language.
Floyd-Warshall Pseudo code:
for i: = 1 to n do
for j: = 1 to n do
cost [i,j]:= c [i, j]; // let c [u,u] := 0
next [imp] :=j
for k := 1 to n do
13
for i := 1 to n do
for j := 1 to n do
sum = cost [i,k] + cost [k,j];
if (sum < cost [i, j]) then
cost [i,j] := sum;
next [i,j]:= next[i,k];
// To write out the path from u to v :
w := u;
write w;
while w != v do
w := next [w,v];
write w;
2.4
Searching Technique Analysis
Basically, graph traversing is related with the searching strategies in order to
provide the retrieval path needed from certain query. The implementations of
searching strategies were based on the acquired goal from the traversing. The
discussion below consisting the techniques that will be using in this study.
2.4.1
Breadth-First Search (BFS)
A Breadth-First search (BFS) is a method that traverse a graph by touching
all the reachable nodes from a given source node and considered as search algorithm
which optimizes breadth-first search by expanding the most promising node chosen
according to some rule. Pearl (1984) described BFS as estimating the promise of
node n by heuristic evaluation function f(n), that depend on the general description
of n, description of goal and the information gathered by the search up to that point.
14
The BFS traversing implementation starts from the source node, which was
assigned at level 0. At the first stage, all nodes at level 1 will be visited followed by
nodes at level 2 for the second stage and continuously to the next level and so forth.
The BFS searches the entire graph and visiting every node until it finds it goal and
terminate. BFS normally labeled each node with a given distance and number of
links from the start node and using First In First Out (FIFO) method to add the
nodes obtained from the queue. A sequence of searching is described in Figure 2.2.
.Assuming that A is a starting node, the traversing go to the node B, C and D as
these 3 nodes were at a same level, then it goes to the node E, denoted as a child of
B and next leave for node F. Here, E and F were considered al level 2 and F is a
child of C. The traversing goes to G as child of E and has a same level as H at level
3 and lastly, the traversing ended at node I as the child of H and being at the lowest
level.
Figure 2.2 Breadth-first Search (BFS)
The general algorithm of BFS can be written as below:
1. Put the root node and ending node in the queue. (define the source and
destination node)
2. Pull a node from the beginning of the queue and examine it.
a. If the searched goal is found in this node, the search terminate
and return the result
b. Otherwise, push all the unexamined nodes (direct child nodes) if
any to the end of the queue
15
3. If the queue is empty and each node has been examined, quit search and
return no result.
4. Repeat from step 2
2.4.2
Depth-First Search (DFS)
Depth-First Search (DFS) is an algorithm for traversing or searching a tree,
tree structure or graph. Compared to the BFS, Depth-First Search (DFS) starts at a
start node or at the root and explore as far as possible along each branch before
backtracking. Formally, DFS is an uninformed search that progresses by expanding
the first child node of the search tree and going deeper until a goal node is found or
until it reach node that has no child. For example, the traversing start at start node as
S in G, which then becomes the current node. The algorithm then traverses the graph
by any link (u, v) incident to the current node u. If the link (u, v) leads to an already
visited node v, then the search backtracks to the current node u. If, on the other
hand, link (u, v) leads to an unvisited node v, the algorithm moves to v and v then
becomes the current node. That is, it will pick the next adjacent unvisited node until
it reaches a node that has no unvisited adjacent nodes. The search proceeds in this
manner until it reaches a dead-end. At this point, the search starts backtracking and
the process terminates when backtracking leads back to the start node. Figure 2.3
shows a DFS applied to an undirected graph, with the nodes labeled in the order
they were explored.
Figure 2.3 Depth-first Search (DFS)
16
2.4.3
Best First Search
The Breadth-First search is able to find a solution without getting trapped in
dead-ends, while the depth-first algorithm finds a solution without computing all of
the nodes. The Best-First search allows us to switch between paths thus gaining the
benefit of both approaches. It is a combination of DFS and BFS, which optimizes
the search at each step by ordering all current adjacent nodes according to their
priority as determined by a heuristic evaluation function. The search then expands
the most promising node which has the highest priority. If the current node
generates adjacent nodes that are less promising, it is possible to choose another at
the same level. In effect, the search changes from depth to breadth. The heuristic
evaluation function predicts how close the end of the current path is to a solution.
Those paths that the function determines to be close to a solution are given priority
and are extended first. A priority queue is typically used to order the paths for
efficient selection of the best candidate for extension. In summary, since the DFS
and BFS exhaustively traverse the entire graph until they find the goal, they are
categorized as uninformed searches. In contrast, the Best-First search utilizes a
heuristic to reduce the search space and is able to find the goal more efficiently and
is categorized as informed search.
2.5
Structured Query Language
A query language provides the means to access and manipulate data in the
database. Structured (Standard) Query Language (pronounced SEQUEL) was
developed by IBM in 1970s and now being a de facto and de jure standard for
accessing relational databases.
Three types of usage comprise of standalone
queries, high level programming and embedded in other applications. Structured
Query Language (SQL) is one of the popular query languages to express typical
spatial queries within GIS capabilities. For spatial queries from SQL/ OGIS itself,
17
the standard had been adopted by many vendors such as Oracle, MySQL and
PostgreSQL, only differs for the syntax and the choices for spatial data types and
operations is similar (R.Larson, 2007). This study will use MYSQL query language
for the database structure and the implementation of graph searching. The spatial
query that require the presence of network structure in the geographic spatial data in
this study focused in finding path between or routes between origin and destination
location such as “What is the shortest route from IKIP college to Kuantan airport”
or “List paths that can be taken from Teruntum Complex to Teluk Chempedak”.
2.6
Spatial Databases
Spatial data is defined as location-related data in an object. It stores spatial
objects and spatial relationships between these objects. Road map is a common
example of spatial data that contains points, lines and polygons to represent cities,
roads and political boundaries such as provinces. This spatial data is used to project
the location of the objects into a two-dimensional with the support from other
application such as GIS in data storage, retrieval, updating and providing queries. In
general, a Geographic Information System may be defined as a computer-based
information system which attempts to capture, store, manipulate, analyze and
display spatially referenced and associated tabular attribute data for solving complex
research, planning and management problems (Fischer and Nijkamp, 1993). Other
types of spatial data are such as computer-aided design (CAD) and computer-aided
manufacturing (CAM) (Ravada, 2007). The emerging of spatial technology seen the
use of modern Database Management Systems (DBMS) for multiple users and
sharing (Zhou et al., 2001).
Applying graph concept in spatial databases is more likely on modeling or
visualizing the data connections such as to model the network between roads or
highway or other trails that being kept already in a form of spatial data such as the
geometry , point or polygon. According to Ewig and Gutting (1994), spatial
18
networks can be modeled in terms of graphs. Nodes and edges can carry the
geometric information, for example, a point may associate with a node and
polygonal line can be associated with an edge. Explicit paths are available as entities
in a graph and this is important since objects can always correspond to paths in a
network.
For this study, we’re looking at relationship from the topology approach of
spatial dataset. According to Foote and Huebner, topology is one of the most useful
relationships maintained in many spatial databases. It is defined as the mathematics
of connectivity or adjacency of points or lines that determines spatial relationships
in a GIS. The topological data structure logically determines exactly how and where
points and lines connect on a map by means of nodes (topological junctions). The
order of connectivity defines the shape of an arc or polygon. The computer stores
this information in various tables of the database structure and GIS manipulates,
analyzes, and uses topological data in determining data relationships.
Network analysis uses topological modeling for determining shortest paths
and alternate routes. For example, a GIS for emergency service dispatch may use
topological models to quickly ascertain optional routes for emergency vehicles.
Automobile commuters perform a similar mental task by altering their route to
avoid accidents and traffic congestion. Likewise an electrical utility GIS could
rapidly determine different circuit paths to route electricity when service is
interrupted by equipment damage. Similarly, political redistricting planners could
use certain algorithms to determine logical relationships between population groups
and areas for district boundaries.
Figure 2 below show the example on how the topology is represented or
modeled, and connections between nodes are coded into a database. The first step is
to record the location of all "nodes," that is endpoints and intersections of lines and
boundaries. Figure 2.4 showed the example of topological data consisting of 5 nodes
and their attribute of dataset of latitude and longitude.
19
Figure 2.4 Example of topological data
Based upon these nodes, "arcs" are defined. Figure 3 showed the
relationship of arcs and the points of node. These arcs have endpoints, but they are
also assigned a direction indicated by the arrowheads. The starting point of the
vector is referred to as the "from node" and the destination the "to node." The
orientation of a given vector can be assigned in either direction, as long as this
direction is recorded and stored in the database.
Figure 2.5 Example of data storage for arc
By keeping track of the orientation of arcs, it is possible to use this
information to establish routes from node to node or place to place. Thus, if one
wants to move from node 3 to node 1, we can locate the necessary connections in
the database.
20
The implementation strategy behind this is to offer special data structures for
the representation of the graphs that allow the traversing between nodes. Graph
operations are to be implemented on the basis of efficient graph algorithms for the
spatially embedded networks.
2.7
Digital Map
Map can be described as a rule based abstraction of reality, which is
intended to convey information. The map is a result of applying rules to objects on
the earth’s surface and translating them into a graphical and informational
representation. (Browne and Jackson, 2004). Digital map data can also being
defined as a map detail held in the form of national grid coordinate values and codes
which can be stored and manipulated on computer (Ordnance Survey of UK, 2007).
As in previous section, digital map is always associated with GIS application and
being part of any GIS application. The information derived from map for this study
consisting of road network and positioning of locations that related to the road
network.
2.8
Kuantan Digital Map
There are few applications on retrieval of information towards Kuantan
Digital Map through the internet such as provided by Jabatan Ukur dan Pemetaan
Malaysia (JUPEM), MapQuest (http:// www.mapquest.com), Yahoo maps guide
(http://www.mapsguide.org), Kuantan Online (http://www.kuantanonline.com) and
Google map application (Figure 2.6). There is also application from GIS regarding
the retrieval information of Kuantan map and the application was developed based
on MapInfo features. By referring to the path finding solution from this application,
the technique used is more on solving the shortest path that give a minimal length of
distance and driving time. The scope of application covered the whole area of
21
Pahang, including each type of road, consisting of highway, federal road, main road,
country road, town road and streets. The disadvantage is that not all of the location
being labelled and stored, making the retrieval least accurately for certain location
search. It drives this study to be conducted that hoped can be an alternative solution
of path findings for set of local map.
Figure 2.6 Google facilities in assisting direction findings
2.9
Discussion
This chapter revising on studies that contributes for this study. It comprise
the study of mathematical graph theory that can be implemented within the database
structure, the suitable path algorithm in solving the single pair problems, the
analysis of suitable searching strategies, the use of SQL as database structure and
the overview of digital map. This chapter also discussed the encountered
disadvantages from the available system of path findings for map application. As
being discussed above, the path finding is more towards for shortest path with
minimal distance and did not consider for other possible path. It derives this study
to give the flexibility on choosing the retrieval of alternative paths for the set of
location.
22
CHAPTER 3
METHODOLOGY
3.1
Introduction
This chapter discusses the methodology that include steps taken and used in
modeling the spatial data, data pre-processing, graph representation within the SQL
database schema and selection of query language to run the searching algorithm and
retrieving the result based on parsing query. This chapter discusses the steps taken
started from modeling the raw source of spatial resources that came from digital
map of location. This study will focus on spatial data for locations or point of
interest in certain sample road networks. Data used for this study are based on
certain area of Kuantan town.
3.2
Research Framework
The development of database will be based on the basic stages, shown in
Figure 3.1 ( Longley, 2005). The conceptual model will model the user’s view and
applications requirements, define the objects and relationship according to the
geographic representation. At the stage of logical model, the model will match to
geographic database types and geographic database structure will be organized
using the normalization approach and lastly, it will be designed towards the
database schema of specific physical model.
23
Figure 3.1 Stages in database design
This study will be conducted according to the procedures in Figure 7. As
being mentioned in the previous section, the development of data model will started
with data model from the map, the calculation between selected points, modeling
the graph theory on the dataset using the database structure and applied the
searching techniques on finding the alternative paths.
Extracting map information
Modeling selected data
Selecting Suitable Query and
Algorithm
Result Analysis
Figure 3.2 Research Framework
24
3.2.1
Extracting map information
Information related was based on map from Jabatan Ukur dan Pemetaan
Malaysia, Pahang, Kuantan Municipal Council and Kuantan Tourism Information
Center. For this sample application, information extracted from Kuantan map were
related towards the road network and related point of interest associated with the
selected roads. Figure 8 shows the origin map that covered for Kuantan area with
scale of 1 : 90 000.
Figure 3.3 Kuantan map
From the origin map, little spatial information was selected. Since this
application were concentrate on finding alternative routes from one to another
destination, selected data were concerned on point of interest and related road
25
between this point of interest. Figure 3.4 shows the part of area from Kuantan town.
Like any other places, we could see the road network that connect from one place
to another places and normally, there are more than one path or route connecting the
2 places or from start to ending destination.
Figure 3.4 Map of Kuantan town
For specific purpose of this study, road network will be based on town area
and the data were modeled based on tourism point of view. Figure 3.5 shows the
related and important network road for Kuantan area from the tourism perspective.
Point of interest such as hotels, road junction, road intersection, traffic light and
public spot denoted as nodes and road between the point of interest denoted as an
edges or arcs.
26
Figure 3.5 Kuantan town network road
For this study, we choose 24 points of interest denoted 24 nodes consist of
important spot in Kuantan that tightly related with the main road in town area,
represent 34 edges or arcs with 61 road direction consist of road segment between
the nodes.
3.2.2
Modeling selected data
Point of interest from map application consist of building, properties, tourist
attraction places such as beaches and public spot area. Road network consist types
of road and things associated with the roads. For this purpose of study, we will only
concentrated on the spatial road attributes such as the positioning of each respective
nodes along the roadside and other important geographic items of the respective
edges and nodes. From Figure 3.5, the related point of interest and road network can
be represented as in Figure 3.6. Points of interest or location were defined as nodes,
n and road path between nodes were defined as edges or arc, connecting the nodes.
For this study, 24 important locations were marked as nodes, linking with 34 edges
within the 61 direction consist of single and double direction of the road segment.
The nodes are marked with A to X, represent the locations of road junctions, traffic
lights, shopping complex, mosque, beach and hotels The adjacent between nodes
27
were defined by the incoming path or outgoing path for the direction, representing
one way or 2 ways of road type. For this scope of study, the edges with single
direction that represent a one way road are edges of DG, GH, HK, KL, LI, IH, IC.
With the reference from graph theory fundamentals, the list of nodes and edges from
the graph can be written as notation below.
Figure 3.6 Nodes and arc representation
Let the set of nodes in Figure 12 as N, the set of edges with single and double
direction = L, graph = G. Then, the tuple or ordered pair {N, L} can be defined as
:
N = { A, B, C, D, E, F,G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X },
L = { AB, BA, AC, CA, BE, BE, BD, DB, DC, CD, DG, EF, FE, EM, ME, FG, GF, FJ,
JF, GH, IH, IC, HK, KL, LI, MN, NM, JN, NJ, JK, KJ, LO, OL, NO, ON, MP,
PM, PS, SP, PQ, QP, QT, TQ, TV, VT, SV, VS, TU, UT, QR, RQ, OR, RO,
RU,UR, UW, WU, VW, WV, WX, XW},
Thus, G = {N, L}
The database model for related data was representing in the form of tables
and sample record as below.
Table 1: namely, location stored data regarding
location, the name of location, detail description of location and their spatial data
28
comprising of the grid position (x and y). Table 2 : road_graph stored data on the
nodes connectivity based on the adjency list of related nodes as mentioned in
previous notation set of vertices, comprising the source and destination nodes and
length that represent the distance between nodes. The distance calculation based on
Euclidan formula, that’s:
distance = √ ( x1 – x2 ) 2 +
locateID
A
B
C
D
(y1 – y2) 2
name
lat_x
Kemunting
10
Traffic Light,
16
Junction Jln Wong
Ah Jang of Jln
Penjara
Kuantan Esplanade
14.2
Teruntum Complex
14
Table 3.1 Sample data from table location
long_y
4.5
13.5
4
8
The standard query of SQL has been used to create tables as being listed
below, inserting data and populate results for selected nodes . For this study, we
proposed of using MySQL version 5 to manage the database structure, along with
PHP script and browser capabilities to represent the result and queries.
CREATE TABLE location
( locateID varchar(10) NOT NULL,
name varchar(255) NOT NULL,
lat_x float (10,2) NOT NULL,
lat_y float (10,2) NOT NULL,
PRIMARY KEY (locateID) ) ;
CREATE TABLE road_graph
( source varchar(10) NOT NULL,
destination varchar(10) ) NOT NULL,
length float (10,2) NOT NULL,
PRIMARY KEY ( source, destination) ) ;
29
Length between the same nodes has been addressed as 0, to avoid the miss
calculate between same nodes. Field of source and destination from road_graph is
referring to the locateID from table location and denoted by label A to X for this
purpose of study. The length field contains the distance between the adjency of
related nodes and is calculated by using the distance calculation formula as
mentioned. Other related table including table of road consist detail of road such as
name and the edges. The details are provided in Appendix.
source
A
A
A
B
B
B
C
3.2.3
destination
length
B
7
C
5
A
0
D
0.5
E
1.5
B
0
D
0.2
Table 3.2 Sample data from road_graph
Selecting Suitable Query and algorithm
Based on the table of road_graph, we proposed to store the retrieval result
from the node searching and traversing in temporary tables, contain the path and
length calculated. The algorithm used is based on the Floyd’s-Warshall algorithm
procedure, adaptation from the algorithm in solving shortest path problems. This
study will refer to the pseudo code on solving problems as being discussed in the
previous chapter but with the suitability on finding alternative paths. For this study,
we use the transitive closure of Warshall algorithm along with the searching
technique of breadth-first-search (BFS). The selected query will focus in finding any
possibilities of paths based on source and destination nodes required by user, thus
the algorithm only focused on the single pair path.
30
The table for input source and destination nodes assumed as #graph. This
script assume that graph may not be tree, thus it will check all the possibilities nodes
based on given source node and the connections from the adjacent list until it find
the destination node. The searching started with the root node (source) and get
reachable node from the adjency list, it will check the child node or finding paths of
length 1 (edges), if not find the destination node, down to sub child until found the
destination node, get back to the root, turn to other child or sibling node at the same
level first, until find the desired destination node, then back to the root and start
with other sibling, down to the child and so on until finish the iteration. The simple
SQL statement are such below.
create table #graph ( id int primary key )
insert into #graph values ( @root_id )
while ( @@rowcount > 0 ) begin
insert into #reached (id ) select distinct child_id
from edges e join #graph p on p.id = e.parent_id
where e.child_id not in ( select id from #graph )
end
For this study, the suitable SQL syntax for MySQL has been used based on the
adaptation of above procedure as below . Assuming that nodes involved in this
graph is 1 to N. The temporary table to store the retrieval paths results denoted as
paths and table that store the adjency list is road_graph, consisting the field of
source, destination and length for each edges.
INSERT INTO paths
SELECT road_graph1.source,road_graph2.source …., road_graphN-1.destination,
(road_graph1.length + road_graph2.length + … road_graphN-1.length)
( CASE WHEN road_graph1.source
NOT IN
( road_graph2.source, road_graph3.source … road_graphN-1.source)
THEN 1 ELSE 0 END
31
.
.
.
+ CASE WHEN road_graphN-1.source
NOT IN
( road_graph1.source, road_graph3.source … road_graphN-2.source)
THEN 1 ELSE 0 END )
FROM road_graph
WHERE road_graph1.source =’source_node’
AND road_graph1.destination = road_graph2.destination ……
AND road_graphN-1 = ‘destination_node’ ;
By using above approach, the maximum number of node is put in the table paths as
a final path taken for each iteration of traversing nodes. The table paths hold the
retrieval results and the distance length for each traversing or searching. Then, the
results will be refined and filtered to get the final alternative paths by eliminate the
redundancy and the traversing more than once for each node. Browser application
has been used to receive the query from user such as shown at Figure 13.
Figure 3.7 Sample query from user through browser
32
3.2.4
Result Analysis
Basically, results from query processing were stored in the table and need to
be refined for the end user. The selected queries were used in filtering the desired
result using SQL function. The result was grouped by its ranking of the length
calculation or total path or node of traversal. The result analysis will be detailed
explained in Chapter 4.
3.3
Instrumentation
The instrumentation involved in this study can be divided into two types,
which is hardware and software.
i) List of hardware that used in this project:
Specification
AMD Turion 64X2 (1.8 Ghz)
Processor
1.0 GB
Memory
120 GB
Hard Disk
13”
Monitor
Table 3.5 List of hardware used
ii) List of software that used in this project:
a. MySql Version 5.1
b. PHP Admin
c. Webserver (Apache)
d. Microsoft Word (Office XP)
e. Microsoft Access(Office XP)
f. Microsoft Visual Studio
33
3.4
Summary
This chapter has discussed methodology used in this study based on the
research framework provided. The stage on map information extraction had
involved the tedious task on selecting the point of interests and identified the
positioning of road edge and intersection. The result of selecting the roads was
based on the priority and importance for the transport network in Kuantan town area
and the selection of SQL was based on its flexibility of embedding the language
programming. Each stage of research framework plays important role in
accomplishing this study.
34
CHAPTER 4
RESULTS
4.1
Introduction
This chapter discussed the process that had been taken in accomplishing this
study based on the stages in framework design as being discussed in previous
chapter. Results found were based on the sample of testing nodes and iteration
within certain selected nodes that connect between source and destination location.
As mentioned in previous chapter, this study is focused on finding alternative paths
between set of locations. Therefore, the testing done based on the algorithm
mentioned in previous chapter only focused towards a single pair of path within the
given source and destination nodes. The discussion from the experiments is more to
discuss the suitability of algorithm used in finding solution for alternative paths
searching and the pattern of searching that reflects towards the searching result or
output.
4.2
Initial findings
4.2.1
Collecting data :
From the interview session with En. Hashim Ismail, Deputy Director from
Jabatan Ukur dan Pemetaan (JUPEM) Kuantan branch, it found that the map
application for public viewer is basically being developed by using fully GIS
35
implementation. The information of road network or path basically was based on a
static application either from online digital map viewer or hard copy of map.
Related map on Kuantan tourism had been taken from Kuantan Tourism
Information Center with respective from Kuantan Municipal Council. Alternative
paths finding is based on the important location along with the main road in part of
Kuantan town area. The ideas on the development of the research content have also
being contributed by En.Fikri Ismail and En. Firdaus Muhamad, staff from GIS
department at POLISAS.
4.2.2
Technical study :
Technical study has been done on revision of mathematics graph theory
foundation, database structure on SQL (MySQL) implementation and related
algorithm used. The calculation of distance had been supervised and verified by the
staff from JUPEM and POLISAS. For this study, the Floyd-Warshall algorithm of
traversing techniques has been used and implemented using SQL.
4.3
Experiments
For testing purposes, the dataset from the graph were modeled in a directed
graph application. Few queries were being tested and the results were concerned on
the accuracy and the number of traversing from source to destination node. From the
methodology discussed in previous chapter, we implement the SQL syntax as
below. This experiment was grouped by the maximum number of node testing,
that’s 9, 12 and 24 of nodes. This experiments took temporary table to keep the
traversing node from each iteration of traversing from node 1 to n and through the
adjency nodes. Futher discussion explained on findings from two experiments,
followed by the summary of few processing analysis and the overall discussion of
the findings.
36
4.3.1
Experiment 1
Sample testing had been run by applying the above queries. The sample testing is
between the location of
Kemunting
as
source and Taman Kerang as target
destination. Kemunting was denoted as A and Taman Kerang as G. The experiment
involved 9 nodes from denoted A to I. The response time for the processing took
0.39 seconds with the 1475 of traversing iteration. After the result being filtered and
refined, the results display the alternative paths that can be taken from Kemunting to
Taman Kerang as denoted below. Table 3 showed the content of filtered result from
the query processing and Table 4 display the actual presentation from the denoted
label.
Num Alternatives path
Path length Node/s
(km)
1
Kemunting Æ Junction/TrafficLight at Jalan Wong Ah
visited
2.18
4
1.33
4
3.07
5
3.31
7
Jang off Jalan Penjara Æ Kompleks Teruntum Æ Taman
Kerang
2
Kemunting Æ Esplanade Æ Kompleks Teruntum Æ
Taman Kerang
3
Kemunting Æ Junction/TrafficLight at Jalan Wong Ah
Jang off Jalan Penjara Æ Junction/Traffic Light at Jalan
Wong Ah Jang off Jalan Bukit Ubi Æ Hotel Shahzan
off Jalan Gambut Æ Taman Kerang
4
Kemunting Æ Esplanade Æ Kompleks Teruntum Æ
Junction/TrafficLight at Jalan Wong Ah Jang of Jalan
Penjara Æ Junction/Traffic Light at Jalan Wong Ah Jang
of Jalan Bukit Ubi Æ Hotel Shahzan off Jalan Gambut
Æ Taman Kerang
Table 4.1 Sample of retrieval query in details
37
Form the above result, we could analyze finding in this below discussion :
i.
The numbers of alternative paths available from a set of single path,
denoted as from source to destination nodes or location, in example from
the above result, the numbers of alternative paths retrieved is 4 ( without
considering the traversing that goes more than once to the nodes)
ii.
The nodes visited from source to destination. From above result, we could
simply look at the numbers of visited node from each alternative paths, in
example the selection of path 1 gave us 4 visited node, path 2 also gave 4
visited node, path 3 gave 5 visited node and path 4 gave the highest
number of visited nodes, that’s 7.
iii.
The length of each retrieved path. For this testing, we put the distance
within the kilometers (km) unit. From above table, we could see that path
1 gave us 2.18 km, path 2 gave 1.33 km, path 3 gave 3.07 km and path 3
gave us 3.31. It could also simply show us the minimum/shortest distance
or the maximum/longest distance from source to destination location.
Figure 4.1 showed the sample of interfaces using browser capabilities along with
the use of PHP capabilities and MySQL datasets retrieval in representing the output.
The output showed the set of retrieval paths based on input from user, selecting
Kemunting as source node (A) and Taman Kerang (G) as destination node.
38
Figure 4.1 Sample of data retrieved
4.3.2
Experiment 2
This experiments test another set of location that has a same node from
source to destination, make it recursive to node itself. The traversing experiments
involved cycles iteration from the same node that denoted as source and destination
node. We took location of Kemunting (A) as a sample and try to find the paths that
can be traverse from Kemunting as source node and back to Kemunting itself as
target destination. Kemunting was denoted as A and the experiment involved 9
nodes from denoted A to I. The response time for the processing took 0.38 seconds
with the 2078 of traversing iteration. After the result being filtered and refined, the
results display the alternative paths that can be taken from Kemunting to itself as
below. Table 5 showed the content of filtered result from the query processing and
Table 6 display the actual data presentation from the denoted label.
39
Num Alternatives path
Path length Node/s
(km)
1
Kemunting Æ Kemunting
2
Kemunting Æ Junction/TrafficLight at Jalan Wong Ah
visited
0
1
2.16
2
0.84
2
2.49
4
Jang off Jalan Penjara Æ Kemunting Æ
3
Kemunting Æ Esplanade Æ Kemunting
4
Kemunting Æ Junction/TrafficLight at Jalan Wong Ah
Jang off Jalan Penjara Æ Kompleks Teruntum Æ
Esplanade ÆKemunting
Table 4.2 Sample of retrieval query in details
The analysis as mention in experiment 1 is clearly showed by the result
displayed. For this experiment, we gave an exceptional for traversing node that
more than once for the first and last step based on the reason that the source and
destination will point to the same location. The counting of node/s visited is based
on the different nodes visited, in example for path 1, the same node referred to the
source and destination node, thus it count as a same node. It goes same to the path 2
and path 3 that gave the counting node of 2, denoted that the same node reached will
not be counted.
4.3.3
Summary of Experiment 1 and Experiment 2
From the experiment 1 denoted from location A to G or Kemunting to Taman
Kerang, we could summarize that the numbers of alternatives path found was 4, the
shortest distance was path 2 and the longest distance was path 4. The numbers of
visited nodes didn’t represent the shortest path because the weighted for each road
segment is difference. Although the path will travel and visit more nodes, the
distance might be shorter than the least number of visited node. For the case of
Experiment 2 that involved the same source and destination node, it clearly showed
40
the same numbers visited node might gave different length, depends on the
weighted given. Therefore, the significant of the result could be much more better in
helping on decision making for selecting the paths to be taken for any reason of road
travelling or planning.
4.4
Findings
Few testing on queries have been run to compare the response time of
iteration and the pattern of retrieval result. The experiments were tested by using the
MySQL version 5.0 capabilities in storing the datasets and also the query
processing. We’ve run few testing on different total of nodes and different locations.
The reason on running the testing for few set of nodes is to show the comparison on
average time running in each response that reflected from the number of nodes
involved and iteration searching from the respective set of source and destination
nodes. The summary of testing for 9 single pair of node is shown at Table 7. The
location denoted as label from A to X. Location involved means from certain
location to the target location, denoted as the source node Æ destination node.
Number of nodes involved is the maximum numbers of nodes that represent
by the table graph, response time is referring to the time of query processing,
number of iteration is referring to the traversing iteration path that visited by the
searching from and to each node and number of alternative paths refined is referred
to the number of paths visited without consideration of more than once traversing
for each nodes, exceptional for the same source and target nodes as being mentioned
in analysis of previous Experiment 1.
41
Location
involved
Number of Response
nodes
time
involved
(second)
AÆG
9
0.39 sc
Number
of Number
of
iteration/
alternative paths
retrieved path refined
(final result)
1474
4
AÆC
9
0.42 sc
1916
2
DÆH
9
0.33 sc
782
2
AÆ A
9
0.38 sc
2078
4
MÆ X
12
3.98 sc
23 771
9
O ÆX
12
4.84
31 400
10
O ÆX
24
15.72 sc
100 000
14
MÆX
24
10.02 sc
70 721
12
Table 4.3 Summary of 8 query processing
From the results, few analyses can be found and summarized as below :
i. The increament of nodes checking or visited will increase the time taken for
each iteration of node traversing. In example, the range time needed in query
processing for the set of 9 nodes is between 0.38 – 0.42 seconds, the set of
12 nodes need the average time between 3 – 5 seconds and more input of
nodes, 24 took more than 10 second to accomplish the iteration. It shows
that the connection between time and number of nodes is directly
proportional.
ii. The number of iteration for each traversing has an impact of response time
in a great number of retrieval paths or retrieval result. From the result, we
found that the 100 000 of iteration needs the time processing of more than 15
seconds compare to the 1474 of iteration that needs only 0.39 seconds to
complete the process, but it goes different in comparison of the processing
between AÆ A and AÆC. AÆA query needs 0.38 seconds to produce the
2078 of lines but AÆC needs 0.42 sc to process 1916 lines, so it can
conclude that the least differences didn’t gave much more impact on the
response time.
42
iii. The pattern of searching from root node (source node) can be seen that the
priority was given to the right node or leaf above the parent node, then it
check the child node until find the destination result, go back to the root, find
other child and so on as being discussed in previous chapter. We take the
example from the experiment 1 and experiment 2. From Experiment 1
(AÆG), it showed that the traversing start from root node, denoted A, then
goes to the B which is the right leaf of the node, after find the destination
node, it goes back to the root, and root goes to other child, that’s C until find
the destination, then it goes back to the root, start searching from child B
which denoted as right leaf until find the destination and get back to the root
and start searching from the child C and so on. The same pattern goes to the
Experiment 2 whereby the searching start from A to A, A to B, then A to C
until find the destination.
4.5
Conclusion
From the overall findings, it conclude that the use of algorithm is significant
to be applied into the larger dataset but it will increase the response time due to the
additional of nodes in the graph list. The searching technique used in this study is
significant to check each node visited and to give the equal traversing for each level
of parent and node involved, therefore it return the all possibilities of path required
from the set of location. The browser capabilities is used for easier interaction
between the user and the proposed system itself.
43
CHAPTER 5
CONCLUSION
5.1
Introduction
This chapter concludes the work which has been done to meet the project
goal. The main objective of this project is to show the effectiveness of path
algorithm used in find out the possibilities paths towards the set of node or location.
5.2
Contribution of Study
This study had achieve the goal on finding solutions of retrieval alternative
paths for the set of location. The findings as mentioned in previous chapter showed
the retrieval paths or outcome from each iteration that showed more than one
possibities of path retrieved from each traversing process. The algorithm used can
be one of the contribution for the existing system available.
44
5.3
Suggestion for Future Work
There is driving force behind the growth of database development
specifically in spatial database application to support on decision making or any
other business process. The solution of path finding techniques can be extend to
embedded with other input data from other field apart from pre trip planning or
route finder. The system proposed can be a backbone for enhancement in a business
analysis such as this example:
1.
Insurance risk assessment – To answer the queries e.g. “What types of
accident happened within 500 meters of this intersection Or “List the
accidents case that had happened between the junction of Kemunting and
Tanjung Lumpur for the past one year”
2. Retail site selection based on the numbers of site location along the
selected road – To answer the queries e.g. “Where should we open our
new stores branch at Kuantan town between the road of Jalan Teluk
Sisek and Jalan Besar that near to the public spot are?”
From the 2 examples above, the integration of system proposed and other
attributes data such as accident statistic, claims insured, demographic or population
statistics can verify the query answer. Thus, it can support the analysis on decision
making for other analysis.
45
REFERENCES
Ahuja, R. K., Magnanti, T.L., and Orlin, J.B. (1993). Network Flows: Theory,
Algorithms and Applications. Englewood Cliffs, NJ:Prentice Hall.
Celko, J. (2004) "Trees and Hierarchies in SQL For Smarties", Morgan
Kaufman, San Francisco, 2004.
Cromley, E.K. (1997). Digital Map Librarianship: Maps and Digital Spatial
Data :Conecticut: IFLA Section of Geography and Map Libraries
Dolman, J., Hodgson, B., Dowsey, J., Heffernan, J., Seymour, J., Simons, B.,
Woods, B. (1996). Futher Mathematics.VCE units 3 & 4. (2nd Edition).
Queensland. The Jacaranda Press
Erwig, M., Guting, R. H, (1994). Explicit Graphs in a Functional Model for
Spatial Databases. Hagen, Germany
Garofalakis, J., Polyxeni, N., and Athanasios, P. (2006). Vehicle Routing and
Road Traffic Simulation:A Smart Navigation System. Patras. Greece
Hamill, R., and Martin, N. (2003). Database Support for Path Query Functions.
London
Longley, P. A., Goodchild, M. F., Maguire, D. J., and Rhind, D. W., (2005).
Creating and Maintaining Geographic Databases. Second Edition. John
Wiley and Sons
Okyere, M. K. (2000). Virtual City: A Heterogenous System Model of an
Intelligent Road Navigation System Incorporating Data Mining
Concepts. Indiana, USA
Pearl, J. (1984). Heuristics:Intelligent Search Strategies for Computer Problem
Solving. Addison-Wesly.
Ravada, S., Sharma, J., Herring, J., (2002). Oracle Spatial: An Oracle
Technical White Paper. California. Oracle Corporation.
46
Ray R. Larson, ( 1998) Geographic Information Retrieval and Spatial
Browsing. Berkeley, California
Ray R. Larson, ( 1998) Geographic Information Retrieval and Spatial
Browsing. Berkeley, California
Roozbeh, S., Hamid, E., and Mohsen, G., (2003). Evaluation of Route Finding
Methods in GIS Applications. Tehran, Iran
Saltenis, S. (2001). Algorithms and Data Structure, Lecture XIII. Aalborg
Scott, K., and Bernstein, D. (2000). Finding Alternatives to the Best Path. New
Jersey, USA
Stephens, M. S., Rung, J., and Lopez, X. (2004). Graph Data Representation
in Oracle Database 10g:Case Studies in Life Sciences. Bulletin of the
IEEE Computer Society Technical Committee in Data Engineering.
Oracle Corporation, USA. Montreal, Canada
Wong, A. (2000). GIS-Based Freight Density And Capacity Modelling. Alberta
Wu, Q. (2006). Incremental Routing Algorithms For Dynamic Transportation
Networks. M.Sc. Thesis. University of Calgary, Alberta.
Xiaofang Zhou, Yanchung Zhang, Sanglu Lu, Guihai Chen, On Spatial
Information Retrieval and Database Generalization, 2001
47
Appendix A
List of nodes (places)
48
Table : location
locationID
name
details
lat_x
A
kemunting
10
B
lampu isyarat jln
wong ah jang, sek
ren abdullah
16
13.5
C
esplanade
14.2
4
D
kompleks teruntum
14
8
E
simpang bukit ubi,
hotel pacific
17
19
F
hotel Shahzan,
simpang jln gambut
17
13
G
taman kerang,
padang mpk1
17
4.5
H
masjid, mahkota
square
18
5
I
simpang jalan besar,
ke tabung haji
merdeka station,
simpang st thomas
Persimpangan Lampu
Isyarat Tanah Putih,
Jalan Wong Ah Jang,
kawasan Kemunting
Persimpangan Lampu
Isyarat Jln Wong Ah
Jang, Jalan Penjara,
sek.ren abdullah
Persimpangan Lampu
Isyarat Jalan Besar, Jln
Penjara , berhampiran
hospital, padang MPK,
benteng,
Persimpangan Lampu
Isyarat bersebelahan
Teruntum, Jln Penjara,
Jalan Mahkota
Persimpangan Lampu
Isyarat Jalan Wong Ah
Jang, Jalan Tun Ismail,
Bukit Ubi
Persimpangan Lampu
Isyarat Jalan Bukit Ubi
dan Jalan Gambut
Persimpangan Jalan
Mahkota, Jalan Bukit
Ubi
Persimpangan Jalan
Mahkota, Jalan Pasar,
berhampiran Masjid
Persimpangan Jalan
Besar, Jalan Pasar,
Persimpangan Lampu
Isyarat Jalan Gambut,
Jalan Merdeka, s.k st
thomas
Persimpangan Jalan
Merdeka,Jalan
Mahkota, berhampiran
Pejabat Pos
Persimpangan Jalan
Teluk Sisek, Jalan
Besar, Jalan Merdeka
long
_y
4.5
19
3.5
22.5
15.5
18.5
8.5
21
4.4
J
K
pejabat pos, bank
L
simpang bank RHB,
simpang 3 jln teluk
sisek, jalan besar,jln
merdeka
49
M
Megamall, MS
Garden
N
Shell, simpang jln
gambut, jalan
beserah
Lampu Isyarat ke
Tanjung Lumpur,
JAlan Beserah, Ke
Teluk Cempedak,
Jalan Teluk Sisek
Shell Jalan Beserah,
simpang ke kubang
buaya
Ikip Link, jln dato
bahaman
O
P
Q
R
S
T
U
V
W
X
Persimpangan Lampu
Isyarat Jalan Tun
Ismail, Jalan Beserah,
berhampiran Megamall
Persimpangan Jalan
Gambut, Jalan Beserah,
berhampiran Shell
Persimpangan Lampu
Isyarat Jalan Teluk
Sisek, Jalan Beserah,
Jalan Dato’ Abu Bakar
24
24
25
18.5
25.5
9
Persimpangan Lampu
Isyarat Jalan Beserah,
Jalan Kubang Buaya
Bulatan Jalan Kubang
Buaya, Jalan Dato’
Bahaman
Pantai Selamat, jalan Persimpangan Lampu
teluk sisek
Isyarat Jalan Teluk
Sisek, Jalan Kubang
Buaya, Jalan Selamat
Lampu Isyarat
Persimpangan Lampu
Semambu, Jabatan
Isyarat Jalan Beserah,
Pemetaan
Jalan Tengku
Muhammad, Jalan
Semambu
MRSM
Persimpangan Jalan
Tok Sira, Jalan Dato’
Bahaman, berhampiran
MRSM
simpang
Persimpangan Jalan
perkampungan tok
Tok Sira, Jalan Teluk
sira
Chempedak
JPJ
Persimpangan Jalan
Dato Bahaman, Jalan
Tengku Muhammad,
berhampiran JPJ
Rumah Persinggahan Persimpangan Jalan
Diraja, berhadapan
Teluk Chempedak,
Taman Teruntum
Jalan Tengku
Muhammad
Teluk Cempedak
Teluk Cempedak
30
30
54.5
21.5
40.5
12
64.5
35.5
62
23
44
11
66
25.5
67
12
80
18
50
Appendix B
List of edges (source and destination)
51
Table : graph
source destination
A
A
length
0
A
B
1.66
A
C
0.42
B
A
1.66
B
B
0
B
D
1.22
B
E
0.14
C
A
0.42
C
C
0
C
D
0.4
D
B
1.22
D
C
0.4
D
D
0
D
G
0.46
E
B
0.14
E
E
0
E
F
0.6
E
M
0.9
F
E
0.6
F
F
0
F
G
0.85
G
F
0.85
G
G
0
G
H
0.11
H
H
0
H
K
0.35
52
I
C
0.48
I
H
0.18
I
I
0
J
J
0
J
K
0.81
J
N
0.7
K
J
0.81
K
K
0
K
L
0.48
L
I
0.22
L
L
0
L
O
1.15
M
E
0.9
M
M
0
M
N
0.21
M
P
0.81
N
J
0.7
N
M
0.21
N
N
0
N
O
0.7
O
L
1.15
O
N
0.7
O
O
0
O
R
1.53
P
M
0.81
P
P
0
P
Q
2.59
P
S
3.49
Q
P
2.59
Q
Q
0
Q
R
1.69
Q
T
0.76
53
R
O
1.53
R
Q
1.69
R
R
0
R
U
0.36
S
P
3.49
S
S
0
S
V
1.01
T
Q
0.76
T
T
0
T
U
2.16
T
V
0.47
U
R
0.36
U
T
2.16
U
U
0
U
W
2.3
V
S
1.01
V
T
0.47
V
V
0
V
W
1.35
W
U
2.3
W
V
1.35
W
W
0
W
X
1.43
X
W
1.43
X
X
0