kuantan polisas
Transcription
kuantan polisas
MODELING AND QUERYING ALTERNATIVE PATHS IN KUANTAN MAZLINA BT MOHAMAD SALLEH A project report submitted in partial fulfillment of the requirements for the award of the degree of Master of Science (Computer Science) Faculty of Computer Science and Information System Universiti Teknologi Malaysia NOVEMBER 2007 “I declare that I have read this project and in my opinion this project has satisfied the scope and quality for the award of the degree of Master of Science (Computer Science)” Signature : ______________________ Name of Supervisor I : Assoc. Prof. Dr. Naomie Salim Date : 26 NOVEMBER 2007_____ ii I declare that this project report is the result of my own research except as cited in the references. The project report has not been accepted for any degree and is not concurrently submitted in candidature of any other degree. Signature : ……..…………………………………. Name of candidate : MAZLINA BT. MOHAMAD SALLEH Date : 26 NOVEMBER 2007 iii To my beloved mom and dad, brothers, sister and friends, thanks for the endless loves and care, to the most beloved kids, Daneal, Lissa and Azfar , you’re light of my life and a very special thanks to my supervisor, Assoc. Prof. Dr.Naomie Salim for the encouragement and support iv ACKNOWLEDGEMENTS First and foremost, I would like to thank ALLAH s.w.t for all the achievements that I have gained today.Here, I would like to express my gratitude to my supervisor, Associate Professor Dr. Naomie Salim for attention, guidance, encouragement and patient through out this length of study. Not forgetting also to other examiners, Dr. Siti Zaiton Mohd Hashim and Dr. Ismail Mat Amin for the comments and guidance towards this project, En Shafry for the ideas and to all the lecturers, please have my heartfelt thanks and sincere gratitude for all your valuable kindness and may Allah reciprocate all your good deeds in the best way. I would also like to extend my deepest appreciation to the staff at Jabatan Ukur dan Pemetaan Pahang, staff at Kuantan Municipal Council (GIS Department), staff at Pahang Tourism Centre and also not forgetting En. Fikri Ismail and En.Firdaus Ali from Politeknik Sultan Haji Ahmad Shah (Polisas) for sharing of information, knowledge, ideas contribution and comments. To families and friends, the endless love will remain forever and only Allah can bless your kindness. v ABSTRACT Route finding based on geodetic data had addressed the growing data management and analysis needs of spatial applications such as Geographic Information System (GIS). Spatial databases are prominently used in Geographic Information System (GIS) application like digital map application. This study discusses the process of modeling data from map on road network information consisting of points including the starting and ending road points and intersection between other road segments. It enables the storage of spatial dataset or the geographical information syatem (GIS) for calculation of distance between points. Route finding solution takes the distance information in the form of directed graph based on starting (source) and ending (destination) nodes for the desired paths. The graph theory algorithm used in this study is applied from the Floyd’s approach and using the combination of searching techniques of breadth first search (BFS) and depth first search (DFS) strategies. The structured query language (SQL) is used for querying the database structures. A browser interface for the system makes the information dissemination easier. vi ABSTRAK Pencarian arah jalan berdasarkan data geografi telah memperlihatkan banyak keperluan analisis berkaitan aplikasi spatial seperti Sistem Maklumat Geografi / Geographic Information System (GIS). Pangkalan data spatial telah banyak digunakan dalam aplikasi GIS seperti aplikasi peta digital. Kajian ini membincangkan proses untuk memodelkan data berkaitan rangkaian jalan daripada maklumat peta yang mengandungi maklumat lokasi atau titik seperti titik permulaan dan pengakhiran jalan serta simpang yang terdapat pada sebuah jalan kepada jalan yang lain, seterusnya membolehkan maklumat tersebut disimpan dalam bentuk set data spatial untuk membolehkan aplikasi pengiraan jarak dilakukan antara titik berkaitan yang dikehendaki. Penyelesaian dalam pencarian arah jalan menggunakan maklumat jarak bagi setiap segmen jalan dari satu titik (nod) ke titik (nod) yang lain dlam bentuk graf terarah. Algoritma teori graf yang digunakan adalah berdasarkan pendekatan daripada Floyd serta kombinasi strategi carian mendatar dan mendalam. Kajian ini juga memilih bahasa pertanyaan berstruktur/ Structured Query Language (SQL) sebagai struktur pangkalan data. Penggunaan pelayar / browser sebagai antaramuka memudahkan penyampaian maklumat dan maklumbalas kepada pengguna. vii TABLE OF CONTENTS CHAPTER 1 2 TITLE PAGE DECLARATION ii DEDICATION iii ACKNOWLEDGEMENTS iv ABSTRACT v ABSTRAK vi TABLE OF CONTENTS vii LIST OF TABLES x LIST OF FIGURES xi LIST OF ABBREVIATIONS xii OVERVIEW 1 1.1 Introduction 1 1.2 Problem Background 2 1.3 Problem Statement 3 1.4 Objectives 3 1.5 Project Scope 3 1.6 Significance of Study 4 1.7 Summary 4 LITERATURE REVIEW 2.1 2.2 5 Introduction 5 Graph Theory 5 viii 3 2.2.1 Graph Characteristics and Models 6 2.2.2 Road Network Modelling 7 2.2.3 Graphs in Database 9 2.3 Path Algorithm 9 2.4 Searching Technique Analysis 13 2.4.1 Breadth-First Search (BFS) 13 2.4.2 Depth-First Search (DFS) 15 2.4.3 Best First Search 16 2.5 Structured Query Language 16 2.6 Spatial Database 17 2.7 Digital Map 20 2.8 Kuantan Digital Map 20 2.9 Discussion 20 METHODOLOGY 22 3.1 Introduction 21 3.2 Research Framework 21 3.2.1 Extracting map information 23 3.2.2 Modeling selected data 25 3.2.3 Selecting Suitable Query and Algorithm 29 3.2.4 Result Analysis 32 3.3 Instrumentation 32 3.3 Conclusion 32 ix 4 5 RESULTS 34 4.1 Introduction 34 4.2 Initial Findings 32 4.3 Experiments 35 4.4 Findings 40 4.5 Conclusion 42 CONCLUSION 42 REFERENCES 45 APPENDICES 47 Appendix A – List of nodes(places) Appendix B – List of edges (source and destination) x LIST OF TABLES TABLE NO. TITLE PAGE 2.1 Terms in Graph 8 3.1 Sample data from table location 28 3.2 Sample data from road_graph 29 3.3 List of hardware used 32 4.1 Sample of retrieval query in details 36 4.2 Sample of retrieval query in details 39 4.3 Summary of 8 query processing 41 xi LIST OF FIGURES FIGURE NO. TITLE PAGE 2.1 Single and double direction road representation 9 2.2 Breadth-first Search (BFS) 14 2.3 Depth-first Search (DFS) 15 2.4 Example of topological data 19 2.5 Example of data storage for arc 19 2.6 Google facilities in assisting direction findings 20 3.1 Stages in database design 23 3.2 Research Framework 23 3.3 Kuantan Map 24 3.4 Map of Kuantan town 25 3.5 Kuantan town network road 26 3.6 Nodes and arc representation 27 3.7 Sample query from user through browser 31 4.1 Sample of data derived 38 xii LIST OF ABBREVIATIONS GIS Geographical Information System SQL Structured Query Language RDBMS Relational Database Management System JUPEM Jabatan Ukur dan Pemetaan Pahang BFS Breadth-First Search DFS Depth-First Search CHAPTER 1 OVERVIEW 1.1 Introduction Routes became an interesting topic when associated with travelling matters. Routes are generally associated with map, road network model and geographic data. A great number of studies have been discovered in developing the techniques for finding alternatives routes, the ‘best’ routes or the optimal routes within the road network, depending on the weights given. For example, people tend to use the same route to travel from home to office because of the road length or other conveniences but sometimes, they have to use other alternative routes due to occurrences of events such as traffic congestion, construction work or road damage. In pre-trip planning, for example, drives tend to take one path for the outbound trip and then take another path for their return due to any reason such as for different sightseeing or attending other functions at loca'tion that might intersect with the selected return path. Then, it comes the questions, Which way should be taken? Or, how can I get there within that path? Or, which routes can be the shortest routes from A to B? Finding routes can be implemented by applying a set of algorithm and using a mathematical based graph theory. This study presents the model of road network based on graph structure using the MySQL database. The basic data of the road is represented using nodes as points of locations and edges or arc as path. It consists of source node as starting point and the end node as the destination. This study also embed the use of spatial data such as positioning (x and y) of certain locations. With 2 the development of geographic information system (GIS) technology, network transportation analysis within a GIS environment has become a common practice in many application areas (Roozbeh et al., 2003). For this query, user has a definite destination in mind and desires to acquire the optimal route leading to the destination. The queries can vary and can answer few questions such as “Which is the shortest route from Kuantan Airport to Cherating” or “How many routes can be taken from Teruntum Complex to Teluk Chempedak”. Motivations for this study came from the desire of manipulating spatial GIS data that can be represented by advancement of database technology based on graph theory application. This study describes one such extension, where database technology is used to implement path queries over a graph view of relational data. Partial-path information is pre-computed and stored based on Structured Query Language (SQL). Path querying is implemented using SQL functions, thus enabling the retrieved path tables to be manipulated within SQL queries in the same way as standard relational tables. 1.2 Problem Background Finding paths or routes has become an interest area, as part of the GIS application or Spatial Database application. Many applications can be applied based on using technique for finding direction such as in area of transportation network and travelling guideline. Most of the application on finding paths for relevant application such as map application focused on finding the best path or the shortest path acquired. The well known web services such as Google or Yahoo provide retrieval path based on the shortest path finding algorithm, comprising the best path according to the minimal distance for required length of path. In the normal situation of route finding, the path from one location to others, for example from location A to location B could produce more than one path, let say 5 paths and from the 5 retrieval paths, one path could be the shortest path in terms of minimal length 3 distance, another path could be the fastest route in terms of minimal time of driving, another path could be the free flow of traffic but took longer distance from the others and the others might be the path with the good sightseeing of views. Thus, it provides choices for user to choose the paths that suit their needs and requirements throughout the trip as well as to assist road users in decision making for the cost effectiveness of time and distance. As being mentioned above, the algorithm used in most web services for path finding was based on the minimal distance that considered as ‘best’ distance whereas the consideration should be given for users who would like to know the entire possibilities path between set of location. In recent decades, road network system has become complex and congested that affect the people conveniences, thus this situation has derived this study to find the alternative solution for users in path findings through the geodetic data. According to the JUPEM (Pahang branch), GIS application on Kuantan digital map for path finding is more directed to give one solution, that based on main road and minimal distance. For example, to find path from Teruntum Complex to Teluk Cempedak, the system compute the ‘best’ path according to the path that has high priority such as main road and has a minimal distance of length whereas, it could be has several paths that can be selected by user. This situation also has derived this study to provide the flexibility of path choices from a set of selected location. 1.3 Problem Statement This research is to study the feasibility and effectiveness of using the query support in relational database system (RDBMS) for information retrieval on finding the routes or alternative paths between two entities with the adaptations of path 4 algorithms described in the literature. In this research, a framework for a data modeled, algorithm selection and queries retrieval technique is developed. 1.4 Objectives 1. To build a data model for storing Kuantan map information on road network. 2. To apply graph algorithm for finding alternative paths and their lengths between two locations using Floyd-Warshall approach towards spatial dataset. 1.5 Project Scope 1. The technique of storage and retrieval will be supported by relational database management system (RDBMS) and structured query language (SQL). 2. The data model is based on mathematical graph theory approach. 3. The graph theory algorithm is based on adaptation from Floyd approach that embedded to the SQL schema. 4. The data set and area of study is based on part of Kuantan town map, consisting 24 points of locations and 61 edges of direction. 5 1.6 Significance of study. This study is necessary to support the use of mathematics graph theory and algorithm in paths finding using the set of database. This study is hoped to be another contribution in area of data storage application and information retrieval for local set data of geographic items. The use of algorithm in finding alternative paths can be an additional function to the existing application of path finding especially for Kuantan dataset. 1.7 Summary This report consists of five chapters. This first chapter presents the overview of the project comprising the general introduction of the problem background and the scope of study. The problem background has described the current situation of path retrieval for general and Kuantan map applications, thus explain the drive of this study. Chapter 2 includes the related literature reviews towards the study, chapter 3 present the project methodology and workflow process, chapter 4 analyzes the findings and chapter 5 comprising conclusion and suggestions for future work. 6 CHAPTER 2 LITERATURE REVIEW 2.1 Introduction This chapter will discuss literature reviews that set the background of this study. The reviews revising the study of graph theory and it’s related with road network modeling, spatial database modeling, path algorithm foundation, and SQL usage and map study. 2.2 Graph Theory Basically, the construction of the data is based on mathematical graph theory. Basically, network in graph theory is defined as a directed or undirected graph and can be written as G = (N, A) consisting of: • N: A set of nodes or vertices consisting of discreet points. • A : A set of edges or arcs consisting connections between the vertices, which can be either directed or not and always associated with numerical values Together with this structural definition, algorithms also generally need to know about properties of these elements. For example, the length, travel-time or general cost of every edge needs to be known. Mathematically, this is denoted as a function l which maps edges to real numbers: l: E -> R. For example, the length of an arc 7 connecting nodes i and j, can be denoted as l (i, j). This is indirectly used to decide how far a vertex is from another (Ahuja et al., 1993). According to Celko (2004), graph theory is a branch of topology and the study of geometric relations, can be ideal for modeling hierarchies like family trees. Example of applications that implement a graph principle is organizational charts, language rules, and route maps. Figure 1 shows the example of simple graph consisting relational nodes and edges between them. Figure 1 : A Simple Graph Let the set of nodes in Figure 1 as N, the set of edges = L, graph = G. Then, the tuple or ordered pair {N, L} can be defined as: N = {A, B, C, D, E, F}, 2.2.1 L = {AC, CD, CF, BE}, G= {N, L} Graph Characteristics and models Basically, these terms are useful in referring to the graph application in this study. Nodes and edges Two nodes are considered as adjacent if there is an edge between them and connecting to a common node. In directed graph, the number that entering a node is its in degree and the number leaving is it’s out degree. Path and cycle A connected sequence of edges is a path, its length the number 8 of edges traversed. Two nodes are connected if there is a path between them. If there is a path connecting every pair of nodes, the graph is a connected graph. A path in which no node repeats is a simple path and a path which returns to its own origin without crossing itself is a cycle or circuit. A graph with multiple paths between at least one pair of nodes is reconvergent. A reconvergent graph may be cyclic or acyclic Traversing graphs There are two main approaches, breadth-first and depth-first. Breadth-first traversal visits all a node's siblings before moving on to the next level, and typically uses a queue. Depth-first traversal follows edges down to leaves and back before proceeding to siblings, and typically uses a stack. Sparsity A graph where the size of E approaches the maximum N2 is dense. When the multiple is much smaller than N, the graph is considered sparse. Trees A tree is a connected graph with no cycles. It is also a graph where the in degree of the root node is 0, and the in degree of every other node is 1. A tree where every node is of out degree <=2 is a binary tree. A forest is a graph in which every connected component is a tree. Euler paths A path which traverses every edge in a graph exactly once is an Euler path. An Euler path which is a circuit is an Euler circuit. Table 2.1 Terms in Graph 2.2.2 Road Network Modeling The modeling of road networks is strongly connected with graph theory (Hofnman et al., 2003) and can be represented as a directed graph. According to Garofalakis et al., (2006), it is possible to design a graph that corresponds to the 9 road network by using the GIS data that represent the road network from certain city map. According to fundamental definitions of graph theory from Gibbons (1985), a graph or specifically a directed graph can represent a map that depicts the road network of certain places such as city or country. A directed graph or digraph G can be assigned as an ordered pair G: = (N, E) with a set of nodes N with |N| = n and set of ordered pairs of nodes E with |E| = m, call directed edges. Each node of the graph represents an intersection or terminal point of the roads. Each edge from one node to another represents a directed link between two adjacent intersections or between an intersection and a terminal point. A road of single direction or normally known as one way direction is represented with a directed edge to the particular direction while a road of double direction or two ways direction is represented with two edges to both directions as shown in Figure 2. Figure 2.1 Single and double direction road representation In this case, a graph also consist a weight or cost for each edge of the graph and in this case of study, we assume that the cost is the length from one node to another through the edges. A path from starting to destination node is a sequence from adjacent nodes and adding up the cost of all edges of the respective path will give a result of total cost or length. 10 2.2.3 Graphs in Databases According to Erwig and Gutting (1994), each data model has its own facilities to represent relationship among objects. The reasons for considering graphs in representing data model is that, the real life problem can be directly expressed in terms of graph concepts (paths, spanning, trees) and most of the problems can be solved by adaptation with the suitable algorithms. By reference to the graph theory, data can be represented as a series of node and edges, nodes represent an object of interest and edge indicates a relationship between two nodes. Nodes and the edges then will be the attribute of field or object that can be modeled using database schema. The implementation example of using data set in route planning are such being offered by MapQuest and any other GIS application, for example, works by Wong et al.,(2004) and Okyere (2000) contributed to the studies in using graph theory for dataset. Other than that, implementation of using graph in series of dataset can be varies such as in life sciences or biological data representation. According to Stephens et al., (2004), graphs can enable complex networks to be visualized in a straightforward manner that captures the structure of the system and they can support the hierarchies of information that well suited for modeling the different level of biological. The increasing number of data sources in size and complexity has make it more importance to be managed as a graph representation in a relational database management system (RDBMS) that can offer users the ability to store data in secure, highly available and scalable environment. 2.3 Path Algorithm Path algorithm has been used widely in application on finding routes such as to find the shortest routes, finding optimal routes, to find the ‘best” routes and also 11 to find alternative routes towards the traversal search. Sometimes, people try to find the best routes in terms for certain weighted such as time, cost, distance or any combination from these terms. Many studies have been done in discussing the path algorithm used in various application, for example, work from Scott and Bernstein (2000) had discussed a constrained shortest path problem that can be used in generating alternative paths and also from Roozbeh, et al., presented the evaluation of route finding methods between three algorithm, Dijkstra’s Algorithm, Heuristic Methods and Genetic Algorithm. According to Saltenis (2001), path problems can be categorized as a single source destination, single pair and all pairs. Single source destination means path between from one given source (vertices) to each of (destination) vertices. Single pair comply the situation on finding path by given 2 vertices, source and destination. All pairs meaning finding path for every pair of vertices and normally by applying the dynamic programming algorithm. This study will concentrate variants on finding path for single pair by given origin node and destination. Given an input graph, a source node and destination node, the single pair algorithm will return paths between the 2 nodes through the graph traversing. Works from Hamill and Martin (2003) have modified the algorithm for path finding based on Hierarchical Encoded Path View (HEPV) by Jing, Huang and Rundensteiner and Zhang’s disk-based Dijkstra’s (diskSP). This algorithm took 3 important elements as an inputs, that’s graph, source node and destination node as an n with the weighted between related node, thus the path returns were reachable from the source node via a path no longer than n and noted that in each case, the input graph can be generated from ordinary relational data: nodes correspond to entities in the database and path correspond to the connection between entities. This study basically is referred to this algorithm guideline but with an approach and adaptation from Floyd Warshall algorithm. 12 2.3.1 Floyd-Warshall Fundamental Floyd–Warshall algorithm is one of the graph analyses of path findings. It is a graph analysis algorithm for finding shortest paths in a weighted, directed graph. It also known as Roy–Floyd algorithm, since Bernard Roy described this algorithm in 1959 (Floyd and Warshall, 1962). The Floyd–Warshall algorithm is an example of dynamic programming. The Floyd-Warshall algorithm compares all possible paths through the graph between each pair of vertices. To find shortest paths between pair of vertices, it will incrementally improve estimation on the shortest path between two vertices, until the estimation known to be optimal. The pseudo code below was a heart of Floyd and yet as the basis of finding alternative paths. For example, the pseudo code below is based on given directed graph G = (V, E), weighted with edge costs and consist of pairs of all vertices from u to v (u, v). All weights are assumed as non-negative numbers and the cost of a path will be the sum of the costs of all edges in the path. The cost c (u, v) is assigned to each of the pairs for all possible pair’s u and v in the graph. Let: c (u, v) = the given edge cost if edge (u, v) exists c (u, v) = infinity if there is no edge (u, v) in the graph Assumed that the vertices are labeled or indexed using integers ranging from 1 to n and let cost[i,j,k] hold the cost of least cost path between vertex i and j with intermediate nodes chosen from vertices 1, 2….k. The traversing will check each vertice for path finding and print the node found starting from the root node, avoiding printing the second node if it is same as first, move to the next node, repeat the same procedure and so on. The pseudo code below was implemented for C language. Floyd-Warshall Pseudo code: for i: = 1 to n do for j: = 1 to n do cost [i,j]:= c [i, j]; // let c [u,u] := 0 next [imp] :=j for k := 1 to n do 13 for i := 1 to n do for j := 1 to n do sum = cost [i,k] + cost [k,j]; if (sum < cost [i, j]) then cost [i,j] := sum; next [i,j]:= next[i,k]; // To write out the path from u to v : w := u; write w; while w != v do w := next [w,v]; write w; 2.4 Searching Technique Analysis Basically, graph traversing is related with the searching strategies in order to provide the retrieval path needed from certain query. The implementations of searching strategies were based on the acquired goal from the traversing. The discussion below consisting the techniques that will be using in this study. 2.4.1 Breadth-First Search (BFS) A Breadth-First search (BFS) is a method that traverse a graph by touching all the reachable nodes from a given source node and considered as search algorithm which optimizes breadth-first search by expanding the most promising node chosen according to some rule. Pearl (1984) described BFS as estimating the promise of node n by heuristic evaluation function f(n), that depend on the general description of n, description of goal and the information gathered by the search up to that point. 14 The BFS traversing implementation starts from the source node, which was assigned at level 0. At the first stage, all nodes at level 1 will be visited followed by nodes at level 2 for the second stage and continuously to the next level and so forth. The BFS searches the entire graph and visiting every node until it finds it goal and terminate. BFS normally labeled each node with a given distance and number of links from the start node and using First In First Out (FIFO) method to add the nodes obtained from the queue. A sequence of searching is described in Figure 2.2. .Assuming that A is a starting node, the traversing go to the node B, C and D as these 3 nodes were at a same level, then it goes to the node E, denoted as a child of B and next leave for node F. Here, E and F were considered al level 2 and F is a child of C. The traversing goes to G as child of E and has a same level as H at level 3 and lastly, the traversing ended at node I as the child of H and being at the lowest level. Figure 2.2 Breadth-first Search (BFS) The general algorithm of BFS can be written as below: 1. Put the root node and ending node in the queue. (define the source and destination node) 2. Pull a node from the beginning of the queue and examine it. a. If the searched goal is found in this node, the search terminate and return the result b. Otherwise, push all the unexamined nodes (direct child nodes) if any to the end of the queue 15 3. If the queue is empty and each node has been examined, quit search and return no result. 4. Repeat from step 2 2.4.2 Depth-First Search (DFS) Depth-First Search (DFS) is an algorithm for traversing or searching a tree, tree structure or graph. Compared to the BFS, Depth-First Search (DFS) starts at a start node or at the root and explore as far as possible along each branch before backtracking. Formally, DFS is an uninformed search that progresses by expanding the first child node of the search tree and going deeper until a goal node is found or until it reach node that has no child. For example, the traversing start at start node as S in G, which then becomes the current node. The algorithm then traverses the graph by any link (u, v) incident to the current node u. If the link (u, v) leads to an already visited node v, then the search backtracks to the current node u. If, on the other hand, link (u, v) leads to an unvisited node v, the algorithm moves to v and v then becomes the current node. That is, it will pick the next adjacent unvisited node until it reaches a node that has no unvisited adjacent nodes. The search proceeds in this manner until it reaches a dead-end. At this point, the search starts backtracking and the process terminates when backtracking leads back to the start node. Figure 2.3 shows a DFS applied to an undirected graph, with the nodes labeled in the order they were explored. Figure 2.3 Depth-first Search (DFS) 16 2.4.3 Best First Search The Breadth-First search is able to find a solution without getting trapped in dead-ends, while the depth-first algorithm finds a solution without computing all of the nodes. The Best-First search allows us to switch between paths thus gaining the benefit of both approaches. It is a combination of DFS and BFS, which optimizes the search at each step by ordering all current adjacent nodes according to their priority as determined by a heuristic evaluation function. The search then expands the most promising node which has the highest priority. If the current node generates adjacent nodes that are less promising, it is possible to choose another at the same level. In effect, the search changes from depth to breadth. The heuristic evaluation function predicts how close the end of the current path is to a solution. Those paths that the function determines to be close to a solution are given priority and are extended first. A priority queue is typically used to order the paths for efficient selection of the best candidate for extension. In summary, since the DFS and BFS exhaustively traverse the entire graph until they find the goal, they are categorized as uninformed searches. In contrast, the Best-First search utilizes a heuristic to reduce the search space and is able to find the goal more efficiently and is categorized as informed search. 2.5 Structured Query Language A query language provides the means to access and manipulate data in the database. Structured (Standard) Query Language (pronounced SEQUEL) was developed by IBM in 1970s and now being a de facto and de jure standard for accessing relational databases. Three types of usage comprise of standalone queries, high level programming and embedded in other applications. Structured Query Language (SQL) is one of the popular query languages to express typical spatial queries within GIS capabilities. For spatial queries from SQL/ OGIS itself, 17 the standard had been adopted by many vendors such as Oracle, MySQL and PostgreSQL, only differs for the syntax and the choices for spatial data types and operations is similar (R.Larson, 2007). This study will use MYSQL query language for the database structure and the implementation of graph searching. The spatial query that require the presence of network structure in the geographic spatial data in this study focused in finding path between or routes between origin and destination location such as “What is the shortest route from IKIP college to Kuantan airport” or “List paths that can be taken from Teruntum Complex to Teluk Chempedak”. 2.6 Spatial Databases Spatial data is defined as location-related data in an object. It stores spatial objects and spatial relationships between these objects. Road map is a common example of spatial data that contains points, lines and polygons to represent cities, roads and political boundaries such as provinces. This spatial data is used to project the location of the objects into a two-dimensional with the support from other application such as GIS in data storage, retrieval, updating and providing queries. In general, a Geographic Information System may be defined as a computer-based information system which attempts to capture, store, manipulate, analyze and display spatially referenced and associated tabular attribute data for solving complex research, planning and management problems (Fischer and Nijkamp, 1993). Other types of spatial data are such as computer-aided design (CAD) and computer-aided manufacturing (CAM) (Ravada, 2007). The emerging of spatial technology seen the use of modern Database Management Systems (DBMS) for multiple users and sharing (Zhou et al., 2001). Applying graph concept in spatial databases is more likely on modeling or visualizing the data connections such as to model the network between roads or highway or other trails that being kept already in a form of spatial data such as the geometry , point or polygon. According to Ewig and Gutting (1994), spatial 18 networks can be modeled in terms of graphs. Nodes and edges can carry the geometric information, for example, a point may associate with a node and polygonal line can be associated with an edge. Explicit paths are available as entities in a graph and this is important since objects can always correspond to paths in a network. For this study, we’re looking at relationship from the topology approach of spatial dataset. According to Foote and Huebner, topology is one of the most useful relationships maintained in many spatial databases. It is defined as the mathematics of connectivity or adjacency of points or lines that determines spatial relationships in a GIS. The topological data structure logically determines exactly how and where points and lines connect on a map by means of nodes (topological junctions). The order of connectivity defines the shape of an arc or polygon. The computer stores this information in various tables of the database structure and GIS manipulates, analyzes, and uses topological data in determining data relationships. Network analysis uses topological modeling for determining shortest paths and alternate routes. For example, a GIS for emergency service dispatch may use topological models to quickly ascertain optional routes for emergency vehicles. Automobile commuters perform a similar mental task by altering their route to avoid accidents and traffic congestion. Likewise an electrical utility GIS could rapidly determine different circuit paths to route electricity when service is interrupted by equipment damage. Similarly, political redistricting planners could use certain algorithms to determine logical relationships between population groups and areas for district boundaries. Figure 2 below show the example on how the topology is represented or modeled, and connections between nodes are coded into a database. The first step is to record the location of all "nodes," that is endpoints and intersections of lines and boundaries. Figure 2.4 showed the example of topological data consisting of 5 nodes and their attribute of dataset of latitude and longitude. 19 Figure 2.4 Example of topological data Based upon these nodes, "arcs" are defined. Figure 3 showed the relationship of arcs and the points of node. These arcs have endpoints, but they are also assigned a direction indicated by the arrowheads. The starting point of the vector is referred to as the "from node" and the destination the "to node." The orientation of a given vector can be assigned in either direction, as long as this direction is recorded and stored in the database. Figure 2.5 Example of data storage for arc By keeping track of the orientation of arcs, it is possible to use this information to establish routes from node to node or place to place. Thus, if one wants to move from node 3 to node 1, we can locate the necessary connections in the database. 20 The implementation strategy behind this is to offer special data structures for the representation of the graphs that allow the traversing between nodes. Graph operations are to be implemented on the basis of efficient graph algorithms for the spatially embedded networks. 2.7 Digital Map Map can be described as a rule based abstraction of reality, which is intended to convey information. The map is a result of applying rules to objects on the earth’s surface and translating them into a graphical and informational representation. (Browne and Jackson, 2004). Digital map data can also being defined as a map detail held in the form of national grid coordinate values and codes which can be stored and manipulated on computer (Ordnance Survey of UK, 2007). As in previous section, digital map is always associated with GIS application and being part of any GIS application. The information derived from map for this study consisting of road network and positioning of locations that related to the road network. 2.8 Kuantan Digital Map There are few applications on retrieval of information towards Kuantan Digital Map through the internet such as provided by Jabatan Ukur dan Pemetaan Malaysia (JUPEM), MapQuest (http:// www.mapquest.com), Yahoo maps guide (http://www.mapsguide.org), Kuantan Online (http://www.kuantanonline.com) and Google map application (Figure 2.6). There is also application from GIS regarding the retrieval information of Kuantan map and the application was developed based on MapInfo features. By referring to the path finding solution from this application, the technique used is more on solving the shortest path that give a minimal length of distance and driving time. The scope of application covered the whole area of 21 Pahang, including each type of road, consisting of highway, federal road, main road, country road, town road and streets. The disadvantage is that not all of the location being labelled and stored, making the retrieval least accurately for certain location search. It drives this study to be conducted that hoped can be an alternative solution of path findings for set of local map. Figure 2.6 Google facilities in assisting direction findings 2.9 Discussion This chapter revising on studies that contributes for this study. It comprise the study of mathematical graph theory that can be implemented within the database structure, the suitable path algorithm in solving the single pair problems, the analysis of suitable searching strategies, the use of SQL as database structure and the overview of digital map. This chapter also discussed the encountered disadvantages from the available system of path findings for map application. As being discussed above, the path finding is more towards for shortest path with minimal distance and did not consider for other possible path. It derives this study to give the flexibility on choosing the retrieval of alternative paths for the set of location. 22 CHAPTER 3 METHODOLOGY 3.1 Introduction This chapter discusses the methodology that include steps taken and used in modeling the spatial data, data pre-processing, graph representation within the SQL database schema and selection of query language to run the searching algorithm and retrieving the result based on parsing query. This chapter discusses the steps taken started from modeling the raw source of spatial resources that came from digital map of location. This study will focus on spatial data for locations or point of interest in certain sample road networks. Data used for this study are based on certain area of Kuantan town. 3.2 Research Framework The development of database will be based on the basic stages, shown in Figure 3.1 ( Longley, 2005). The conceptual model will model the user’s view and applications requirements, define the objects and relationship according to the geographic representation. At the stage of logical model, the model will match to geographic database types and geographic database structure will be organized using the normalization approach and lastly, it will be designed towards the database schema of specific physical model. 23 Figure 3.1 Stages in database design This study will be conducted according to the procedures in Figure 7. As being mentioned in the previous section, the development of data model will started with data model from the map, the calculation between selected points, modeling the graph theory on the dataset using the database structure and applied the searching techniques on finding the alternative paths. Extracting map information Modeling selected data Selecting Suitable Query and Algorithm Result Analysis Figure 3.2 Research Framework 24 3.2.1 Extracting map information Information related was based on map from Jabatan Ukur dan Pemetaan Malaysia, Pahang, Kuantan Municipal Council and Kuantan Tourism Information Center. For this sample application, information extracted from Kuantan map were related towards the road network and related point of interest associated with the selected roads. Figure 8 shows the origin map that covered for Kuantan area with scale of 1 : 90 000. Figure 3.3 Kuantan map From the origin map, little spatial information was selected. Since this application were concentrate on finding alternative routes from one to another destination, selected data were concerned on point of interest and related road 25 between this point of interest. Figure 3.4 shows the part of area from Kuantan town. Like any other places, we could see the road network that connect from one place to another places and normally, there are more than one path or route connecting the 2 places or from start to ending destination. Figure 3.4 Map of Kuantan town For specific purpose of this study, road network will be based on town area and the data were modeled based on tourism point of view. Figure 3.5 shows the related and important network road for Kuantan area from the tourism perspective. Point of interest such as hotels, road junction, road intersection, traffic light and public spot denoted as nodes and road between the point of interest denoted as an edges or arcs. 26 Figure 3.5 Kuantan town network road For this study, we choose 24 points of interest denoted 24 nodes consist of important spot in Kuantan that tightly related with the main road in town area, represent 34 edges or arcs with 61 road direction consist of road segment between the nodes. 3.2.2 Modeling selected data Point of interest from map application consist of building, properties, tourist attraction places such as beaches and public spot area. Road network consist types of road and things associated with the roads. For this purpose of study, we will only concentrated on the spatial road attributes such as the positioning of each respective nodes along the roadside and other important geographic items of the respective edges and nodes. From Figure 3.5, the related point of interest and road network can be represented as in Figure 3.6. Points of interest or location were defined as nodes, n and road path between nodes were defined as edges or arc, connecting the nodes. For this study, 24 important locations were marked as nodes, linking with 34 edges within the 61 direction consist of single and double direction of the road segment. The nodes are marked with A to X, represent the locations of road junctions, traffic lights, shopping complex, mosque, beach and hotels The adjacent between nodes 27 were defined by the incoming path or outgoing path for the direction, representing one way or 2 ways of road type. For this scope of study, the edges with single direction that represent a one way road are edges of DG, GH, HK, KL, LI, IH, IC. With the reference from graph theory fundamentals, the list of nodes and edges from the graph can be written as notation below. Figure 3.6 Nodes and arc representation Let the set of nodes in Figure 12 as N, the set of edges with single and double direction = L, graph = G. Then, the tuple or ordered pair {N, L} can be defined as : N = { A, B, C, D, E, F,G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X }, L = { AB, BA, AC, CA, BE, BE, BD, DB, DC, CD, DG, EF, FE, EM, ME, FG, GF, FJ, JF, GH, IH, IC, HK, KL, LI, MN, NM, JN, NJ, JK, KJ, LO, OL, NO, ON, MP, PM, PS, SP, PQ, QP, QT, TQ, TV, VT, SV, VS, TU, UT, QR, RQ, OR, RO, RU,UR, UW, WU, VW, WV, WX, XW}, Thus, G = {N, L} The database model for related data was representing in the form of tables and sample record as below. Table 1: namely, location stored data regarding location, the name of location, detail description of location and their spatial data 28 comprising of the grid position (x and y). Table 2 : road_graph stored data on the nodes connectivity based on the adjency list of related nodes as mentioned in previous notation set of vertices, comprising the source and destination nodes and length that represent the distance between nodes. The distance calculation based on Euclidan formula, that’s: distance = √ ( x1 – x2 ) 2 + locateID A B C D (y1 – y2) 2 name lat_x Kemunting 10 Traffic Light, 16 Junction Jln Wong Ah Jang of Jln Penjara Kuantan Esplanade 14.2 Teruntum Complex 14 Table 3.1 Sample data from table location long_y 4.5 13.5 4 8 The standard query of SQL has been used to create tables as being listed below, inserting data and populate results for selected nodes . For this study, we proposed of using MySQL version 5 to manage the database structure, along with PHP script and browser capabilities to represent the result and queries. CREATE TABLE location ( locateID varchar(10) NOT NULL, name varchar(255) NOT NULL, lat_x float (10,2) NOT NULL, lat_y float (10,2) NOT NULL, PRIMARY KEY (locateID) ) ; CREATE TABLE road_graph ( source varchar(10) NOT NULL, destination varchar(10) ) NOT NULL, length float (10,2) NOT NULL, PRIMARY KEY ( source, destination) ) ; 29 Length between the same nodes has been addressed as 0, to avoid the miss calculate between same nodes. Field of source and destination from road_graph is referring to the locateID from table location and denoted by label A to X for this purpose of study. The length field contains the distance between the adjency of related nodes and is calculated by using the distance calculation formula as mentioned. Other related table including table of road consist detail of road such as name and the edges. The details are provided in Appendix. source A A A B B B C 3.2.3 destination length B 7 C 5 A 0 D 0.5 E 1.5 B 0 D 0.2 Table 3.2 Sample data from road_graph Selecting Suitable Query and algorithm Based on the table of road_graph, we proposed to store the retrieval result from the node searching and traversing in temporary tables, contain the path and length calculated. The algorithm used is based on the Floyd’s-Warshall algorithm procedure, adaptation from the algorithm in solving shortest path problems. This study will refer to the pseudo code on solving problems as being discussed in the previous chapter but with the suitability on finding alternative paths. For this study, we use the transitive closure of Warshall algorithm along with the searching technique of breadth-first-search (BFS). The selected query will focus in finding any possibilities of paths based on source and destination nodes required by user, thus the algorithm only focused on the single pair path. 30 The table for input source and destination nodes assumed as #graph. This script assume that graph may not be tree, thus it will check all the possibilities nodes based on given source node and the connections from the adjacent list until it find the destination node. The searching started with the root node (source) and get reachable node from the adjency list, it will check the child node or finding paths of length 1 (edges), if not find the destination node, down to sub child until found the destination node, get back to the root, turn to other child or sibling node at the same level first, until find the desired destination node, then back to the root and start with other sibling, down to the child and so on until finish the iteration. The simple SQL statement are such below. create table #graph ( id int primary key ) insert into #graph values ( @root_id ) while ( @@rowcount > 0 ) begin insert into #reached (id ) select distinct child_id from edges e join #graph p on p.id = e.parent_id where e.child_id not in ( select id from #graph ) end For this study, the suitable SQL syntax for MySQL has been used based on the adaptation of above procedure as below . Assuming that nodes involved in this graph is 1 to N. The temporary table to store the retrieval paths results denoted as paths and table that store the adjency list is road_graph, consisting the field of source, destination and length for each edges. INSERT INTO paths SELECT road_graph1.source,road_graph2.source …., road_graphN-1.destination, (road_graph1.length + road_graph2.length + … road_graphN-1.length) ( CASE WHEN road_graph1.source NOT IN ( road_graph2.source, road_graph3.source … road_graphN-1.source) THEN 1 ELSE 0 END 31 . . . + CASE WHEN road_graphN-1.source NOT IN ( road_graph1.source, road_graph3.source … road_graphN-2.source) THEN 1 ELSE 0 END ) FROM road_graph WHERE road_graph1.source =’source_node’ AND road_graph1.destination = road_graph2.destination …… AND road_graphN-1 = ‘destination_node’ ; By using above approach, the maximum number of node is put in the table paths as a final path taken for each iteration of traversing nodes. The table paths hold the retrieval results and the distance length for each traversing or searching. Then, the results will be refined and filtered to get the final alternative paths by eliminate the redundancy and the traversing more than once for each node. Browser application has been used to receive the query from user such as shown at Figure 13. Figure 3.7 Sample query from user through browser 32 3.2.4 Result Analysis Basically, results from query processing were stored in the table and need to be refined for the end user. The selected queries were used in filtering the desired result using SQL function. The result was grouped by its ranking of the length calculation or total path or node of traversal. The result analysis will be detailed explained in Chapter 4. 3.3 Instrumentation The instrumentation involved in this study can be divided into two types, which is hardware and software. i) List of hardware that used in this project: Specification AMD Turion 64X2 (1.8 Ghz) Processor 1.0 GB Memory 120 GB Hard Disk 13” Monitor Table 3.5 List of hardware used ii) List of software that used in this project: a. MySql Version 5.1 b. PHP Admin c. Webserver (Apache) d. Microsoft Word (Office XP) e. Microsoft Access(Office XP) f. Microsoft Visual Studio 33 3.4 Summary This chapter has discussed methodology used in this study based on the research framework provided. The stage on map information extraction had involved the tedious task on selecting the point of interests and identified the positioning of road edge and intersection. The result of selecting the roads was based on the priority and importance for the transport network in Kuantan town area and the selection of SQL was based on its flexibility of embedding the language programming. Each stage of research framework plays important role in accomplishing this study. 34 CHAPTER 4 RESULTS 4.1 Introduction This chapter discussed the process that had been taken in accomplishing this study based on the stages in framework design as being discussed in previous chapter. Results found were based on the sample of testing nodes and iteration within certain selected nodes that connect between source and destination location. As mentioned in previous chapter, this study is focused on finding alternative paths between set of locations. Therefore, the testing done based on the algorithm mentioned in previous chapter only focused towards a single pair of path within the given source and destination nodes. The discussion from the experiments is more to discuss the suitability of algorithm used in finding solution for alternative paths searching and the pattern of searching that reflects towards the searching result or output. 4.2 Initial findings 4.2.1 Collecting data : From the interview session with En. Hashim Ismail, Deputy Director from Jabatan Ukur dan Pemetaan (JUPEM) Kuantan branch, it found that the map application for public viewer is basically being developed by using fully GIS 35 implementation. The information of road network or path basically was based on a static application either from online digital map viewer or hard copy of map. Related map on Kuantan tourism had been taken from Kuantan Tourism Information Center with respective from Kuantan Municipal Council. Alternative paths finding is based on the important location along with the main road in part of Kuantan town area. The ideas on the development of the research content have also being contributed by En.Fikri Ismail and En. Firdaus Muhamad, staff from GIS department at POLISAS. 4.2.2 Technical study : Technical study has been done on revision of mathematics graph theory foundation, database structure on SQL (MySQL) implementation and related algorithm used. The calculation of distance had been supervised and verified by the staff from JUPEM and POLISAS. For this study, the Floyd-Warshall algorithm of traversing techniques has been used and implemented using SQL. 4.3 Experiments For testing purposes, the dataset from the graph were modeled in a directed graph application. Few queries were being tested and the results were concerned on the accuracy and the number of traversing from source to destination node. From the methodology discussed in previous chapter, we implement the SQL syntax as below. This experiment was grouped by the maximum number of node testing, that’s 9, 12 and 24 of nodes. This experiments took temporary table to keep the traversing node from each iteration of traversing from node 1 to n and through the adjency nodes. Futher discussion explained on findings from two experiments, followed by the summary of few processing analysis and the overall discussion of the findings. 36 4.3.1 Experiment 1 Sample testing had been run by applying the above queries. The sample testing is between the location of Kemunting as source and Taman Kerang as target destination. Kemunting was denoted as A and Taman Kerang as G. The experiment involved 9 nodes from denoted A to I. The response time for the processing took 0.39 seconds with the 1475 of traversing iteration. After the result being filtered and refined, the results display the alternative paths that can be taken from Kemunting to Taman Kerang as denoted below. Table 3 showed the content of filtered result from the query processing and Table 4 display the actual presentation from the denoted label. Num Alternatives path Path length Node/s (km) 1 Kemunting Æ Junction/TrafficLight at Jalan Wong Ah visited 2.18 4 1.33 4 3.07 5 3.31 7 Jang off Jalan Penjara Æ Kompleks Teruntum Æ Taman Kerang 2 Kemunting Æ Esplanade Æ Kompleks Teruntum Æ Taman Kerang 3 Kemunting Æ Junction/TrafficLight at Jalan Wong Ah Jang off Jalan Penjara Æ Junction/Traffic Light at Jalan Wong Ah Jang off Jalan Bukit Ubi Æ Hotel Shahzan off Jalan Gambut Æ Taman Kerang 4 Kemunting Æ Esplanade Æ Kompleks Teruntum Æ Junction/TrafficLight at Jalan Wong Ah Jang of Jalan Penjara Æ Junction/Traffic Light at Jalan Wong Ah Jang of Jalan Bukit Ubi Æ Hotel Shahzan off Jalan Gambut Æ Taman Kerang Table 4.1 Sample of retrieval query in details 37 Form the above result, we could analyze finding in this below discussion : i. The numbers of alternative paths available from a set of single path, denoted as from source to destination nodes or location, in example from the above result, the numbers of alternative paths retrieved is 4 ( without considering the traversing that goes more than once to the nodes) ii. The nodes visited from source to destination. From above result, we could simply look at the numbers of visited node from each alternative paths, in example the selection of path 1 gave us 4 visited node, path 2 also gave 4 visited node, path 3 gave 5 visited node and path 4 gave the highest number of visited nodes, that’s 7. iii. The length of each retrieved path. For this testing, we put the distance within the kilometers (km) unit. From above table, we could see that path 1 gave us 2.18 km, path 2 gave 1.33 km, path 3 gave 3.07 km and path 3 gave us 3.31. It could also simply show us the minimum/shortest distance or the maximum/longest distance from source to destination location. Figure 4.1 showed the sample of interfaces using browser capabilities along with the use of PHP capabilities and MySQL datasets retrieval in representing the output. The output showed the set of retrieval paths based on input from user, selecting Kemunting as source node (A) and Taman Kerang (G) as destination node. 38 Figure 4.1 Sample of data retrieved 4.3.2 Experiment 2 This experiments test another set of location that has a same node from source to destination, make it recursive to node itself. The traversing experiments involved cycles iteration from the same node that denoted as source and destination node. We took location of Kemunting (A) as a sample and try to find the paths that can be traverse from Kemunting as source node and back to Kemunting itself as target destination. Kemunting was denoted as A and the experiment involved 9 nodes from denoted A to I. The response time for the processing took 0.38 seconds with the 2078 of traversing iteration. After the result being filtered and refined, the results display the alternative paths that can be taken from Kemunting to itself as below. Table 5 showed the content of filtered result from the query processing and Table 6 display the actual data presentation from the denoted label. 39 Num Alternatives path Path length Node/s (km) 1 Kemunting Æ Kemunting 2 Kemunting Æ Junction/TrafficLight at Jalan Wong Ah visited 0 1 2.16 2 0.84 2 2.49 4 Jang off Jalan Penjara Æ Kemunting Æ 3 Kemunting Æ Esplanade Æ Kemunting 4 Kemunting Æ Junction/TrafficLight at Jalan Wong Ah Jang off Jalan Penjara Æ Kompleks Teruntum Æ Esplanade ÆKemunting Table 4.2 Sample of retrieval query in details The analysis as mention in experiment 1 is clearly showed by the result displayed. For this experiment, we gave an exceptional for traversing node that more than once for the first and last step based on the reason that the source and destination will point to the same location. The counting of node/s visited is based on the different nodes visited, in example for path 1, the same node referred to the source and destination node, thus it count as a same node. It goes same to the path 2 and path 3 that gave the counting node of 2, denoted that the same node reached will not be counted. 4.3.3 Summary of Experiment 1 and Experiment 2 From the experiment 1 denoted from location A to G or Kemunting to Taman Kerang, we could summarize that the numbers of alternatives path found was 4, the shortest distance was path 2 and the longest distance was path 4. The numbers of visited nodes didn’t represent the shortest path because the weighted for each road segment is difference. Although the path will travel and visit more nodes, the distance might be shorter than the least number of visited node. For the case of Experiment 2 that involved the same source and destination node, it clearly showed 40 the same numbers visited node might gave different length, depends on the weighted given. Therefore, the significant of the result could be much more better in helping on decision making for selecting the paths to be taken for any reason of road travelling or planning. 4.4 Findings Few testing on queries have been run to compare the response time of iteration and the pattern of retrieval result. The experiments were tested by using the MySQL version 5.0 capabilities in storing the datasets and also the query processing. We’ve run few testing on different total of nodes and different locations. The reason on running the testing for few set of nodes is to show the comparison on average time running in each response that reflected from the number of nodes involved and iteration searching from the respective set of source and destination nodes. The summary of testing for 9 single pair of node is shown at Table 7. The location denoted as label from A to X. Location involved means from certain location to the target location, denoted as the source node Æ destination node. Number of nodes involved is the maximum numbers of nodes that represent by the table graph, response time is referring to the time of query processing, number of iteration is referring to the traversing iteration path that visited by the searching from and to each node and number of alternative paths refined is referred to the number of paths visited without consideration of more than once traversing for each nodes, exceptional for the same source and target nodes as being mentioned in analysis of previous Experiment 1. 41 Location involved Number of Response nodes time involved (second) AÆG 9 0.39 sc Number of Number of iteration/ alternative paths retrieved path refined (final result) 1474 4 AÆC 9 0.42 sc 1916 2 DÆH 9 0.33 sc 782 2 AÆ A 9 0.38 sc 2078 4 MÆ X 12 3.98 sc 23 771 9 O ÆX 12 4.84 31 400 10 O ÆX 24 15.72 sc 100 000 14 MÆX 24 10.02 sc 70 721 12 Table 4.3 Summary of 8 query processing From the results, few analyses can be found and summarized as below : i. The increament of nodes checking or visited will increase the time taken for each iteration of node traversing. In example, the range time needed in query processing for the set of 9 nodes is between 0.38 – 0.42 seconds, the set of 12 nodes need the average time between 3 – 5 seconds and more input of nodes, 24 took more than 10 second to accomplish the iteration. It shows that the connection between time and number of nodes is directly proportional. ii. The number of iteration for each traversing has an impact of response time in a great number of retrieval paths or retrieval result. From the result, we found that the 100 000 of iteration needs the time processing of more than 15 seconds compare to the 1474 of iteration that needs only 0.39 seconds to complete the process, but it goes different in comparison of the processing between AÆ A and AÆC. AÆA query needs 0.38 seconds to produce the 2078 of lines but AÆC needs 0.42 sc to process 1916 lines, so it can conclude that the least differences didn’t gave much more impact on the response time. 42 iii. The pattern of searching from root node (source node) can be seen that the priority was given to the right node or leaf above the parent node, then it check the child node until find the destination result, go back to the root, find other child and so on as being discussed in previous chapter. We take the example from the experiment 1 and experiment 2. From Experiment 1 (AÆG), it showed that the traversing start from root node, denoted A, then goes to the B which is the right leaf of the node, after find the destination node, it goes back to the root, and root goes to other child, that’s C until find the destination, then it goes back to the root, start searching from child B which denoted as right leaf until find the destination and get back to the root and start searching from the child C and so on. The same pattern goes to the Experiment 2 whereby the searching start from A to A, A to B, then A to C until find the destination. 4.5 Conclusion From the overall findings, it conclude that the use of algorithm is significant to be applied into the larger dataset but it will increase the response time due to the additional of nodes in the graph list. The searching technique used in this study is significant to check each node visited and to give the equal traversing for each level of parent and node involved, therefore it return the all possibilities of path required from the set of location. The browser capabilities is used for easier interaction between the user and the proposed system itself. 43 CHAPTER 5 CONCLUSION 5.1 Introduction This chapter concludes the work which has been done to meet the project goal. The main objective of this project is to show the effectiveness of path algorithm used in find out the possibilities paths towards the set of node or location. 5.2 Contribution of Study This study had achieve the goal on finding solutions of retrieval alternative paths for the set of location. The findings as mentioned in previous chapter showed the retrieval paths or outcome from each iteration that showed more than one possibities of path retrieved from each traversing process. The algorithm used can be one of the contribution for the existing system available. 44 5.3 Suggestion for Future Work There is driving force behind the growth of database development specifically in spatial database application to support on decision making or any other business process. The solution of path finding techniques can be extend to embedded with other input data from other field apart from pre trip planning or route finder. The system proposed can be a backbone for enhancement in a business analysis such as this example: 1. Insurance risk assessment – To answer the queries e.g. “What types of accident happened within 500 meters of this intersection Or “List the accidents case that had happened between the junction of Kemunting and Tanjung Lumpur for the past one year” 2. Retail site selection based on the numbers of site location along the selected road – To answer the queries e.g. “Where should we open our new stores branch at Kuantan town between the road of Jalan Teluk Sisek and Jalan Besar that near to the public spot are?” From the 2 examples above, the integration of system proposed and other attributes data such as accident statistic, claims insured, demographic or population statistics can verify the query answer. Thus, it can support the analysis on decision making for other analysis. 45 REFERENCES Ahuja, R. K., Magnanti, T.L., and Orlin, J.B. (1993). Network Flows: Theory, Algorithms and Applications. Englewood Cliffs, NJ:Prentice Hall. Celko, J. (2004) "Trees and Hierarchies in SQL For Smarties", Morgan Kaufman, San Francisco, 2004. Cromley, E.K. (1997). Digital Map Librarianship: Maps and Digital Spatial Data :Conecticut: IFLA Section of Geography and Map Libraries Dolman, J., Hodgson, B., Dowsey, J., Heffernan, J., Seymour, J., Simons, B., Woods, B. (1996). Futher Mathematics.VCE units 3 & 4. (2nd Edition). Queensland. The Jacaranda Press Erwig, M., Guting, R. H, (1994). Explicit Graphs in a Functional Model for Spatial Databases. Hagen, Germany Garofalakis, J., Polyxeni, N., and Athanasios, P. (2006). Vehicle Routing and Road Traffic Simulation:A Smart Navigation System. Patras. Greece Hamill, R., and Martin, N. (2003). Database Support for Path Query Functions. London Longley, P. A., Goodchild, M. F., Maguire, D. J., and Rhind, D. W., (2005). Creating and Maintaining Geographic Databases. Second Edition. John Wiley and Sons Okyere, M. K. (2000). Virtual City: A Heterogenous System Model of an Intelligent Road Navigation System Incorporating Data Mining Concepts. Indiana, USA Pearl, J. (1984). Heuristics:Intelligent Search Strategies for Computer Problem Solving. Addison-Wesly. Ravada, S., Sharma, J., Herring, J., (2002). Oracle Spatial: An Oracle Technical White Paper. California. Oracle Corporation. 46 Ray R. Larson, ( 1998) Geographic Information Retrieval and Spatial Browsing. Berkeley, California Ray R. Larson, ( 1998) Geographic Information Retrieval and Spatial Browsing. Berkeley, California Roozbeh, S., Hamid, E., and Mohsen, G., (2003). Evaluation of Route Finding Methods in GIS Applications. Tehran, Iran Saltenis, S. (2001). Algorithms and Data Structure, Lecture XIII. Aalborg Scott, K., and Bernstein, D. (2000). Finding Alternatives to the Best Path. New Jersey, USA Stephens, M. S., Rung, J., and Lopez, X. (2004). Graph Data Representation in Oracle Database 10g:Case Studies in Life Sciences. Bulletin of the IEEE Computer Society Technical Committee in Data Engineering. Oracle Corporation, USA. Montreal, Canada Wong, A. (2000). GIS-Based Freight Density And Capacity Modelling. Alberta Wu, Q. (2006). Incremental Routing Algorithms For Dynamic Transportation Networks. M.Sc. Thesis. University of Calgary, Alberta. Xiaofang Zhou, Yanchung Zhang, Sanglu Lu, Guihai Chen, On Spatial Information Retrieval and Database Generalization, 2001 47 Appendix A List of nodes (places) 48 Table : location locationID name details lat_x A kemunting 10 B lampu isyarat jln wong ah jang, sek ren abdullah 16 13.5 C esplanade 14.2 4 D kompleks teruntum 14 8 E simpang bukit ubi, hotel pacific 17 19 F hotel Shahzan, simpang jln gambut 17 13 G taman kerang, padang mpk1 17 4.5 H masjid, mahkota square 18 5 I simpang jalan besar, ke tabung haji merdeka station, simpang st thomas Persimpangan Lampu Isyarat Tanah Putih, Jalan Wong Ah Jang, kawasan Kemunting Persimpangan Lampu Isyarat Jln Wong Ah Jang, Jalan Penjara, sek.ren abdullah Persimpangan Lampu Isyarat Jalan Besar, Jln Penjara , berhampiran hospital, padang MPK, benteng, Persimpangan Lampu Isyarat bersebelahan Teruntum, Jln Penjara, Jalan Mahkota Persimpangan Lampu Isyarat Jalan Wong Ah Jang, Jalan Tun Ismail, Bukit Ubi Persimpangan Lampu Isyarat Jalan Bukit Ubi dan Jalan Gambut Persimpangan Jalan Mahkota, Jalan Bukit Ubi Persimpangan Jalan Mahkota, Jalan Pasar, berhampiran Masjid Persimpangan Jalan Besar, Jalan Pasar, Persimpangan Lampu Isyarat Jalan Gambut, Jalan Merdeka, s.k st thomas Persimpangan Jalan Merdeka,Jalan Mahkota, berhampiran Pejabat Pos Persimpangan Jalan Teluk Sisek, Jalan Besar, Jalan Merdeka long _y 4.5 19 3.5 22.5 15.5 18.5 8.5 21 4.4 J K pejabat pos, bank L simpang bank RHB, simpang 3 jln teluk sisek, jalan besar,jln merdeka 49 M Megamall, MS Garden N Shell, simpang jln gambut, jalan beserah Lampu Isyarat ke Tanjung Lumpur, JAlan Beserah, Ke Teluk Cempedak, Jalan Teluk Sisek Shell Jalan Beserah, simpang ke kubang buaya Ikip Link, jln dato bahaman O P Q R S T U V W X Persimpangan Lampu Isyarat Jalan Tun Ismail, Jalan Beserah, berhampiran Megamall Persimpangan Jalan Gambut, Jalan Beserah, berhampiran Shell Persimpangan Lampu Isyarat Jalan Teluk Sisek, Jalan Beserah, Jalan Dato’ Abu Bakar 24 24 25 18.5 25.5 9 Persimpangan Lampu Isyarat Jalan Beserah, Jalan Kubang Buaya Bulatan Jalan Kubang Buaya, Jalan Dato’ Bahaman Pantai Selamat, jalan Persimpangan Lampu teluk sisek Isyarat Jalan Teluk Sisek, Jalan Kubang Buaya, Jalan Selamat Lampu Isyarat Persimpangan Lampu Semambu, Jabatan Isyarat Jalan Beserah, Pemetaan Jalan Tengku Muhammad, Jalan Semambu MRSM Persimpangan Jalan Tok Sira, Jalan Dato’ Bahaman, berhampiran MRSM simpang Persimpangan Jalan perkampungan tok Tok Sira, Jalan Teluk sira Chempedak JPJ Persimpangan Jalan Dato Bahaman, Jalan Tengku Muhammad, berhampiran JPJ Rumah Persinggahan Persimpangan Jalan Diraja, berhadapan Teluk Chempedak, Taman Teruntum Jalan Tengku Muhammad Teluk Cempedak Teluk Cempedak 30 30 54.5 21.5 40.5 12 64.5 35.5 62 23 44 11 66 25.5 67 12 80 18 50 Appendix B List of edges (source and destination) 51 Table : graph source destination A A length 0 A B 1.66 A C 0.42 B A 1.66 B B 0 B D 1.22 B E 0.14 C A 0.42 C C 0 C D 0.4 D B 1.22 D C 0.4 D D 0 D G 0.46 E B 0.14 E E 0 E F 0.6 E M 0.9 F E 0.6 F F 0 F G 0.85 G F 0.85 G G 0 G H 0.11 H H 0 H K 0.35 52 I C 0.48 I H 0.18 I I 0 J J 0 J K 0.81 J N 0.7 K J 0.81 K K 0 K L 0.48 L I 0.22 L L 0 L O 1.15 M E 0.9 M M 0 M N 0.21 M P 0.81 N J 0.7 N M 0.21 N N 0 N O 0.7 O L 1.15 O N 0.7 O O 0 O R 1.53 P M 0.81 P P 0 P Q 2.59 P S 3.49 Q P 2.59 Q Q 0 Q R 1.69 Q T 0.76 53 R O 1.53 R Q 1.69 R R 0 R U 0.36 S P 3.49 S S 0 S V 1.01 T Q 0.76 T T 0 T U 2.16 T V 0.47 U R 0.36 U T 2.16 U U 0 U W 2.3 V S 1.01 V T 0.47 V V 0 V W 1.35 W U 2.3 W V 1.35 W W 0 W X 1.43 X W 1.43 X X 0