Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino
Contents of /trunk/doc/ALGORITHM.txt
Parent Directory
|
Revision Log
Revision 381 -
(show annotations)
(download)
Tue Apr 27 16:26:58 2010 UTC (14 years, 10 months ago) by amb
File MIME type: text/plain
File size: 7032 byte(s)
Tue Apr 27 16:26:58 2010 UTC (14 years, 10 months ago) by amb
File MIME type: text/plain
File size: 7032 byte(s)
Interim checkin of updated documentation.
1 | Routino : Algorithm |
2 | =================== |
3 | |
4 | |
5 | This page describes the development of the algorithm that is used in |
6 | the software demonstrated here. |
7 | |
8 | |
9 | Simplest Algorithm |
10 | ------------------ |
11 | |
12 | The algorithm to find a route is fundamentally simple: Start at the |
13 | beginning, follow all possible routes and keep going until you reach |
14 | the end. |
15 | |
16 | While this method does work, it isn't fast. To be able to find a route |
17 | quickly needs a different algorithm, one that can find the correct |
18 | answer without wasting time on routes that lead nowhere. |
19 | |
20 | |
21 | Improved Algorithm |
22 | ------------------ |
23 | |
24 | The simplest way to do this is to follow all possible segments from the |
25 | starting node to the next nearest node (an intermediate node in the |
26 | complete journey). For each node that is reached store the shortest |
27 | route from the starting node and the length of that route. The list of |
28 | intermediate nodes needs to be maintained in order of shortest overall |
29 | route on the assumption that there is a straight line route from here |
30 | to the end node. |
31 | At each point the intermediate node that has the shortest potential |
32 | overall journey time is processed before any other node. From the first |
33 | node in the list follow all possible segments and place the newly |
34 | discovered nodes into the same list ordered in the same way. This will |
35 | tend to constrain the list of nodes examined to be the ones that are |
36 | between the start and end nodes. If at any point you reach a node that |
37 | has already been reached by a longer route then you can discard that |
38 | route since the newly discovered route is shorter. Conversely if the |
39 | previously discovered route is shorter then discard the new route. |
40 | At some point the end node will be reached and then any routes with |
41 | potential lengths longer than this actual route can be immediately |
42 | discarded. The few remaining potential routes must be continued until |
43 | they are found to be shorter or have no possibility of being shorter. |
44 | The shortest possible route is then found. |
45 | |
46 | At all times when looking at a node only those segments that are |
47 | possible by the chosen means of transport are followed. This allows the |
48 | type of transport to be handled easily. When finding the quickest route |
49 | the same rules apply except that criterion for sorting is the shortest |
50 | potential route (assuming that from each node to the end is the fastest |
51 | possible type of highway). |
52 | |
53 | This method also works, but again it isn't very fast. The problem is |
54 | that the complexity is proportional to the number of nodes or segments |
55 | in all routes examined between the start and end nodes. Maintaining the |
56 | list of intermediate nodes in order is the most complex part. |
57 | |
58 | |
59 | Final Algorithm |
60 | --------------- |
61 | |
62 | The final algorithm that is implemented in the router is basically the |
63 | one above but with an important difference. Instead of finding a long |
64 | route among a data set of 5,000,000 nodes (number of highway nodes in |
65 | UK at beginning of 2009) it finds one long route in a data set of |
66 | 500,000 nodes and a few hundred very short routes in the full data set. |
67 | Since the time taken to find a route is proportional to the number of |
68 | nodes the main route takes 1/10th of the time and the very short routes |
69 | take almost no time at all. |
70 | |
71 | The solution to making the algorithm fast is therefore to discard most |
72 | of the nodes and only keep the interesting ones. In this case a node is |
73 | deemed to be interesting if it is the junction of three or more |
74 | segments or the junction of two segments with different properties. In |
75 | the algorithm these are classed as super-nodes. Between each super-node |
76 | and a neighbouring super-node a super-segment is generated that |
77 | contains the shortest path along segments with identical properties. |
78 | This decision making process can be repeated until the only the most |
79 | important and interesting nodes remain. |
80 | |
81 | To find a route now comprises finding a route along super-segments |
82 | between the start node and the end node followed by finding a route |
83 | between each adjacent pair of super-nodes in that route. (The routes |
84 | between the start and end nodes and all adjacent super-nodes need to be |
85 | found before the route using only super-nodes can be found.) This is |
86 | considerably quicker than using all nodes but gives a result that still |
87 | contains the full list of nodes that are visited. |
88 | |
89 | |
90 | Implementation |
91 | -------------- |
92 | |
93 | The hardest part of implementing this router is the data organisation. |
94 | The arrangement of the data to minimise the number of operations |
95 | required to follow a route from one node to another is much harder than |
96 | designing the algorithm itself. |
97 | |
98 | The final implementation uses a separate table for nodes, segments and |
99 | ways. Each table individually is implemented as a C-language data |
100 | structure that is written to disk by a program which parses the |
101 | OpenStreetMap XML data file. In the router these data structures are |
102 | memory mapped so that the operating system handles the problems of |
103 | loading the needed data blocks from disk. |
104 | |
105 | Each node contains a latitude and longitude and they are sorted |
106 | geographically so that converting a latitude and longitude coordinate |
107 | to a node is fast as well as looking up the coordinate of a node. The |
108 | node also contains the location in the array of segments for the first |
109 | segment that uses that node. |
110 | Each segment contains the location of the two nodes as well as the way |
111 | that the segment came from. The location of the next segment that uses |
112 | one of the two nodes is also stored; the next segment for the other |
113 | node is the following one in the array. The length of the segment is |
114 | also pre-computed and stored. |
115 | Each way has a name, a highway type, a list of allowed types of |
116 | traffic, a speed limit and any weight, height, width or length |
117 | restrictions. |
118 | |
119 | The super-nodes are mixed in with the nodes and the super-segments are |
120 | mixed in with the segments. For the nodes they are the same as the |
121 | normal nodes, so just a flag is needed to indicate that they are super. |
122 | The super-segments are in addition to the normal segments so they |
123 | increase the database size (by about 10%) and are also marked with a |
124 | flag. |
125 | |
126 | |
127 | Practicalities |
128 | -------------- |
129 | |
130 | At the time of writing (April 2010) the OpenStreetMap data for Great |
131 | Britain contains: |
132 | * 14,675,098 nodes |
133 | + 8,767,521 are highway nodes |
134 | + 1,120,297 are super-nodes |
135 | * 1,876,822 ways |
136 | + 1,417,033 are highways |
137 | o 18,741,434 highway segments |
138 | o 1,641,009 are super-segments |
139 | * 60,572 relations |
140 | |
141 | The database files when generated are 41.5 MB for nodes, 121.6 MB for |
142 | segments and 12.6 MB for ways and are stored uncompressed. By having at |
143 | least 200 MB or RAM available the routing can be performed with no disk |
144 | accesses (once the data has been read once). |
145 | |
146 | |
147 | -------- |
148 | |
149 | Copyright 2008-2010 Andrew M. Bishop. |
Properties
Name | Value |
---|---|
cvs:description | Description of the algorithm. |