Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino
Contents of /trunk/doc/html/algorithm.html
Parent Directory
|
Revision Log
Revision 736 -
(show annotations)
(download)
(as text)
Mon May 30 12:36:12 2011 UTC (13 years, 9 months ago) by amb
File MIME type: text/html
File size: 17576 byte(s)
Mon May 30 12:36:12 2011 UTC (13 years, 9 months ago) by amb
File MIME type: text/html
File size: 17576 byte(s)
Describe new philosophy of alloing U-turn at waypoints to avoid dead-ends.
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
2 | <HTML> |
3 | |
4 | <!-- |
5 | Routino documentation - algorithm |
6 | |
7 | Part of the Routino routing software. |
8 | |
9 | This file Copyright 2008-2011 Andrew M. Bishop |
10 | |
11 | This program is free software: you can redistribute it and/or modify |
12 | it under the terms of the GNU Affero General Public License as published by |
13 | the Free Software Foundation, either version 3 of the License, or |
14 | (at your option) any later version. |
15 | |
16 | This program is distributed in the hope that it will be useful, |
17 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
18 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
19 | GNU Affero General Public License for more details. |
20 | |
21 | You should have received a copy of the GNU Affero General Public License |
22 | along with this program. If not, see http://www.gnu.org/licenses/. |
23 | --> |
24 | |
25 | <HEAD> |
26 | <TITLE>Routino : Algorithm</TITLE> |
27 | <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
28 | <LINK href="style.css" type="text/css" rel="stylesheet"> |
29 | </HEAD> |
30 | |
31 | <BODY> |
32 | |
33 | <!-- Header Start --> |
34 | |
35 | <div class="header" align="center"> |
36 | |
37 | <h1>Routino : Algorithm</h1> |
38 | |
39 | <hr> |
40 | </div> |
41 | |
42 | <!-- Header End --> |
43 | |
44 | <!-- Content Start --> |
45 | |
46 | <div class="content"> |
47 | |
48 | <h2><a name="H_1_1"></a>Algorithms</h2> |
49 | |
50 | This page describes the development of the algorithm that is used in Routino for |
51 | finding routes. |
52 | |
53 | <h3><a name="H_1_1_1"></a>Simplest Algorithm</h3> |
54 | |
55 | The algorithm to find a route is fundamentally simple: Start at the beginning, |
56 | follow all possible routes and keep going until you reach the end. |
57 | <p> |
58 | While this method does work, it isn't fast. To be able to find a route quickly |
59 | needs a different algorithm, one that can find the correct answer without |
60 | wasting time on routes that lead nowhere. |
61 | |
62 | <h3><a name="H_1_1_2"></a>Improved Algorithm</h3> |
63 | |
64 | The simplest way to do this is to follow all possible segments from the starting |
65 | node to the next nearest node (an intermediate node in the complete journey). |
66 | For each node that is reached store the shortest route from the starting node |
67 | and the length of that route. The list of intermediate nodes needs to be |
68 | maintained in order of shortest overall route on the assumption that there is a |
69 | straight line route from here to the end node. |
70 | <br> |
71 | At each point the intermediate node that has the shortest potential overall |
72 | journey time is processed before any other node. From the first node in the |
73 | list follow all possible segments and place the newly discovered nodes into the |
74 | same list ordered in the same way. This will tend to constrain the list of |
75 | nodes examined to be the ones that are between the start and end nodes. If at |
76 | any point you reach a node that has already been reached by a longer route then |
77 | you can discard that route since the newly discovered route is shorter. |
78 | Conversely if the previously discovered route is shorter then discard the new |
79 | route. |
80 | <br> |
81 | At some point the end node will be reached and then any routes with potential |
82 | lengths longer than this actual route can be immediately discarded. The few |
83 | remaining potential routes must be continued until they are found to be shorter |
84 | or have no possibility of being shorter. The shortest possible route is then |
85 | found. |
86 | <p> |
87 | At all times when looking at a node only those segments that are possible by the |
88 | chosen means of transport are followed. This allows the type of transport to be |
89 | handled easily. When finding the quickest route the same rules apply except |
90 | that the criterion for sorting is the shortest potential route (assuming that |
91 | from each node to the end is the fastest possible type of highway). |
92 | <p> |
93 | This method also works, but again it isn't very fast. The problem is that the |
94 | complexity is proportional to the number of nodes or segments in all routes |
95 | examined between the start and end nodes. Maintaining the list of intermediate |
96 | nodes in order is the most complex part. |
97 | |
98 | <h3><a name="H_1_1_3"></a>Final Algorithm</h3> |
99 | |
100 | The final algorithm that is implemented in the router is basically the one above |
101 | but with an important difference. Instead of finding a long route among a data |
102 | set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it |
103 | finds one long route in a data set of 1,000,000 nodes and a few hundred very |
104 | short routes in the full data set. Since the time taken to find a route is |
105 | proportional to the number of nodes that need to be considered the main route |
106 | takes 1/10th of the time and the very short routes take almost no time at all. |
107 | <p> |
108 | The solution to making the algorithm fast is therefore to discard most of the |
109 | nodes and only keep the interesting ones. In this case a node is deemed to be |
110 | interesting if it is the junction of three or more segments or the junction of |
111 | two segments with different properties or has a routing restriction different |
112 | from the connecting segments. In the algorithm and following description these |
113 | are classed as <em>super-nodes</em>. Starting at each super-node a |
114 | <em>super-segment</em> is generated that finishes on another super-node and |
115 | contains the <em>shortest</em> path along segments with identical properties |
116 | (and these properties are inherited by the super-segment). The point of |
117 | choosing the shortest route is that since all segments considered have identical |
118 | properties they will be treated identically when properties are taken into |
119 | account. This decision making process can be repeated until the only the most |
120 | important and interesting nodes remain. |
121 | <p align="center"> |
122 | <img alt="Original data" src="example0.png"> |
123 | <br> |
124 | Original Highways |
125 | <p align="center"> |
126 | <img alt="Iteration 1" src="example1.png"> |
127 | <br> |
128 | First Iteration |
129 | <p align="center"> |
130 | <img alt="Iteration 2" src="example2.png"> |
131 | <br> |
132 | Second Iteration |
133 | <p align="center"> |
134 | <img alt="Iteration 3" src="example3.png"> |
135 | <br> |
136 | Third Iteration |
137 | <p align="center"> |
138 | <img alt="Iteration 4" src="example4.png"> |
139 | <br> |
140 | Fourth Iteration |
141 | <p> |
142 | To find a route between a start and finish point now comprises the following |
143 | steps (assuming a shortest route is required): |
144 | <ol> |
145 | <li>Find all shortest routes from the start point along normal segments and |
146 | stopping when super-nodes are reached. |
147 | <li>Find all shortest routes from the end point backwards along normal |
148 | segments and stopping when super-nodes are reached. |
149 | <li>Find the shortest route along super-segments from the set of super-nodes |
150 | in step 1 to the set of super-nodes in step 2 (taking into account the lengths |
151 | found in steps 1 and 2 between the start/finish super-nodes and the ultimate |
152 | start/finish point). |
153 | <li>For each super-segment in step 3 find the shortest route between the two |
154 | end-point super-nodes. |
155 | </ol> |
156 | This multi-step process is considerably quicker than using all nodes but gives a |
157 | result that still contains the full list of nodes that are visited. There are |
158 | some special cases though, for example very short routes that do not pass |
159 | through any super-nodes, or routes that start or finish on a super-node. In |
160 | these cases one or more of the steps listed can be removed or simplified. |
161 | <p> |
162 | When the first route reaches the final node the length of that route is retained |
163 | as a benchmark. Any shorter complete route that is calculated later would |
164 | replace this benchmark. As routes are tested any partial routes that are longer |
165 | than the benchmark can be immediately discarded. Other partial routes have the |
166 | length of a perfect straight highway to the final node added to them and if the |
167 | total exceeds the benchmark they can also be discarded. Very quickly the number |
168 | of possible routes is reduced until the absolute shortest is found. |
169 | <p> |
170 | For routes that do not start or finish on a node in the original data set a fake |
171 | node is added to an existing segment. This requires special handling in the |
172 | algorithm but it gives mode flexibility for the start, finish and intermediate |
173 | points in a route. |
174 | |
175 | <h4><a name="H_1_1_3_1"></a>Algorithm Evolution</h4> |
176 | |
177 | In Routino versions 1.0 to 1.4 the algorithm used to select a super-node was the |
178 | same as above except that node properties were not included. Routino versions |
179 | 1.4.1 to 1.5.1 used a slightly different algorithm which only chose nodes that |
180 | were junctions between segments with different properties (or has a routing |
181 | restriction that is different from connecting segments in versions 1.5 and |
182 | 1.5.1). The addition of turn restrictions (described in more detail below) |
183 | requires the original algorithm since the super-segments more accurately reflect |
184 | the underlying topology. |
185 | |
186 | <h3><a name="H_1_1_4"></a>Routing Preferences</h3> |
187 | |
188 | One of the important features of Routino is the ability to select a route that |
189 | is optimum for a set of criteria such as preferences for each type of highway, |
190 | speed limits and other restrictions and highway properties. |
191 | <p> |
192 | All of these features are handled by assigning a score to each segment while |
193 | calculating the route and trying to minimise the score rather than simply |
194 | minimising the length. |
195 | <dl> |
196 | <dt>Segment length |
197 | <dd>When calculating the shortest route the length of the segment is the |
198 | starting point for the score. |
199 | <dt>Speed preference |
200 | <dd>When calculating the quickest route the time taken calculated from the |
201 | length of the segment and the lower of the highway's own speed limit and the |
202 | user's speed preference for the type of highway is the starting point for the |
203 | score. |
204 | <dt>Oneway restriction |
205 | <dd>If a highway has the oneway property in the opposite direction to the |
206 | desired travel and the user's preference is to obey oneway restrictions then |
207 | the segment is ignored. |
208 | <dt>Weight, height, width & length limits |
209 | <dd>If a highway has one of these limits and its value is less than the user's |
210 | specified requirement then the segment is ignored. |
211 | <dt>Highway preference |
212 | <dd>The highway preference specified by the user is a percentage, these are |
213 | scaled so that the most preferred highway type has a weighted preference of |
214 | 1.0 (0% always has a weighted preference of 0.0). The calculated score for a |
215 | segment is divided by this weighted preference. |
216 | <dt>Highway properties |
217 | <dd>The other highway properties are specified by the user as a percentage and |
218 | each highway either has that property or not. The user's property preference |
219 | is scaled into the range 0.0 (for 0%) to 1.0 (for 100%) to give a weighted |
220 | preference, a second "non-property" weighted preference is calcuated in the |
221 | same way after subtracting the user's preference from 100%. If a segment has |
222 | a particular property then the calculated score is divided by the weighted |
223 | preference for that property, if not then it is divided by the non-property |
224 | weighted preference. A non-linear transformation is applied so that changing |
225 | property preferences close to 50% do not cause large variations in routes. |
226 | </dl> |
227 | |
228 | <h3><a name="H_1_1_5"></a>Turn Restrictions</h3> |
229 | |
230 | The addition of turn restrictions adds a set of further complications because it |
231 | introduces a set of constraints that are far more complex than one-way streets. |
232 | <p> |
233 | A turn restriction in the simplest case is a combination of a segment, node and |
234 | segment such that routes are not allowed to go from the first segment to the |
235 | second one through the specified node. Exceptions for certain types of traffic |
236 | can also be specified. Currently only this simplest type of turn restriction is |
237 | handled by the algorithm. |
238 | <p> |
239 | The first complication of turn restrictions is that the algorithm above requires |
240 | that super-segments are composed of segments with identical properties. A turn |
241 | restriction is not the same in both directions so a super-segment cannot include |
242 | any route through that turn restriction. The node at the centre of the turn |
243 | restriction must therefore be a super-node to avoid this. In addition to this |
244 | all nodes connected to the turn restriction node by a single segment must also |
245 | be super-nodes to avoid any long-distance super-segments starting at the |
246 | restricted node. |
247 | <p> |
248 | The second complication of a turn restriction is that the optimum route may |
249 | require passing through the same node more than once. This can happen where the |
250 | route needs to work around a turn restriction by driving past it, turning round |
251 | (on a roundabout perhaps) and coming back along the same highway. Without turn |
252 | restrictions a route could be defined purely by the set of nodes that were |
253 | passed; no node would exist more than once along a route between two points. |
254 | With turn restrictions the route is defined by a node and the segment used to |
255 | get there; no route between two points will ever need to follow the same segment |
256 | in the same direction more than once. |
257 | <p> |
258 | A side-effect of this is that a route that works around a turn restriction must |
259 | be calculable using the super-segments that are stored in the database. This |
260 | puts a limit on the amount of database optimisation that can be performed |
261 | because if too many super-segments are removed the optimum work-around may also |
262 | be removed. The solution to this is to ensure that the database preserves all |
263 | loops that can be used to turn around and reverse direction, previously |
264 | super-segments that started and finished on the same super-node were disallowed. |
265 | <p> |
266 | Another side-effect of having the route composed of a set of locations (nodes) |
267 | as well as the direction of travel (segments used to reach them) is that via |
268 | points in the route can be forced to continue in the original direction. If the |
269 | chosen method of transport obeys turn restrictions then it will not reverse |
270 | direction at a via point but will find an optimum route continuing in the same |
271 | direction. The only exception to this is when the route ahead at a waypoint is |
272 | into a dead-end and an immediate U-turn is allowed. |
273 | <p> |
274 | A side-effect of having the starting direction at a via point defined by the |
275 | previous part of the route is that overall non-optimal routes may be found even |
276 | though each section between via points is optimal. For a route with a start, |
277 | middle and end point defined it can be the case that the shortest route from the |
278 | start to the middle arrives in the opposite direction to that required for the |
279 | optimal route from the middle to the end. The calculation of the route in |
280 | separate sections therefore may give a non-optimum result even though each |
281 | section is itself optimum based on the start conditions. |
282 | <p> |
283 | Overall the presence of turn restrictions in the database makes the routing |
284 | slower even for regions of the map that have no turn restrictions. |
285 | |
286 | <h3><a name="H_1_1_4"></a>Data Implementation</h3> |
287 | |
288 | The hardest part of implementing this router is the data organisation. The |
289 | arrangement of the data to minimise the number of operations required to follow |
290 | a route from one node to another is much harder than designing the algorithm |
291 | itself. |
292 | <p> |
293 | The final implementation uses a separate table for nodes, segments and ways. |
294 | Each table individually is implemented as a C-language data structure that is |
295 | written to disk by a program which parses the OpenStreetMap XML data file. In |
296 | the router these data structures are memory mapped so that the operating system |
297 | handles the problems of loading the needed data blocks from disk. |
298 | <p> |
299 | Each node contains a latitude and longitude and they are sorted geographically |
300 | so that converting a latitude and longitude coordinate to a node is fast as well |
301 | as looking up the coordinate of a node. The node also contains the location in |
302 | the array of segments for the first segment that uses that node. |
303 | <br> |
304 | Each segment contains the location of the two nodes as well as the way that the |
305 | segment came from. The location of the next segment that uses one of the two |
306 | nodes is also stored; the next segment for the other node is the following one |
307 | in the array. The length of the segment is also pre-computed and stored. |
308 | <br> |
309 | Each way has a name, a highway type, a list of allowed types of traffic, a speed |
310 | limit, any weight, height, width or length restrictions and the highway |
311 | properties. |
312 | <p> |
313 | The super-nodes are mixed in with the nodes and the super-segments are mixed in |
314 | with the segments. For the nodes they are the same as the normal nodes, so just |
315 | a flag is needed to indicate that they are super. The super-segments are in |
316 | addition to the normal segments so they increase the database size (by about |
317 | 10%) and are also marked with a flag. |
318 | <p> |
319 | The relations are stored separately from the nodes, segments and ways. For the |
320 | turn restriction relations the initial and final segments are stored along with |
321 | the restricted node itself. Each node that has a turn restriction is marked in |
322 | the main node storage with a flag to indicate this information. |
323 | |
324 | |
325 | <h3><a name="H_1_1_6"></a>Practicalities</h3> |
326 | |
327 | At the time of writing (April 2010) the OpenStreetMap data for Great Britain |
328 | (taken from |
329 | <a class="ext" href="http://download.geofabrik.de/osm/europe/" title="GeoFabrik Mirror of OpenStreetMap Data">GeoFabrik</a> |
330 | ) contains: |
331 | <ul> |
332 | <li>14,675,098 nodes |
333 | <ul> |
334 | <li>8,767,521 are highway nodes |
335 | <li>1,120,297 are super-nodes |
336 | </ul> |
337 | <li>1,876,822 ways |
338 | <ul> |
339 | <li>1,412,898 are highways |
340 | <ul> |
341 | <li>9,316,328 highway segments |
342 | <li>1,641,009 are super-segments |
343 | </ul> |
344 | </ul> |
345 | <li>60,572 relations |
346 | </ul> |
347 | |
348 | The database files when generated are 41.5 MB for nodes, 121.6 MB for segments |
349 | and 12.6 MB for ways and are stored uncompressed. By having at least 200 MB or |
350 | RAM available the routing can be performed with no disk accesses (once the data |
351 | has been read once). |
352 | |
353 | |
354 | </div> |
355 | |
356 | <!-- Content End --> |
357 | |
358 | <!-- Footer Start --> |
359 | |
360 | <div class="footer" align="center"> |
361 | <hr> |
362 | |
363 | <address> |
364 | © Andrew M. Bishop = <amb "at" gedanken.demon.co.uk> |
365 | </address> |
366 | |
367 | </div> |
368 | |
369 | <!-- Footer End --> |
370 | |
371 | </BODY> |
372 | |
373 | </HTML> |