Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino
Annotation of /trunk/doc/html/algorithm.html
Parent Directory
|
Revision Log
Revision 1646 -
(hide annotations)
(download)
(as text)
Fri May 1 17:51:13 2015 UTC (9 years, 10 months ago) by amb
File MIME type: text/html
File size: 20681 byte(s)
Fri May 1 17:51:13 2015 UTC (9 years, 10 months ago) by amb
File MIME type: text/html
File size: 20681 byte(s)
Add "meta" header to HTML to help mobile devices and tidy up some CSS.
1 | amb | 1469 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> |
2 | <html> | ||
3 | amb | 569 | |
4 | amb | 1469 | <head> |
5 | amb | 1257 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
6 | amb | 1646 | <meta name="viewport" content="width=device-width, initial-scale=1"> |
7 | amb | 1257 | |
8 | <title>Routino : Algorithm</title> | ||
9 | |||
10 | amb | 569 | <!-- |
11 | amb | 584 | Routino documentation - algorithm |
12 | amb | 569 | |
13 | Part of the Routino routing software. | ||
14 | |||
15 | amb | 1646 | This file Copyright 2008-2015 Andrew M. Bishop |
16 | amb | 569 | |
17 | This program is free software: you can redistribute it and/or modify | ||
18 | it under the terms of the GNU Affero General Public License as published by | ||
19 | the Free Software Foundation, either version 3 of the License, or | ||
20 | (at your option) any later version. | ||
21 | |||
22 | This program is distributed in the hope that it will be useful, | ||
23 | but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
24 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
25 | GNU Affero General Public License for more details. | ||
26 | |||
27 | You should have received a copy of the GNU Affero General Public License | ||
28 | along with this program. If not, see http://www.gnu.org/licenses/. | ||
29 | --> | ||
30 | |||
31 | amb | 1257 | <link href="style.css" type="text/css" rel="stylesheet"> |
32 | amb | 1469 | </head> |
33 | amb | 569 | |
34 | amb | 1469 | <body> |
35 | amb | 569 | |
36 | <!-- Header Start --> | ||
37 | |||
38 | amb | 1469 | <div class="header"> |
39 | amb | 569 | |
40 | amb | 577 | <h1>Routino : Algorithm</h1> |
41 | amb | 569 | |
42 | </div> | ||
43 | |||
44 | <!-- Header End --> | ||
45 | |||
46 | <!-- Content Start --> | ||
47 | |||
48 | <div class="content"> | ||
49 | |||
50 | amb | 1474 | <h2 id="H_1_1">Algorithms</h2> |
51 | amb | 569 | |
52 | amb | 584 | This page describes the development of the algorithm that is used in Routino for |
53 | finding routes. | ||
54 | amb | 569 | |
55 | amb | 1474 | <h3 id="H_1_1_1">Simplest Algorithm</h3> |
56 | amb | 569 | |
57 | The algorithm to find a route is fundamentally simple: Start at the beginning, | ||
58 | follow all possible routes and keep going until you reach the end. | ||
59 | <p> | ||
60 | While this method does work, it isn't fast. To be able to find a route quickly | ||
61 | needs a different algorithm, one that can find the correct answer without | ||
62 | wasting time on routes that lead nowhere. | ||
63 | |||
64 | amb | 1474 | <h3 id="H_1_1_2">Improved Algorithm</h3> |
65 | amb | 569 | |
66 | The simplest way to do this is to follow all possible segments from the starting | ||
67 | node to the next nearest node (an intermediate node in the complete journey). | ||
68 | For each node that is reached store the shortest route from the starting node | ||
69 | and the length of that route. The list of intermediate nodes needs to be | ||
70 | maintained in order of shortest overall route on the assumption that there is a | ||
71 | straight line route from here to the end node. | ||
72 | <br> | ||
73 | At each point the intermediate node that has the shortest potential overall | ||
74 | journey time is processed before any other node. From the first node in the | ||
75 | list follow all possible segments and place the newly discovered nodes into the | ||
76 | same list ordered in the same way. This will tend to constrain the list of | ||
77 | nodes examined to be the ones that are between the start and end nodes. If at | ||
78 | any point you reach a node that has already been reached by a longer route then | ||
79 | you can discard that route since the newly discovered route is shorter. | ||
80 | Conversely if the previously discovered route is shorter then discard the new | ||
81 | route. | ||
82 | <br> | ||
83 | At some point the end node will be reached and then any routes with potential | ||
84 | lengths longer than this actual route can be immediately discarded. The few | ||
85 | remaining potential routes must be continued until they are found to be shorter | ||
86 | or have no possibility of being shorter. The shortest possible route is then | ||
87 | found. | ||
88 | <p> | ||
89 | At all times when looking at a node only those segments that are possible by the | ||
90 | chosen means of transport are followed. This allows the type of transport to be | ||
91 | handled easily. When finding the quickest route the same rules apply except | ||
92 | amb | 621 | that the criterion for sorting is the shortest potential route (assuming that |
93 | from each node to the end is the fastest possible type of highway). | ||
94 | amb | 569 | <p> |
95 | This method also works, but again it isn't very fast. The problem is that the | ||
96 | complexity is proportional to the number of nodes or segments in all routes | ||
97 | examined between the start and end nodes. Maintaining the list of intermediate | ||
98 | nodes in order is the most complex part. | ||
99 | |||
100 | amb | 1474 | <h3 id="H_1_1_3">Final Algorithm</h3> |
101 | amb | 569 | |
102 | The final algorithm that is implemented in the router is basically the one above | ||
103 | but with an important difference. Instead of finding a long route among a data | ||
104 | amb | 584 | set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it |
105 | finds one long route in a data set of 1,000,000 nodes and a few hundred very | ||
106 | short routes in the full data set. Since the time taken to find a route is | ||
107 | amb | 659 | proportional to the number of nodes that need to be considered the main route |
108 | takes 1/10th of the time and the very short routes take almost no time at all. | ||
109 | amb | 569 | <p> |
110 | The solution to making the algorithm fast is therefore to discard most of the | ||
111 | nodes and only keep the interesting ones. In this case a node is deemed to be | ||
112 | amb | 659 | interesting if it is the junction of three or more segments or the junction of |
113 | two segments with different properties or has a routing restriction different | ||
114 | from the connecting segments. In the algorithm and following description these | ||
115 | are classed as <em>super-nodes</em>. Starting at each super-node a | ||
116 | <em>super-segment</em> is generated that finishes on another super-node and | ||
117 | contains the <em>shortest</em> path along segments with identical properties | ||
118 | (and these properties are inherited by the super-segment). The point of | ||
119 | choosing the shortest route is that since all segments considered have identical | ||
120 | properties they will be treated identically when properties are taken into | ||
121 | account. This decision making process can be repeated until the only the most | ||
122 | important and interesting nodes remain. | ||
123 | amb | 1469 | <p class="center"> |
124 | amb | 659 | <img alt="Original data" src="example0.png"> |
125 | <br> | ||
126 | Original Highways | ||
127 | amb | 1469 | <p class="center"> |
128 | amb | 659 | <img alt="Iteration 1" src="example1.png"> |
129 | <br> | ||
130 | First Iteration | ||
131 | amb | 1469 | <p class="center"> |
132 | amb | 659 | <img alt="Iteration 2" src="example2.png"> |
133 | <br> | ||
134 | Second Iteration | ||
135 | amb | 1469 | <p class="center"> |
136 | amb | 659 | <img alt="Iteration 3" src="example3.png"> |
137 | <br> | ||
138 | Third Iteration | ||
139 | amb | 1469 | <p class="center"> |
140 | amb | 659 | <img alt="Iteration 4" src="example4.png"> |
141 | <br> | ||
142 | Fourth Iteration | ||
143 | amb | 569 | <p> |
144 | amb | 584 | To find a route between a start and finish point now comprises the following |
145 | steps (assuming a shortest route is required): | ||
146 | <ol> | ||
147 | <li>Find all shortest routes from the start point along normal segments and | ||
148 | stopping when super-nodes are reached. | ||
149 | <li>Find all shortest routes from the end point backwards along normal | ||
150 | segments and stopping when super-nodes are reached. | ||
151 | <li>Find the shortest route along super-segments from the set of super-nodes | ||
152 | in step 1 to the set of super-nodes in step 2 (taking into account the lengths | ||
153 | found in steps 1 and 2 between the start/finish super-nodes and the ultimate | ||
154 | start/finish point). | ||
155 | <li>For each super-segment in step 3 find the shortest route between the two | ||
156 | end-point super-nodes. | ||
157 | </ol> | ||
158 | This multi-step process is considerably quicker than using all nodes but gives a | ||
159 | result that still contains the full list of nodes that are visited. There are | ||
160 | some special cases though, for example very short routes that do not pass | ||
161 | through any super-nodes, or routes that start or finish on a super-node. In | ||
162 | these cases one or more of the steps listed can be removed or simplified. | ||
163 | amb | 621 | <p> |
164 | When the first route reaches the final node the length of that route is retained | ||
165 | as a benchmark. Any shorter complete route that is calculated later would | ||
166 | replace this benchmark. As routes are tested any partial routes that are longer | ||
167 | than the benchmark can be immediately discarded. Other partial routes have the | ||
168 | length of a perfect straight highway to the final node added to them and if the | ||
169 | total exceeds the benchmark they can also be discarded. Very quickly the number | ||
170 | of possible routes is reduced until the absolute shortest is found. | ||
171 | <p> | ||
172 | For routes that do not start or finish on a node in the original data set a fake | ||
173 | node is added to an existing segment. This requires special handling in the | ||
174 | algorithm but it gives mode flexibility for the start, finish and intermediate | ||
175 | points in a route. | ||
176 | amb | 569 | |
177 | amb | 1474 | <h4 id="H_1_1_3_1">Algorithm Evolution</h4> |
178 | amb | 659 | |
179 | In Routino versions 1.0 to 1.4 the algorithm used to select a super-node was the | ||
180 | same as above except that node properties were not included. Routino versions | ||
181 | 1.4.1 to 1.5.1 used a slightly different algorithm which only chose nodes that | ||
182 | were junctions between segments with different properties (or has a routing | ||
183 | restriction that is different from connecting segments in versions 1.5 and | ||
184 | 1.5.1). The addition of turn restrictions (described in more detail below) | ||
185 | requires the original algorithm since the super-segments more accurately reflect | ||
186 | the underlying topology. | ||
187 | |||
188 | amb | 1474 | <h4 id="H_1_1_3_2">Algorithm Implementation</h4> |
189 | amb | 1461 | |
190 | The algorithm that is used for finding the route between the super-nodes using | ||
191 | super-segments is the A* algorithm (or a slight variation of it). This was not | ||
192 | a deliberate design decision, but evolved into it during development. This | ||
193 | algorithm relies on calculating the lowest score (shortest distance or quickest | ||
194 | time) to each node from the starting node. The remaining score for the path to | ||
195 | the destination node is estimated (based on a straight line using the fastest | ||
196 | type of highway) and added to the current score and the result recorded. At | ||
197 | each step the unvisited node that has the lowest current score is examined and | ||
198 | all nodes connected to it have their scores calculated. When the destination | ||
199 | node has been reached all remaining unvisited nodes with scores higher than the | ||
200 | destination node's score can be discarded and the few remaining nodes examined. | ||
201 | <p> | ||
202 | The algorithm used to find the route between super-nodes using normal segments | ||
203 | is Dijkstra's algorithm (although it is implemented as the same algorithm as | ||
204 | above but with no estimated cost). Since these routes tend to be short and the | ||
205 | CPU time for calculating the heuristic cost function is relatively large this | ||
206 | tends to give a quicker solution. | ||
207 | |||
208 | |||
209 | amb | 1474 | <h3 id="H_1_1_4">Routing Preferences</h3> |
210 | amb | 569 | |
211 | amb | 584 | One of the important features of Routino is the ability to select a route that |
212 | is optimum for a set of criteria such as preferences for each type of highway, | ||
213 | speed limits and other restrictions and highway properties. | ||
214 | <p> | ||
215 | All of these features are handled by assigning a score to each segment while | ||
216 | calculating the route and trying to minimise the score rather than simply | ||
217 | minimising the length. | ||
218 | <dl> | ||
219 | <dt>Segment length | ||
220 | <dd>When calculating the shortest route the length of the segment is the | ||
221 | starting point for the score. | ||
222 | <dt>Speed preference | ||
223 | <dd>When calculating the quickest route the time taken calculated from the | ||
224 | length of the segment and the lower of the highway's own speed limit and the | ||
225 | user's speed preference for the type of highway is the starting point for the | ||
226 | score. | ||
227 | amb | 737 | <dt>One-way restriction |
228 | <dd>If a highway has the one-way property in the opposite direction to the | ||
229 | desired travel and the user's preference is to obey one-way restrictions then | ||
230 | amb | 584 | the segment is ignored. |
231 | <dt>Weight, height, width & length limits | ||
232 | <dd>If a highway has one of these limits and its value is less than the user's | ||
233 | specified requirement then the segment is ignored. | ||
234 | <dt>Highway preference | ||
235 | <dd>The highway preference specified by the user is a percentage, these are | ||
236 | scaled so that the most preferred highway type has a weighted preference of | ||
237 | 1.0 (0% always has a weighted preference of 0.0). The calculated score for a | ||
238 | segment is divided by this weighted preference. | ||
239 | <dt>Highway properties | ||
240 | <dd>The other highway properties are specified by the user as a percentage and | ||
241 | each highway either has that property or not. The user's property preference | ||
242 | amb | 585 | is scaled into the range 0.0 (for 0%) to 1.0 (for 100%) to give a weighted |
243 | amb | 737 | preference, a second "non-property" weighted preference is calculated in the |
244 | amb | 584 | same way after subtracting the user's preference from 100%. If a segment has |
245 | amb | 585 | a particular property then the calculated score is divided by the weighted |
246 | amb | 671 | preference for that property, if not then it is divided by the non-property |
247 | weighted preference. A non-linear transformation is applied so that changing | ||
248 | property preferences close to 50% do not cause large variations in routes. | ||
249 | amb | 584 | </dl> |
250 | |||
251 | amb | 1474 | <h3 id="H_1_1_5">Data Pruning</h3> |
252 | amb | 584 | |
253 | amb | 969 | From version 2.2 there are options to "prune" nodes and segments from the input |
254 | data which means to remove nodes and/or segments without significantly changing | ||
255 | the routing results. | ||
256 | amb | 621 | <p> |
257 | amb | 969 | The pruning options must meet a number of conditions to be useful: |
258 | <ul> | ||
259 | <li>The topology relevant to routing must remain unchanged. The instructions | ||
260 | that are produced from the reduced set of nodes and segments must be | ||
261 | sufficiently accurate for anybody trying to follow them on the ground. | ||
262 | <li>Any restrictions belonging to nodes or segments that stop certain types of | ||
263 | traffic from following a particular highway must be preserved. | ||
264 | <li>The total length must be calculated using the original data and not the | ||
265 | simplified data which by its nature will typically be shorter. | ||
266 | <li>The location of the remaining nodes and segments must be a good | ||
267 | representation of the original nodes and segments. Since the calculated | ||
268 | route may be displayed on a map the remaining nodes and segments must | ||
269 | clearly indicate the route to take. | ||
270 | </ul> | ||
271 | <p> | ||
272 | The prune options all have user-controllable parameters which allow the | ||
273 | amb | 1117 | geographical accuracy to be controlled. This means that although the topology |
274 | is the same the geographical accuracy can be sacrificed slightly to minimise the | ||
275 | number of nodes and segments. | ||
276 | <p> | ||
277 | The pruning options that are available are: | ||
278 | <ul> | ||
279 | <li>Removing the access permissions for a transport type from segments if it | ||
280 | is not possible to route that transport type from those segments to a | ||
281 | significant number of other places. The limit on the pruning is set by the | ||
282 | total length of the isolated group of segments. This significantly increases | ||
283 | the chance that a route will be found by not putting waypoints in inaccessible | ||
284 | places. | ||
285 | <li>Removing short segments, the limit is set by the length of the segment. | ||
286 | This removes a number of redundant segments (and associated nodes) but rules | ||
287 | are applied to ensure that removing the segments does not alter junction | ||
288 | topology or remove node access permissions or changes in way properties. | ||
289 | <li>Removing nodes from almost straight highways, the limit is set by the | ||
290 | distance between the remaining segments and the original nodes. This removes | ||
291 | a large number of redundant nodes (and therefore segments) but again care is | ||
292 | taken not to remove node access permissions or changes in way properties. | ||
293 | </ul> | ||
294 | amb | 969 | |
295 | amb | 1474 | <h3 id="H_1_1_6">Turn Restrictions</h3> |
296 | amb | 969 | |
297 | The addition of turn restrictions in version 2.0 adds a set of further | ||
298 | complications because it introduces a set of constraints that are far more | ||
299 | complex than one-way streets. | ||
300 | <p> | ||
301 | amb | 621 | A turn restriction in the simplest case is a combination of a segment, node and |
302 | segment such that routes are not allowed to go from the first segment to the | ||
303 | second one through the specified node. Exceptions for certain types of traffic | ||
304 | can also be specified. Currently only this simplest type of turn restriction is | ||
305 | handled by the algorithm. | ||
306 | <p> | ||
307 | amb | 659 | The first complication of turn restrictions is that the algorithm above requires |
308 | that super-segments are composed of segments with identical properties. A turn | ||
309 | restriction is not the same in both directions so a super-segment cannot include | ||
310 | any route through that turn restriction. The node at the centre of the turn | ||
311 | restriction must therefore be a super-node to avoid this. In addition to this | ||
312 | all nodes connected to the turn restriction node by a single segment must also | ||
313 | be super-nodes to avoid any long-distance super-segments starting at the | ||
314 | restricted node. | ||
315 | <p> | ||
316 | The second complication of a turn restriction is that the optimum route may | ||
317 | amb | 621 | require passing through the same node more than once. This can happen where the |
318 | route needs to work around a turn restriction by driving past it, turning round | ||
319 | (on a roundabout perhaps) and coming back along the same highway. Without turn | ||
320 | restrictions a route could be defined purely by the set of nodes that were | ||
321 | amb | 659 | passed; no node would exist more than once along a route between two points. |
322 | amb | 621 | With turn restrictions the route is defined by a node and the segment used to |
323 | get there; no route between two points will ever need to follow the same segment | ||
324 | amb | 1461 | in the same direction more than once. This means that the optimisation |
325 | algorithm calculates scores for directed segments (indexed by segment and end | ||
326 | node) rather than for nodes. | ||
327 | amb | 621 | <p> |
328 | amb | 659 | A side-effect of this is that a route that works around a turn restriction must |
329 | be calculable using the super-segments that are stored in the database. This | ||
330 | puts a limit on the amount of database optimisation that can be performed | ||
331 | because if too many super-segments are removed the optimum work-around may also | ||
332 | be removed. The solution to this is to ensure that the database preserves all | ||
333 | loops that can be used to turn around and reverse direction, previously | ||
334 | super-segments that started and finished on the same super-node were disallowed. | ||
335 | <p> | ||
336 | Another side-effect of having the route composed of a set of locations (nodes) | ||
337 | as well as the direction of travel (segments used to reach them) is that via | ||
338 | points in the route can be forced to continue in the original direction. If the | ||
339 | chosen method of transport obeys turn restrictions then it will not reverse | ||
340 | direction at a via point but will find an optimum route continuing in the same | ||
341 | amb | 736 | direction. The only exception to this is when the route ahead at a waypoint is |
342 | into a dead-end and an immediate U-turn is allowed. | ||
343 | amb | 659 | <p> |
344 | A side-effect of having the starting direction at a via point defined by the | ||
345 | previous part of the route is that overall non-optimal routes may be found even | ||
346 | though each section between via points is optimal. For a route with a start, | ||
347 | middle and end point defined it can be the case that the shortest route from the | ||
348 | start to the middle arrives in the opposite direction to that required for the | ||
349 | optimal route from the middle to the end. The calculation of the route in | ||
350 | separate sections therefore may give a non-optimum result even though each | ||
351 | section is itself optimum based on the start conditions. | ||
352 | <p> | ||
353 | Overall the presence of turn restrictions in the database makes the routing | ||
354 | slower even for regions of the map that have no turn restrictions. | ||
355 | amb | 621 | |
356 | amb | 1474 | <h3 id="H_1_1_7">Data Implementation</h3> |
357 | amb | 621 | |
358 | amb | 569 | The hardest part of implementing this router is the data organisation. The |
359 | arrangement of the data to minimise the number of operations required to follow | ||
360 | a route from one node to another is much harder than designing the algorithm | ||
361 | itself. | ||
362 | <p> | ||
363 | The final implementation uses a separate table for nodes, segments and ways. | ||
364 | Each table individually is implemented as a C-language data structure that is | ||
365 | written to disk by a program which parses the OpenStreetMap XML data file. In | ||
366 | the router these data structures are memory mapped so that the operating system | ||
367 | handles the problems of loading the needed data blocks from disk. | ||
368 | <p> | ||
369 | Each node contains a latitude and longitude and they are sorted geographically | ||
370 | so that converting a latitude and longitude coordinate to a node is fast as well | ||
371 | as looking up the coordinate of a node. The node also contains the location in | ||
372 | the array of segments for the first segment that uses that node. | ||
373 | <br> | ||
374 | Each segment contains the location of the two nodes as well as the way that the | ||
375 | segment came from. The location of the next segment that uses one of the two | ||
376 | nodes is also stored; the next segment for the other node is the following one | ||
377 | in the array. The length of the segment is also pre-computed and stored. | ||
378 | <br> | ||
379 | Each way has a name, a highway type, a list of allowed types of traffic, a speed | ||
380 | amb | 584 | limit, any weight, height, width or length restrictions and the highway |
381 | properties. | ||
382 | amb | 569 | <p> |
383 | The super-nodes are mixed in with the nodes and the super-segments are mixed in | ||
384 | with the segments. For the nodes they are the same as the normal nodes, so just | ||
385 | a flag is needed to indicate that they are super. The super-segments are in | ||
386 | addition to the normal segments so they increase the database size (by about | ||
387 | amb | 969 | 10%) and are also marked with a flag. Some segments are therefore flagged as |
388 | both normal segments and super-segments if they both have the same end nodes. | ||
389 | amb | 621 | <p> |
390 | The relations are stored separately from the nodes, segments and ways. For the | ||
391 | turn restriction relations the initial and final segments are stored along with | ||
392 | the restricted node itself. Each node that has a turn restriction is marked in | ||
393 | the main node storage with a flag to indicate this information. | ||
394 | amb | 569 | |
395 | </div> | ||
396 | |||
397 | <!-- Content End --> | ||
398 | |||
399 | <!-- Footer Start --> | ||
400 | |||
401 | amb | 1469 | <div class="footer"> |
402 | amb | 569 | |
403 | <address> | ||
404 | amb | 1267 | © Andrew M. Bishop - <a href="http://www.routino.org/">http://www.routino.org/</a> |
405 | amb | 569 | </address> |
406 | |||
407 | </div> | ||
408 | |||
409 | <!-- Footer End --> | ||
410 | |||
411 | amb | 1469 | </body> |
412 | amb | 569 | |
413 | amb | 1469 | </html> |