Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino
Contents of /trunk/doc/html/algorithm.html
Parent Directory
|
Revision Log
Revision 1646 -
(show annotations)
(download)
(as text)
Fri May 1 17:51:13 2015 UTC (9 years, 11 months ago) by amb
File MIME type: text/html
File size: 20681 byte(s)
Fri May 1 17:51:13 2015 UTC (9 years, 11 months ago) by amb
File MIME type: text/html
File size: 20681 byte(s)
Add "meta" header to HTML to help mobile devices and tidy up some CSS.
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> |
2 | <html> |
3 | |
4 | <head> |
5 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
6 | <meta name="viewport" content="width=device-width, initial-scale=1"> |
7 | |
8 | <title>Routino : Algorithm</title> |
9 | |
10 | <!-- |
11 | Routino documentation - algorithm |
12 | |
13 | Part of the Routino routing software. |
14 | |
15 | This file Copyright 2008-2015 Andrew M. Bishop |
16 | |
17 | This program is free software: you can redistribute it and/or modify |
18 | it under the terms of the GNU Affero General Public License as published by |
19 | the Free Software Foundation, either version 3 of the License, or |
20 | (at your option) any later version. |
21 | |
22 | This program is distributed in the hope that it will be useful, |
23 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
24 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
25 | GNU Affero General Public License for more details. |
26 | |
27 | You should have received a copy of the GNU Affero General Public License |
28 | along with this program. If not, see http://www.gnu.org/licenses/. |
29 | --> |
30 | |
31 | <link href="style.css" type="text/css" rel="stylesheet"> |
32 | </head> |
33 | |
34 | <body> |
35 | |
36 | <!-- Header Start --> |
37 | |
38 | <div class="header"> |
39 | |
40 | <h1>Routino : Algorithm</h1> |
41 | |
42 | </div> |
43 | |
44 | <!-- Header End --> |
45 | |
46 | <!-- Content Start --> |
47 | |
48 | <div class="content"> |
49 | |
50 | <h2 id="H_1_1">Algorithms</h2> |
51 | |
52 | This page describes the development of the algorithm that is used in Routino for |
53 | finding routes. |
54 | |
55 | <h3 id="H_1_1_1">Simplest Algorithm</h3> |
56 | |
57 | The algorithm to find a route is fundamentally simple: Start at the beginning, |
58 | follow all possible routes and keep going until you reach the end. |
59 | <p> |
60 | While this method does work, it isn't fast. To be able to find a route quickly |
61 | needs a different algorithm, one that can find the correct answer without |
62 | wasting time on routes that lead nowhere. |
63 | |
64 | <h3 id="H_1_1_2">Improved Algorithm</h3> |
65 | |
66 | The simplest way to do this is to follow all possible segments from the starting |
67 | node to the next nearest node (an intermediate node in the complete journey). |
68 | For each node that is reached store the shortest route from the starting node |
69 | and the length of that route. The list of intermediate nodes needs to be |
70 | maintained in order of shortest overall route on the assumption that there is a |
71 | straight line route from here to the end node. |
72 | <br> |
73 | At each point the intermediate node that has the shortest potential overall |
74 | journey time is processed before any other node. From the first node in the |
75 | list follow all possible segments and place the newly discovered nodes into the |
76 | same list ordered in the same way. This will tend to constrain the list of |
77 | nodes examined to be the ones that are between the start and end nodes. If at |
78 | any point you reach a node that has already been reached by a longer route then |
79 | you can discard that route since the newly discovered route is shorter. |
80 | Conversely if the previously discovered route is shorter then discard the new |
81 | route. |
82 | <br> |
83 | At some point the end node will be reached and then any routes with potential |
84 | lengths longer than this actual route can be immediately discarded. The few |
85 | remaining potential routes must be continued until they are found to be shorter |
86 | or have no possibility of being shorter. The shortest possible route is then |
87 | found. |
88 | <p> |
89 | At all times when looking at a node only those segments that are possible by the |
90 | chosen means of transport are followed. This allows the type of transport to be |
91 | handled easily. When finding the quickest route the same rules apply except |
92 | that the criterion for sorting is the shortest potential route (assuming that |
93 | from each node to the end is the fastest possible type of highway). |
94 | <p> |
95 | This method also works, but again it isn't very fast. The problem is that the |
96 | complexity is proportional to the number of nodes or segments in all routes |
97 | examined between the start and end nodes. Maintaining the list of intermediate |
98 | nodes in order is the most complex part. |
99 | |
100 | <h3 id="H_1_1_3">Final Algorithm</h3> |
101 | |
102 | The final algorithm that is implemented in the router is basically the one above |
103 | but with an important difference. Instead of finding a long route among a data |
104 | set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it |
105 | finds one long route in a data set of 1,000,000 nodes and a few hundred very |
106 | short routes in the full data set. Since the time taken to find a route is |
107 | proportional to the number of nodes that need to be considered the main route |
108 | takes 1/10th of the time and the very short routes take almost no time at all. |
109 | <p> |
110 | The solution to making the algorithm fast is therefore to discard most of the |
111 | nodes and only keep the interesting ones. In this case a node is deemed to be |
112 | interesting if it is the junction of three or more segments or the junction of |
113 | two segments with different properties or has a routing restriction different |
114 | from the connecting segments. In the algorithm and following description these |
115 | are classed as <em>super-nodes</em>. Starting at each super-node a |
116 | <em>super-segment</em> is generated that finishes on another super-node and |
117 | contains the <em>shortest</em> path along segments with identical properties |
118 | (and these properties are inherited by the super-segment). The point of |
119 | choosing the shortest route is that since all segments considered have identical |
120 | properties they will be treated identically when properties are taken into |
121 | account. This decision making process can be repeated until the only the most |
122 | important and interesting nodes remain. |
123 | <p class="center"> |
124 | <img alt="Original data" src="example0.png"> |
125 | <br> |
126 | Original Highways |
127 | <p class="center"> |
128 | <img alt="Iteration 1" src="example1.png"> |
129 | <br> |
130 | First Iteration |
131 | <p class="center"> |
132 | <img alt="Iteration 2" src="example2.png"> |
133 | <br> |
134 | Second Iteration |
135 | <p class="center"> |
136 | <img alt="Iteration 3" src="example3.png"> |
137 | <br> |
138 | Third Iteration |
139 | <p class="center"> |
140 | <img alt="Iteration 4" src="example4.png"> |
141 | <br> |
142 | Fourth Iteration |
143 | <p> |
144 | To find a route between a start and finish point now comprises the following |
145 | steps (assuming a shortest route is required): |
146 | <ol> |
147 | <li>Find all shortest routes from the start point along normal segments and |
148 | stopping when super-nodes are reached. |
149 | <li>Find all shortest routes from the end point backwards along normal |
150 | segments and stopping when super-nodes are reached. |
151 | <li>Find the shortest route along super-segments from the set of super-nodes |
152 | in step 1 to the set of super-nodes in step 2 (taking into account the lengths |
153 | found in steps 1 and 2 between the start/finish super-nodes and the ultimate |
154 | start/finish point). |
155 | <li>For each super-segment in step 3 find the shortest route between the two |
156 | end-point super-nodes. |
157 | </ol> |
158 | This multi-step process is considerably quicker than using all nodes but gives a |
159 | result that still contains the full list of nodes that are visited. There are |
160 | some special cases though, for example very short routes that do not pass |
161 | through any super-nodes, or routes that start or finish on a super-node. In |
162 | these cases one or more of the steps listed can be removed or simplified. |
163 | <p> |
164 | When the first route reaches the final node the length of that route is retained |
165 | as a benchmark. Any shorter complete route that is calculated later would |
166 | replace this benchmark. As routes are tested any partial routes that are longer |
167 | than the benchmark can be immediately discarded. Other partial routes have the |
168 | length of a perfect straight highway to the final node added to them and if the |
169 | total exceeds the benchmark they can also be discarded. Very quickly the number |
170 | of possible routes is reduced until the absolute shortest is found. |
171 | <p> |
172 | For routes that do not start or finish on a node in the original data set a fake |
173 | node is added to an existing segment. This requires special handling in the |
174 | algorithm but it gives mode flexibility for the start, finish and intermediate |
175 | points in a route. |
176 | |
177 | <h4 id="H_1_1_3_1">Algorithm Evolution</h4> |
178 | |
179 | In Routino versions 1.0 to 1.4 the algorithm used to select a super-node was the |
180 | same as above except that node properties were not included. Routino versions |
181 | 1.4.1 to 1.5.1 used a slightly different algorithm which only chose nodes that |
182 | were junctions between segments with different properties (or has a routing |
183 | restriction that is different from connecting segments in versions 1.5 and |
184 | 1.5.1). The addition of turn restrictions (described in more detail below) |
185 | requires the original algorithm since the super-segments more accurately reflect |
186 | the underlying topology. |
187 | |
188 | <h4 id="H_1_1_3_2">Algorithm Implementation</h4> |
189 | |
190 | The algorithm that is used for finding the route between the super-nodes using |
191 | super-segments is the A* algorithm (or a slight variation of it). This was not |
192 | a deliberate design decision, but evolved into it during development. This |
193 | algorithm relies on calculating the lowest score (shortest distance or quickest |
194 | time) to each node from the starting node. The remaining score for the path to |
195 | the destination node is estimated (based on a straight line using the fastest |
196 | type of highway) and added to the current score and the result recorded. At |
197 | each step the unvisited node that has the lowest current score is examined and |
198 | all nodes connected to it have their scores calculated. When the destination |
199 | node has been reached all remaining unvisited nodes with scores higher than the |
200 | destination node's score can be discarded and the few remaining nodes examined. |
201 | <p> |
202 | The algorithm used to find the route between super-nodes using normal segments |
203 | is Dijkstra's algorithm (although it is implemented as the same algorithm as |
204 | above but with no estimated cost). Since these routes tend to be short and the |
205 | CPU time for calculating the heuristic cost function is relatively large this |
206 | tends to give a quicker solution. |
207 | |
208 | |
209 | <h3 id="H_1_1_4">Routing Preferences</h3> |
210 | |
211 | One of the important features of Routino is the ability to select a route that |
212 | is optimum for a set of criteria such as preferences for each type of highway, |
213 | speed limits and other restrictions and highway properties. |
214 | <p> |
215 | All of these features are handled by assigning a score to each segment while |
216 | calculating the route and trying to minimise the score rather than simply |
217 | minimising the length. |
218 | <dl> |
219 | <dt>Segment length |
220 | <dd>When calculating the shortest route the length of the segment is the |
221 | starting point for the score. |
222 | <dt>Speed preference |
223 | <dd>When calculating the quickest route the time taken calculated from the |
224 | length of the segment and the lower of the highway's own speed limit and the |
225 | user's speed preference for the type of highway is the starting point for the |
226 | score. |
227 | <dt>One-way restriction |
228 | <dd>If a highway has the one-way property in the opposite direction to the |
229 | desired travel and the user's preference is to obey one-way restrictions then |
230 | the segment is ignored. |
231 | <dt>Weight, height, width & length limits |
232 | <dd>If a highway has one of these limits and its value is less than the user's |
233 | specified requirement then the segment is ignored. |
234 | <dt>Highway preference |
235 | <dd>The highway preference specified by the user is a percentage, these are |
236 | scaled so that the most preferred highway type has a weighted preference of |
237 | 1.0 (0% always has a weighted preference of 0.0). The calculated score for a |
238 | segment is divided by this weighted preference. |
239 | <dt>Highway properties |
240 | <dd>The other highway properties are specified by the user as a percentage and |
241 | each highway either has that property or not. The user's property preference |
242 | is scaled into the range 0.0 (for 0%) to 1.0 (for 100%) to give a weighted |
243 | preference, a second "non-property" weighted preference is calculated in the |
244 | same way after subtracting the user's preference from 100%. If a segment has |
245 | a particular property then the calculated score is divided by the weighted |
246 | preference for that property, if not then it is divided by the non-property |
247 | weighted preference. A non-linear transformation is applied so that changing |
248 | property preferences close to 50% do not cause large variations in routes. |
249 | </dl> |
250 | |
251 | <h3 id="H_1_1_5">Data Pruning</h3> |
252 | |
253 | From version 2.2 there are options to "prune" nodes and segments from the input |
254 | data which means to remove nodes and/or segments without significantly changing |
255 | the routing results. |
256 | <p> |
257 | The pruning options must meet a number of conditions to be useful: |
258 | <ul> |
259 | <li>The topology relevant to routing must remain unchanged. The instructions |
260 | that are produced from the reduced set of nodes and segments must be |
261 | sufficiently accurate for anybody trying to follow them on the ground. |
262 | <li>Any restrictions belonging to nodes or segments that stop certain types of |
263 | traffic from following a particular highway must be preserved. |
264 | <li>The total length must be calculated using the original data and not the |
265 | simplified data which by its nature will typically be shorter. |
266 | <li>The location of the remaining nodes and segments must be a good |
267 | representation of the original nodes and segments. Since the calculated |
268 | route may be displayed on a map the remaining nodes and segments must |
269 | clearly indicate the route to take. |
270 | </ul> |
271 | <p> |
272 | The prune options all have user-controllable parameters which allow the |
273 | geographical accuracy to be controlled. This means that although the topology |
274 | is the same the geographical accuracy can be sacrificed slightly to minimise the |
275 | number of nodes and segments. |
276 | <p> |
277 | The pruning options that are available are: |
278 | <ul> |
279 | <li>Removing the access permissions for a transport type from segments if it |
280 | is not possible to route that transport type from those segments to a |
281 | significant number of other places. The limit on the pruning is set by the |
282 | total length of the isolated group of segments. This significantly increases |
283 | the chance that a route will be found by not putting waypoints in inaccessible |
284 | places. |
285 | <li>Removing short segments, the limit is set by the length of the segment. |
286 | This removes a number of redundant segments (and associated nodes) but rules |
287 | are applied to ensure that removing the segments does not alter junction |
288 | topology or remove node access permissions or changes in way properties. |
289 | <li>Removing nodes from almost straight highways, the limit is set by the |
290 | distance between the remaining segments and the original nodes. This removes |
291 | a large number of redundant nodes (and therefore segments) but again care is |
292 | taken not to remove node access permissions or changes in way properties. |
293 | </ul> |
294 | |
295 | <h3 id="H_1_1_6">Turn Restrictions</h3> |
296 | |
297 | The addition of turn restrictions in version 2.0 adds a set of further |
298 | complications because it introduces a set of constraints that are far more |
299 | complex than one-way streets. |
300 | <p> |
301 | A turn restriction in the simplest case is a combination of a segment, node and |
302 | segment such that routes are not allowed to go from the first segment to the |
303 | second one through the specified node. Exceptions for certain types of traffic |
304 | can also be specified. Currently only this simplest type of turn restriction is |
305 | handled by the algorithm. |
306 | <p> |
307 | The first complication of turn restrictions is that the algorithm above requires |
308 | that super-segments are composed of segments with identical properties. A turn |
309 | restriction is not the same in both directions so a super-segment cannot include |
310 | any route through that turn restriction. The node at the centre of the turn |
311 | restriction must therefore be a super-node to avoid this. In addition to this |
312 | all nodes connected to the turn restriction node by a single segment must also |
313 | be super-nodes to avoid any long-distance super-segments starting at the |
314 | restricted node. |
315 | <p> |
316 | The second complication of a turn restriction is that the optimum route may |
317 | require passing through the same node more than once. This can happen where the |
318 | route needs to work around a turn restriction by driving past it, turning round |
319 | (on a roundabout perhaps) and coming back along the same highway. Without turn |
320 | restrictions a route could be defined purely by the set of nodes that were |
321 | passed; no node would exist more than once along a route between two points. |
322 | With turn restrictions the route is defined by a node and the segment used to |
323 | get there; no route between two points will ever need to follow the same segment |
324 | in the same direction more than once. This means that the optimisation |
325 | algorithm calculates scores for directed segments (indexed by segment and end |
326 | node) rather than for nodes. |
327 | <p> |
328 | A side-effect of this is that a route that works around a turn restriction must |
329 | be calculable using the super-segments that are stored in the database. This |
330 | puts a limit on the amount of database optimisation that can be performed |
331 | because if too many super-segments are removed the optimum work-around may also |
332 | be removed. The solution to this is to ensure that the database preserves all |
333 | loops that can be used to turn around and reverse direction, previously |
334 | super-segments that started and finished on the same super-node were disallowed. |
335 | <p> |
336 | Another side-effect of having the route composed of a set of locations (nodes) |
337 | as well as the direction of travel (segments used to reach them) is that via |
338 | points in the route can be forced to continue in the original direction. If the |
339 | chosen method of transport obeys turn restrictions then it will not reverse |
340 | direction at a via point but will find an optimum route continuing in the same |
341 | direction. The only exception to this is when the route ahead at a waypoint is |
342 | into a dead-end and an immediate U-turn is allowed. |
343 | <p> |
344 | A side-effect of having the starting direction at a via point defined by the |
345 | previous part of the route is that overall non-optimal routes may be found even |
346 | though each section between via points is optimal. For a route with a start, |
347 | middle and end point defined it can be the case that the shortest route from the |
348 | start to the middle arrives in the opposite direction to that required for the |
349 | optimal route from the middle to the end. The calculation of the route in |
350 | separate sections therefore may give a non-optimum result even though each |
351 | section is itself optimum based on the start conditions. |
352 | <p> |
353 | Overall the presence of turn restrictions in the database makes the routing |
354 | slower even for regions of the map that have no turn restrictions. |
355 | |
356 | <h3 id="H_1_1_7">Data Implementation</h3> |
357 | |
358 | The hardest part of implementing this router is the data organisation. The |
359 | arrangement of the data to minimise the number of operations required to follow |
360 | a route from one node to another is much harder than designing the algorithm |
361 | itself. |
362 | <p> |
363 | The final implementation uses a separate table for nodes, segments and ways. |
364 | Each table individually is implemented as a C-language data structure that is |
365 | written to disk by a program which parses the OpenStreetMap XML data file. In |
366 | the router these data structures are memory mapped so that the operating system |
367 | handles the problems of loading the needed data blocks from disk. |
368 | <p> |
369 | Each node contains a latitude and longitude and they are sorted geographically |
370 | so that converting a latitude and longitude coordinate to a node is fast as well |
371 | as looking up the coordinate of a node. The node also contains the location in |
372 | the array of segments for the first segment that uses that node. |
373 | <br> |
374 | Each segment contains the location of the two nodes as well as the way that the |
375 | segment came from. The location of the next segment that uses one of the two |
376 | nodes is also stored; the next segment for the other node is the following one |
377 | in the array. The length of the segment is also pre-computed and stored. |
378 | <br> |
379 | Each way has a name, a highway type, a list of allowed types of traffic, a speed |
380 | limit, any weight, height, width or length restrictions and the highway |
381 | properties. |
382 | <p> |
383 | The super-nodes are mixed in with the nodes and the super-segments are mixed in |
384 | with the segments. For the nodes they are the same as the normal nodes, so just |
385 | a flag is needed to indicate that they are super. The super-segments are in |
386 | addition to the normal segments so they increase the database size (by about |
387 | 10%) and are also marked with a flag. Some segments are therefore flagged as |
388 | both normal segments and super-segments if they both have the same end nodes. |
389 | <p> |
390 | The relations are stored separately from the nodes, segments and ways. For the |
391 | turn restriction relations the initial and final segments are stored along with |
392 | the restricted node itself. Each node that has a turn restriction is marked in |
393 | the main node storage with a flag to indicate this information. |
394 | |
395 | </div> |
396 | |
397 | <!-- Content End --> |
398 | |
399 | <!-- Footer Start --> |
400 | |
401 | <div class="footer"> |
402 | |
403 | <address> |
404 | © Andrew M. Bishop - <a href="http://www.routino.org/">http://www.routino.org/</a> |
405 | </address> |
406 | |
407 | </div> |
408 | |
409 | <!-- Footer End --> |
410 | |
411 | </body> |
412 | |
413 | </html> |