Routino SVN Repository Browser

Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino

ViewVC logotype

Annotation of /trunk/doc/html/algorithm.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1646 - (hide annotations) (download) (as text)
Fri May 1 17:51:13 2015 UTC (9 years, 10 months ago) by amb
File MIME type: text/html
File size: 20681 byte(s)
Add "meta" header to HTML to help mobile devices and tidy up some CSS.

1 amb 1469 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2     <html>
3 amb 569
4 amb 1469 <head>
5 amb 1257 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
6 amb 1646 <meta name="viewport" content="width=device-width, initial-scale=1">
7 amb 1257
8     <title>Routino : Algorithm</title>
9    
10 amb 569 <!--
11 amb 584 Routino documentation - algorithm
12 amb 569
13     Part of the Routino routing software.
14    
15 amb 1646 This file Copyright 2008-2015 Andrew M. Bishop
16 amb 569
17     This program is free software: you can redistribute it and/or modify
18     it under the terms of the GNU Affero General Public License as published by
19     the Free Software Foundation, either version 3 of the License, or
20     (at your option) any later version.
21    
22     This program is distributed in the hope that it will be useful,
23     but WITHOUT ANY WARRANTY; without even the implied warranty of
24     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
25     GNU Affero General Public License for more details.
26    
27     You should have received a copy of the GNU Affero General Public License
28     along with this program. If not, see http://www.gnu.org/licenses/.
29     -->
30    
31 amb 1257 <link href="style.css" type="text/css" rel="stylesheet">
32 amb 1469 </head>
33 amb 569
34 amb 1469 <body>
35 amb 569
36     <!-- Header Start -->
37    
38 amb 1469 <div class="header">
39 amb 569
40 amb 577 <h1>Routino : Algorithm</h1>
41 amb 569
42     </div>
43    
44     <!-- Header End -->
45    
46     <!-- Content Start -->
47    
48     <div class="content">
49    
50 amb 1474 <h2 id="H_1_1">Algorithms</h2>
51 amb 569
52 amb 584 This page describes the development of the algorithm that is used in Routino for
53     finding routes.
54 amb 569
55 amb 1474 <h3 id="H_1_1_1">Simplest Algorithm</h3>
56 amb 569
57     The algorithm to find a route is fundamentally simple: Start at the beginning,
58     follow all possible routes and keep going until you reach the end.
59     <p>
60     While this method does work, it isn't fast. To be able to find a route quickly
61     needs a different algorithm, one that can find the correct answer without
62     wasting time on routes that lead nowhere.
63    
64 amb 1474 <h3 id="H_1_1_2">Improved Algorithm</h3>
65 amb 569
66     The simplest way to do this is to follow all possible segments from the starting
67     node to the next nearest node (an intermediate node in the complete journey).
68     For each node that is reached store the shortest route from the starting node
69     and the length of that route. The list of intermediate nodes needs to be
70     maintained in order of shortest overall route on the assumption that there is a
71     straight line route from here to the end node.
72     <br>
73     At each point the intermediate node that has the shortest potential overall
74     journey time is processed before any other node. From the first node in the
75     list follow all possible segments and place the newly discovered nodes into the
76     same list ordered in the same way. This will tend to constrain the list of
77     nodes examined to be the ones that are between the start and end nodes. If at
78     any point you reach a node that has already been reached by a longer route then
79     you can discard that route since the newly discovered route is shorter.
80     Conversely if the previously discovered route is shorter then discard the new
81     route.
82     <br>
83     At some point the end node will be reached and then any routes with potential
84     lengths longer than this actual route can be immediately discarded. The few
85     remaining potential routes must be continued until they are found to be shorter
86     or have no possibility of being shorter. The shortest possible route is then
87     found.
88     <p>
89     At all times when looking at a node only those segments that are possible by the
90     chosen means of transport are followed. This allows the type of transport to be
91     handled easily. When finding the quickest route the same rules apply except
92 amb 621 that the criterion for sorting is the shortest potential route (assuming that
93     from each node to the end is the fastest possible type of highway).
94 amb 569 <p>
95     This method also works, but again it isn't very fast. The problem is that the
96     complexity is proportional to the number of nodes or segments in all routes
97     examined between the start and end nodes. Maintaining the list of intermediate
98     nodes in order is the most complex part.
99    
100 amb 1474 <h3 id="H_1_1_3">Final Algorithm</h3>
101 amb 569
102     The final algorithm that is implemented in the router is basically the one above
103     but with an important difference. Instead of finding a long route among a data
104 amb 584 set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it
105     finds one long route in a data set of 1,000,000 nodes and a few hundred very
106     short routes in the full data set. Since the time taken to find a route is
107 amb 659 proportional to the number of nodes that need to be considered the main route
108     takes 1/10th of the time and the very short routes take almost no time at all.
109 amb 569 <p>
110     The solution to making the algorithm fast is therefore to discard most of the
111     nodes and only keep the interesting ones. In this case a node is deemed to be
112 amb 659 interesting if it is the junction of three or more segments or the junction of
113     two segments with different properties or has a routing restriction different
114     from the connecting segments. In the algorithm and following description these
115     are classed as <em>super-nodes</em>. Starting at each super-node a
116     <em>super-segment</em> is generated that finishes on another super-node and
117     contains the <em>shortest</em> path along segments with identical properties
118     (and these properties are inherited by the super-segment). The point of
119     choosing the shortest route is that since all segments considered have identical
120     properties they will be treated identically when properties are taken into
121     account. This decision making process can be repeated until the only the most
122     important and interesting nodes remain.
123 amb 1469 <p class="center">
124 amb 659 <img alt="Original data" src="example0.png">
125     <br>
126     Original Highways
127 amb 1469 <p class="center">
128 amb 659 <img alt="Iteration 1" src="example1.png">
129     <br>
130     First Iteration
131 amb 1469 <p class="center">
132 amb 659 <img alt="Iteration 2" src="example2.png">
133     <br>
134     Second Iteration
135 amb 1469 <p class="center">
136 amb 659 <img alt="Iteration 3" src="example3.png">
137     <br>
138     Third Iteration
139 amb 1469 <p class="center">
140 amb 659 <img alt="Iteration 4" src="example4.png">
141     <br>
142     Fourth Iteration
143 amb 569 <p>
144 amb 584 To find a route between a start and finish point now comprises the following
145     steps (assuming a shortest route is required):
146     <ol>
147     <li>Find all shortest routes from the start point along normal segments and
148     stopping when super-nodes are reached.
149     <li>Find all shortest routes from the end point backwards along normal
150     segments and stopping when super-nodes are reached.
151     <li>Find the shortest route along super-segments from the set of super-nodes
152     in step 1 to the set of super-nodes in step 2 (taking into account the lengths
153     found in steps 1 and 2 between the start/finish super-nodes and the ultimate
154     start/finish point).
155     <li>For each super-segment in step 3 find the shortest route between the two
156     end-point super-nodes.
157     </ol>
158     This multi-step process is considerably quicker than using all nodes but gives a
159     result that still contains the full list of nodes that are visited. There are
160     some special cases though, for example very short routes that do not pass
161     through any super-nodes, or routes that start or finish on a super-node. In
162     these cases one or more of the steps listed can be removed or simplified.
163 amb 621 <p>
164     When the first route reaches the final node the length of that route is retained
165     as a benchmark. Any shorter complete route that is calculated later would
166     replace this benchmark. As routes are tested any partial routes that are longer
167     than the benchmark can be immediately discarded. Other partial routes have the
168     length of a perfect straight highway to the final node added to them and if the
169     total exceeds the benchmark they can also be discarded. Very quickly the number
170     of possible routes is reduced until the absolute shortest is found.
171     <p>
172     For routes that do not start or finish on a node in the original data set a fake
173     node is added to an existing segment. This requires special handling in the
174     algorithm but it gives mode flexibility for the start, finish and intermediate
175     points in a route.
176 amb 569
177 amb 1474 <h4 id="H_1_1_3_1">Algorithm Evolution</h4>
178 amb 659
179     In Routino versions 1.0 to 1.4 the algorithm used to select a super-node was the
180     same as above except that node properties were not included. Routino versions
181     1.4.1 to 1.5.1 used a slightly different algorithm which only chose nodes that
182     were junctions between segments with different properties (or has a routing
183     restriction that is different from connecting segments in versions 1.5 and
184     1.5.1). The addition of turn restrictions (described in more detail below)
185     requires the original algorithm since the super-segments more accurately reflect
186     the underlying topology.
187    
188 amb 1474 <h4 id="H_1_1_3_2">Algorithm Implementation</h4>
189 amb 1461
190     The algorithm that is used for finding the route between the super-nodes using
191     super-segments is the A* algorithm (or a slight variation of it). This was not
192     a deliberate design decision, but evolved into it during development. This
193     algorithm relies on calculating the lowest score (shortest distance or quickest
194     time) to each node from the starting node. The remaining score for the path to
195     the destination node is estimated (based on a straight line using the fastest
196     type of highway) and added to the current score and the result recorded. At
197     each step the unvisited node that has the lowest current score is examined and
198     all nodes connected to it have their scores calculated. When the destination
199     node has been reached all remaining unvisited nodes with scores higher than the
200     destination node's score can be discarded and the few remaining nodes examined.
201     <p>
202     The algorithm used to find the route between super-nodes using normal segments
203     is Dijkstra's algorithm (although it is implemented as the same algorithm as
204     above but with no estimated cost). Since these routes tend to be short and the
205     CPU time for calculating the heuristic cost function is relatively large this
206     tends to give a quicker solution.
207    
208    
209 amb 1474 <h3 id="H_1_1_4">Routing Preferences</h3>
210 amb 569
211 amb 584 One of the important features of Routino is the ability to select a route that
212     is optimum for a set of criteria such as preferences for each type of highway,
213     speed limits and other restrictions and highway properties.
214     <p>
215     All of these features are handled by assigning a score to each segment while
216     calculating the route and trying to minimise the score rather than simply
217     minimising the length.
218     <dl>
219     <dt>Segment length
220     <dd>When calculating the shortest route the length of the segment is the
221     starting point for the score.
222     <dt>Speed preference
223     <dd>When calculating the quickest route the time taken calculated from the
224     length of the segment and the lower of the highway's own speed limit and the
225     user's speed preference for the type of highway is the starting point for the
226     score.
227 amb 737 <dt>One-way restriction
228     <dd>If a highway has the one-way property in the opposite direction to the
229     desired travel and the user's preference is to obey one-way restrictions then
230 amb 584 the segment is ignored.
231     <dt>Weight, height, width &amp; length limits
232     <dd>If a highway has one of these limits and its value is less than the user's
233     specified requirement then the segment is ignored.
234     <dt>Highway preference
235     <dd>The highway preference specified by the user is a percentage, these are
236     scaled so that the most preferred highway type has a weighted preference of
237     1.0 (0% always has a weighted preference of 0.0). The calculated score for a
238     segment is divided by this weighted preference.
239     <dt>Highway properties
240     <dd>The other highway properties are specified by the user as a percentage and
241     each highway either has that property or not. The user's property preference
242 amb 585 is scaled into the range 0.0 (for 0%) to 1.0 (for 100%) to give a weighted
243 amb 737 preference, a second "non-property" weighted preference is calculated in the
244 amb 584 same way after subtracting the user's preference from 100%. If a segment has
245 amb 585 a particular property then the calculated score is divided by the weighted
246 amb 671 preference for that property, if not then it is divided by the non-property
247     weighted preference. A non-linear transformation is applied so that changing
248     property preferences close to 50% do not cause large variations in routes.
249 amb 584 </dl>
250    
251 amb 1474 <h3 id="H_1_1_5">Data Pruning</h3>
252 amb 584
253 amb 969 From version 2.2 there are options to "prune" nodes and segments from the input
254     data which means to remove nodes and/or segments without significantly changing
255     the routing results.
256 amb 621 <p>
257 amb 969 The pruning options must meet a number of conditions to be useful:
258     <ul>
259     <li>The topology relevant to routing must remain unchanged. The instructions
260     that are produced from the reduced set of nodes and segments must be
261     sufficiently accurate for anybody trying to follow them on the ground.
262     <li>Any restrictions belonging to nodes or segments that stop certain types of
263     traffic from following a particular highway must be preserved.
264     <li>The total length must be calculated using the original data and not the
265     simplified data which by its nature will typically be shorter.
266     <li>The location of the remaining nodes and segments must be a good
267     representation of the original nodes and segments. Since the calculated
268     route may be displayed on a map the remaining nodes and segments must
269     clearly indicate the route to take.
270     </ul>
271     <p>
272     The prune options all have user-controllable parameters which allow the
273 amb 1117 geographical accuracy to be controlled. This means that although the topology
274     is the same the geographical accuracy can be sacrificed slightly to minimise the
275     number of nodes and segments.
276     <p>
277     The pruning options that are available are:
278     <ul>
279     <li>Removing the access permissions for a transport type from segments if it
280     is not possible to route that transport type from those segments to a
281     significant number of other places. The limit on the pruning is set by the
282     total length of the isolated group of segments. This significantly increases
283     the chance that a route will be found by not putting waypoints in inaccessible
284     places.
285     <li>Removing short segments, the limit is set by the length of the segment.
286     This removes a number of redundant segments (and associated nodes) but rules
287     are applied to ensure that removing the segments does not alter junction
288     topology or remove node access permissions or changes in way properties.
289     <li>Removing nodes from almost straight highways, the limit is set by the
290     distance between the remaining segments and the original nodes. This removes
291     a large number of redundant nodes (and therefore segments) but again care is
292     taken not to remove node access permissions or changes in way properties.
293     </ul>
294 amb 969
295 amb 1474 <h3 id="H_1_1_6">Turn Restrictions</h3>
296 amb 969
297     The addition of turn restrictions in version 2.0 adds a set of further
298     complications because it introduces a set of constraints that are far more
299     complex than one-way streets.
300     <p>
301 amb 621 A turn restriction in the simplest case is a combination of a segment, node and
302     segment such that routes are not allowed to go from the first segment to the
303     second one through the specified node. Exceptions for certain types of traffic
304     can also be specified. Currently only this simplest type of turn restriction is
305     handled by the algorithm.
306     <p>
307 amb 659 The first complication of turn restrictions is that the algorithm above requires
308     that super-segments are composed of segments with identical properties. A turn
309     restriction is not the same in both directions so a super-segment cannot include
310     any route through that turn restriction. The node at the centre of the turn
311     restriction must therefore be a super-node to avoid this. In addition to this
312     all nodes connected to the turn restriction node by a single segment must also
313     be super-nodes to avoid any long-distance super-segments starting at the
314     restricted node.
315     <p>
316     The second complication of a turn restriction is that the optimum route may
317 amb 621 require passing through the same node more than once. This can happen where the
318     route needs to work around a turn restriction by driving past it, turning round
319     (on a roundabout perhaps) and coming back along the same highway. Without turn
320     restrictions a route could be defined purely by the set of nodes that were
321 amb 659 passed; no node would exist more than once along a route between two points.
322 amb 621 With turn restrictions the route is defined by a node and the segment used to
323     get there; no route between two points will ever need to follow the same segment
324 amb 1461 in the same direction more than once. This means that the optimisation
325     algorithm calculates scores for directed segments (indexed by segment and end
326     node) rather than for nodes.
327 amb 621 <p>
328 amb 659 A side-effect of this is that a route that works around a turn restriction must
329     be calculable using the super-segments that are stored in the database. This
330     puts a limit on the amount of database optimisation that can be performed
331     because if too many super-segments are removed the optimum work-around may also
332     be removed. The solution to this is to ensure that the database preserves all
333     loops that can be used to turn around and reverse direction, previously
334     super-segments that started and finished on the same super-node were disallowed.
335     <p>
336     Another side-effect of having the route composed of a set of locations (nodes)
337     as well as the direction of travel (segments used to reach them) is that via
338     points in the route can be forced to continue in the original direction. If the
339     chosen method of transport obeys turn restrictions then it will not reverse
340     direction at a via point but will find an optimum route continuing in the same
341 amb 736 direction. The only exception to this is when the route ahead at a waypoint is
342     into a dead-end and an immediate U-turn is allowed.
343 amb 659 <p>
344     A side-effect of having the starting direction at a via point defined by the
345     previous part of the route is that overall non-optimal routes may be found even
346     though each section between via points is optimal. For a route with a start,
347     middle and end point defined it can be the case that the shortest route from the
348     start to the middle arrives in the opposite direction to that required for the
349     optimal route from the middle to the end. The calculation of the route in
350     separate sections therefore may give a non-optimum result even though each
351     section is itself optimum based on the start conditions.
352     <p>
353     Overall the presence of turn restrictions in the database makes the routing
354     slower even for regions of the map that have no turn restrictions.
355 amb 621
356 amb 1474 <h3 id="H_1_1_7">Data Implementation</h3>
357 amb 621
358 amb 569 The hardest part of implementing this router is the data organisation. The
359     arrangement of the data to minimise the number of operations required to follow
360     a route from one node to another is much harder than designing the algorithm
361     itself.
362     <p>
363     The final implementation uses a separate table for nodes, segments and ways.
364     Each table individually is implemented as a C-language data structure that is
365     written to disk by a program which parses the OpenStreetMap XML data file. In
366     the router these data structures are memory mapped so that the operating system
367     handles the problems of loading the needed data blocks from disk.
368     <p>
369     Each node contains a latitude and longitude and they are sorted geographically
370     so that converting a latitude and longitude coordinate to a node is fast as well
371     as looking up the coordinate of a node. The node also contains the location in
372     the array of segments for the first segment that uses that node.
373     <br>
374     Each segment contains the location of the two nodes as well as the way that the
375     segment came from. The location of the next segment that uses one of the two
376     nodes is also stored; the next segment for the other node is the following one
377     in the array. The length of the segment is also pre-computed and stored.
378     <br>
379     Each way has a name, a highway type, a list of allowed types of traffic, a speed
380 amb 584 limit, any weight, height, width or length restrictions and the highway
381     properties.
382 amb 569 <p>
383     The super-nodes are mixed in with the nodes and the super-segments are mixed in
384     with the segments. For the nodes they are the same as the normal nodes, so just
385     a flag is needed to indicate that they are super. The super-segments are in
386     addition to the normal segments so they increase the database size (by about
387 amb 969 10%) and are also marked with a flag. Some segments are therefore flagged as
388     both normal segments and super-segments if they both have the same end nodes.
389 amb 621 <p>
390     The relations are stored separately from the nodes, segments and ways. For the
391     turn restriction relations the initial and final segments are stored along with
392     the restricted node itself. Each node that has a turn restriction is marked in
393     the main node storage with a flag to indicate this information.
394 amb 569
395     </div>
396    
397     <!-- Content End -->
398    
399     <!-- Footer Start -->
400    
401 amb 1469 <div class="footer">
402 amb 569
403     <address>
404 amb 1267 &copy; Andrew M. Bishop - <a href="http://www.routino.org/">http://www.routino.org/</a>
405 amb 569 </address>
406    
407     </div>
408    
409     <!-- Footer End -->
410    
411 amb 1469 </body>
412 amb 569
413 amb 1469 </html>