Routino SVN Repository Browser

Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino

ViewVC logotype

Contents of /trunk/doc/html/algorithm.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 736 - (show annotations) (download) (as text)
Mon May 30 12:36:12 2011 UTC (13 years, 9 months ago) by amb
File MIME type: text/html
File size: 17576 byte(s)
Describe new philosophy of alloing U-turn at waypoints to avoid dead-ends.

1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2 <HTML>
3
4 <!--
5 Routino documentation - algorithm
6
7 Part of the Routino routing software.
8
9 This file Copyright 2008-2011 Andrew M. Bishop
10
11 This program is free software: you can redistribute it and/or modify
12 it under the terms of the GNU Affero General Public License as published by
13 the Free Software Foundation, either version 3 of the License, or
14 (at your option) any later version.
15
16 This program is distributed in the hope that it will be useful,
17 but WITHOUT ANY WARRANTY; without even the implied warranty of
18 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
19 GNU Affero General Public License for more details.
20
21 You should have received a copy of the GNU Affero General Public License
22 along with this program. If not, see http://www.gnu.org/licenses/.
23 -->
24
25 <HEAD>
26 <TITLE>Routino : Algorithm</TITLE>
27 <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
28 <LINK href="style.css" type="text/css" rel="stylesheet">
29 </HEAD>
30
31 <BODY>
32
33 <!-- Header Start -->
34
35 <div class="header" align="center">
36
37 <h1>Routino : Algorithm</h1>
38
39 <hr>
40 </div>
41
42 <!-- Header End -->
43
44 <!-- Content Start -->
45
46 <div class="content">
47
48 <h2><a name="H_1_1"></a>Algorithms</h2>
49
50 This page describes the development of the algorithm that is used in Routino for
51 finding routes.
52
53 <h3><a name="H_1_1_1"></a>Simplest Algorithm</h3>
54
55 The algorithm to find a route is fundamentally simple: Start at the beginning,
56 follow all possible routes and keep going until you reach the end.
57 <p>
58 While this method does work, it isn't fast. To be able to find a route quickly
59 needs a different algorithm, one that can find the correct answer without
60 wasting time on routes that lead nowhere.
61
62 <h3><a name="H_1_1_2"></a>Improved Algorithm</h3>
63
64 The simplest way to do this is to follow all possible segments from the starting
65 node to the next nearest node (an intermediate node in the complete journey).
66 For each node that is reached store the shortest route from the starting node
67 and the length of that route. The list of intermediate nodes needs to be
68 maintained in order of shortest overall route on the assumption that there is a
69 straight line route from here to the end node.
70 <br>
71 At each point the intermediate node that has the shortest potential overall
72 journey time is processed before any other node. From the first node in the
73 list follow all possible segments and place the newly discovered nodes into the
74 same list ordered in the same way. This will tend to constrain the list of
75 nodes examined to be the ones that are between the start and end nodes. If at
76 any point you reach a node that has already been reached by a longer route then
77 you can discard that route since the newly discovered route is shorter.
78 Conversely if the previously discovered route is shorter then discard the new
79 route.
80 <br>
81 At some point the end node will be reached and then any routes with potential
82 lengths longer than this actual route can be immediately discarded. The few
83 remaining potential routes must be continued until they are found to be shorter
84 or have no possibility of being shorter. The shortest possible route is then
85 found.
86 <p>
87 At all times when looking at a node only those segments that are possible by the
88 chosen means of transport are followed. This allows the type of transport to be
89 handled easily. When finding the quickest route the same rules apply except
90 that the criterion for sorting is the shortest potential route (assuming that
91 from each node to the end is the fastest possible type of highway).
92 <p>
93 This method also works, but again it isn't very fast. The problem is that the
94 complexity is proportional to the number of nodes or segments in all routes
95 examined between the start and end nodes. Maintaining the list of intermediate
96 nodes in order is the most complex part.
97
98 <h3><a name="H_1_1_3"></a>Final Algorithm</h3>
99
100 The final algorithm that is implemented in the router is basically the one above
101 but with an important difference. Instead of finding a long route among a data
102 set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it
103 finds one long route in a data set of 1,000,000 nodes and a few hundred very
104 short routes in the full data set. Since the time taken to find a route is
105 proportional to the number of nodes that need to be considered the main route
106 takes 1/10th of the time and the very short routes take almost no time at all.
107 <p>
108 The solution to making the algorithm fast is therefore to discard most of the
109 nodes and only keep the interesting ones. In this case a node is deemed to be
110 interesting if it is the junction of three or more segments or the junction of
111 two segments with different properties or has a routing restriction different
112 from the connecting segments. In the algorithm and following description these
113 are classed as <em>super-nodes</em>. Starting at each super-node a
114 <em>super-segment</em> is generated that finishes on another super-node and
115 contains the <em>shortest</em> path along segments with identical properties
116 (and these properties are inherited by the super-segment). The point of
117 choosing the shortest route is that since all segments considered have identical
118 properties they will be treated identically when properties are taken into
119 account. This decision making process can be repeated until the only the most
120 important and interesting nodes remain.
121 <p align="center">
122 <img alt="Original data" src="example0.png">
123 <br>
124 Original Highways
125 <p align="center">
126 <img alt="Iteration 1" src="example1.png">
127 <br>
128 First Iteration
129 <p align="center">
130 <img alt="Iteration 2" src="example2.png">
131 <br>
132 Second Iteration
133 <p align="center">
134 <img alt="Iteration 3" src="example3.png">
135 <br>
136 Third Iteration
137 <p align="center">
138 <img alt="Iteration 4" src="example4.png">
139 <br>
140 Fourth Iteration
141 <p>
142 To find a route between a start and finish point now comprises the following
143 steps (assuming a shortest route is required):
144 <ol>
145 <li>Find all shortest routes from the start point along normal segments and
146 stopping when super-nodes are reached.
147 <li>Find all shortest routes from the end point backwards along normal
148 segments and stopping when super-nodes are reached.
149 <li>Find the shortest route along super-segments from the set of super-nodes
150 in step 1 to the set of super-nodes in step 2 (taking into account the lengths
151 found in steps 1 and 2 between the start/finish super-nodes and the ultimate
152 start/finish point).
153 <li>For each super-segment in step 3 find the shortest route between the two
154 end-point super-nodes.
155 </ol>
156 This multi-step process is considerably quicker than using all nodes but gives a
157 result that still contains the full list of nodes that are visited. There are
158 some special cases though, for example very short routes that do not pass
159 through any super-nodes, or routes that start or finish on a super-node. In
160 these cases one or more of the steps listed can be removed or simplified.
161 <p>
162 When the first route reaches the final node the length of that route is retained
163 as a benchmark. Any shorter complete route that is calculated later would
164 replace this benchmark. As routes are tested any partial routes that are longer
165 than the benchmark can be immediately discarded. Other partial routes have the
166 length of a perfect straight highway to the final node added to them and if the
167 total exceeds the benchmark they can also be discarded. Very quickly the number
168 of possible routes is reduced until the absolute shortest is found.
169 <p>
170 For routes that do not start or finish on a node in the original data set a fake
171 node is added to an existing segment. This requires special handling in the
172 algorithm but it gives mode flexibility for the start, finish and intermediate
173 points in a route.
174
175 <h4><a name="H_1_1_3_1"></a>Algorithm Evolution</h4>
176
177 In Routino versions 1.0 to 1.4 the algorithm used to select a super-node was the
178 same as above except that node properties were not included. Routino versions
179 1.4.1 to 1.5.1 used a slightly different algorithm which only chose nodes that
180 were junctions between segments with different properties (or has a routing
181 restriction that is different from connecting segments in versions 1.5 and
182 1.5.1). The addition of turn restrictions (described in more detail below)
183 requires the original algorithm since the super-segments more accurately reflect
184 the underlying topology.
185
186 <h3><a name="H_1_1_4"></a>Routing Preferences</h3>
187
188 One of the important features of Routino is the ability to select a route that
189 is optimum for a set of criteria such as preferences for each type of highway,
190 speed limits and other restrictions and highway properties.
191 <p>
192 All of these features are handled by assigning a score to each segment while
193 calculating the route and trying to minimise the score rather than simply
194 minimising the length.
195 <dl>
196 <dt>Segment length
197 <dd>When calculating the shortest route the length of the segment is the
198 starting point for the score.
199 <dt>Speed preference
200 <dd>When calculating the quickest route the time taken calculated from the
201 length of the segment and the lower of the highway's own speed limit and the
202 user's speed preference for the type of highway is the starting point for the
203 score.
204 <dt>Oneway restriction
205 <dd>If a highway has the oneway property in the opposite direction to the
206 desired travel and the user's preference is to obey oneway restrictions then
207 the segment is ignored.
208 <dt>Weight, height, width &amp; length limits
209 <dd>If a highway has one of these limits and its value is less than the user's
210 specified requirement then the segment is ignored.
211 <dt>Highway preference
212 <dd>The highway preference specified by the user is a percentage, these are
213 scaled so that the most preferred highway type has a weighted preference of
214 1.0 (0% always has a weighted preference of 0.0). The calculated score for a
215 segment is divided by this weighted preference.
216 <dt>Highway properties
217 <dd>The other highway properties are specified by the user as a percentage and
218 each highway either has that property or not. The user's property preference
219 is scaled into the range 0.0 (for 0%) to 1.0 (for 100%) to give a weighted
220 preference, a second "non-property" weighted preference is calcuated in the
221 same way after subtracting the user's preference from 100%. If a segment has
222 a particular property then the calculated score is divided by the weighted
223 preference for that property, if not then it is divided by the non-property
224 weighted preference. A non-linear transformation is applied so that changing
225 property preferences close to 50% do not cause large variations in routes.
226 </dl>
227
228 <h3><a name="H_1_1_5"></a>Turn Restrictions</h3>
229
230 The addition of turn restrictions adds a set of further complications because it
231 introduces a set of constraints that are far more complex than one-way streets.
232 <p>
233 A turn restriction in the simplest case is a combination of a segment, node and
234 segment such that routes are not allowed to go from the first segment to the
235 second one through the specified node. Exceptions for certain types of traffic
236 can also be specified. Currently only this simplest type of turn restriction is
237 handled by the algorithm.
238 <p>
239 The first complication of turn restrictions is that the algorithm above requires
240 that super-segments are composed of segments with identical properties. A turn
241 restriction is not the same in both directions so a super-segment cannot include
242 any route through that turn restriction. The node at the centre of the turn
243 restriction must therefore be a super-node to avoid this. In addition to this
244 all nodes connected to the turn restriction node by a single segment must also
245 be super-nodes to avoid any long-distance super-segments starting at the
246 restricted node.
247 <p>
248 The second complication of a turn restriction is that the optimum route may
249 require passing through the same node more than once. This can happen where the
250 route needs to work around a turn restriction by driving past it, turning round
251 (on a roundabout perhaps) and coming back along the same highway. Without turn
252 restrictions a route could be defined purely by the set of nodes that were
253 passed; no node would exist more than once along a route between two points.
254 With turn restrictions the route is defined by a node and the segment used to
255 get there; no route between two points will ever need to follow the same segment
256 in the same direction more than once.
257 <p>
258 A side-effect of this is that a route that works around a turn restriction must
259 be calculable using the super-segments that are stored in the database. This
260 puts a limit on the amount of database optimisation that can be performed
261 because if too many super-segments are removed the optimum work-around may also
262 be removed. The solution to this is to ensure that the database preserves all
263 loops that can be used to turn around and reverse direction, previously
264 super-segments that started and finished on the same super-node were disallowed.
265 <p>
266 Another side-effect of having the route composed of a set of locations (nodes)
267 as well as the direction of travel (segments used to reach them) is that via
268 points in the route can be forced to continue in the original direction. If the
269 chosen method of transport obeys turn restrictions then it will not reverse
270 direction at a via point but will find an optimum route continuing in the same
271 direction. The only exception to this is when the route ahead at a waypoint is
272 into a dead-end and an immediate U-turn is allowed.
273 <p>
274 A side-effect of having the starting direction at a via point defined by the
275 previous part of the route is that overall non-optimal routes may be found even
276 though each section between via points is optimal. For a route with a start,
277 middle and end point defined it can be the case that the shortest route from the
278 start to the middle arrives in the opposite direction to that required for the
279 optimal route from the middle to the end. The calculation of the route in
280 separate sections therefore may give a non-optimum result even though each
281 section is itself optimum based on the start conditions.
282 <p>
283 Overall the presence of turn restrictions in the database makes the routing
284 slower even for regions of the map that have no turn restrictions.
285
286 <h3><a name="H_1_1_4"></a>Data Implementation</h3>
287
288 The hardest part of implementing this router is the data organisation. The
289 arrangement of the data to minimise the number of operations required to follow
290 a route from one node to another is much harder than designing the algorithm
291 itself.
292 <p>
293 The final implementation uses a separate table for nodes, segments and ways.
294 Each table individually is implemented as a C-language data structure that is
295 written to disk by a program which parses the OpenStreetMap XML data file. In
296 the router these data structures are memory mapped so that the operating system
297 handles the problems of loading the needed data blocks from disk.
298 <p>
299 Each node contains a latitude and longitude and they are sorted geographically
300 so that converting a latitude and longitude coordinate to a node is fast as well
301 as looking up the coordinate of a node. The node also contains the location in
302 the array of segments for the first segment that uses that node.
303 <br>
304 Each segment contains the location of the two nodes as well as the way that the
305 segment came from. The location of the next segment that uses one of the two
306 nodes is also stored; the next segment for the other node is the following one
307 in the array. The length of the segment is also pre-computed and stored.
308 <br>
309 Each way has a name, a highway type, a list of allowed types of traffic, a speed
310 limit, any weight, height, width or length restrictions and the highway
311 properties.
312 <p>
313 The super-nodes are mixed in with the nodes and the super-segments are mixed in
314 with the segments. For the nodes they are the same as the normal nodes, so just
315 a flag is needed to indicate that they are super. The super-segments are in
316 addition to the normal segments so they increase the database size (by about
317 10%) and are also marked with a flag.
318 <p>
319 The relations are stored separately from the nodes, segments and ways. For the
320 turn restriction relations the initial and final segments are stored along with
321 the restricted node itself. Each node that has a turn restriction is marked in
322 the main node storage with a flag to indicate this information.
323
324
325 <h3><a name="H_1_1_6"></a>Practicalities</h3>
326
327 At the time of writing (April 2010) the OpenStreetMap data for Great Britain
328 (taken from
329 <a class="ext" href="http://download.geofabrik.de/osm/europe/" title="GeoFabrik Mirror of OpenStreetMap Data">GeoFabrik</a>
330 ) contains:
331 <ul>
332 <li>14,675,098 nodes
333 <ul>
334 <li>8,767,521 are highway nodes
335 <li>1,120,297 are super-nodes
336 </ul>
337 <li>1,876,822 ways
338 <ul>
339 <li>1,412,898 are highways
340 <ul>
341 <li>9,316,328 highway segments
342 <li>1,641,009 are super-segments
343 </ul>
344 </ul>
345 <li>60,572 relations
346 </ul>
347
348 The database files when generated are 41.5 MB for nodes, 121.6 MB for segments
349 and 12.6 MB for ways and are stored uncompressed. By having at least 200 MB or
350 RAM available the routing can be performed with no disk accesses (once the data
351 has been read once).
352
353
354 </div>
355
356 <!-- Content End -->
357
358 <!-- Footer Start -->
359
360 <div class="footer" align="center">
361 <hr>
362
363 <address>
364 &copy; Andrew M. Bishop = &lt;amb "at" gedanken.demon.co.uk&gt;
365 </address>
366
367 </div>
368
369 <!-- Footer End -->
370
371 </BODY>
372
373 </HTML>