Routino SVN Repository Browser

Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino

ViewVC logotype

Contents of /trunk/doc/ALGORITHM.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 679 - (show annotations) (download)
Sat Apr 23 16:05:56 2011 UTC (13 years, 11 months ago) by amb
File MIME type: text/plain
File size: 16161 byte(s)
Add description of U-turns at dead-ends.

1 Routino : Algorithm
2 ===================
3
4
5 This page describes the development of the algorithm that is used in
6 Routino for finding routes.
7
8
9 Simplest Algorithm
10 ------------------
11
12 The algorithm to find a route is fundamentally simple: Start at the
13 beginning, follow all possible routes and keep going until you reach
14 the end.
15
16 While this method does work, it isn't fast. To be able to find a route
17 quickly needs a different algorithm, one that can find the correct
18 answer without wasting time on routes that lead nowhere.
19
20
21 Improved Algorithm
22 ------------------
23
24 The simplest way to do this is to follow all possible segments from the
25 starting node to the next nearest node (an intermediate node in the
26 complete journey). For each node that is reached store the shortest
27 route from the starting node and the length of that route. The list of
28 intermediate nodes needs to be maintained in order of shortest overall
29 route on the assumption that there is a straight line route from here
30 to the end node.
31 At each point the intermediate node that has the shortest potential
32 overall journey time is processed before any other node. From the first
33 node in the list follow all possible segments and place the newly
34 discovered nodes into the same list ordered in the same way. This will
35 tend to constrain the list of nodes examined to be the ones that are
36 between the start and end nodes. If at any point you reach a node that
37 has already been reached by a longer route then you can discard that
38 route since the newly discovered route is shorter. Conversely if the
39 previously discovered route is shorter then discard the new route.
40 At some point the end node will be reached and then any routes with
41 potential lengths longer than this actual route can be immediately
42 discarded. The few remaining potential routes must be continued until
43 they are found to be shorter or have no possibility of being shorter.
44 The shortest possible route is then found.
45
46 At all times when looking at a node only those segments that are
47 possible by the chosen means of transport are followed. This allows the
48 type of transport to be handled easily. When finding the quickest route
49 the same rules apply except that the criterion for sorting is the
50 shortest potential route (assuming that from each node to the end is
51 the fastest possible type of highway).
52
53 This method also works, but again it isn't very fast. The problem is
54 that the complexity is proportional to the number of nodes or segments
55 in all routes examined between the start and end nodes. Maintaining the
56 list of intermediate nodes in order is the most complex part.
57
58
59 Final Algorithm
60 ---------------
61
62 The final algorithm that is implemented in the router is basically the
63 one above but with an important difference. Instead of finding a long
64 route among a data set of 8,000,000 nodes (number of highway nodes in
65 UK at beginning of 2010) it finds one long route in a data set of
66 1,000,000 nodes and a few hundred very short routes in the full data
67 set. Since the time taken to find a route is proportional to the number
68 of nodes that need to be considered the main route takes 1/10th of the
69 time and the very short routes take almost no time at all.
70
71 The solution to making the algorithm fast is therefore to discard most
72 of the nodes and only keep the interesting ones. In this case a node is
73 deemed to be interesting if it is the junction of three or more
74 segments or the junction of two segments with different properties or
75 has a routing restriction different from the connecting segments. In
76 the algorithm and following description these are classed as
77 super-nodes. Starting at each super-node a super-segment is generated
78 that finishes on another super-node and contains the shortest path
79 along segments with identical properties (and these properties are
80 inherited by the super-segment). The point of choosing the shortest
81 route is that since all segments considered have identical properties
82 they will be treated identically when properties are taken into
83 account. This decision making process can be repeated until the only
84 the most important and interesting nodes remain.
85
86 To find a route between a start and finish point now comprises the
87 following steps (assuming a shortest route is required):
88
89 1. Find all shortest routes from the start point along normal segments
90 and stopping when super-nodes are reached.
91 2. Find all shortest routes from the end point backwards along normal
92 segments and stopping when super-nodes are reached.
93 3. Find the shortest route along super-segments from the set of
94 super-nodes in step 1 to the set of super-nodes in step 2 (taking
95 into account the lengths found in steps 1 and 2 between the
96 start/finish super-nodes and the ultimate start/finish point).
97 4. For each super-segment in step 3 find the shortest route between
98 the two end-point super-nodes.
99
100 This multi-step process is considerably quicker than using all nodes
101 but gives a result that still contains the full list of nodes that are
102 visited. There are some special cases though, for example very short
103 routes that do not pass through any super-nodes, or routes that start
104 or finish on a super-node. In these cases one or more of the steps
105 listed can be removed or simplified.
106
107 When the first route reaches the final node the length of that route is
108 retained as a benchmark. Any shorter complete route that is calculated
109 later would replace this benchmark. As routes are tested any partial
110 routes that are longer than the benchmark can be immediately discarded.
111 Other partial routes have the length of a perfect straight highway to
112 the final node added to them and if the total exceeds the benchmark
113 they can also be discarded. Very quickly the number of possible routes
114 is reduced until the absolute shortest is found.
115
116 For routes that do not start or finish on a node in the original data
117 set a fake node is added to an existing segment. This requires special
118 handling in the algorithm but it gives mode flexibility for the start,
119 finish and intermediate points in a route.
120
121
122 Algorithm Evolution
123 -------------------
124
125 In Routino versions 1.0 to 1.4 the algorithm used to select a
126 super-node was the same as above except that node properties were not
127 included. Routino versions 1.4.1 to 1.5.1 used a slightly different
128 algorithm which only chose nodes that were junctions between segments
129 with different properties (or has a routing restriction that is
130 different from connecting segments in versions 1.5 and 1.5.1). The
131 addition of turn restrictions (described in more detail below) requires
132 the original algorithm since the super-segments more accurately reflect
133 the underlying topology.
134
135
136 Routing Preferences
137 -------------------
138
139 One of the important features of Routino is the ability to select a
140 route that is optimum for a set of criteria such as preferences for
141 each type of highway, speed limits and other restrictions and highway
142 properties.
143
144 All of these features are handled by assigning a score to each segment
145 while calculating the route and trying to minimise the score rather
146 than simply minimising the length.
147
148 Segment length
149 When calculating the shortest route the length of the segment is
150 the starting point for the score.
151
152 Speed preference
153 When calculating the quickest route the time taken calculated
154 from the length of the segment and the lower of the highway's
155 own speed limit and the user's speed preference for the type of
156 highway is the starting point for the score.
157
158 Oneway restriction
159 If a highway has the oneway property in the opposite direction
160 to the desired travel and the user's preference is to obey
161 oneway restrictions then the segment is ignored.
162
163 Weight, height, width & length limits
164 If a highway has one of these limits and its value is less than
165 the user's specified requirement then the segment is ignored.
166
167 Highway preference
168 The highway preference specified by the user is a percentage,
169 these are scaled so that the most preferred highway type has a
170 weighted preference of 1.0 (0% always has a weighted preference
171 of 0.0). The calculated score for a segment is divided by this
172 weighted preference.
173
174 Highway properties
175 The other highway properties are specified by the user as a
176 percentage and each highway either has that property or not. The
177 user's property preference is scaled into the range 0.0 (for 0%)
178 to 1.0 (for 100%) to give a weighted preference, a second
179 "non-property" weighted preference is calcuated in the same way
180 after subtracting the user's preference from 100%. If a segment
181 has a particular property then the calculated score is divided
182 by the weighted preference for that property, if not then it is
183 divided by the non-property weighted preference. A non-linear
184 transformation is applied so that changing property preferences
185 close to 50% do not cause large variations in routes.
186
187 Turn Restrictions
188 -----------------
189
190 The addition of turn restrictions adds a set of further complications
191 because it introduces a set of constraints that are far more complex
192 than one-way streets.
193
194 A turn restriction in the simplest case is a combination of a segment,
195 node and segment such that routes are not allowed to go from the first
196 segment to the second one through the specified node. Exceptions for
197 certain types of traffic can also be specified. Currently only this
198 simplest type of turn restriction is handled by the algorithm.
199
200 The first complication of turn restrictions is that the algorithm above
201 requires that super-segments are composed of segments with identical
202 properties. A turn restriction is not the same in both directions so a
203 super-segment cannot include any route through that turn restriction.
204 The node at the centre of the turn restriction must therefore be a
205 super-node to avoid this. In addition to this all nodes connected to
206 the turn restriction node by a single segment must also be super-nodes
207 to avoid any long-distance super-segments starting at the restricted
208 node.
209
210 The second complication of a turn restriction is that the optimum route
211 may require passing through the same node more than once. This can
212 happen where the route needs to work around a turn restriction by
213 driving past it, turning round (on a roundabout perhaps) and coming
214 back along the same highway. Without turn restrictions a route could be
215 defined purely by the set of nodes that were passed; no node would
216 exist more than once along a route between two points. With turn
217 restrictions the route is defined by a node and the segment used to get
218 there; no route between two points will ever need to follow the same
219 segment in the same direction more than once.
220
221 A side-effect of this is that a route that works around a turn
222 restriction must be calculable using the super-segments that are stored
223 in the database. This puts a limit on the amount of database
224 optimisation that can be performed because if too many super-segments
225 are removed the optimum work-around may also be removed. The solution
226 to this is to ensure that the database preserves all loops that can be
227 used to turn around and reverse direction, previously super-segments
228 that started and finished on the same super-node were disallowed.
229
230 Another side-effect of having the route composed of a set of locations
231 (nodes) as well as the direction of travel (segments used to reach
232 them) is that via points in the route can be forced to continue in the
233 original direction. If the chosen method of transport obeys turn
234 restrictions then it will not reverse direction at a via point but will
235 find an optimum route continuing in the same direction. This route may
236 require turning around to get out of a dead-end and it is assumed that
237 all dead-ends will allow this.
238
239 A side-effect of having the starting direction at a via point defined
240 by the previous part of the route is that overall non-optimal routes
241 may be found even though each section between via points is optimal.
242 For a route with a start, middle and end point defined it can be the
243 case that the shortest route from the start to the middle arrives in
244 the opposite direction to that required for the optimal route from the
245 middle to the end. The calculation of the route in separate sections
246 therefore may give a non-optimum result even though each section is
247 itself optimum based on the start conditions.
248
249 Overall the presence of turn restrictions in the database makes the
250 routing slower even for regions of the map that have no turn
251 restrictions.
252
253
254 Data Implementation
255 -------------------
256
257 The hardest part of implementing this router is the data organisation.
258 The arrangement of the data to minimise the number of operations
259 required to follow a route from one node to another is much harder than
260 designing the algorithm itself.
261
262 The final implementation uses a separate table for nodes, segments and
263 ways. Each table individually is implemented as a C-language data
264 structure that is written to disk by a program which parses the
265 OpenStreetMap XML data file. In the router these data structures are
266 memory mapped so that the operating system handles the problems of
267 loading the needed data blocks from disk.
268
269 Each node contains a latitude and longitude and they are sorted
270 geographically so that converting a latitude and longitude coordinate
271 to a node is fast as well as looking up the coordinate of a node. The
272 node also contains the location in the array of segments for the first
273 segment that uses that node.
274 Each segment contains the location of the two nodes as well as the way
275 that the segment came from. The location of the next segment that uses
276 one of the two nodes is also stored; the next segment for the other
277 node is the following one in the array. The length of the segment is
278 also pre-computed and stored.
279 Each way has a name, a highway type, a list of allowed types of
280 traffic, a speed limit, any weight, height, width or length
281 restrictions and the highway properties.
282
283 The super-nodes are mixed in with the nodes and the super-segments are
284 mixed in with the segments. For the nodes they are the same as the
285 normal nodes, so just a flag is needed to indicate that they are super.
286 The super-segments are in addition to the normal segments so they
287 increase the database size (by about 10%) and are also marked with a
288 flag.
289
290 The relations are stored separately from the nodes, segments and ways.
291 For the turn restriction relations the initial and final segments are
292 stored along with the restricted node itself. Each node that has a turn
293 restriction is marked in the main node storage with a flag to indicate
294 this information.
295
296
297 Practicalities
298 --------------
299
300 At the time of writing (April 2010) the OpenStreetMap data for Great
301 Britain (taken from GeoFabrik) contains:
302 * 14,675,098 nodes
303 + 8,767,521 are highway nodes
304 + 1,120,297 are super-nodes
305 * 1,876,822 ways
306 + 1,412,898 are highways
307 o 9,316,328 highway segments
308 o 1,641,009 are super-segments
309 * 60,572 relations
310
311 The database files when generated are 41.5 MB for nodes, 121.6 MB for
312 segments and 12.6 MB for ways and are stored uncompressed. By having at
313 least 200 MB or RAM available the routing can be performed with no disk
314 accesses (once the data has been read once).
315
316
317 --------
318
319 Copyright 2008-2011 Andrew M. Bishop.

Properties

Name Value
cvs:description Description of the algorithm.