Routino SVN Repository Browser

Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino

ViewVC logotype

Annotation of /trunk/doc/ALGORITHM.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 901 - (hide annotations) (download)
Sat Nov 12 11:26:25 2011 UTC (13 years, 4 months ago) by amb
File MIME type: text/plain
File size: 16169 byte(s)
Small formatting changes.

1 amb 429 Routino : Algorithm
2     ===================
3 amb 152
4    
5     This page describes the development of the algorithm that is used in
6 amb 429 Routino for finding routes.
7 amb 152
8 amb 381
9 amb 152 Simplest Algorithm
10 amb 381 ------------------
11 amb 152
12     The algorithm to find a route is fundamentally simple: Start at the
13     beginning, follow all possible routes and keep going until you reach
14     the end.
15    
16     While this method does work, it isn't fast. To be able to find a route
17     quickly needs a different algorithm, one that can find the correct
18     answer without wasting time on routes that lead nowhere.
19    
20 amb 381
21 amb 152 Improved Algorithm
22 amb 381 ------------------
23 amb 152
24     The simplest way to do this is to follow all possible segments from the
25     starting node to the next nearest node (an intermediate node in the
26     complete journey). For each node that is reached store the shortest
27     route from the starting node and the length of that route. The list of
28     intermediate nodes needs to be maintained in order of shortest overall
29     route on the assumption that there is a straight line route from here
30     to the end node.
31     At each point the intermediate node that has the shortest potential
32     overall journey time is processed before any other node. From the first
33     node in the list follow all possible segments and place the newly
34     discovered nodes into the same list ordered in the same way. This will
35     tend to constrain the list of nodes examined to be the ones that are
36     between the start and end nodes. If at any point you reach a node that
37     has already been reached by a longer route then you can discard that
38     route since the newly discovered route is shorter. Conversely if the
39     previously discovered route is shorter then discard the new route.
40     At some point the end node will be reached and then any routes with
41     potential lengths longer than this actual route can be immediately
42     discarded. The few remaining potential routes must be continued until
43     they are found to be shorter or have no possibility of being shorter.
44     The shortest possible route is then found.
45    
46     At all times when looking at a node only those segments that are
47     possible by the chosen means of transport are followed. This allows the
48     type of transport to be handled easily. When finding the quickest route
49 amb 621 the same rules apply except that the criterion for sorting is the
50     shortest potential route (assuming that from each node to the end is
51     the fastest possible type of highway).
52 amb 152
53     This method also works, but again it isn't very fast. The problem is
54     that the complexity is proportional to the number of nodes or segments
55     in all routes examined between the start and end nodes. Maintaining the
56     list of intermediate nodes in order is the most complex part.
57    
58 amb 381
59 amb 152 Final Algorithm
60 amb 381 ---------------
61 amb 152
62     The final algorithm that is implemented in the router is basically the
63     one above but with an important difference. Instead of finding a long
64 amb 445 route among a data set of 8,000,000 nodes (number of highway nodes in
65     UK at beginning of 2010) it finds one long route in a data set of
66     1,000,000 nodes and a few hundred very short routes in the full data
67     set. Since the time taken to find a route is proportional to the number
68 amb 659 of nodes that need to be considered the main route takes 1/10th of the
69     time and the very short routes take almost no time at all.
70 amb 152
71     The solution to making the algorithm fast is therefore to discard most
72     of the nodes and only keep the interesting ones. In this case a node is
73 amb 659 deemed to be interesting if it is the junction of three or more
74     segments or the junction of two segments with different properties or
75     has a routing restriction different from the connecting segments. In
76     the algorithm and following description these are classed as
77 amb 445 super-nodes. Starting at each super-node a super-segment is generated
78     that finishes on another super-node and contains the shortest path
79     along segments with identical properties (and these properties are
80     inherited by the super-segment). The point of choosing the shortest
81     route is that since all segments considered have identical properties
82     they will be treated identically when properties are taken into
83     account. This decision making process can be repeated until the only
84     the most important and interesting nodes remain.
85 amb 152
86 amb 445 To find a route between a start and finish point now comprises the
87     following steps (assuming a shortest route is required):
88 amb 152
89 amb 445 1. Find all shortest routes from the start point along normal segments
90     and stopping when super-nodes are reached.
91     2. Find all shortest routes from the end point backwards along normal
92     segments and stopping when super-nodes are reached.
93     3. Find the shortest route along super-segments from the set of
94     super-nodes in step 1 to the set of super-nodes in step 2 (taking
95     into account the lengths found in steps 1 and 2 between the
96     start/finish super-nodes and the ultimate start/finish point).
97     4. For each super-segment in step 3 find the shortest route between
98     the two end-point super-nodes.
99 amb 381
100 amb 445 This multi-step process is considerably quicker than using all nodes
101     but gives a result that still contains the full list of nodes that are
102     visited. There are some special cases though, for example very short
103     routes that do not pass through any super-nodes, or routes that start
104     or finish on a super-node. In these cases one or more of the steps
105     listed can be removed or simplified.
106    
107 amb 621 When the first route reaches the final node the length of that route is
108     retained as a benchmark. Any shorter complete route that is calculated
109     later would replace this benchmark. As routes are tested any partial
110     routes that are longer than the benchmark can be immediately discarded.
111     Other partial routes have the length of a perfect straight highway to
112     the final node added to them and if the total exceeds the benchmark
113     they can also be discarded. Very quickly the number of possible routes
114     is reduced until the absolute shortest is found.
115    
116     For routes that do not start or finish on a node in the original data
117     set a fake node is added to an existing segment. This requires special
118     handling in the algorithm but it gives mode flexibility for the start,
119     finish and intermediate points in a route.
120    
121    
122 amb 659 Algorithm Evolution
123     -------------------
124    
125     In Routino versions 1.0 to 1.4 the algorithm used to select a
126     super-node was the same as above except that node properties were not
127     included. Routino versions 1.4.1 to 1.5.1 used a slightly different
128     algorithm which only chose nodes that were junctions between segments
129     with different properties (or has a routing restriction that is
130     different from connecting segments in versions 1.5 and 1.5.1). The
131     addition of turn restrictions (described in more detail below) requires
132     the original algorithm since the super-segments more accurately reflect
133     the underlying topology.
134    
135    
136 amb 445 Routing Preferences
137 amb 621 -------------------
138 amb 445
139     One of the important features of Routino is the ability to select a
140     route that is optimum for a set of criteria such as preferences for
141     each type of highway, speed limits and other restrictions and highway
142     properties.
143    
144     All of these features are handled by assigning a score to each segment
145     while calculating the route and trying to minimise the score rather
146     than simply minimising the length.
147    
148     Segment length
149     When calculating the shortest route the length of the segment is
150     the starting point for the score.
151    
152     Speed preference
153     When calculating the quickest route the time taken calculated
154     from the length of the segment and the lower of the highway's
155     own speed limit and the user's speed preference for the type of
156     highway is the starting point for the score.
157    
158 amb 737 One-way restriction
159     If a highway has the one-way property in the opposite direction
160 amb 445 to the desired travel and the user's preference is to obey
161 amb 737 one-way restrictions then the segment is ignored.
162 amb 445
163     Weight, height, width & length limits
164     If a highway has one of these limits and its value is less than
165     the user's specified requirement then the segment is ignored.
166    
167     Highway preference
168     The highway preference specified by the user is a percentage,
169     these are scaled so that the most preferred highway type has a
170     weighted preference of 1.0 (0% always has a weighted preference
171     of 0.0). The calculated score for a segment is divided by this
172     weighted preference.
173    
174     Highway properties
175     The other highway properties are specified by the user as a
176     percentage and each highway either has that property or not. The
177     user's property preference is scaled into the range 0.0 (for 0%)
178 amb 515 to 1.0 (for 100%) to give a weighted preference, a second
179 amb 737 "non-property" weighted preference is calculated in the same way
180 amb 445 after subtracting the user's preference from 100%. If a segment
181 amb 515 has a particular property then the calculated score is divided
182 amb 671 by the weighted preference for that property, if not then it is
183     divided by the non-property weighted preference. A non-linear
184     transformation is applied so that changing property preferences
185     close to 50% do not cause large variations in routes.
186 amb 445
187 amb 901
188 amb 621 Turn Restrictions
189     -----------------
190    
191 amb 659 The addition of turn restrictions adds a set of further complications
192     because it introduces a set of constraints that are far more complex
193     than one-way streets.
194 amb 621
195     A turn restriction in the simplest case is a combination of a segment,
196     node and segment such that routes are not allowed to go from the first
197     segment to the second one through the specified node. Exceptions for
198     certain types of traffic can also be specified. Currently only this
199     simplest type of turn restriction is handled by the algorithm.
200    
201 amb 659 The first complication of turn restrictions is that the algorithm above
202     requires that super-segments are composed of segments with identical
203     properties. A turn restriction is not the same in both directions so a
204     super-segment cannot include any route through that turn restriction.
205     The node at the centre of the turn restriction must therefore be a
206     super-node to avoid this. In addition to this all nodes connected to
207     the turn restriction node by a single segment must also be super-nodes
208     to avoid any long-distance super-segments starting at the restricted
209     node.
210    
211     The second complication of a turn restriction is that the optimum route
212 amb 621 may require passing through the same node more than once. This can
213     happen where the route needs to work around a turn restriction by
214     driving past it, turning round (on a roundabout perhaps) and coming
215     back along the same highway. Without turn restrictions a route could be
216 amb 659 defined purely by the set of nodes that were passed; no node would
217     exist more than once along a route between two points. With turn
218 amb 621 restrictions the route is defined by a node and the segment used to get
219     there; no route between two points will ever need to follow the same
220     segment in the same direction more than once.
221    
222 amb 659 A side-effect of this is that a route that works around a turn
223     restriction must be calculable using the super-segments that are stored
224     in the database. This puts a limit on the amount of database
225     optimisation that can be performed because if too many super-segments
226     are removed the optimum work-around may also be removed. The solution
227     to this is to ensure that the database preserves all loops that can be
228     used to turn around and reverse direction, previously super-segments
229     that started and finished on the same super-node were disallowed.
230 amb 621
231 amb 659 Another side-effect of having the route composed of a set of locations
232     (nodes) as well as the direction of travel (segments used to reach
233     them) is that via points in the route can be forced to continue in the
234     original direction. If the chosen method of transport obeys turn
235     restrictions then it will not reverse direction at a via point but will
236 amb 737 find an optimum route continuing in the same direction. The only
237     exception to this is when the route ahead at a waypoint is into a
238     dead-end and an immediate U-turn is allowed.
239 amb 621
240 amb 659 A side-effect of having the starting direction at a via point defined
241     by the previous part of the route is that overall non-optimal routes
242     may be found even though each section between via points is optimal.
243     For a route with a start, middle and end point defined it can be the
244     case that the shortest route from the start to the middle arrives in
245     the opposite direction to that required for the optimal route from the
246     middle to the end. The calculation of the route in separate sections
247     therefore may give a non-optimum result even though each section is
248     itself optimum based on the start conditions.
249    
250     Overall the presence of turn restrictions in the database makes the
251     routing slower even for regions of the map that have no turn
252     restrictions.
253    
254    
255 amb 621 Data Implementation
256     -------------------
257    
258 amb 152 The hardest part of implementing this router is the data organisation.
259     The arrangement of the data to minimise the number of operations
260     required to follow a route from one node to another is much harder than
261     designing the algorithm itself.
262    
263     The final implementation uses a separate table for nodes, segments and
264     ways. Each table individually is implemented as a C-language data
265     structure that is written to disk by a program which parses the
266     OpenStreetMap XML data file. In the router these data structures are
267     memory mapped so that the operating system handles the problems of
268     loading the needed data blocks from disk.
269    
270     Each node contains a latitude and longitude and they are sorted
271     geographically so that converting a latitude and longitude coordinate
272     to a node is fast as well as looking up the coordinate of a node. The
273     node also contains the location in the array of segments for the first
274     segment that uses that node.
275     Each segment contains the location of the two nodes as well as the way
276     that the segment came from. The location of the next segment that uses
277     one of the two nodes is also stored; the next segment for the other
278     node is the following one in the array. The length of the segment is
279     also pre-computed and stored.
280     Each way has a name, a highway type, a list of allowed types of
281 amb 445 traffic, a speed limit, any weight, height, width or length
282     restrictions and the highway properties.
283 amb 152
284     The super-nodes are mixed in with the nodes and the super-segments are
285     mixed in with the segments. For the nodes they are the same as the
286     normal nodes, so just a flag is needed to indicate that they are super.
287     The super-segments are in addition to the normal segments so they
288     increase the database size (by about 10%) and are also marked with a
289     flag.
290    
291 amb 621 The relations are stored separately from the nodes, segments and ways.
292     For the turn restriction relations the initial and final segments are
293     stored along with the restricted node itself. Each node that has a turn
294     restriction is marked in the main node storage with a flag to indicate
295     this information.
296 amb 381
297 amb 621
298 amb 152 Practicalities
299 amb 381 --------------
300 amb 152
301 amb 381 At the time of writing (April 2010) the OpenStreetMap data for Great
302 amb 429 Britain (taken from GeoFabrik) contains:
303 amb 381 * 14,675,098 nodes
304     + 8,767,521 are highway nodes
305     + 1,120,297 are super-nodes
306     * 1,876,822 ways
307 amb 416 + 1,412,898 are highways
308     o 9,316,328 highway segments
309 amb 381 o 1,641,009 are super-segments
310     * 60,572 relations
311 amb 152
312     The database files when generated are 41.5 MB for nodes, 121.6 MB for
313     segments and 12.6 MB for ways and are stored uncompressed. By having at
314     least 200 MB or RAM available the routing can be performed with no disk
315     accesses (once the data has been read once).
316    
317    
318     --------
319    
320 amb 621 Copyright 2008-2011 Andrew M. Bishop.

Properties

Name Value
cvs:description Description of the algorithm.