Routino SVN Repository Browser

Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino

ViewVC logotype

Annotation of /trunk/doc/ALGORITHM.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 416 - (hide annotations) (download)
Sun May 30 18:22:44 2010 UTC (14 years, 9 months ago) by amb
File MIME type: text/plain
File size: 7053 byte(s)
An update to the current size of the UK database.

1 amb 381 Routino : Algorithm
2     ===================
3 amb 152
4    
5     This page describes the development of the algorithm that is used in
6     the software demonstrated here.
7    
8 amb 381
9 amb 152 Simplest Algorithm
10 amb 381 ------------------
11 amb 152
12     The algorithm to find a route is fundamentally simple: Start at the
13     beginning, follow all possible routes and keep going until you reach
14     the end.
15    
16     While this method does work, it isn't fast. To be able to find a route
17     quickly needs a different algorithm, one that can find the correct
18     answer without wasting time on routes that lead nowhere.
19    
20 amb 381
21 amb 152 Improved Algorithm
22 amb 381 ------------------
23 amb 152
24     The simplest way to do this is to follow all possible segments from the
25     starting node to the next nearest node (an intermediate node in the
26     complete journey). For each node that is reached store the shortest
27     route from the starting node and the length of that route. The list of
28     intermediate nodes needs to be maintained in order of shortest overall
29     route on the assumption that there is a straight line route from here
30     to the end node.
31     At each point the intermediate node that has the shortest potential
32     overall journey time is processed before any other node. From the first
33     node in the list follow all possible segments and place the newly
34     discovered nodes into the same list ordered in the same way. This will
35     tend to constrain the list of nodes examined to be the ones that are
36     between the start and end nodes. If at any point you reach a node that
37     has already been reached by a longer route then you can discard that
38     route since the newly discovered route is shorter. Conversely if the
39     previously discovered route is shorter then discard the new route.
40     At some point the end node will be reached and then any routes with
41     potential lengths longer than this actual route can be immediately
42     discarded. The few remaining potential routes must be continued until
43     they are found to be shorter or have no possibility of being shorter.
44     The shortest possible route is then found.
45    
46     At all times when looking at a node only those segments that are
47     possible by the chosen means of transport are followed. This allows the
48     type of transport to be handled easily. When finding the quickest route
49     the same rules apply except that criterion for sorting is the shortest
50     potential route (assuming that from each node to the end is the fastest
51     possible type of highway).
52    
53     This method also works, but again it isn't very fast. The problem is
54     that the complexity is proportional to the number of nodes or segments
55     in all routes examined between the start and end nodes. Maintaining the
56     list of intermediate nodes in order is the most complex part.
57    
58 amb 381
59 amb 152 Final Algorithm
60 amb 381 ---------------
61 amb 152
62     The final algorithm that is implemented in the router is basically the
63     one above but with an important difference. Instead of finding a long
64     route among a data set of 5,000,000 nodes (number of highway nodes in
65     UK at beginning of 2009) it finds one long route in a data set of
66     500,000 nodes and a few hundred very short routes in the full data set.
67     Since the time taken to find a route is proportional to the number of
68     nodes the main route takes 1/10th of the time and the very short routes
69     take almost no time at all.
70    
71     The solution to making the algorithm fast is therefore to discard most
72     of the nodes and only keep the interesting ones. In this case a node is
73     deemed to be interesting if it is the junction of three or more
74     segments or the junction of two segments with different properties. In
75     the algorithm these are classed as super-nodes. Between each super-node
76     and a neighbouring super-node a super-segment is generated that
77     contains the shortest path along segments with identical properties.
78     This decision making process can be repeated until the only the most
79     important and interesting nodes remain.
80    
81     To find a route now comprises finding a route along super-segments
82     between the start node and the end node followed by finding a route
83     between each adjacent pair of super-nodes in that route. (The routes
84     between the start and end nodes and all adjacent super-nodes need to be
85     found before the route using only super-nodes can be found.) This is
86     considerably quicker than using all nodes but gives a result that still
87     contains the full list of nodes that are visited.
88    
89 amb 381
90 amb 152 Implementation
91 amb 381 --------------
92 amb 152
93     The hardest part of implementing this router is the data organisation.
94     The arrangement of the data to minimise the number of operations
95     required to follow a route from one node to another is much harder than
96     designing the algorithm itself.
97    
98     The final implementation uses a separate table for nodes, segments and
99     ways. Each table individually is implemented as a C-language data
100     structure that is written to disk by a program which parses the
101     OpenStreetMap XML data file. In the router these data structures are
102     memory mapped so that the operating system handles the problems of
103     loading the needed data blocks from disk.
104    
105     Each node contains a latitude and longitude and they are sorted
106     geographically so that converting a latitude and longitude coordinate
107     to a node is fast as well as looking up the coordinate of a node. The
108     node also contains the location in the array of segments for the first
109     segment that uses that node.
110     Each segment contains the location of the two nodes as well as the way
111     that the segment came from. The location of the next segment that uses
112     one of the two nodes is also stored; the next segment for the other
113     node is the following one in the array. The length of the segment is
114     also pre-computed and stored.
115     Each way has a name, a highway type, a list of allowed types of
116     traffic, a speed limit and any weight, height, width or length
117     restrictions.
118    
119     The super-nodes are mixed in with the nodes and the super-segments are
120     mixed in with the segments. For the nodes they are the same as the
121     normal nodes, so just a flag is needed to indicate that they are super.
122     The super-segments are in addition to the normal segments so they
123     increase the database size (by about 10%) and are also marked with a
124     flag.
125    
126 amb 381
127 amb 152 Practicalities
128 amb 381 --------------
129 amb 152
130 amb 381 At the time of writing (April 2010) the OpenStreetMap data for Great
131 amb 416 Britain (taken fromGeoFabrik) contains:
132 amb 381 * 14,675,098 nodes
133     + 8,767,521 are highway nodes
134     + 1,120,297 are super-nodes
135     * 1,876,822 ways
136 amb 416 + 1,412,898 are highways
137     o 9,316,328 highway segments
138 amb 381 o 1,641,009 are super-segments
139     * 60,572 relations
140 amb 152
141     The database files when generated are 41.5 MB for nodes, 121.6 MB for
142     segments and 12.6 MB for ways and are stored uncompressed. By having at
143     least 200 MB or RAM available the routing can be performed with no disk
144     accesses (once the data has been read once).
145    
146    
147     --------
148    
149 amb 381 Copyright 2008-2010 Andrew M. Bishop.

Properties

Name Value
cvs:description Description of the algorithm.