Check out the latest version of Routino: svn co http://routino.org/svn/trunk routino
Annotation of /trunk/doc/ALGORITHM.txt
Parent Directory
|
Revision Log
Revision 416 -
(hide annotations)
(download)
Sun May 30 18:22:44 2010 UTC (14 years, 9 months ago) by amb
File MIME type: text/plain
File size: 7053 byte(s)
Sun May 30 18:22:44 2010 UTC (14 years, 9 months ago) by amb
File MIME type: text/plain
File size: 7053 byte(s)
An update to the current size of the UK database.
1 | amb | 381 | Routino : Algorithm |
2 | =================== | ||
3 | amb | 152 | |
4 | |||
5 | This page describes the development of the algorithm that is used in | ||
6 | the software demonstrated here. | ||
7 | |||
8 | amb | 381 | |
9 | amb | 152 | Simplest Algorithm |
10 | amb | 381 | ------------------ |
11 | amb | 152 | |
12 | The algorithm to find a route is fundamentally simple: Start at the | ||
13 | beginning, follow all possible routes and keep going until you reach | ||
14 | the end. | ||
15 | |||
16 | While this method does work, it isn't fast. To be able to find a route | ||
17 | quickly needs a different algorithm, one that can find the correct | ||
18 | answer without wasting time on routes that lead nowhere. | ||
19 | |||
20 | amb | 381 | |
21 | amb | 152 | Improved Algorithm |
22 | amb | 381 | ------------------ |
23 | amb | 152 | |
24 | The simplest way to do this is to follow all possible segments from the | ||
25 | starting node to the next nearest node (an intermediate node in the | ||
26 | complete journey). For each node that is reached store the shortest | ||
27 | route from the starting node and the length of that route. The list of | ||
28 | intermediate nodes needs to be maintained in order of shortest overall | ||
29 | route on the assumption that there is a straight line route from here | ||
30 | to the end node. | ||
31 | At each point the intermediate node that has the shortest potential | ||
32 | overall journey time is processed before any other node. From the first | ||
33 | node in the list follow all possible segments and place the newly | ||
34 | discovered nodes into the same list ordered in the same way. This will | ||
35 | tend to constrain the list of nodes examined to be the ones that are | ||
36 | between the start and end nodes. If at any point you reach a node that | ||
37 | has already been reached by a longer route then you can discard that | ||
38 | route since the newly discovered route is shorter. Conversely if the | ||
39 | previously discovered route is shorter then discard the new route. | ||
40 | At some point the end node will be reached and then any routes with | ||
41 | potential lengths longer than this actual route can be immediately | ||
42 | discarded. The few remaining potential routes must be continued until | ||
43 | they are found to be shorter or have no possibility of being shorter. | ||
44 | The shortest possible route is then found. | ||
45 | |||
46 | At all times when looking at a node only those segments that are | ||
47 | possible by the chosen means of transport are followed. This allows the | ||
48 | type of transport to be handled easily. When finding the quickest route | ||
49 | the same rules apply except that criterion for sorting is the shortest | ||
50 | potential route (assuming that from each node to the end is the fastest | ||
51 | possible type of highway). | ||
52 | |||
53 | This method also works, but again it isn't very fast. The problem is | ||
54 | that the complexity is proportional to the number of nodes or segments | ||
55 | in all routes examined between the start and end nodes. Maintaining the | ||
56 | list of intermediate nodes in order is the most complex part. | ||
57 | |||
58 | amb | 381 | |
59 | amb | 152 | Final Algorithm |
60 | amb | 381 | --------------- |
61 | amb | 152 | |
62 | The final algorithm that is implemented in the router is basically the | ||
63 | one above but with an important difference. Instead of finding a long | ||
64 | route among a data set of 5,000,000 nodes (number of highway nodes in | ||
65 | UK at beginning of 2009) it finds one long route in a data set of | ||
66 | 500,000 nodes and a few hundred very short routes in the full data set. | ||
67 | Since the time taken to find a route is proportional to the number of | ||
68 | nodes the main route takes 1/10th of the time and the very short routes | ||
69 | take almost no time at all. | ||
70 | |||
71 | The solution to making the algorithm fast is therefore to discard most | ||
72 | of the nodes and only keep the interesting ones. In this case a node is | ||
73 | deemed to be interesting if it is the junction of three or more | ||
74 | segments or the junction of two segments with different properties. In | ||
75 | the algorithm these are classed as super-nodes. Between each super-node | ||
76 | and a neighbouring super-node a super-segment is generated that | ||
77 | contains the shortest path along segments with identical properties. | ||
78 | This decision making process can be repeated until the only the most | ||
79 | important and interesting nodes remain. | ||
80 | |||
81 | To find a route now comprises finding a route along super-segments | ||
82 | between the start node and the end node followed by finding a route | ||
83 | between each adjacent pair of super-nodes in that route. (The routes | ||
84 | between the start and end nodes and all adjacent super-nodes need to be | ||
85 | found before the route using only super-nodes can be found.) This is | ||
86 | considerably quicker than using all nodes but gives a result that still | ||
87 | contains the full list of nodes that are visited. | ||
88 | |||
89 | amb | 381 | |
90 | amb | 152 | Implementation |
91 | amb | 381 | -------------- |
92 | amb | 152 | |
93 | The hardest part of implementing this router is the data organisation. | ||
94 | The arrangement of the data to minimise the number of operations | ||
95 | required to follow a route from one node to another is much harder than | ||
96 | designing the algorithm itself. | ||
97 | |||
98 | The final implementation uses a separate table for nodes, segments and | ||
99 | ways. Each table individually is implemented as a C-language data | ||
100 | structure that is written to disk by a program which parses the | ||
101 | OpenStreetMap XML data file. In the router these data structures are | ||
102 | memory mapped so that the operating system handles the problems of | ||
103 | loading the needed data blocks from disk. | ||
104 | |||
105 | Each node contains a latitude and longitude and they are sorted | ||
106 | geographically so that converting a latitude and longitude coordinate | ||
107 | to a node is fast as well as looking up the coordinate of a node. The | ||
108 | node also contains the location in the array of segments for the first | ||
109 | segment that uses that node. | ||
110 | Each segment contains the location of the two nodes as well as the way | ||
111 | that the segment came from. The location of the next segment that uses | ||
112 | one of the two nodes is also stored; the next segment for the other | ||
113 | node is the following one in the array. The length of the segment is | ||
114 | also pre-computed and stored. | ||
115 | Each way has a name, a highway type, a list of allowed types of | ||
116 | traffic, a speed limit and any weight, height, width or length | ||
117 | restrictions. | ||
118 | |||
119 | The super-nodes are mixed in with the nodes and the super-segments are | ||
120 | mixed in with the segments. For the nodes they are the same as the | ||
121 | normal nodes, so just a flag is needed to indicate that they are super. | ||
122 | The super-segments are in addition to the normal segments so they | ||
123 | increase the database size (by about 10%) and are also marked with a | ||
124 | flag. | ||
125 | |||
126 | amb | 381 | |
127 | amb | 152 | Practicalities |
128 | amb | 381 | -------------- |
129 | amb | 152 | |
130 | amb | 381 | At the time of writing (April 2010) the OpenStreetMap data for Great |
131 | amb | 416 | Britain (taken fromGeoFabrik) contains: |
132 | amb | 381 | * 14,675,098 nodes |
133 | + 8,767,521 are highway nodes | ||
134 | + 1,120,297 are super-nodes | ||
135 | * 1,876,822 ways | ||
136 | amb | 416 | + 1,412,898 are highways |
137 | o 9,316,328 highway segments | ||
138 | amb | 381 | o 1,641,009 are super-segments |
139 | * 60,572 relations | ||
140 | amb | 152 | |
141 | The database files when generated are 41.5 MB for nodes, 121.6 MB for | ||
142 | segments and 12.6 MB for ways and are stored uncompressed. By having at | ||
143 | least 200 MB or RAM available the routing can be performed with no disk | ||
144 | accesses (once the data has been read once). | ||
145 | |||
146 | |||
147 | -------- | ||
148 | |||
149 | amb | 381 | Copyright 2008-2010 Andrew M. Bishop. |
Properties
Name | Value |
---|---|
cvs:description | Description of the algorithm. |