doc/html/algorithm.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML>

<!--
 Routino documentation - algorithm

 Part of the Routino routing software.

 This file Copyright 2008-2011 Andrew M. Bishop

 This program is free software: you can redistribute it and/or modify
 it under the terms of the GNU Affero General Public License as published by
 the Free Software Foundation, either version 3 of the License, or
 (at your option) any later version.

 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU Affero General Public License for more details.

 You should have received a copy of the GNU Affero General Public License
 along with this program.  If not, see http://www.gnu.org/licenses/.
-->

<HEAD>
<TITLE>Routino : Algorithm</TITLE>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<LINK href="style.css" type="text/css" rel="stylesheet">
</HEAD>

<BODY>

<!-- Header Start -->

<div class="header" align="center">

<h1>Routino : Algorithm</h1>

<hr>
</div>

<!-- Header End -->

<!-- Content Start -->

<div class="content">

<h2><a name="H_1_1"></a>Algorithms</h2>

This page describes the development of the algorithm that is used in Routino for
finding routes.

<h3><a name="H_1_1_1"></a>Simplest Algorithm</h3>

The algorithm to find a route is fundamentally simple: Start at the beginning,
follow all possible routes and keep going until you reach the end.
<p>
While this method does work, it isn't fast.  To be able to find a route quickly
needs a different algorithm, one that can find the correct answer without
wasting time on routes that lead nowhere.

<h3><a name="H_1_1_2"></a>Improved Algorithm</h3>

The simplest way to do this is to follow all possible segments from the starting
node to the next nearest node (an intermediate node in the complete journey).
For each node that is reached store the shortest route from the starting node
and the length of that route.  The list of intermediate nodes needs to be
maintained in order of shortest overall route on the assumption that there is a
straight line route from here to the end node.
<br>
At each point the intermediate node that has the shortest potential overall
journey time is processed before any other node.  From the first node in the
list follow all possible segments and place the newly discovered nodes into the
same list ordered in the same way.  This will tend to constrain the list of
nodes examined to be the ones that are between the start and end nodes.  If at
any point you reach a node that has already been reached by a longer route then
you can discard that route since the newly discovered route is shorter.
Conversely if the previously discovered route is shorter then discard the new
route.
<br>
At some point the end node will be reached and then any routes with potential
lengths longer than this actual route can be immediately discarded.  The few
remaining potential routes must be continued until they are found to be shorter
or have no possibility of being shorter.  The shortest possible route is then
found.
<p>
At all times when looking at a node only those segments that are possible by the
chosen means of transport are followed.  This allows the type of transport to be
handled easily.  When finding the quickest route the same rules apply except
that the criterion for sorting is the shortest potential route (assuming that
from each node to the end is the fastest possible type of highway).
<p>
This method also works, but again it isn't very fast.  The problem is that the
complexity is proportional to the number of nodes or segments in all routes
examined between the start and end nodes.  Maintaining the list of intermediate
nodes in order is the most complex part.

<h3><a name="H_1_1_3"></a>Final Algorithm</h3>

The final algorithm that is implemented in the router is basically the one above
but with an important difference.  Instead of finding a long route among a data
set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it
finds one long route in a data set of 1,000,000 nodes and a few hundred very
short routes in the full data set.  Since the time taken to find a route is
proportional to the number of nodes that need to be considered the main route
takes 1/10th of the time and the very short routes take almost no time at all.
<p>
The solution to making the algorithm fast is therefore to discard most of the
nodes and only keep the interesting ones.  In this case a node is deemed to be
interesting if it is the junction of three or more segments or the junction of
two segments with different properties or has a routing restriction different
from the connecting segments.  In the algorithm and following description these
are classed as <em>super-nodes</em>.  Starting at each super-node a
<em>super-segment</em> is generated that finishes on another super-node and
contains the <em>shortest</em> path along segments with identical properties
(and these properties are inherited by the super-segment).  The point of
choosing the shortest route is that since all segments considered have identical
properties they will be treated identically when properties are taken into
account.  This decision making process can be repeated until the only the most
important and interesting nodes remain.
<p align="center">
<img alt="Original data" src="example0.png">
<br>
Original Highways
<p align="center">
<img alt="Iteration 1" src="example1.png">
<br>
First Iteration
<p align="center">
<img alt="Iteration 2" src="example2.png">
<br>
Second Iteration
<p align="center">
<img alt="Iteration 3" src="example3.png">
<br>
Third Iteration
<p align="center">
<img alt="Iteration 4" src="example4.png">
<br>
Fourth Iteration
<p>
To find a route between a start and finish point now comprises the following
steps (assuming a shortest route is required):
<ol>
  <li>Find all shortest routes from the start point along normal segments and
  stopping when super-nodes are reached.
  <li>Find all shortest routes from the end point backwards along normal
  segments and stopping when super-nodes are reached.
  <li>Find the shortest route along super-segments from the set of super-nodes
  in step 1 to the set of super-nodes in step 2 (taking into account the lengths
  found in steps 1 and 2 between the start/finish super-nodes and the ultimate
  start/finish point).
  <li>For each super-segment in step 3 find the shortest route between the two
  end-point super-nodes.
</ol>
This multi-step process is considerably quicker than using all nodes but gives a
result that still contains the full list of nodes that are visited.  There are
some special cases though, for example very short routes that do not pass
through any super-nodes, or routes that start or finish on a super-node.  In
these cases one or more of the steps listed can be removed or simplified.
<p>
When the first route reaches the final node the length of that route is retained
as a benchmark.  Any shorter complete route that is calculated later would
replace this benchmark.  As routes are tested any partial routes that are longer
than the benchmark can be immediately discarded.  Other partial routes have the
length of a perfect straight highway to the final node added to them and if the
total exceeds the benchmark they can also be discarded.  Very quickly the number
of possible routes is reduced until the absolute shortest is found.
<p>
For routes that do not start or finish on a node in the original data set a fake
node is added to an existing segment.  This requires special handling in the
algorithm but it gives mode flexibility for the start, finish and intermediate
points in a route.

<h4><a name="H_1_1_3_1"></a>Algorithm Evolution</h4>

In Routino versions 1.0 to 1.4 the algorithm used to select a super-node was the
same as above except that node properties were not included.  Routino versions
1.4.1 to 1.5.1 used a slightly different algorithm which only chose nodes that
were junctions between segments with different properties (or has a routing
restriction that is different from connecting segments in versions 1.5 and
1.5.1).  The addition of turn restrictions (described in more detail below)
requires the original algorithm since the super-segments more accurately reflect
the underlying topology.

<h3><a name="H_1_1_4"></a>Routing Preferences</h3>

One of the important features of Routino is the ability to select a route that
is optimum for a set of criteria such as preferences for each type of highway,
speed limits and other restrictions and highway properties.
<p>
All of these features are handled by assigning a score to each segment while
calculating the route and trying to minimise the score rather than simply
minimising the length.
<dl>
  <dt>Segment length
  <dd>When calculating the shortest route the length of the segment is the
  starting point for the score.
  <dt>Speed preference
  <dd>When calculating the quickest route the time taken calculated from the
  length of the segment and the lower of the highway's own speed limit and the
  user's speed preference for the type of highway is the starting point for the
  score.
  <dt>Oneway restriction
  <dd>If a highway has the oneway property in the opposite direction to the
  desired travel and the user's preference is to obey oneway restrictions then
  the segment is ignored.
  <dt>Weight, height, width &amp; length limits
  <dd>If a highway has one of these limits and its value is less than the user's
  specified requirement then the segment is ignored.
  <dt>Highway preference
  <dd>The highway preference specified by the user is a percentage, these are
  scaled so that the most preferred highway type has a weighted preference of
  1.0 (0% always has a weighted preference of 0.0).  The calculated score for a
  segment is divided by this weighted preference.
  <dt>Highway properties
  <dd>The other highway properties are specified by the user as a percentage and
  each highway either has that property or not.  The user's property preference
  is scaled into the range 0.0 (for 0%) to 1.0 (for 100%) to give a weighted
  preference, a second "non-property" weighted preference is calcuated in the
  same way after subtracting the user's preference from 100%.  If a segment has
  a particular property then the calculated score is divided by the weighted
  preference for that property, if not then it is divided by the non-property
  weighted preference.  A non-linear transformation is applied so that changing
  property preferences close to 50% do not cause large variations in routes.
</dl>

<h3><a name="H_1_1_5"></a>Turn Restrictions</h3>

The addition of turn restrictions adds a set of further complications because it
introduces a set of constraints that are far more complex than one-way streets.
<p>
A turn restriction in the simplest case is a combination of a segment, node and
segment such that routes are not allowed to go from the first segment to the
second one through the specified node.  Exceptions for certain types of traffic
can also be specified.  Currently only this simplest type of turn restriction is
handled by the algorithm.
<p>
The first complication of turn restrictions is that the algorithm above requires
that super-segments are composed of segments with identical properties.  A turn
restriction is not the same in both directions so a super-segment cannot include
any route through that turn restriction.  The node at the centre of the turn
restriction must therefore be a super-node to avoid this.  In addition to this
all nodes connected to the turn restriction node by a single segment must also
be super-nodes to avoid any long-distance super-segments starting at the
restricted node.
<p>
The second complication of a turn restriction is that the optimum route may
require passing through the same node more than once.  This can happen where the
route needs to work around a turn restriction by driving past it, turning round
(on a roundabout perhaps) and coming back along the same highway.  Without turn
restrictions a route could be defined purely by the set of nodes that were
passed; no node would exist more than once along a route between two points.
With turn restrictions the route is defined by a node and the segment used to
get there; no route between two points will ever need to follow the same segment
in the same direction more than once.
<p>
A side-effect of this is that a route that works around a turn restriction must
be calculable using the super-segments that are stored in the database.  This
puts a limit on the amount of database optimisation that can be performed
because if too many super-segments are removed the optimum work-around may also
be removed.  The solution to this is to ensure that the database preserves all
loops that can be used to turn around and reverse direction, previously
super-segments that started and finished on the same super-node were disallowed.
<p>
Another side-effect of having the route composed of a set of locations (nodes)
as well as the direction of travel (segments used to reach them) is that via
points in the route can be forced to continue in the original direction.  If the
chosen method of transport obeys turn restrictions then it will not reverse
direction at a via point but will find an optimum route continuing in the same
direction.  The only exception to this is when the route ahead at a waypoint is
into a dead-end and an immediate U-turn is allowed.
<p>
A side-effect of having the starting direction at a via point defined by the
previous part of the route is that overall non-optimal routes may be found even
though each section between via points is optimal.  For a route with a start,
middle and end point defined it can be the case that the shortest route from the
start to the middle arrives in the opposite direction to that required for the
optimal route from the middle to the end.  The calculation of the route in
separate sections therefore may give a non-optimum result even though each
section is itself optimum based on the start conditions.
<p>
Overall the presence of turn restrictions in the database makes the routing
slower even for regions of the map that have no turn restrictions.

<h3><a name="H_1_1_4"></a>Data Implementation</h3>

The hardest part of implementing this router is the data organisation.  The
arrangement of the data to minimise the number of operations required to follow
a route from one node to another is much harder than designing the algorithm
itself.
<p>
The final implementation uses a separate table for nodes, segments and ways.
Each table individually is implemented as a C-language data structure that is
written to disk by a program which parses the OpenStreetMap XML data file.  In
the router these data structures are memory mapped so that the operating system
handles the problems of loading the needed data blocks from disk.
<p>
Each node contains a latitude and longitude and they are sorted geographically
so that converting a latitude and longitude coordinate to a node is fast as well
as looking up the coordinate of a node.  The node also contains the location in
the array of segments for the first segment that uses that node.
<br>
Each segment contains the location of the two nodes as well as the way that the
segment came from.  The location of the next segment that uses one of the two
nodes is also stored; the next segment for the other node is the following one
in the array.  The length of the segment is also pre-computed and stored.
<br>
Each way has a name, a highway type, a list of allowed types of traffic, a speed
limit, any weight, height, width or length restrictions and the highway
properties.
<p>
The super-nodes are mixed in with the nodes and the super-segments are mixed in
with the segments.  For the nodes they are the same as the normal nodes, so just
a flag is needed to indicate that they are super.  The super-segments are in
addition to the normal segments so they increase the database size (by about
10%) and are also marked with a flag.
<p>
The relations are stored separately from the nodes, segments and ways.  For the
turn restriction relations the initial and final segments are stored along with
the restricted node itself.  Each node that has a turn restriction is marked in
the main node storage with a flag to indicate this information.


<h3><a name="H_1_1_6"></a>Practicalities</h3>

At the time of writing (April 2010) the OpenStreetMap data for Great Britain
(taken from
<a class="ext" href="http://download.geofabrik.de/osm/europe/" title="GeoFabrik Mirror of OpenStreetMap Data">GeoFabrik</a>
) contains:
<ul>
  <li>14,675,098 nodes
  <ul>
    <li>8,767,521 are highway nodes
    <li>1,120,297 are super-nodes
  </ul>
  <li>1,876,822 ways
  <ul>
    <li>1,412,898 are highways
    <ul>
      <li>9,316,328 highway segments
      <li>1,641,009 are super-segments
    </ul>
  </ul>
  <li>60,572 relations
</ul>

The database files when generated are 41.5 MB for nodes, 121.6 MB for segments
and 12.6 MB for ways and are stored uncompressed.  By having at least 200 MB or
RAM available the routing can be performed with no disk accesses (once the data
has been read once).


</div>

<!-- Content End -->

<!-- Footer Start -->

<div class="footer" align="center">
<hr>

<address>
&copy; Andrew M. Bishop = &lt;amb "at" gedanken.demon.co.uk&gt;
</address>

</div>

<!-- Footer End -->

</BODY>

</HTML>
1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2	<HTML>
3
4	<!--
5	Routino documentation - algorithm
6
7	Part of the Routino routing software.
8
9	This file Copyright 2008-2011 Andrew M. Bishop
10
11	This program is free software: you can redistribute it and/or modify
12	it under the terms of the GNU Affero General Public License as published by
13	the Free Software Foundation, either version 3 of the License, or
14	(at your option) any later version.
15
16	This program is distributed in the hope that it will be useful,
17	but WITHOUT ANY WARRANTY; without even the implied warranty of
18	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
19	GNU Affero General Public License for more details.
20
21	You should have received a copy of the GNU Affero General Public License
22	along with this program. If not, see http://www.gnu.org/licenses/.
23	-->
24
25	<HEAD>
26	<TITLE>Routino : Algorithm</TITLE>
27	<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
28	<LINK href="style.css" type="text/css" rel="stylesheet">
29	</HEAD>
30
31	<BODY>
32
33	<!-- Header Start -->
34
35	<div class="header" align="center">
36
37	<h1>Routino : Algorithm</h1>
38
39	<hr>
40	</div>
41
42	<!-- Header End -->
43
44	<!-- Content Start -->
45
46	<div class="content">
47
48	<h2><a name="H_1_1"></a>Algorithms</h2>
49
50	This page describes the development of the algorithm that is used in Routino for
51	finding routes.
52
53	<h3><a name="H_1_1_1"></a>Simplest Algorithm</h3>
54
55	The algorithm to find a route is fundamentally simple: Start at the beginning,
56	follow all possible routes and keep going until you reach the end.
57	<p>
58	While this method does work, it isn't fast. To be able to find a route quickly
59	needs a different algorithm, one that can find the correct answer without
60	wasting time on routes that lead nowhere.
61
62	<h3><a name="H_1_1_2"></a>Improved Algorithm</h3>
63
64	The simplest way to do this is to follow all possible segments from the starting
65	node to the next nearest node (an intermediate node in the complete journey).
66	For each node that is reached store the shortest route from the starting node
67	and the length of that route. The list of intermediate nodes needs to be
68	maintained in order of shortest overall route on the assumption that there is a
69	straight line route from here to the end node.
70	<br>
71	At each point the intermediate node that has the shortest potential overall
72	journey time is processed before any other node. From the first node in the
73	list follow all possible segments and place the newly discovered nodes into the
74	same list ordered in the same way. This will tend to constrain the list of
75	nodes examined to be the ones that are between the start and end nodes. If at
76	any point you reach a node that has already been reached by a longer route then
77	you can discard that route since the newly discovered route is shorter.
78	Conversely if the previously discovered route is shorter then discard the new
79	route.
80	<br>
81	At some point the end node will be reached and then any routes with potential
82	lengths longer than this actual route can be immediately discarded. The few
83	remaining potential routes must be continued until they are found to be shorter
84	or have no possibility of being shorter. The shortest possible route is then
85	found.
86	<p>
87	At all times when looking at a node only those segments that are possible by the
88	chosen means of transport are followed. This allows the type of transport to be
89	handled easily. When finding the quickest route the same rules apply except
90	that the criterion for sorting is the shortest potential route (assuming that
91	from each node to the end is the fastest possible type of highway).
92	<p>
93	This method also works, but again it isn't very fast. The problem is that the
94	complexity is proportional to the number of nodes or segments in all routes
95	examined between the start and end nodes. Maintaining the list of intermediate
96	nodes in order is the most complex part.
97
98	<h3><a name="H_1_1_3"></a>Final Algorithm</h3>
99
100	The final algorithm that is implemented in the router is basically the one above
101	but with an important difference. Instead of finding a long route among a data
102	set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it
103	finds one long route in a data set of 1,000,000 nodes and a few hundred very
104	short routes in the full data set. Since the time taken to find a route is
105	proportional to the number of nodes that need to be considered the main route
106	takes 1/10th of the time and the very short routes take almost no time at all.
107	<p>
108	The solution to making the algorithm fast is therefore to discard most of the
109	nodes and only keep the interesting ones. In this case a node is deemed to be
110	interesting if it is the junction of three or more segments or the junction of
111	two segments with different properties or has a routing restriction different
112	from the connecting segments. In the algorithm and following description these
113	are classed as <em>super-nodes</em>. Starting at each super-node a
114	<em>super-segment</em> is generated that finishes on another super-node and
115	contains the <em>shortest</em> path along segments with identical properties
116	(and these properties are inherited by the super-segment). The point of
117	choosing the shortest route is that since all segments considered have identical
118	properties they will be treated identically when properties are taken into
119	account. This decision making process can be repeated until the only the most
120	important and interesting nodes remain.
121	<p align="center">
122	<img alt="Original data" src="example0.png">
123	<br>
124	Original Highways
125	<p align="center">
126	<img alt="Iteration 1" src="example1.png">
127	<br>
128	First Iteration
129	<p align="center">
130	<img alt="Iteration 2" src="example2.png">
131	<br>
132	Second Iteration
133	<p align="center">
134	<img alt="Iteration 3" src="example3.png">
135	<br>
136	Third Iteration
137	<p align="center">
138	<img alt="Iteration 4" src="example4.png">
139	<br>
140	Fourth Iteration
141	<p>
142	To find a route between a start and finish point now comprises the following
143	steps (assuming a shortest route is required):
144	<ol>
145	<li>Find all shortest routes from the start point along normal segments and
146	stopping when super-nodes are reached.
147	<li>Find all shortest routes from the end point backwards along normal
148	segments and stopping when super-nodes are reached.
149	<li>Find the shortest route along super-segments from the set of super-nodes
150	in step 1 to the set of super-nodes in step 2 (taking into account the lengths
151	found in steps 1 and 2 between the start/finish super-nodes and the ultimate
152	start/finish point).
153	<li>For each super-segment in step 3 find the shortest route between the two
154	end-point super-nodes.
155	</ol>
156	This multi-step process is considerably quicker than using all nodes but gives a
157	result that still contains the full list of nodes that are visited. There are
158	some special cases though, for example very short routes that do not pass
159	through any super-nodes, or routes that start or finish on a super-node. In
160	these cases one or more of the steps listed can be removed or simplified.
161	<p>
162	When the first route reaches the final node the length of that route is retained
163	as a benchmark. Any shorter complete route that is calculated later would
164	replace this benchmark. As routes are tested any partial routes that are longer
165	than the benchmark can be immediately discarded. Other partial routes have the
166	length of a perfect straight highway to the final node added to them and if the
167	total exceeds the benchmark they can also be discarded. Very quickly the number
168	of possible routes is reduced until the absolute shortest is found.
169	<p>
170	For routes that do not start or finish on a node in the original data set a fake
171	node is added to an existing segment. This requires special handling in the
172	algorithm but it gives mode flexibility for the start, finish and intermediate
173	points in a route.
174
175	<h4><a name="H_1_1_3_1"></a>Algorithm Evolution</h4>
176
177	In Routino versions 1.0 to 1.4 the algorithm used to select a super-node was the
178	same as above except that node properties were not included. Routino versions
179	1.4.1 to 1.5.1 used a slightly different algorithm which only chose nodes that
180	were junctions between segments with different properties (or has a routing
181	restriction that is different from connecting segments in versions 1.5 and
182	1.5.1). The addition of turn restrictions (described in more detail below)
183	requires the original algorithm since the super-segments more accurately reflect
184	the underlying topology.
185
186	<h3><a name="H_1_1_4"></a>Routing Preferences</h3>
187
188	One of the important features of Routino is the ability to select a route that
189	is optimum for a set of criteria such as preferences for each type of highway,
190	speed limits and other restrictions and highway properties.
191	<p>
192	All of these features are handled by assigning a score to each segment while
193	calculating the route and trying to minimise the score rather than simply
194	minimising the length.
195	<dl>
196	<dt>Segment length
197	<dd>When calculating the shortest route the length of the segment is the
198	starting point for the score.
199	<dt>Speed preference
200	<dd>When calculating the quickest route the time taken calculated from the
201	length of the segment and the lower of the highway's own speed limit and the
202	user's speed preference for the type of highway is the starting point for the
203	score.
204	<dt>Oneway restriction
205	<dd>If a highway has the oneway property in the opposite direction to the
206	desired travel and the user's preference is to obey oneway restrictions then
207	the segment is ignored.
208	<dt>Weight, height, width & length limits
209	<dd>If a highway has one of these limits and its value is less than the user's
210	specified requirement then the segment is ignored.
211	<dt>Highway preference
212	<dd>The highway preference specified by the user is a percentage, these are
213	scaled so that the most preferred highway type has a weighted preference of
214	1.0 (0% always has a weighted preference of 0.0). The calculated score for a
215	segment is divided by this weighted preference.
216	<dt>Highway properties
217	<dd>The other highway properties are specified by the user as a percentage and
218	each highway either has that property or not. The user's property preference
219	is scaled into the range 0.0 (for 0%) to 1.0 (for 100%) to give a weighted
220	preference, a second "non-property" weighted preference is calcuated in the
221	same way after subtracting the user's preference from 100%. If a segment has
222	a particular property then the calculated score is divided by the weighted
223	preference for that property, if not then it is divided by the non-property
224	weighted preference. A non-linear transformation is applied so that changing
225	property preferences close to 50% do not cause large variations in routes.
226	</dl>
227
228	<h3><a name="H_1_1_5"></a>Turn Restrictions</h3>
229
230	The addition of turn restrictions adds a set of further complications because it
231	introduces a set of constraints that are far more complex than one-way streets.
232	<p>
233	A turn restriction in the simplest case is a combination of a segment, node and
234	segment such that routes are not allowed to go from the first segment to the
235	second one through the specified node. Exceptions for certain types of traffic
236	can also be specified. Currently only this simplest type of turn restriction is
237	handled by the algorithm.
238	<p>
239	The first complication of turn restrictions is that the algorithm above requires
240	that super-segments are composed of segments with identical properties. A turn
241	restriction is not the same in both directions so a super-segment cannot include
242	any route through that turn restriction. The node at the centre of the turn
243	restriction must therefore be a super-node to avoid this. In addition to this
244	all nodes connected to the turn restriction node by a single segment must also
245	be super-nodes to avoid any long-distance super-segments starting at the
246	restricted node.
247	<p>
248	The second complication of a turn restriction is that the optimum route may
249	require passing through the same node more than once. This can happen where the
250	route needs to work around a turn restriction by driving past it, turning round
251	(on a roundabout perhaps) and coming back along the same highway. Without turn
252	restrictions a route could be defined purely by the set of nodes that were
253	passed; no node would exist more than once along a route between two points.
254	With turn restrictions the route is defined by a node and the segment used to
255	get there; no route between two points will ever need to follow the same segment
256	in the same direction more than once.
257	<p>
258	A side-effect of this is that a route that works around a turn restriction must
259	be calculable using the super-segments that are stored in the database. This
260	puts a limit on the amount of database optimisation that can be performed
261	because if too many super-segments are removed the optimum work-around may also
262	be removed. The solution to this is to ensure that the database preserves all
263	loops that can be used to turn around and reverse direction, previously
264	super-segments that started and finished on the same super-node were disallowed.
265	<p>
266	Another side-effect of having the route composed of a set of locations (nodes)
267	as well as the direction of travel (segments used to reach them) is that via
268	points in the route can be forced to continue in the original direction. If the
269	chosen method of transport obeys turn restrictions then it will not reverse
270	direction at a via point but will find an optimum route continuing in the same
271	direction. The only exception to this is when the route ahead at a waypoint is
272	into a dead-end and an immediate U-turn is allowed.
273	<p>
274	A side-effect of having the starting direction at a via point defined by the
275	previous part of the route is that overall non-optimal routes may be found even
276	though each section between via points is optimal. For a route with a start,
277	middle and end point defined it can be the case that the shortest route from the
278	start to the middle arrives in the opposite direction to that required for the
279	optimal route from the middle to the end. The calculation of the route in
280	separate sections therefore may give a non-optimum result even though each
281	section is itself optimum based on the start conditions.
282	<p>
283	Overall the presence of turn restrictions in the database makes the routing
284	slower even for regions of the map that have no turn restrictions.
285
286	<h3><a name="H_1_1_4"></a>Data Implementation</h3>
287
288	The hardest part of implementing this router is the data organisation. The
289	arrangement of the data to minimise the number of operations required to follow
290	a route from one node to another is much harder than designing the algorithm
291	itself.
292	<p>
293	The final implementation uses a separate table for nodes, segments and ways.
294	Each table individually is implemented as a C-language data structure that is
295	written to disk by a program which parses the OpenStreetMap XML data file. In
296	the router these data structures are memory mapped so that the operating system
297	handles the problems of loading the needed data blocks from disk.
298	<p>
299	Each node contains a latitude and longitude and they are sorted geographically
300	so that converting a latitude and longitude coordinate to a node is fast as well
301	as looking up the coordinate of a node. The node also contains the location in
302	the array of segments for the first segment that uses that node.
303	<br>
304	Each segment contains the location of the two nodes as well as the way that the
305	segment came from. The location of the next segment that uses one of the two
306	nodes is also stored; the next segment for the other node is the following one
307	in the array. The length of the segment is also pre-computed and stored.
308	<br>
309	Each way has a name, a highway type, a list of allowed types of traffic, a speed
310	limit, any weight, height, width or length restrictions and the highway
311	properties.
312	<p>
313	The super-nodes are mixed in with the nodes and the super-segments are mixed in
314	with the segments. For the nodes they are the same as the normal nodes, so just
315	a flag is needed to indicate that they are super. The super-segments are in
316	addition to the normal segments so they increase the database size (by about
317	10%) and are also marked with a flag.
318	<p>
319	The relations are stored separately from the nodes, segments and ways. For the
320	turn restriction relations the initial and final segments are stored along with
321	the restricted node itself. Each node that has a turn restriction is marked in
322	the main node storage with a flag to indicate this information.
323
324
325	<h3><a name="H_1_1_6"></a>Practicalities</h3>
326
327	At the time of writing (April 2010) the OpenStreetMap data for Great Britain
328	(taken from
329	<a class="ext" href="http://download.geofabrik.de/osm/europe/" title="GeoFabrik Mirror of OpenStreetMap Data">GeoFabrik</a>
330	) contains:
331	<ul>
332	<li>14,675,098 nodes
333	<ul>
334	<li>8,767,521 are highway nodes
335	<li>1,120,297 are super-nodes
336	</ul>
337	<li>1,876,822 ways
338	<ul>
339	<li>1,412,898 are highways
340	<ul>
341	<li>9,316,328 highway segments
342	<li>1,641,009 are super-segments
343	</ul>
344	</ul>
345	<li>60,572 relations
346	</ul>
347
348	The database files when generated are 41.5 MB for nodes, 121.6 MB for segments
349	and 12.6 MB for ways and are stored uncompressed. By having at least 200 MB or
350	RAM available the routing can be performed with no disk accesses (once the data
351	has been read once).
352
353
354	</div>
355
356	<!-- Content End -->
357
358	<!-- Footer Start -->
359
360	<div class="footer" align="center">
361	<hr>
362
363	<address>
364	© Andrew M. Bishop = <amb "at" gedanken.demon.co.uk>
365	</address>
366
367	</div>
368
369	<!-- Footer End -->
370
371	</BODY>
372
373	</HTML>