trunk/doc/ALGORITHM.txt

                             Routino : Algorithm
                             ===================


   This page describes the development of the algorithm that is used in
   Routino for finding routes.


Simplest Algorithm
------------------

   The algorithm to find a route is fundamentally simple: Start at the
   beginning, follow all possible routes and keep going until you reach
   the end.

   While this method does work, it isn't fast. To be able to find a route
   quickly needs a different algorithm, one that can find the correct
   answer without wasting time on routes that lead nowhere.


Improved Algorithm
------------------

   The simplest way to do this is to follow all possible segments from the
   starting node to the next nearest node (an intermediate node in the
   complete journey). For each node that is reached store the shortest
   route from the starting node and the length of that route. The list of
   intermediate nodes needs to be maintained in order of shortest overall
   route on the assumption that there is a straight line route from here
   to the end node.
   At each point the intermediate node that has the shortest potential
   overall journey time is processed before any other node. From the first
   node in the list follow all possible segments and place the newly
   discovered nodes into the same list ordered in the same way. This will
   tend to constrain the list of nodes examined to be the ones that are
   between the start and end nodes. If at any point you reach a node that
   has already been reached by a longer route then you can discard that
   route since the newly discovered route is shorter. Conversely if the
   previously discovered route is shorter then discard the new route.
   At some point the end node will be reached and then any routes with
   potential lengths longer than this actual route can be immediately
   discarded. The few remaining potential routes must be continued until
   they are found to be shorter or have no possibility of being shorter.
   The shortest possible route is then found.

   At all times when looking at a node only those segments that are
   possible by the chosen means of transport are followed. This allows the
   type of transport to be handled easily. When finding the quickest route
   the same rules apply except that the criterion for sorting is the
   shortest potential route (assuming that from each node to the end is
   the fastest possible type of highway).

   This method also works, but again it isn't very fast. The problem is
   that the complexity is proportional to the number of nodes or segments
   in all routes examined between the start and end nodes. Maintaining the
   list of intermediate nodes in order is the most complex part.


Final Algorithm
---------------

   The final algorithm that is implemented in the router is basically the
   one above but with an important difference. Instead of finding a long
   route among a data set of 8,000,000 nodes (number of highway nodes in
   UK at beginning of 2010) it finds one long route in a data set of
   1,000,000 nodes and a few hundred very short routes in the full data
   set. Since the time taken to find a route is proportional to the number
   of nodes that need to be considered the main route takes 1/10th of the
   time and the very short routes take almost no time at all.

   The solution to making the algorithm fast is therefore to discard most
   of the nodes and only keep the interesting ones. In this case a node is
   deemed to be interesting if it is the junction of three or more
   segments or the junction of two segments with different properties or
   has a routing restriction different from the connecting segments. In
   the algorithm and following description these are classed as
   super-nodes. Starting at each super-node a super-segment is generated
   that finishes on another super-node and contains the shortest path
   along segments with identical properties (and these properties are
   inherited by the super-segment). The point of choosing the shortest
   route is that since all segments considered have identical properties
   they will be treated identically when properties are taken into
   account. This decision making process can be repeated until the only
   the most important and interesting nodes remain.

   To find a route between a start and finish point now comprises the
   following steps (assuming a shortest route is required):

    1. Find all shortest routes from the start point along normal segments
       and stopping when super-nodes are reached.
    2. Find all shortest routes from the end point backwards along normal
       segments and stopping when super-nodes are reached.
    3. Find the shortest route along super-segments from the set of
       super-nodes in step 1 to the set of super-nodes in step 2 (taking
       into account the lengths found in steps 1 and 2 between the
       start/finish super-nodes and the ultimate start/finish point).
    4. For each super-segment in step 3 find the shortest route between
       the two end-point super-nodes.

   This multi-step process is considerably quicker than using all nodes
   but gives a result that still contains the full list of nodes that are
   visited. There are some special cases though, for example very short
   routes that do not pass through any super-nodes, or routes that start
   or finish on a super-node. In these cases one or more of the steps
   listed can be removed or simplified.

   When the first route reaches the final node the length of that route is
   retained as a benchmark. Any shorter complete route that is calculated
   later would replace this benchmark. As routes are tested any partial
   routes that are longer than the benchmark can be immediately discarded.
   Other partial routes have the length of a perfect straight highway to
   the final node added to them and if the total exceeds the benchmark
   they can also be discarded. Very quickly the number of possible routes
   is reduced until the absolute shortest is found.

   For routes that do not start or finish on a node in the original data
   set a fake node is added to an existing segment. This requires special
   handling in the algorithm but it gives mode flexibility for the start,
   finish and intermediate points in a route.

Algorithm Evolution
- - - - - - - - - -

   In Routino versions 1.0 to 1.4 the algorithm used to select a
   super-node was the same as above except that node properties were not
   included. Routino versions 1.4.1 to 1.5.1 used a slightly different
   algorithm which only chose nodes that were junctions between segments
   with different properties (or has a routing restriction that is
   different from connecting segments in versions 1.5 and 1.5.1). The
   addition of turn restrictions (described in more detail below) requires
   the original algorithm since the super-segments more accurately reflect
   the underlying topology.

Algorithm Implementation
- - - - - - - - - - - -

   The algorithm that is used for finding the route between the
   super-nodes using super-segments is the A* algorithm (or a slight
   variation of it). This was not a deliberate design decision, but
   evolved into it during development. This algorithm relies on
   calculating the lowest score (shortest distance or quickest time) to
   each node from the starting node. The remaining score for the path to
   the destination node is estimated (based on a straight line using the
   fastest type of highway) and added to the current score and the result
   recorded. At each step the unvisited node that has the lowest current
   score is examined and all nodes connected to it have their scores
   calculated. When the destination node has been reached all remaining
   unvisited nodes with scores higher than the destination node's score
   can be discarded and the few remaining nodes examined.

   The algorithm used to find the route between super-nodes using normal
   segments is Dijkstra's algorithm (although it is implemented as the
   same algorithm as above but with no estimated cost). Since these routes
   tend to be short and the CPU time for calculating the heuristic cost
   function is relatively large this tends to give a quicker solution.


Routing Preferences
-------------------

   One of the important features of Routino is the ability to select a
   route that is optimum for a set of criteria such as preferences for
   each type of highway, speed limits and other restrictions and highway
   properties.

   All of these features are handled by assigning a score to each segment
   while calculating the route and trying to minimise the score rather
   than simply minimising the length.

   Segment length
          When calculating the shortest route the length of the segment is
          the starting point for the score.

   Speed preference
          When calculating the quickest route the time taken calculated
          from the length of the segment and the lower of the highway's
          own speed limit and the user's speed preference for the type of
          highway is the starting point for the score.

   One-way restriction
          If a highway has the one-way property in the opposite direction
          to the desired travel and the user's preference is to obey
          one-way restrictions then the segment is ignored.

   Weight, height, width & length limits
          If a highway has one of these limits and its value is less than
          the user's specified requirement then the segment is ignored.

   Highway preference
          The highway preference specified by the user is a percentage,
          these are scaled so that the most preferred highway type has a
          weighted preference of 1.0 (0% always has a weighted preference
          of 0.0). The calculated score for a segment is divided by this
          weighted preference.

   Highway properties
          The other highway properties are specified by the user as a
          percentage and each highway either has that property or not. The
          user's property preference is scaled into the range 0.0 (for 0%)
          to 1.0 (for 100%) to give a weighted preference, a second
          "non-property" weighted preference is calculated in the same way
          after subtracting the user's preference from 100%. If a segment
          has a particular property then the calculated score is divided
          by the weighted preference for that property, if not then it is
          divided by the non-property weighted preference. A non-linear
          transformation is applied so that changing property preferences
          close to 50% do not cause large variations in routes.


Data Pruning
------------

   From version 2.2 there are options to "prune" nodes and segments from
   the input data which means to remove nodes and/or segments without
   significantly changing the routing results.

   The pruning options must meet a number of conditions to be useful:
     * The topology relevant to routing must remain unchanged. The
       instructions that are produced from the reduced set of nodes and
       segments must be sufficiently accurate for anybody trying to follow
       them on the ground.
     * Any restrictions belonging to nodes or segments that stop certain
       types of traffic from following a particular highway must be
       preserved.
     * The total length must be calculated using the original data and not
       the simplified data which by its nature will typically be shorter.
     * The location of the remaining nodes and segments must be a good
       representation of the original nodes and segments. Since the
       calculated route may be displayed on a map the remaining nodes and
       segments must clearly indicate the route to take.

   The prune options all have user-controllable parameters which allow the
   geographical accuracy to be controlled. This means that although the
   topology is the same the geographical accuracy can be sacrificed
   slightly to minimise the number of nodes and segments.

   The pruning options that are available are:
     * Removing the access permissions for a transport type from segments
       if it is not possible to route that transport type from those
       segments to a significant number of other places. The limit on the
       pruning is set by the total length of the isolated group of
       segments. This significantly increases the chance that a route will
       be found by not putting waypoints in inaccessible places.
     * Removing short segments, the limit is set by the length of the
       segment. This removes a number of redundant segments (and
       associated nodes) but rules are applied to ensure that removing the
       segments does not alter junction topology or remove node access
       permissions or changes in way properties.
     * Removing nodes from almost straight highways, the limit is set by
       the distance between the remaining segments and the original nodes.
       This removes a large number of redundant nodes (and therefore
       segments) but again care is taken not to remove node access
       permissions or changes in way properties.


Turn Restrictions
-----------------

   The addition of turn restrictions in version 2.0 adds a set of further
   complications because it introduces a set of constraints that are far
   more complex than one-way streets.

   A turn restriction in the simplest case is a combination of a segment,
   node and segment such that routes are not allowed to go from the first
   segment to the second one through the specified node. Exceptions for
   certain types of traffic can also be specified. Currently only this
   simplest type of turn restriction is handled by the algorithm.

   The first complication of turn restrictions is that the algorithm above
   requires that super-segments are composed of segments with identical
   properties. A turn restriction is not the same in both directions so a
   super-segment cannot include any route through that turn restriction.
   The node at the centre of the turn restriction must therefore be a
   super-node to avoid this. In addition to this all nodes connected to
   the turn restriction node by a single segment must also be super-nodes
   to avoid any long-distance super-segments starting at the restricted
   node.

   The second complication of a turn restriction is that the optimum route
   may require passing through the same node more than once. This can
   happen where the route needs to work around a turn restriction by
   driving past it, turning round (on a roundabout perhaps) and coming
   back along the same highway. Without turn restrictions a route could be
   defined purely by the set of nodes that were passed; no node would
   exist more than once along a route between two points. With turn
   restrictions the route is defined by a node and the segment used to get
   there; no route between two points will ever need to follow the same
   segment in the same direction more than once. This means that the
   optimisation algorithm calculates scores for directed segments (indexed
   by segment and end node) rather than for nodes.

   A side-effect of this is that a route that works around a turn
   restriction must be calculable using the super-segments that are stored
   in the database. This puts a limit on the amount of database
   optimisation that can be performed because if too many super-segments
   are removed the optimum work-around may also be removed. The solution
   to this is to ensure that the database preserves all loops that can be
   used to turn around and reverse direction, previously super-segments
   that started and finished on the same super-node were disallowed.

   Another side-effect of having the route composed of a set of locations
   (nodes) as well as the direction of travel (segments used to reach
   them) is that via points in the route can be forced to continue in the
   original direction. If the chosen method of transport obeys turn
   restrictions then it will not reverse direction at a via point but will
   find an optimum route continuing in the same direction. The only
   exception to this is when the route ahead at a waypoint is into a
   dead-end and an immediate U-turn is allowed.

   A side-effect of having the starting direction at a via point defined
   by the previous part of the route is that overall non-optimal routes
   may be found even though each section between via points is optimal.
   For a route with a start, middle and end point defined it can be the
   case that the shortest route from the start to the middle arrives in
   the opposite direction to that required for the optimal route from the
   middle to the end. The calculation of the route in separate sections
   therefore may give a non-optimum result even though each section is
   itself optimum based on the start conditions.

   Overall the presence of turn restrictions in the database makes the
   routing slower even for regions of the map that have no turn
   restrictions.


Data Implementation
-------------------

   The hardest part of implementing this router is the data organisation.
   The arrangement of the data to minimise the number of operations
   required to follow a route from one node to another is much harder than
   designing the algorithm itself.

   The final implementation uses a separate table for nodes, segments and
   ways. Each table individually is implemented as a C-language data
   structure that is written to disk by a program which parses the
   OpenStreetMap XML data file. In the router these data structures are
   memory mapped so that the operating system handles the problems of
   loading the needed data blocks from disk.

   Each node contains a latitude and longitude and they are sorted
   geographically so that converting a latitude and longitude coordinate
   to a node is fast as well as looking up the coordinate of a node. The
   node also contains the location in the array of segments for the first
   segment that uses that node.
   Each segment contains the location of the two nodes as well as the way
   that the segment came from. The location of the next segment that uses
   one of the two nodes is also stored; the next segment for the other
   node is the following one in the array. The length of the segment is
   also pre-computed and stored.
   Each way has a name, a highway type, a list of allowed types of
   traffic, a speed limit, any weight, height, width or length
   restrictions and the highway properties.

   The super-nodes are mixed in with the nodes and the super-segments are
   mixed in with the segments. For the nodes they are the same as the
   normal nodes, so just a flag is needed to indicate that they are super.
   The super-segments are in addition to the normal segments so they
   increase the database size (by about 10%) and are also marked with a
   flag. Some segments are therefore flagged as both normal segments and
   super-segments if they both have the same end nodes.

   The relations are stored separately from the nodes, segments and ways.
   For the turn restriction relations the initial and final segments are
   stored along with the restricted node itself. Each node that has a turn
   restriction is marked in the main node storage with a flag to indicate
   this information.


--------

Copyright 2008-2013 Andrew M. Bishop.
1	Routino : Algorithm
2	===================
3
4
5	This page describes the development of the algorithm that is used in
6	Routino for finding routes.
7
8
9	Simplest Algorithm
10	------------------
11
12	The algorithm to find a route is fundamentally simple: Start at the
13	beginning, follow all possible routes and keep going until you reach
14	the end.
15
16	While this method does work, it isn't fast. To be able to find a route
17	quickly needs a different algorithm, one that can find the correct
18	answer without wasting time on routes that lead nowhere.
19
20
21	Improved Algorithm
22	------------------
23
24	The simplest way to do this is to follow all possible segments from the
25	starting node to the next nearest node (an intermediate node in the
26	complete journey). For each node that is reached store the shortest
27	route from the starting node and the length of that route. The list of
28	intermediate nodes needs to be maintained in order of shortest overall
29	route on the assumption that there is a straight line route from here
30	to the end node.
31	At each point the intermediate node that has the shortest potential
32	overall journey time is processed before any other node. From the first
33	node in the list follow all possible segments and place the newly
34	discovered nodes into the same list ordered in the same way. This will
35	tend to constrain the list of nodes examined to be the ones that are
36	between the start and end nodes. If at any point you reach a node that
37	has already been reached by a longer route then you can discard that
38	route since the newly discovered route is shorter. Conversely if the
39	previously discovered route is shorter then discard the new route.
40	At some point the end node will be reached and then any routes with
41	potential lengths longer than this actual route can be immediately
42	discarded. The few remaining potential routes must be continued until
43	they are found to be shorter or have no possibility of being shorter.
44	The shortest possible route is then found.
45
46	At all times when looking at a node only those segments that are
47	possible by the chosen means of transport are followed. This allows the
48	type of transport to be handled easily. When finding the quickest route
49	the same rules apply except that the criterion for sorting is the
50	shortest potential route (assuming that from each node to the end is
51	the fastest possible type of highway).
52
53	This method also works, but again it isn't very fast. The problem is
54	that the complexity is proportional to the number of nodes or segments
55	in all routes examined between the start and end nodes. Maintaining the
56	list of intermediate nodes in order is the most complex part.
57
58
59	Final Algorithm
60	---------------
61
62	The final algorithm that is implemented in the router is basically the
63	one above but with an important difference. Instead of finding a long
64	route among a data set of 8,000,000 nodes (number of highway nodes in
65	UK at beginning of 2010) it finds one long route in a data set of
66	1,000,000 nodes and a few hundred very short routes in the full data
67	set. Since the time taken to find a route is proportional to the number
68	of nodes that need to be considered the main route takes 1/10th of the
69	time and the very short routes take almost no time at all.
70
71	The solution to making the algorithm fast is therefore to discard most
72	of the nodes and only keep the interesting ones. In this case a node is
73	deemed to be interesting if it is the junction of three or more
74	segments or the junction of two segments with different properties or
75	has a routing restriction different from the connecting segments. In
76	the algorithm and following description these are classed as
77	super-nodes. Starting at each super-node a super-segment is generated
78	that finishes on another super-node and contains the shortest path
79	along segments with identical properties (and these properties are
80	inherited by the super-segment). The point of choosing the shortest
81	route is that since all segments considered have identical properties
82	they will be treated identically when properties are taken into
83	account. This decision making process can be repeated until the only
84	the most important and interesting nodes remain.
85
86	To find a route between a start and finish point now comprises the
87	following steps (assuming a shortest route is required):
88
89	1. Find all shortest routes from the start point along normal segments
90	and stopping when super-nodes are reached.
91	2. Find all shortest routes from the end point backwards along normal
92	segments and stopping when super-nodes are reached.
93	3. Find the shortest route along super-segments from the set of
94	super-nodes in step 1 to the set of super-nodes in step 2 (taking
95	into account the lengths found in steps 1 and 2 between the
96	start/finish super-nodes and the ultimate start/finish point).
97	4. For each super-segment in step 3 find the shortest route between
98	the two end-point super-nodes.
99
100	This multi-step process is considerably quicker than using all nodes
101	but gives a result that still contains the full list of nodes that are
102	visited. There are some special cases though, for example very short
103	routes that do not pass through any super-nodes, or routes that start
104	or finish on a super-node. In these cases one or more of the steps
105	listed can be removed or simplified.
106
107	When the first route reaches the final node the length of that route is
108	retained as a benchmark. Any shorter complete route that is calculated
109	later would replace this benchmark. As routes are tested any partial
110	routes that are longer than the benchmark can be immediately discarded.
111	Other partial routes have the length of a perfect straight highway to
112	the final node added to them and if the total exceeds the benchmark
113	they can also be discarded. Very quickly the number of possible routes
114	is reduced until the absolute shortest is found.
115
116	For routes that do not start or finish on a node in the original data
117	set a fake node is added to an existing segment. This requires special
118	handling in the algorithm but it gives mode flexibility for the start,
119	finish and intermediate points in a route.
120
121	Algorithm Evolution
122	- - - - - - - - - -
123
124	In Routino versions 1.0 to 1.4 the algorithm used to select a
125	super-node was the same as above except that node properties were not
126	included. Routino versions 1.4.1 to 1.5.1 used a slightly different
127	algorithm which only chose nodes that were junctions between segments
128	with different properties (or has a routing restriction that is
129	different from connecting segments in versions 1.5 and 1.5.1). The
130	addition of turn restrictions (described in more detail below) requires
131	the original algorithm since the super-segments more accurately reflect
132	the underlying topology.
133
134	Algorithm Implementation
135	- - - - - - - - - - - -
136
137	The algorithm that is used for finding the route between the
138	super-nodes using super-segments is the A* algorithm (or a slight
139	variation of it). This was not a deliberate design decision, but
140	evolved into it during development. This algorithm relies on
141	calculating the lowest score (shortest distance or quickest time) to
142	each node from the starting node. The remaining score for the path to
143	the destination node is estimated (based on a straight line using the
144	fastest type of highway) and added to the current score and the result
145	recorded. At each step the unvisited node that has the lowest current
146	score is examined and all nodes connected to it have their scores
147	calculated. When the destination node has been reached all remaining
148	unvisited nodes with scores higher than the destination node's score
149	can be discarded and the few remaining nodes examined.
150
151	The algorithm used to find the route between super-nodes using normal
152	segments is Dijkstra's algorithm (although it is implemented as the
153	same algorithm as above but with no estimated cost). Since these routes
154	tend to be short and the CPU time for calculating the heuristic cost
155	function is relatively large this tends to give a quicker solution.
156
157
158	Routing Preferences
159	-------------------
160
161	One of the important features of Routino is the ability to select a
162	route that is optimum for a set of criteria such as preferences for
163	each type of highway, speed limits and other restrictions and highway
164	properties.
165
166	All of these features are handled by assigning a score to each segment
167	while calculating the route and trying to minimise the score rather
168	than simply minimising the length.
169
170	Segment length
171	When calculating the shortest route the length of the segment is
172	the starting point for the score.
173
174	Speed preference
175	When calculating the quickest route the time taken calculated
176	from the length of the segment and the lower of the highway's
177	own speed limit and the user's speed preference for the type of
178	highway is the starting point for the score.
179
180	One-way restriction
181	If a highway has the one-way property in the opposite direction
182	to the desired travel and the user's preference is to obey
183	one-way restrictions then the segment is ignored.
184
185	Weight, height, width & length limits
186	If a highway has one of these limits and its value is less than
187	the user's specified requirement then the segment is ignored.
188
189	Highway preference
190	The highway preference specified by the user is a percentage,
191	these are scaled so that the most preferred highway type has a
192	weighted preference of 1.0 (0% always has a weighted preference
193	of 0.0). The calculated score for a segment is divided by this
194	weighted preference.
195
196	Highway properties
197	The other highway properties are specified by the user as a
198	percentage and each highway either has that property or not. The
199	user's property preference is scaled into the range 0.0 (for 0%)
200	to 1.0 (for 100%) to give a weighted preference, a second
201	"non-property" weighted preference is calculated in the same way
202	after subtracting the user's preference from 100%. If a segment
203	has a particular property then the calculated score is divided
204	by the weighted preference for that property, if not then it is
205	divided by the non-property weighted preference. A non-linear
206	transformation is applied so that changing property preferences
207	close to 50% do not cause large variations in routes.
208
209
210	Data Pruning
211	------------
212
213	From version 2.2 there are options to "prune" nodes and segments from
214	the input data which means to remove nodes and/or segments without
215	significantly changing the routing results.
216
217	The pruning options must meet a number of conditions to be useful:
218	* The topology relevant to routing must remain unchanged. The
219	instructions that are produced from the reduced set of nodes and
220	segments must be sufficiently accurate for anybody trying to follow
221	them on the ground.
222	* Any restrictions belonging to nodes or segments that stop certain
223	types of traffic from following a particular highway must be
224	preserved.
225	* The total length must be calculated using the original data and not
226	the simplified data which by its nature will typically be shorter.
227	* The location of the remaining nodes and segments must be a good
228	representation of the original nodes and segments. Since the
229	calculated route may be displayed on a map the remaining nodes and
230	segments must clearly indicate the route to take.
231
232	The prune options all have user-controllable parameters which allow the
233	geographical accuracy to be controlled. This means that although the
234	topology is the same the geographical accuracy can be sacrificed
235	slightly to minimise the number of nodes and segments.
236
237	The pruning options that are available are:
238	* Removing the access permissions for a transport type from segments
239	if it is not possible to route that transport type from those
240	segments to a significant number of other places. The limit on the
241	pruning is set by the total length of the isolated group of
242	segments. This significantly increases the chance that a route will
243	be found by not putting waypoints in inaccessible places.
244	* Removing short segments, the limit is set by the length of the
245	segment. This removes a number of redundant segments (and
246	associated nodes) but rules are applied to ensure that removing the
247	segments does not alter junction topology or remove node access
248	permissions or changes in way properties.
249	* Removing nodes from almost straight highways, the limit is set by
250	the distance between the remaining segments and the original nodes.
251	This removes a large number of redundant nodes (and therefore
252	segments) but again care is taken not to remove node access
253	permissions or changes in way properties.
254
255
256	Turn Restrictions
257	-----------------
258
259	The addition of turn restrictions in version 2.0 adds a set of further
260	complications because it introduces a set of constraints that are far
261	more complex than one-way streets.
262
263	A turn restriction in the simplest case is a combination of a segment,
264	node and segment such that routes are not allowed to go from the first
265	segment to the second one through the specified node. Exceptions for
266	certain types of traffic can also be specified. Currently only this
267	simplest type of turn restriction is handled by the algorithm.
268
269	The first complication of turn restrictions is that the algorithm above
270	requires that super-segments are composed of segments with identical
271	properties. A turn restriction is not the same in both directions so a
272	super-segment cannot include any route through that turn restriction.
273	The node at the centre of the turn restriction must therefore be a
274	super-node to avoid this. In addition to this all nodes connected to
275	the turn restriction node by a single segment must also be super-nodes
276	to avoid any long-distance super-segments starting at the restricted
277	node.
278
279	The second complication of a turn restriction is that the optimum route
280	may require passing through the same node more than once. This can
281	happen where the route needs to work around a turn restriction by
282	driving past it, turning round (on a roundabout perhaps) and coming
283	back along the same highway. Without turn restrictions a route could be
284	defined purely by the set of nodes that were passed; no node would
285	exist more than once along a route between two points. With turn
286	restrictions the route is defined by a node and the segment used to get
287	there; no route between two points will ever need to follow the same
288	segment in the same direction more than once. This means that the
289	optimisation algorithm calculates scores for directed segments (indexed
290	by segment and end node) rather than for nodes.
291
292	A side-effect of this is that a route that works around a turn
293	restriction must be calculable using the super-segments that are stored
294	in the database. This puts a limit on the amount of database
295	optimisation that can be performed because if too many super-segments
296	are removed the optimum work-around may also be removed. The solution
297	to this is to ensure that the database preserves all loops that can be
298	used to turn around and reverse direction, previously super-segments
299	that started and finished on the same super-node were disallowed.
300
301	Another side-effect of having the route composed of a set of locations
302	(nodes) as well as the direction of travel (segments used to reach
303	them) is that via points in the route can be forced to continue in the
304	original direction. If the chosen method of transport obeys turn
305	restrictions then it will not reverse direction at a via point but will
306	find an optimum route continuing in the same direction. The only
307	exception to this is when the route ahead at a waypoint is into a
308	dead-end and an immediate U-turn is allowed.
309
310	A side-effect of having the starting direction at a via point defined
311	by the previous part of the route is that overall non-optimal routes
312	may be found even though each section between via points is optimal.
313	For a route with a start, middle and end point defined it can be the
314	case that the shortest route from the start to the middle arrives in
315	the opposite direction to that required for the optimal route from the
316	middle to the end. The calculation of the route in separate sections
317	therefore may give a non-optimum result even though each section is
318	itself optimum based on the start conditions.
319
320	Overall the presence of turn restrictions in the database makes the
321	routing slower even for regions of the map that have no turn
322	restrictions.
323
324
325	Data Implementation
326	-------------------
327
328	The hardest part of implementing this router is the data organisation.
329	The arrangement of the data to minimise the number of operations
330	required to follow a route from one node to another is much harder than
331	designing the algorithm itself.
332
333	The final implementation uses a separate table for nodes, segments and
334	ways. Each table individually is implemented as a C-language data
335	structure that is written to disk by a program which parses the
336	OpenStreetMap XML data file. In the router these data structures are
337	memory mapped so that the operating system handles the problems of
338	loading the needed data blocks from disk.
339
340	Each node contains a latitude and longitude and they are sorted
341	geographically so that converting a latitude and longitude coordinate
342	to a node is fast as well as looking up the coordinate of a node. The
343	node also contains the location in the array of segments for the first
344	segment that uses that node.
345	Each segment contains the location of the two nodes as well as the way
346	that the segment came from. The location of the next segment that uses
347	one of the two nodes is also stored; the next segment for the other
348	node is the following one in the array. The length of the segment is
349	also pre-computed and stored.
350	Each way has a name, a highway type, a list of allowed types of
351	traffic, a speed limit, any weight, height, width or length
352	restrictions and the highway properties.
353
354	The super-nodes are mixed in with the nodes and the super-segments are
355	mixed in with the segments. For the nodes they are the same as the
356	normal nodes, so just a flag is needed to indicate that they are super.
357	The super-segments are in addition to the normal segments so they
358	increase the database size (by about 10%) and are also marked with a
359	flag. Some segments are therefore flagged as both normal segments and
360	super-segments if they both have the same end nodes.
361
362	The relations are stored separately from the nodes, segments and ways.
363	For the turn restriction relations the initial and final segments are
364	stored along with the restricted node itself. Each node that has a turn
365	restriction is marked in the main node storage with a flag to indicate
366	this information.
367
368
369	--------
370
371	Copyright 2008-2013 Andrew M. Bishop.