Routino : Numerical Limits


32/64-bit Data IDs

The OpenStreetMap data uses a numerical identifier for each node, way and relation. These identifiers started at 1 and increase for every new item of each type that is added. When an object is deleted the identifier is not re-used so the highest identifier will always be higher than the number of objects.

The identifier needs to be handled carefully to ensure that it does not overflow the data type allocated for it. Depending on the data type used to store the identifier there are are a number of numerical limits as described below:

  1. If a signed 32-bit integer is used to store the identifier then the maximum value that can be handled is 2147483647 (231-1) before overflow.
  2. If an unsigned 32-bit integer is used to store the identifier then the maximum value that can be handled is 4294967295 (232-1) before overflow.
  3. If a signed 64-bit integer is used to store the identifier then the maximum value that can be handled is 9223372036854775807 (263-1) before overflow.
For the purposes of this document the possibility of overflow of a 64-bit integer is ignored.

The part of Routino that handles the node, way and relation identifiers is the planetsplitter program.

ID Above 31-bits

The first identifier exceeding 31-bits (for a node) is predicted to be created in the OpenStreetMap database in February 2013.

All versions of Routino use unsigned 32-bit integers to store the identifier. Therefore all versions of Routino will continue working correctly when node number 2147483648 (231) or higher is present.

ID Above 32-bits

The ability of Routino to handle identifiers larger than 32-bits does not depend on having a 64-bit operating system.

Before version 2.0.1 of Routino there was no check that the identifier read from the input data would fit within an unsigned 32-bit integer. Earlier versions of Routino will therefore fail to report an error and will process data incorrectly when node number 4294967296 (232) or higher is present.

From version 2.0.2 the code is written to allow the node, way and relation identifier data type to be changed to 64-bits. This means that a consistent data type is used for handling identifiers and the format used for printing them is consistent with the variable type.

From version 2.0.2 onwards it is possible to make a simple change to the code to process data with node identifiers above 4294967296 (232) without error. The binary format of the database will be unchanged by the use of 64-bit identifiers (since the identifiers are not stored in the database).

To recompile with 64-bit node identifiers the file src/typesx.h should be edited and the two lines below changed from:

typedef uint32_t node_t;

#define Pnode_t PRIu32
to:
typedef uint64_t node_t;

#define Pnode_t PRIu64

A similar change can also be made for way or relation identifiers although since there are currently fewer of these the limit is not as close to being reached.

Between version 2.0.2 and version 2.4 a bug means that route relations will ignore the way or relation identifier if it is equal to 4294967295 (232-1).

From version 2.4 onwards when a numerical limit is reached the planetsplitter program will exit with an error message that describes which limit was reached and which data type needs to be changed.

Database Format

The other limitation in Routino is the number of objects stored in the database that is generated by the planetsplitter data processing. This number may be significantly different from the highest identifier in the input data set for two reasons. Firstly any nodes, ways or relations that have been deleted will not be present in the data. Secondly when a partial planet database (continent, country or smaller) is processed there will be only a fraction of the total number of nodes, ways and relations.

The limiting factor is the largest of the following.

  1. The number of nodes in the input data files.
  2. The number of segments in the input data files.
  3. The number of highways in the input data files.
  4. The number of relations in the input data files.
Normally the number of nodes will be the limiting factor.

32-bit Indexes

Before version 1.2 the database could hold up to 4294967295 (232-1) items of each type (node, segment, way) since an unsigned 32-bit integer is used.

Versions 1.3 to 1.4.1 have a limit of 2147483647 (231-1) items since half of the 32-bit integer range is reserved for fake nodes and segments that are inserted if a waypoint is not close to a node.

From version 1.5 the limit is 4294901760 (232-216) for the number of items of each type that can be stored. The small remaining part of the 32-bit unsigned integer range is reserved for fake nodes and segments.

64-bit Indexes

When using a 32-bit operating system it is not possible to create a database that exceeds about 2GB in total. This will be fewer than 232 objects in the database in total. The use of 64-bit indexes will require a 64-bit operating system.

From version 2.0.2 onwards it is possible to make a simple change to the code to index the database objects with 64-bit integers insted of 32-bit integers.

To recompile with 64-bit index integers the file src/types.h should be edited and the two lines below changed from:

typedef uint32_t index_t;

#define Pindex_t PRIu32
to:
typedef uint64_t index_t;

#define Pindex_t PRIu64
This change will affect nodes, segments, ways and relations together. The database that is generated will no longer be compatible with Routino that has been compiled with 32-bit indexes. The size of the database will also grow by about 50% when this change is made.