Data Extracts - Technical Details

The extract process

Once a day, around 21:00 CET, we update a locally held version of the planet file from the latest OpenStreetMap data and split it into a number of pre-defined regions. This is done using the Osmosis and osmium programs (Osmosis to download updates, Osmium to apply them and split).

Every couple of months we re-initialise our update process with a new planet file just to make sure we're not carrying over potential replication errors forever.

The splitting is done in a cascading fashion – first we split the world in two halves, then we cut out the continents from each half, then countries, and so on.

We use polygonal boundaries for the splitting – boundaries that are sometiems derived and simplified from OSM data, sometimes just hand-drawn. The boundaries usually follow country borders, but occasionally we take liberties and include a litte more of a neighbouring country if this greatly simplifies the polygon. The Osmium extract function that we use keeps ways and multipolygon relations that cross an extract border complete, i.e. when a very large mutlipolygon crosses the border, an extract can occasionally contain a lot more that expected.

Polygon files

The .poly files that you can download reflect the exact clipping boundary that we use in generating the extract, and can be used with programs like Osmosis, Osmium, or Osmconvert to generate the extract from a larger file. The KML files are the same data, just in different format. Please note that these files are not country boundaries but a buffer around countries - go to naturalearthdata.com if you want a simple set of country boundaries.

pbf files

The .osm.pbf data format is the common format for the exchange of raw OpenStreetMap data. It is fast to read and write and can be directly processed by most programs dealing with OSM data. Our .osm.pbf files are 100% pure, un-filtered OSM and contain all data and metadata available in OSM for the region; the only thing they don't contain is history, i.e. information about past edits.

We do, however, keep a couple of older files around. They are not usually shown but you can access them through the directory index; they are timestamped in the file name. We delete these older files after a while. If you are on a very slow and/or flaky internet connection, do not download the file named "...-latest", download the timestamped file instead, then you can resume the download even if the connection fails.

The .osh.pbf format is for history files. We keep one history file for each region that is on offer, and that file is only updated weekly, but it contains the full history of an area and can be used to synthesize a data file for the region for any timestamp in the past.

bz2 files

The .osm.bz2 files are bzip2-compressed versions of OSM XML data for the region. We generate these files from the .osm.pbf files with a low-priority background process which means that they will often be older than the .osm.pbf counterparts. They are also slower to process and larger.

Shape files

The .shp.zip files contain a number of shape layers (.shp/.shx/.dbf combos). In contrast to the pbf/bz2 files, the shape files are not "complete" - we have made a selection of features and attributes. The shape files have the same structure as shape files we make to order, but the free files contain fewer layers, and are only available for smaller areas. A PDF describing the shape files is available online.

.osc.gz files, or "diff updates"

Whenever we produce a new exctract for a region, we also compute the difference between the new extract and the previous one, and make that available for download so that users can continuously update their own regional extract instead of having to download the full file. The file names for update files follow the convention used by Osmosis for the "read-replication-interval" task, so automatic updates are possible with Osmosis, osmupdate, and pyosmium-up-to-date.

Please be aware that these diffs really only represent the changes between the previous and current versions of the Geofabrik extract. These changes will mainly be recent changes in OSM but not exclusively; it is for example possible that we modify the clip bounds for one country a little and therefore the next extract contains more or less data than the previous one, a change that would also be reflected in the diffs.

Metadata and data protection

Files which are accessible on our public download server without any login do not contain sensitive data about the OpenStreetMap contributors. The user, uid and changeset fields are missing in these files since May 3, 2018. You can download files with full metadata from a different download server which requires log-in with your OpenStreetMap account. Files from the this non-public download server contain data which is subject to EU data protection regulations. These regulations apply world-wide.

JSON index of all downloads

If you want a machine-readable list of all files the server offers, check out our GeoJSON index at https://download.geofabrik.de/index-v1.json (or https://download.geofabrik.de/index-v1-nogeom.json for a smaller file without the boundary geometries).

This file is a FeatureCollection where each feature has the following properties:

The structure is guaranteed to remain stable; if we make changes to the structure we will use a different version number for the name of the file.