How Good is Volunteered Geographical Information on Open Street Map? 

While the use of the Internet and the World Wide Web (Web) for mapping applications is well into its second decade, the picture has changed dramatically since 2005 (Haklay et al, 2008). One expression of this change is the emerging neologism that follows the rapid technological developments. While terms such as neogeography, mapping mashups, geotagging, and geostack may seem alien to veterans in the area of geographical information systems (GIS), they can be mapped to existing terms that have been in use for decades; so mash-up is a form of interoperability between geographical databases, geotagging means a specific form of georeferencing or geocoding, the geostack is a GIS, and neogeography is the sum of these terms in an attempt to divorce the past and conquer new (cyber)space.

Therefore, the neologism does not represent new ideas, rather a zeitgeist which is indicative of the change that has  appened. 

Yet, it is hard not to notice the whole range of new websites and communitiesö from the commercial Google Maps to the grassroots OpenStreetMap (OSM), and to applications such as Platialöthat have emerged. The sheer scale of new mapping applications is evidence of a step change in the geographical web (or the GeoWeb for short). Mapping has gained prominence within the range of applications known as Web 2.0, and the attention that is given to this type of application in the higher echelons of the hi-tech circles is exemplified by a series of conferences, Where 2.0, which were started in 2006 by O'Reilly Mediaöprobably the leading promoters of hi-tech knowhow: ``GIS has been around for decades, but is no longer only the realm of specialists. The Web is now flush with geographic data, being harnessed in a usable and searchable format'' (Where 2.0, 2008). How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets Mordechai Haklay Department of Civil, Environmental, and Geomatic Engineering, University College London, Gower Street, London WC1E 6BT; Received 8 August 2008; in revised form 13 September 2009 Environment and Planning B: Planning and Design 2010, volume 37, pages 682 ^ 703 Abstract. Within the framework of Web 2.0 mapping applications, the most striking example of a
geographical application is the OpenStreetMap (OSM) project. OSM aims to create a free digital map of the world and is implemented through the engagement of participants in a mode similar to software development in Open Source projects. The information is collected by many participants, collated on a central database, and distributed in multiple digital formats through the World Wide Web. This type of information was termed `Volunteered Geographical  nformation' (VGI) by Goodchild, 2007. However, to date there has been no systematic analysis of the quality of VGI. This study aims to fill this gap by analysing OSM information. The examination focuses on analysis of its quality  hrough a comparison with Ordnance Survey (OS) datasets. The analysis focuses on London and England,since OSM started in London in August 2004 and therefore the study of these geographies provides the best understanding of the achievements and difficulties of VGI. The analysis shows that OSM information can be fairly accurate: on average within about 6 m of the position recorded by the OS, and with approximately 80% overlap of motorway objects between the two datasets. In the space of four years, OSM has captured about 29% of the area of England, of which approximately 24% are digitised lines without a complete set of attributes. The paper concludes with a  iscussion of the implications of the findings to the study of VGI as well as suggesting future research directions. doi:10.1068/b35097 (1) A list of URLs is given in the appendix. While Haklay et al (2008) and Elwood (2009) provide an overview of Web 2.0 mapping, one of the most interesting aspects is the emergence of crowdsourced information. Crowdsourcing is one of the most significant and potentially controversial developments in Web 2.0. The term developed from the concept of outsourcing where business operations are transferred to remote, cheaper locations (Friedman, 2006). Similarly, crowdsourcing is how large groups of users can perform functions that are either difficult to automate or expensive to implement (Howe, 2006).

 The reason for the controversial potential of crowdsourcing is that it can be a highly exploitative activity, in which participants are encouraged to contribute to an alleged greater good when, in reality, the whole activity is contributing to an exclusive

enterprise that profits from it. In such situations, crowdsourcing is the ultimate cost
reduction for the enterprise, in which labour is used without any compensation or
obligation between the employer and the employee.
On the other hand, crowdsourcing can be used in large-scale community activities
that were difficult to implement and maintain before Web 2.0. Such community activities
can be focused on the development of software, or more recently on the collection
and sharing of information. This `commons-based peer production' has attracted
significant attention (Benkler and Nissenbaum, 2006). Tapscott and Williams (2006)
note, in relation to this form of activity, that in many peer-production communities,
productive activities are voluntary and nonmonetary. In these cases, the participants
contribute to achieve a useful goal that will serve their community, and very frequently
the wider public. This is well exemplified by the creation of Open Source software such
as the Firefox web browser or the Apache web server: both are used by millions, while
being developed by a few hundred people. Even in cases where the technological
barrier for participation is not as high as in software development, the number of
participants is much smaller than the number of users. For example, in Wikipedia,
well over 99.8% of visitors to the site do not contribute anything (Nielsen, 2006), yet
this does not deter the contributorsöon the contrary, they gain gratification from the
usefulness of their contributions.
The use of large-scale crowdsourcing activities to create reliable sources of information
or high-quality software is not without difficulties. These activitiesöespecially
the commons-based oneöare carried out by a large group of volunteers, who
work independently and without much coordination, each concentrating on their own
interests. In successful commons-based peer-production networks, there are lengthy
deliberations within the communities about the directions that the project should
take or how to implement a specific issue. Even after such deliberation, these projects
have a limited ability to force participants into a specific course of action other than to
banish them from the project at the cost of losing a contributor (and usually a
significant one). Furthermore, and especially in information-based activities, the participants
are not professionals but amateurs (Keen, 2007) and therefore do not follow
common standards in terms of data collection, verification, and use. This is a core
issue which is very frequently discussed and debated in the context of crowdsourcing
activities (Friedman, 2006; Tapscott and Williams, 2006).
The potential of crowdsourced geographical information has captured the attention
of researchers in GIS (including Elwood, 2009; Goodchild, 2007a; 2007b; 2008;
Sui, 2008). Goodchild has coined a term to describe this activity as `Volunteered Geographical
Information' (VGI). In the area of geographical information, the question of
information quality has been at the centre of the research agenda since the first
definition of GIScience (Goodchild, 1992). Therefore, in light of the data collection
by amateurs, the distributed nature of the data collection and the loose coordination in
terms of standards, one of the significant core questions about VGI is `how good is the
OpenStreetMap quality analysis 683
quality of the information that is collected through such activities?' This is a crucial
question about the efficacy of VGI activities and the value of the outputs for a range
of applications, from basic navigation to more sophisticated applications such as
site-location planning.
To answer this question, the OSM project provides a suitable case study. OSM aims
to create map data that are free to use, editable, and licensed under new copyright
schemesösuch as Creative Commonsöwhich protect the project from unwarranted
use by either participants or a third party (Benkler and Nissenbaum, 2006). A key
motivation for this project is to enable free access to current digital geographical
information across the world, as such information has not been available until now.
In many Western countries this information is available from commercial providers
and national mapping agencies, but it is considered to be expensive and is out of reach
of individuals and grass-roots organisations. Even in the US, where basic road information
is available through the US Census Bureau TIGER/Line programme, the
details that are provided are limited (streets and roads only) and do not include green
spaces, landmarks, and the like. Also, due to budget limitations, the update cycle is
slow and does not take into account rapid changes. Thus, even in the US, there is a
need for free detailed geographical information.
OSM information can be edited online through a wiki-like interface where, once a
user has created an account, the underlying map data can be viewed and edited. In
addition to this lightweight editing software package working within the browser, there
is a stand-alone editing tool, more akin to a GIS package. A number of sources have
been used to create these maps including uploaded Global Positioning System (GPS)
tracks, out-of-copyright maps, and Yahoo! aerial imagery, which was made available
through collaboration with this search engine. Unlike Wikipedia, where the majority of
content is created at disparate locations, the OSM community also organises a series
of local workshops (called `mapping parties'), which aim to create and annotate content
for localised geographical areas (see Perkins and Dodge, 2008). These events are
designed to introduce new contributors to the community with hands-on experience
of collecting data, while contributing positively to the project overall by generating new
information and street labelling as part of the exercise. The OSM data are stored on
servers at University College London, and Bytemark, which contributes the bandwidth
for this project. Whilst over 50 000 people have contributed to the map as of August
2008, it is a core group of about forty volunteers who dedicate their time to creating
the technical infrastructure for a viable data collection and dissemination service.
This includes the maintenance of the servers, writing the core software that handles
the transactions with the server when adding and editing geographical information,
and creating cartographical outputs. For a detailed discussion of the technical side of
the project, see Haklay and Weber (2008).
With OSM, it is possible to answer the question of VGI quality by comparing the
dataset against Ordnance Survey (OS) datasets in the UK. As OSM started in London,
and thus the city represents the place that has received the longest ongoing attention
from OSM participants, it stands to reason that an examination of the data for the city
and for England will provide an early indication about the quality of VGI.
An analysis of the quality of the OSM dataset is discussed here, evaluating its
positional and attribute accuracy, completeness, and consistency. In light of this analysis,
the fitness for purpose of OSM information and some possible directions for
future developments are suggested. However, before turning to the analysis, a short
discussion of evaluations of geographical information quality will help to set the scene.
684 M Haklay
2 How to evaluate the quality of geographical information
The problem of understanding the quality of geographical databases was identified
many years ago, and received attention from surveyors, cartographers, and geographers
(van Oort, 2006). Van Oort identified work on the quality of geographical information
dating back to the late 1960s and early 1970s.
With the emergence of GIS in the 1980s, this area of research experienced rapid
growth, receiving attention from leading figures in the area of geographical information
science including Burrough and Frank (1996), Goodchild (1995), Fisher (1999),
Chrisman (1984), and many others [see van Oort (2006) for a comprehensive review].
By 2002, quality aspects of geographical information had been enshrined in the
International Organisation for Standards (ISO) codes 19113 (quality principles) and
19114 (quality evaluation procedures) under the aegis of ISO Technical Committee 211.
In their review of these standards, Kresse and Fadaie (2004) identified the following
aspects of quality: completeness, logical consistency, positional accuracy, temporal
accuracy, thematic accuracy, purpose, usage, and lineage.
Van Oort's (2006) synthesis of various quality standards and definitions is more
comprehensive and identifies the following aspects.
. Lineageöthis aspect of quality is about the history of the dataset, how it was
collected, and how it evolved.
. Positional accuracyöthis is probably the most obvious aspect of quality and
evaluates how well the coordinate value of an object in the database relates to
the reality on the ground.
. Attribute accuracyöas objects in a geographical database are represented not
only by their geometrical shape but also by additional attributes, this measure
evaluates how correct these values are.
. Logical consistencyöthis is an aspect of the internal consistency of the dataset,
in terms of topological correctness and the relationships that are encoded in the
database.
. Completenessöthis is a measure of the lack of data; that is, an assessment of
how many objects are expected to be found in the database but are missing, as
well as an assessment of excess data that should not be included. In other words,
how comprehensive is the coverage of real-world objects?
. Semantic accuracyöthis measure links the way in which the object is captured
and represented in the database to its meaning and the way in which it should
be interpreted.
. Usage, purpose, and constraintsöthis is a fitness-for-purpose declaration that
should help potential users in deciding how the data should be used.
. Temporal qualityöthis is a measure of the validity of changes in the database in
relation to real-world changes and also the rate of updates.
Naturally, the definitions above are shorthand versions and aim to explain the principles
of geographical-information quality. The burgeoning literature on geographicalinformation
quality provides more detailed definitions and discussion of these aspects
(see Kresse and Fadaie, 2004; van Oort, 2006).
To understand the amount of work that is required to achieve a high-quality geographical
database, the OS provides a good example of monitoring completeness and
temporal quality (based on Cross et al, 2005). To achieve this goal, the OS has an
internal quality-assurance process known as `the agency performance monitor'. This
is set by the UK government and requires that `some 99.6% significant real-world
features are represented in the database within six months of completion'. Internally
to OS, the operational instruction based on this criterion is the maintenance of the
OS large-scale database currency at an average of no more than 0.7 house units of
OpenStreetMap quality analysis 685
unsurveyed major change, over six months old, per digital map unit (DMU). DMUs
are inherently map tiles, while house units are a measure of data capture, with the
physical capture of one building as the basic unit. To verify that they comply with
the criterion, every six months the OS analyses the result of auditing over 4000 DMUs,
selected through stratified sampling, for missing detail by sending out on the ground
semitrained surveyors with printed maps. This is a significant and costly undertaking
but it is an unavoidable part of creating a reliable and authoritative geographical
database. Noteworthy is that this work focuses on completeness and temporal quality,
while positional accuracy is evaluated through a separate process.
As this type of evaluation is not feasible for OSM, a desk-based approach was
taken using two geographical datasets: the OS dataset and OSM dataset. The assumption
is that, at this stage of OSM development, the OS dataset represents higher
accuracy and higher overall quality (at least with regard to position and attribute).
Considering the lineage and investment in the OS dataset, this should not be a
contested statement. This type of comparison is common in spatial-data-quality
research (see Goodchild et al, 1992; Hunter, 1999).
3 Datasets used and comparison framework
A basic problem that is inherent in a desk-based quality assessment of any spatial
dataset is the selection of the comparison dataset. The explicit assumption in any
selection is that the comparison dataset is of higher quality and represents a version
of reality that is consistent in terms of quality, and is therefore usable to expose
shortcomings in the dataset that is the subject of the investigation.
Therefore, a meaningful comparison of OSM information should take into account
the characteristics of this dataset. Due to the dataset collection method, the OSM
dataset cannot be more accurate than the quality of the GPS receiver (which usually
captures a location within 6 ^ 10 m) and the Yahoo! aerial imagery, which outside
London provides about 15 m resolution. This means that we can expect the OSM
dataset to be accurate to within a region about 20 m from the true location under
ideal conditions. Therefore, we should treat it as a generalised dataset. Furthermore,
for the purpose of the comparison, only streets and roads will be used, as they are the
core feature that is being collected by OSM volunteers.
On the basis of these characteristics, Navteq or TeleAtlas datasets, where comprehensive
street-level information without generalisation is available, should be the most
suitable. They are collected under standard processes and quality-assurance procedures,
with a global coverage. Yet, these two datasets are outside the reach of researchers
without incurring the high costs of purchasing the data, and a request to access such a
dataset for the purpose of comparing it with OSM was turned down.
As an alternative, the OS datasets were considered. Significantly, the OS willingly
provided their datasets for this comparison. Of the range of OS vector datasets,
Meridian 2 (for the sake of simplicity, `Meridian') and MasterMap were used. Meridian
is a generalised dataset and, due to reasons that are explained below, it holds some
characteristics that make it similar to OSM and suitable for comparison. The OS
MasterMap Integrated Transport Layer (ITN) dataset is a large-scale dataset with
high accuracy level but, due to data volumes, it can only be used in several small areas
for a comprehensive comparison.
As Meridian is central to the comparison, it is worth paying attention to its
characteristics. Meridian is a vector dataset that provides coverage of Great Britain
with complete details of the national road network: ``Motorways, major and minor
roads are represented in the dataset. Complex junctions are collapsed to single nodes
and multi-carriageways to single links. To avoid congestion, some minor roads and
686 M Haklay
cul-de-sacs less than 200 m are not represented ... . Private roads and tracks are
not included'' (Ordnance Survey, 2007, page 24). The source of the road network is
high-resolution mapping (1:1250 in urban areas, 1:2500 in rural areas, and 1:10 000 in
moorland).
Meridian is constructed so that the node points are kept in their original position
while, through a process of generalisation, the road centreline is filtered to within a
20 m region of the original location. The generalisation process decreases the number
of nodes to reduce clutter and complexity. Thus, Meridian's position accuracy is 5 m or
better for the nodes, and within 20 m of the real-world position for the links between
the nodes. The OS describes Meridian as a dataset suitable for applications from environmental
analysis to design and management of distribution networks for warehouses
to health planning.
Two other sources were used to complete the comparison. The first is the 1:10 000
raster files from the OS. This is the largest scale raster product that is available from
the OS. The raster files are based on detailed mapping, and went through a process of
generalisation that leaves most of the features intact. The map is highly detailed, and
thus it is suitable for locating attribute information and details of streets and other
features that are expected to be found in OSM.
The second source is the lower level of Super Output Areas (SOAs), which is
provided by the OS and the Office for National Statistics and is based on the census
data. SOAs are about the size of a neighbourhood and are created through a computational
process by merging the basic census units. This dataset was combined with the
Index of Deprivation 2007 (ID 2007), created by the Department of Communities and
Local Government (DCLG) and which indicates the socioeconomic status of each
SOA. This dataset is used in section 4.5 for the analysis of the socioeconomic bias of
VGI.
The OSM dataset that was used in this comparison was from the end of March
2008, and was based on roads information created by Frederik Ramm and available on
his website Geofabrik. The dataset is provided as a set of thematic layers (buildings,
natural, points, railways, roads, and waterways), which are classified according to their
OSM tags.
The process of comparison started from an evaluation of positional accuracy, first
by analysing motorways, A-roads, and B-roads in the London area, and then by closely
inspecting five randomly selected OS tiles at 1:10 000 resolution, covering 113 km2
.
After this comparison, an analysis of completeness was carried out: first through a
statistical analysis across England, followed by a detailed visual inspection of the
1:10 000 tiles. Finally, statistical analysis of SOAs and ID 2007 was carried out.
4 Detailed methodology and results
For this preliminary study, two elements of the possible range of quality measures were
reviewedöpositional accuracy and completeness. Firstly, positional accuracy is considered
by Chrisman (1991) to be the best established issue of accuracy in mapping
science and therefore must be tested. Positional accuracy is significant in the evaluation
of fitness-for-use of data that was not created by professionals and was without
stringent data-collection standards. Secondly, completeness is significant in the case of
VGI, as data collection is carried out by volunteers who collect information of their
own accord without top-down coordination that would ensure systematic coverage. At
this stage of the development of VGI, the main question is the ability of these loosely
organised peer-production collectives to cover significant areas in a way that renders
their dataset useful.
OpenStreetMap quality analysis 687
4.1 Positional accuracy: motorways, A-roads and B-roads comparison(2)
The evaluation of the positional accuracy of OSM can be carried out against Meridian,
since the nodes of Meridian are derived from the high-resolution topographical dataset
and thus are highly accurate. However, the fact that the number of nodes has been
diluted by the application of a filter and the differing digitising methods means that the
two datasets have a different number of nodes. Furthermore, in the case of motorways,
OSM represents these as a line object for each direction, whereas Meridian represents
them as a single line. This means that matching on a point-by-point basis would be
inappropriate in this case.
Motorways were selected as the objects for comparison as they are significant
objects on the landscape so the comparison will evaluate the data capture along
lengthy objects, which should be captured in a consistent way. In addition, at the
time of the comparison, motorways were completely covered by the OSM dataset, so
the evaluation does not encounter completeness problems. The methodology used to
evaluate the positional accuracy of motorway objects across the two datasets was based
on Goodchild and Hunter (1997) and Hunter (1999). The comparison was carried out
by using buffers to determine the percentage of a line from one dataset that is within a
certain distance of the same feature in another dataset of higher accuracy (figure 1).
The preparation of the datasets for comparison included some manipulation. The
comparison was carried out for the motorways in the London area on both datasets to
ensure that they represented roughly the same area and length. Complex slip-road
configurations were edited in the OSM dataset to ensure that the representation was
similar to that in Meridian, since the latter represents a slip road as a simple straight
line. Significantly, this editing affected less than 0.1% of the total length of the motorway
layer. The rest of the analysis was carried out by creating a buffer around each
dataset, and then evaluating the overlap. As the OS represents the two directions as a
single line, it was decided that the buffer used for Meridian should be set at 20 m
(as this is the filter that the OS applies in the creation of the line), and, following
Goodchild and Hunter's (1999) method the OSM dataset was buffered with a 1 m
buffer to calculate the overlap. The results are displayed in table 1. On the basis of
(2) This section is based on the M Eng reports of Naureen Zulfiqar and Aamer Ather.
Coastline to
be tested
`True' coastline
Buffer zone of width x
x x
Figure 1. Example of the buffer-zone method (Goodchild and Hunter, 1997). A buffer of width x
is created around a high-quality object, and the percentage of the tested object that falls within
the buffer is evaluated.
688 M Haklay
this analysis, we can conclude that with an average overlap of nearly 80% and variability
from 60% up to 89%, the OSM dataset provides a good representation of motorways.
A further analysis was carried out using five tiles (5 km5 km) of OS MasterMap
ITN, randomly selected from the London area, to provide an estimation of the accuracy
of capture of A-roads and B-roads, which are the smaller roads in the UK
hierarchy. For this analysis, the buffer that was used for A-roads was 5.6 m, and for
B-roads 3.75 m. Thus, this test was carried out using a higher accuracy dataset and
stringent comparison conditions in the buffers. This comparison included over 246 km
of A-roads and the average overlap between OS MasterMap ITN and OSM was 88%,
with variability from 21% to 100%. In the same areas there were 68 km of B-roads,
which were captured with an overall overlap of 77% and variability from 5% to 100%.
The results of this comparison are presented in figure 2.
Table 1. Percentage of overlap between Ordnance Survey Meridian and OpenStreetMap buffers.
Motorway Percentage
M1 87.36
M2 59.81
M3 71.40
M4 84.09
M4 Spur 88.77
M10 64.05
M11 84.38
M20 87.18
M23 88.78
M25 88.80
M26 83.37
M40 72.78
A1 (M) 85.70
A308 (M) 78.27
A329 (M) 72.11
A404 76.65 Percentage overlap
100
90
80
70
60
50
40
30
20
10
0
0 2000 4000 6000 8000 10000 12000 14000 16000
Road length (m)
A-roads
B-roads
Figure 2. Comparison of A-roads and B-roads between Ordnance Survey MasterMap and
OpenStreetMap for five 5 km5 km randomly selected tiles in the London area.
OpenStreetMap quality analysis 689
4.2 Positional accuracy: urban areas in London
In addition to the statistical comparison, a more detailed, visual comparison was
carried out across 113 km2 in London using five OS 1:10 000 raster tiles (TQ37ne:
New Cross; TQ28ne: Highgate; TQ29nw: Barnet; TQ26se: Sutton; and TQ36nw: South
Norwood). The tiles were selected randomly from the London area. In each one of
them, the tiles were inspected visually and 100 samples were taken to evaluate the
differences between the OS centreline and the location that is recorded in OSM.
The average differences between the OS location and OSM are provided in table 2.
Notice the difference in averages between the areas. In terms of the underlying measurements,
in the best areas many of the locations are within 1 ^ 2 m of the location,
whereas in Barnet and Highgate distances of up to 20 m from the OS centreline were
recorded. Figure 3 provides examples from New Cross (a), Barnet (b) and Highgate (c),
which show the overlap and mismatch between the two datasets. The visual examination
of the various tiles shows that the accuracy and attention to detail differs between
areas. This can be attributed to digitisation, data collection skills, and the patience of
the person who carried out the work.
4.3 Completeness: length comparison
After gauging the level of positional accuracy of the OSM dataset, the next issue is the
level of completeness. While Steve Coast, the founder of OSM, stated ``it's important to
let go of the concept of completeness'' (GISPro, 2007, page 22), it is important to know
which areas are well covered and which are notöotherwise, the data can be assumed to
be unusable. Furthermore, the analysis of completeness can reveal other characteristics
that are relevant to other VGI projects.
Here, the difference in data capture between OSM and Meridian provided the core
principle for the assessment of comparison:
Because Meridian is generalised, excludes some of the minor roads, and does not
include foot and cycle paths, in every area where OSM has a good coverage, the total
length of OSM roads must be longer than the total length of Meridian features.
This aspect can be compared across the whole area of Great Britain, but as OSM
started in England (and more specifically in London) a comparison across England
was more appropriate and manageable.
To prepare the dataset for comparison, a grid at a resolution of 1 km was created
across England. Next, as the comparison was trying to find the difference between
OSM and Meridian objects, and to avoid the inclusion of coastline objects and small
slivers of grid cells, all incomplete cells with an area less than 1 km2 were eliminated.
This meant that out of the total area of England of 132 929 km2
, the comparison was
carried out on 123 714 km2 (about 93% of the total area).
The first step was to project the OSM dataset onto the British National Grid, to
bring it to the same projection as Meridian. The grid was then used to clip all the road
Table 2. Positional accuracy across five areas of London, comparing Ordnance Survey positions
with OpenStreetMap positions.
Area Average (m)
Barnet 6.77
Highgate 8.33
New Cross 6.04
South Norwood 3.17
Sutton 4.83
Total 5.83
690 M Haklay
(a)
(b)
Figure 3. Three examples of overlapping OpenStreetMaps (OSM) and Ordnance Survey maps
for the New Cross (a), Barnet (b), and Highgate (c) areas of London. The darker lines that
overlay the map are OSM features.
objects from OSM and Meridian in such a way that they were segmented along the
grid lines. This step enabled comparison of the two sets in each cell grid across
England. For each cell, the following formula was calculated:
X…OSM roads lengths† ÿX…Meridian road lengths†.
The rest of the analysis was carried out through SQL queries, which added up the
length of lines that were contained in or intersected the grid cells. The clipping process
was carried out in MapInfo, whereas the analysis was in Manifold GIS.
The results of the analysis show the current state of OSM completeness. At
the macrolevel, the total length of Meridian roads is 302 349 778 m, while for OSM it
is 209 755 703 m. thus, even at the highest level, the OSM dataset total length is 69%
of Meridian. It is important to remember that, in this and in the following comparisons,
Meridian is an incomplete and generalised coverage, and thus there is an underestimation
of the total length of roads for England. Yet, considering the fact that OSM has been
around for a mere four years, this is a significant and impressive rate of data collection.
There are 16 300 km2 in which neither OSM nor Meridian has any feature. Of the
remainder, in 70.7% of the area, Meridian provides a better, more comprehensive coverage
than OSM. In other words OSM volunteers have provided an adequate coverage for 29.3%
of the area of England in which we should expect to find features (see table 3).
Naturally, the real interest lies in the geography of these differences. The centres
of the big cities of England (such as London, Manchester, Birmingham, Newcastle, and
Liverpool) are well mapped using this measure. However, in suburban areas,
and especially in the boundary between the city and the rural area that surrounds it,
(c)
Figure 3 (continued).
692 M Haklay
Table 3. Length comparison between OpenStreetMap (OSM) and Ordnance Survey Meridian 2.
Percentage of total are shown in parentheses.
Cells Area (km2)
Empty cells 16 300 (13.2)
Meridian 2 more detailed than OSM 75 977 (61.4)
OSM more detailed than Meridian 2 31 437 (25.4)
Total 123 714
0 100 km
Total length difference (m)
< ÿ1
ÿ1 .:: 1
> 1
Figure 4. Length difference between OpenStreetMap (OSM) and Ordnance Survey Meridian 2
datasets. Black ˆ areas of good OSM coverage; grey ˆ areas of poor OSM coverage.
OpenStreetMap quality analysis 693
the quality of coverage drops very fast and there are many areas that are not covered
very well. Figure 4 shows the pattern across England. The areas that are marked in black
show where OSM is likely to be complete, while the grey indicates incompleteness. The
white areas are the locations where there is no difference between the two (where
the difference between the two datasets is between 1 m). Figure 4 highlights the
importance of the Yahoo! Imageryöthe rectangular areas around London where
high-resolution imagery is available, and therefore data capture is easier, is clearly visible.
Following this comparison, which inherently compares all the line features that are
captured in OSM, including footpaths and minor roads, a more detailed comparison
was carried out. This time, only OSM features that were comparable with the Meridian
dataset were included (for example, motorway, major road, residential).
Noteworthy is that this comparison moves into the area of attribute quality, as a
road that is included in the database but without any tag will be excluded. Furthermore,
the hypothesis that was noted above still standsöin any location in which the
OSM dataset has been captured completely, the length of OSM objects must be longer
than Meridian objects (see table 4).
Under this comparison, the OSM dataset is providing coverage for 24.5% of the
total area that is covered by Meridian. Figure 5 provides an overview of the difference
in the London area.
4.4 Completeness: urban areas in London
Another way to evaluate the completeness of the dataset is by visual inspection of the
dataset against another dataset. Similar to the method that was described above for
the detailed analysis of urban areas, 113 km2 in London were examined visually to
understand the nature of the incompleteness in OSM. The five 1:10 000 raster tiles are
shown in figure 6, and provide a good cross-section of London from the centre to the
edge. Each red circle on the image indicates an omission of a detail or a major mistake
in digitising (such as a road that passes through the centre of a built-up area). Of the
five tiles, two stand out dramatically. The Highgate tile includes many omissions, and,
as noted in the previous section, also exemplifies sloppy digitisation, which impacts on
the positional accuracy of the dataset. As figure 7 shows, open spaces are missing, as
well as minor roads. Notice that some of OSM lines are at the edge of the roads and
some errors in digitising can be identified clearly. The Sutton tile contains large areas
where details are completely missing or erroneousönotice the size of the circles in
figure 6. A similar examination of the South Norwood tile, on the other hand, shows
small areas that are missing completely, while the quality of the coverage is high.
Table 4. Length comparison with attributes between OpenStreetMap (OSM) and Ordnance Survey
Meridian 2. Percentages of total are shown in parentheses.
Cells Area (km2)
Empty cells 17 632a (14.3)
Meridian 2 more detailed than OSM 80 041 (64.7)
OSM more detailed than Meridian 2 26 041 (21.0)
Total 123 714
a The increased number of empty cells is due to the removal of cells that contain OSM
information on footpaths and similar features.
694 M Haklay
4.5 Social justice and OSM dataset
Another measure of data collection is the equality in the area where the data are
collected. Following the principle of universal service, governmental bodies and organisations
such as the Royal Mail or the OS are committed to providing full coverage of
the country, regardless of the remoteness of the location or the socioeconomic status
of its inhabitants. As OSM relies on the decisions of contributors about the areas
where they would like to collect data, it is interesting to evaluate the level at which
deprivation influences data collection.
Total length difference (m) < ÿ1
ÿ1 .:: 1
> 1
Figure 5. Length difference between OpenStreetMap and Ordnance Survey Meridian 2 for the
London area, including attributes indicating categories comparable with Meridian.
OpenStreetMap quality analysis 695
0 10 km
Figure 6. Overview of the completeness of the OpenStreetMap and Ordnance Survey Meridian 2
comparisons across five areas in London.
Figure 7. Highgate, London. The light-coloured lines are OpenStreetMap features and the circles
and rectangles indicate omissions.
696 M Haklay
For this purpose, the UK government's ID 2007 was used. ID 2007 is calculated from a
combination of governmental datasets and provides a score for each SOA in England,
and it is possible to calculate the percentile position of each SOA. Each percentile
point includes about 325 SOAs. Areas that are in the bottom percentiles are the most
deprived, while those at the 99th percentile are the most affluent places in the UK.
Following the same methodology that was used for completeness, the road datasets
from OSM and from Meridian were clipped to each of the SOAs for the purpose of
comparison. In addition, OSM nodes were examined against the SOA layer. As
figure 8 shows, a clear difference between SOA at the bottom of the scale (to the left)
and at the top can be seen. While they are not neglected, the level of coverage is far
lower, even when taking into account the variability in SOA size. Of importance is the
middle area, where a hump is visibleöthis is due to the positioning of most rural
SOAs in the middle of the ranking and therefore the total area is larger. However,
nodes provide only partial indication of what is actually captured. A more accurate
analysis of mapping activities is to measure the places where OSM features were
collected. This can be carried out in two ways. Firstly, all the roads in the OSM dataset
can be compared with all the roads in the Meridian dataset. Secondly, a more detailed
scrutiny would include only lines with attributes that make them similar to Meridian
features, and would check that the name field is also completedöconfirming that a
contributor physically visited the area as otherwise they would not be able to provide
the street name. The reason for this is that only out-of-copyright maps can be used as
an alternative source of street names, but they are not widely used as the source of
street names by most contributors. Contributors are asked not to copy street names
from existing maps, due to copyright issues. Thus, in most cases the recording of a
street name is an indication of a physical visit to the location. These criteria reduce the
number of road features included in the comparison to about 40% of the objects in
the OSM dataset. Furthermore, to increase the readability of the graph, only SOAs
that fall within one standard deviation in area size were included. This removes the
hump effect of rural areas, where the SOA size is larger, and therefore allows for a
clearer representation of the information (see figure 9).
Notice that, while the datasets exhibit the same general pattern of distribution as in
the case of area and nodes, the bias towards more affluent areas is clearer, especially
between places at the top of the scale. As table 5 demonstrates, at the bottom of the
ID 2007 score the coverage is below the average for all SOAs and in named roads there Number of nodes (thousands)
80
70
60
50
40
30
20
10
1 11 21 31 41 51 61 71 81 91
4000
3500
3000
2500
2000
1500
1000
500
0
Area (km2)
Nodes
Area
Index of deprivation 2007 percentile
Figure 8. Number of nodes and areas by Super Output Area (SOA) and the Department of
Communities and Local Government's Index of Deprivation 2007 percentile. The area of each
percentile, about 325 SOAs, is summed in km2 (right-hand side); number of nodes is on the left.
OpenStreetMap quality analysis 697
is a difference of 8% between wealthy areas and poor areas. This bias is a cause of
concern as it shows that OSM is not an inclusive project, shunning socially marginal
places (and thus people). While OSM contributors are assisting in disaster relief and
humanitarian aid (Maron, 2007), the evidence from the dataset is that the concept of
`charity begins at home' has not been adopted yet. This indeed verifies Steve Coast's
declaration that ``Nobody wants to do council estates. But apart from those socioeconomic
barriersöfor places people aren't that interested in visiting anywayönowhere else gets
missed'' (GISPro, 2007, page 20).
Significantly, it is civic-society bodies such as charities and voluntary organisations
that are currently excluded from the use of the commercial dataset due to costs. The
areas at the bottom of the ID 2007 are those that are most in need of assistance from
these bodies, and thus OSM is failing to provide a free alternative to commercial
products where it is needed most.
5 Spatial data quality and VGI
The analysis carried out has exposed many aspects of the OSM dataset, providing an
insight to the quality of VGI in general. The most impressive aspect is the speed at
which the dataset was collectedöwithin a short period of time, about a third of the
area of England was covered by a team of about 150 participants with minor help from
over 1000 others. The availability of detailed imagery is crucial, as can be seen from the
impact of high-resolution Yahoo! imagery (figure 4). The matrix of positional accuracy
shows that OSM information is of a reasonable accuracy of about 6 m, and a good Percentage of Ordnance Survey Meridian 2 coverage
90
80
70
60
50
40
30
20
10
0
1 11 21 31 41 51 61 71 81 91
OSM (all roads)
OSM (named roads
only)
Index of Deprivation 2007 percentile
Figure 9. Percentage of Ordnance Survey Meridian 2 coverage for OpenStreetMap (OSM) (all
roads) and OSM (named roads only) by the Department of Communities and Local Government's
Index of Deprivation 2007 percentile.
Table 5. Average percentage coverage by length in comparison with Ordnance Survey Meridian 2
and the Department of Communities and Local Government's Index of Deprivation 2007
(ID 2007).
ID 2007 percentile All roads (%) Named roads (%)
1 ± 10 (poor) 46.09 22.52
91 ± 100 (wealthy) 76.59 30.21
Overall 57.00 16.87
698 M Haklay
overlap of up to 100% of OS digitised motorways, A-roads, and B-roads. In places
where the participant was diligent and committed, the information quality can be,
indeed, very good.
This is an interesting aspect of VGI and demonstrates the importance of the
infrastructure, which is funded by the private and public sector and which allows
the volunteers to do their work without significant personal investment. The GPS
system and the receivers allow untrained users to acquire their position automatically
and accurately, and thus simplify the process of gathering geographic information. This
is, in a way, the culmination of the process in which highly trained surveyors were
replaced by technicians through the introduction of high-accuracy GPS receivers in the
construction and mapping industries over the last decade. The imagery also provides
an infrastructure functionöthe images were processed, rectified, and georeferenced by
experts and, thus, an OSM volunteer who uses this imagery for digitising benefits from
the good positional accuracy which is inherent in the image. So the issue here is not to
compare the work of professionals and amateurs, but to understand that the amateurs
are actually supported by the codified professionalised infrastructure and develop their
skills through engagement with the project.
At the same time, the analysis highlights the inconsistency of VGI in terms of its
quality. Differences in digitisationöfrom fairly sloppy in the area of Highgate, to
consistent and careful in South Norwoodöseem to be part of the price that is paid
for having a loosely organised group of participants. Figure 2 shows a range of
performances in the comparison using A-roads and B-roads, and there is no direct
correlation between the length of the road and the quality of its capture.
This brings to the fore the statement that Steve Coast has made, regarding the need
to forgo the concept of completeness. Indeed, it is well known that because of update
cycles, cartographic limitations such as projections and a range of other issues are
leading to uncertainty in centrally collected datasets such as those created by the OS
or the United States Geological Survey (Goodchild, 2008). Notice, for example, that,
despite the fact that Meridian is generalised and does not include minor roads, this
does not diminish its usability for many GIS-analysis applications. Moreover, with
OSM, in terms of dealing with incompleteness, if users find that data are missing or
erroneous, they do not need to follow a lengthy process of error reporting to the data
producer and wait until it provides a new dataset; rather they can fix the dataset
themselves and, within a short period of some hours, use the updated and more
complete dataset (see OSM, 2008).
Yet, there are clear differences between places that are more complete and areas
that are relatively empty. A research question that is emerging here is about the
usability of the informationöat which point does the information become useful for
cartographic output and general GIS analysis? Is there a point at which the coverage
becomes good enough? Or is coverage of main roads in a similar fashion to Meridian
enough? If so, then OSM is more complete than was stated aboveölikely to be at
about 50%. These are questions that were partially explored during the 1980s and 1990s
in the spatial data quality literature, but a more exploratory investigation of these
issues for VGI is necessary.
Another core question that the comparison raises, and which Goodchild's (2008)
discussion is hinting at, is the difference between declared standards of quality, and
ad hoc attempts to achieve quality. In commercial or government-sponsored products,
there are defined standards for positional accuracy, attribute accuracy, completeness,
and other elements that van Oort (2006) listed. Importantly, the standard does not
mean that the quality was achieved for every single objectöfor example, a declaration
of positional accuracy of a national dataset that it is within 10 m on average from its
OpenStreetMap quality analysis 699
true location means that some objects might be as far as 20 m from their true location. All
we have is a guarantee that, overall, map objects will be within the given range and this is
based on trust in the provider and its quality-assurance procedures.
In the case of VGI, there is clear awareness of the quality of the information: for
example, the Cherdlu (2007) presentation on quality in the first OSM community
conference; or the introduction in June 2008 of OpenStreetBugsöa simple tool that
allows rapid annotation of errors and problems in the OSM dataset; and the development
by Dair Grant, an OSM contributor, of a tool (refNum software) to compare
Google Maps and OSM to identify possible gapsösimilar to the process that was
described in section 4.4. However, in the case of OSM, unlike Wikipedia or Slashdot
(a popular website for sharing technology news), there is no integrated quality-assurance
mechanism that allows participants to rate the quality of the contribution of other
participants. This means that statements about accuracy, such as the one discussed
here, come with a caveat. Looking at figure 2, the following statement can be formulated:
`you can expect OSM data to be with positional accuracy of over 70%, with
occasional drop down to 20%'. In terms of overall quality, this might lead to results
that are not dissimilar to commercial datasets, apart from a very significant difference:
while our expectation from the commercial dataset is that errors will be randomly
distributed geographically, sections 4.2 and 4.4 highlighted the importance of the
specific contributor to the overall quality of the information captured in a given
area. Therefore, the errors are not randomly distributed. This raises a question about
the ways in which it is possible to associate individual contributors with some indication
of the quality of their outputs. Another interesting avenue for exploration is
emerging from the principle of Open Source software development, which highlights
the importance of `given enough eyeballs, all bugs are shallow' (Raymond, 2001,
page 19). For mapping, this can be translated as the number of contributors that
worked on an area and therefore removed `bugs' from it. Is this indeed the case?
Are areas that are covered by multiple contributors exhibiting higher positional and
attribute quality?
The analysis also highlighted the implications of the digital and social divide on
VGI. Notice the lack of coverage in rural areas and poorer areas. Thus, while Goodchild
(2007b, page 220) suggested that ``the most important value of VGI may lie in what it
can tell about local activities in various geographical locations that go unnoticed by the
world's media, and about life at a local level'', the evidence is that places that are
perceived as `nice places', where members of the middle classes have the necessary
educational attainment, disposable income for equipment, and availability of leisure
time, will be covered. Places where population is scarce or deprived are, potentially,
further marginalised by VGI exactly because of the cacophony created by places which
are covered.
There are many open questions that this preliminary investigation could not cover.
First, within the area of spatial data quality there are several other measures of quality
that were mentioned in section 2, and that are worth exploring with VGI, including
logical accuracy, attribute accuracy, semantic accuracy, and temporal accuracy. The
temporal issue is of special interest with VGI; due to the leisure-activity aspect of the
involvement in such projects, the longevity of engagement can be an issue as some
participants can get excited about a project, collect information during the period
when the map is empty, and then lose interest. OSM is still going through a period of
rapid growth and the map is relatively empty, so this problem has not arisen yet.
However, the dataset allows the exploration of this issue and there is a need to explore
the longevity of engagement. It is important to note that many other commons-based
peer-production projects are showing the ability to continue to engage participants
700 M Haklay
over a long periodöas shown by the Apache web server, which has been developing
for almost fifteen years, or Wikipedia, which continues to engage its participants after
eight years.
This preliminary study has shown that VGI can reach very good spatial data
quality. As expected, in this new area of research, the analysis opens up many new
questions about qualityöfrom the need to explore the usability of the data, to the
consistency of coverage and the various elements of spatial-data quality. Because the activity is carried out using the Internet, and in projects like OSM the whole
process is open to scrutiny at the object level, it can offer a very fruitful ground for
further research. The outcome of such an investigation will be relevant beyond VGI, as
it can reveal some principles and challenge some well-established assumptions within
GIScience in general.
Acknowledgements. The author expresses thanks to Patrick Weber, Claire Ellul, and especially
Naureen Zulfiqar and Aamer Ather who carried out part of the analysis of motorways as part of
their M Eng project. Some details on geographical information quality are based on a report on
the Agency Performance Monitor that was prepared for the OS in 2005. Some early support for
OSM research was received from the Royal Geographical Society Small Research Grants programme
in 2005. Apologies are expressed to the OSM system team for overloading the API server
during this research. The author is grateful to Steven Feldman, Steve Coast, Mike Goodchild,
Martin Dodge, and four anonymous reviewers who provided comments on an earlier version
of this paper. Thanks to the OS External Research and University Liaison who provided the
Meridian 2 dataset for this study in January 2008. The 1:10 000 raster and the SOA boundaries
were provided under EDINA/Digimap and EDINA/UKBorders agreements. ID 2007 was downloaded
from DCLG and is Crown copyright. All maps are Ordnance Survey ß Crown copyright
(Crown copyright/database right 2008, Ordnance Survey/EDINA supplied service, and Crown
copyright/database right 2008). OSM data were provided under Creative Common and attributed
to OSM. All rights reserved.