Shaping participatory public data infrastructure in the smart city: open data standards and the turn to transparency
Introduction
One of the earliest expressions of the modern open data movement evolved out of the CitiStat analytical dashboard in Washington DC. In 2007, seeking to allow actors outside the administration work with city data, the DC chief technology officer launched data.de. gov, a data portal offering direct access to machine-readable government datasets (Tauberer, 2014, Ch. 1). Over the following decade, the idea that governments should publish their data holdings as machine-readable, freely accessible and openly licensed data has taken hold across the world at national and sub-national levels (Davies et al., 2019). Open data ideas have found particular traction in urban areas, initially connecting with a culture and practice of civic hacking (Landry’, 2019), and open data has also acted as a component of a number of smart cities programmes, both rhetorically and substantively. Scholars have been particularly interested in the potential uses of open data to support democratic engagement and collaborative models of governance in the smart city (Bartenberger & Grubmüller-régent, 2014; Goldsmith & Crawford, 2014). Yet open data communities have also been the source of a number of critical perspectives on the smart city: questioning centralisation and corporate control of urban data infrastructure, and challenging the presentation of a rationalised urban domain filled with consumers and service recipients, rather than a rich urban environment of diverse citizens, political struggles and lives only partially digitised (e.g. Sadoway & Shekhar, 2014).
One source of this dual role of open data, in both enabling and opposing smart city narratives, can be traced to the complex origins of the open data movement, which brought together both public sector information businesses and civil society activists, united by a common cause in gaining access to government data, but ultimately motivated by divergent long-term goals (Gonzalez-Zapata & Heeks, 2015; Gray, 2014). These unusual allies ranged from those seeking transparency, accountability and new forms of civic engagement (Davies, 2010; Huber & Maier-rabler, 2012; Kassen, 2013; Sieber & Johnson, 2015), to those looking to develop new business models and promote the idea of ‘government as a platform’ outsourcing many more aspects of service provision to the private sector (Gurin, 2014).
In this chapter, my goal is to explore open data-related strategies for re-asserting the role of citizens within the smart city. Such strategies are able draw in particular on the political narrative of transparency, and the role of technical standards in delivering transparency. I will outline how these two components can be used not only to secure access to data from government, but also to open up two-way communication channels between citizens, states and private providers. I will argue that a focus on opening up the data infrastructures of the smart city not only offers the opportunity to make processes of governance more visible and open to scrutiny, but it also creates a space for debate over the collection, management and use of data within governance. This can give citizens an opportunity to shape the data infrastructures that do so much to shape the operation of smart cities and of the modern data-driven policy environment.
The chapter proceeds in four parts, the first three unpacking different aspects of the title, and the forth offers a model for thinking about the relationship between transparency, open data and standards in the future development of inclusive and participatory data practice in the smart city.
Participatory public data infrastructure
Data infrastructure
Infrastructures provide the shared set of physical and organisational arrangements upon which everyday life is built. The notion of infrastructure is central to conventional imaginations of the smart city. Fibre-optic cables, wireless access points, cameras, control systems and sensors embedded in just about anything, constitute the digital infrastructure that feed into new, more automated, organisational processes. These, in turn, direct the operation of existing physical infrastructures for transportation, the distribution of water and power, and the provision of city' services. However, between the physical and the organisational lies another form of infrastructure: data and information infrastructure.
Although in the literature the term ‘information infrastructure’ is often used to cover both data and information, I use the two terms separately here to draw attention to an important analytical distinction. The General Definition of Information (GDI) describes information as ‘data + meaning’ (Floridi, 2004). Information, as the basis for human decision-making, requires data that is filtered, organised and contextualised. Data, by contrast, is, in its purest form, decontextualised: with each individual aspect of a phenomena encoded as a distinct data point, open to be re-assembled and represented as information, but also open to a range of different representations and forms of analysis. It is this re-interpretability that gives digital data its particular value. Of course, in practice a digital dataset rarely encodes all the possible variables that describe a phenomenon; instead, certain features of the world are selected for encoding and others discarded. Even with growing data storage and processing capacity', the need for this explicit or implicit selection is not avoided.
It is by being rendered as structured data that signals from the myriad sensors of the smart city, or the submissions by hundreds of citizens through reporting portals, are turned into management information and fed into human- or machine-based decision-making, and back into the actions of actuators (Dunleavy et al., 2006) within the city'. Seen as a set of physical or digital artefacts, the data infrastructure of a city involves ETL (Extract, Transform, Load) processes, APIs (Application Programming Interfaces), databases and data warehouses, stored queries and dashboards, schema, codelists and standards. Seen as part of a wider ‘data assemblage’ (Kitchin & Lauriault, 2014) this data infrastructure also involves various processes of data entry and management (Denis & Goeta, 2017; Goeta & Davies, 2016), of design, analysis and use, as well relationships to other external datasets, systems and standards. Dodds and Wells capture this by defining data infrastructure to incorporate not only data assets, such as datasets, identifiers and registers, but also the organisations and organisational processes used to provide access to those assets (Dodds & Wells, 2019)
It is, however, often very hard to ‘see’ data infrastructure. By their very natures, infrastructures move into the background, often only ‘visible upon breakdown’ (Star & Ruhleder, 1996). For example, you may only really pay attention to the shape and structure of the road network when your planned route is blocked. It takes a process of ‘infrastructural inversion’ to bring information infrastructures into view (Bowker & Star, 2000), deliberately foregrounding what has been so far the background. I will argue shortly that ‘transparency’ as a policy performs much the same function as ‘breakdown’ in making the contours of infrastructure more visible. In taking something created with one set of use-cases in mind, and placing it in front of a range of alternative use-cases, transparency allows the affordances and limitations of a data infrastructure to be more fully scrutinised. Such critical scrutiny can then feed into shaping the future development of that infrastructure. But before developing that argument further, I will first outline the extent of ‘public data infrastructure’ and the different ways in which we might understand the idea of a ‘participatory public data infrastructure’.