Keywords

1 Introduction

The data-driven transformation of policing or smart policing has not received much attention in digital government research. Current approaches have concentrated on policing using social media sources [1, 2] and, to a lesser extent, on the predictive capabilities of new tools to enable police authorities to better allocate resources [3]. New forms of data are becoming the cornerstone of policing in many more transformative ways that remain less explored. These new applications at the intersection of data and policing include crime analytics, predictive crime profiling, network analysis of criminal gangs and intelligence using a variety of new sources of data.

Beyond the emphasis on new forms of data, smart policing is a newly established concept that encompasses innovative strategies, collaborations, interagency sharing and research partnerships to exploit data-driven approaches and digital tools [4, 5]. As with business and public policy applications of big data and data analytics, smart policing presumes the integration of various sources from which raw data can be converted into meaningful and actionable information [6]. Large volumes of data offer great potential for policing, but critical questions arise as to how policing has been transformed through use of data. No review has yet synthesised the evidence on smart policing to provide an integrated perspective and establish the connections between data applications and the strategies and functions of policing. As new data-driven applications have been reshaping the landscape of smart policing quite rapidly in recent years making the need for a contemporary approach even more evident.

The systematic review presented in this paper seeks to achieve the above aims while focusing on the public management implications of smart policing. Following a more detailed background, we explain how the review was conducted and present the findings structured around four main themes that correspond to the four main sources of data in smart policing. The contribution is summarised in the form of an analytical framework that outlines the main approaches to smart policing and leads to implications for future research on the topic.

2 Conceptual Background on Smart Policing

Advanced statistical techniques and large data streams from multiple sources have long been used by police authorities. Starting from the digitisation of police records in the 1970s, most US police departments were using computers and databases for operational allocations as well as strategic decisions by the early 1990s [7]. A significant milestone of data-driven police management came in the 1990s with the use of CompStat and Crime Analysis [8, 9]. CompStat was deployed both as an accountability tool and an initial approach to operational management via problem-oriented policing that advocates the use of data to identify crime hotspots [8, 9]. While used to evaluate performance of operational commanders, CompStat raised the first issues of data quality and validity leading to changes in the practices of rigorous data collection by field officers. Following its great success to control crime in New York, CompStat was adopted throughout the US while crime analysis in the form of hotspot policing diffused throughout the western democracies by the early 2000s.

More recently, the full range of technologies associated with big data, analytics and new data sources have been introduced in the context of policing. Two major data-related developments are the automated generation of data from surveillance technologies and crowdsourced data from social networks and information sharing sites [10]. Police authorities have been building their expertise on surveillance and monitoring technologies in the wake of 9/11, for example, CCTVs, smart phones, automated number plate readers, dash-board cameras and body-worn video cameras [11, 12]. These real-time sensors can continuously collect large amounts of data in urban environments on the movement of people, traffic flows and violations and changes in environmental indicators like heat, sound or pollution [6].

Crowdsourced data from open information sources – mainly social media – have resulted in new sources of big data streams of text, photos and videos that can be translated into factual information as well as collective opinions, emotions, perceptions and reactions regarding any place, person, current and future events [13]. For example, over 18 terabytes of data were collected and analysed by the French police in the 2015 Paris attacks [14]. In the 2011 riots in the UK, social media data were used as an ample and cost-effective source of intelligence during extensive public disorder [1, 15].

These and several more critical events and case studies have demonstrated the practical relevance of data in policing. However, many questions remain about how large volumes of new data sources have changed policing. One of the reasons is that the literature on policing is fragmented either across the different data applications or policing approaches and terminologies like hotspot policing, problem-oriented policing, intelligence-led policing and predictive policing [16]. Other reviews related to the topic have focused on the emerging area of predictive policing [3] or the use of social media by police authorities [2]. We aim to further develop the concept of smart policing as the synthesis of how policing is changing with smart use of data and assess the whole range of applications and potential impacts of data in policing. To synthesise the literature and provide new insights, we view data applications across the core functions of policing that have been established in the relevant literature as: (1) order maintenance (suppressing riots, preserving public peace and protecting morals); (2) law enforcement (crime prevention and investigation, traffic enforcement, stop-and-frisk operations, raids, patrols, arrests, searches and seizures etc.); and (3) service provision (police officer working as disaster manager, probation worker, social worker, school worker, security manager etc.) [17].

3 Methodology

The systematic review was conducted to collate the empirical evidence surrounding different approaches to data and policing in a systematic, transparent and auditable manner [18]. Searches were conducted in the three most prominent social science databases (EbscoHost, Scopus and Web of Science) using a combination of domain and concept terms. The concept terms covered data-related keywords while the domain terms ensured the relevance of the articles to policing. The following terms were used to search within title, abstract and article keywords: (police OR policing) AND (smart OR intelligence OR algorith* OR predictive OR analytics OR “data science” OR “data driven” OR “big data” OR AI OR “social media”). The search was limited to peer-reviewed academic articles published in English within 2009–2019.

A total of 4,502 articles were initially obtained and reduced to 2,856 after removing duplicates across the databases. A further 2,366 articles were manually removed after the titles and abstracts were screened for relevance to use of data in policing. The remaining 490 articles were further screened on reading the full text with the following criteria: (1) clear focus on data uses within police departments or agencies (2) at least some empirical data or application context provided – studies relying only on conceptual and theoretical foundations were excluded (3) studies focusing only on social media communication and digital engagement without direct relevance to data aspects were excluded. The final selection included 112 articles for the full review.

Initially, the articles were analysed structurally on the basis of author, title, year, journal, context, article type (empirical or conceptual), area of research focus, research question, methods and level of analysis. Then the articles were openly coded with NVivo 12 for: sources of data, analytical techniques (including applications and products), policing activities and outputs. Themes were compared and grouped as follows: (1) sources of data under Kitchin’s [19] typology, (2) analytical techniques were inductively grouped under roles to policing (3) policing activities under Braga & Schnell’s [20] intervention dimensions (environmental, enforcement and community outreach) and (4) outputs under core policing functions (order maintenance, law enforcement, service provision) [17]. These four analytical dimensions were integrated to develop the framework shown in Fig. 1 and identify research gaps as discussed in the next sections.

4 Findings

Data-driven policing has recently sparked major interest with 40% of the articles appearing in 2018 or 2019. The articles were published in various academic disciplines mostly concentrated in policing and criminology journals (53) followed by sociology and culture (22), information technology and information systems (16), public policy and administration (3) and others (18). The majority of application contexts are concentrated in the US (43 articles) followed by the UK (19), Europe (18), Canada (10), Australia (6) and Asia/Africa (6). The studies include qualitative (71) and quantitative (41) research methods with over half of the latter (23) using data from the US.

The framework of smart policing shown in Fig. 1 maps the different dimensions starting from the three types of data in policing. Directed data are intentionally captured either by police officers (crime and disorder records, offender profiles, service calls, community intelligence, warrants and summons) or by other public and private organizations (weather patterns, socio-demographics, lighting, infrastructure conditions, school schedules, traffic patterns, emergency calls). Automated data are inherently collected by a device or system used by police or public like smart CCTVs, police body worn cameras, dash-board cameras, automatic number plate readers (ANPR), smart phones, smart cards, environmental sensors etc. [6]. Crowdsourced data are created through open source devices, platforms or systems with the greatest volume generated by social networking sites like Facebook, Twitter and Instagram [13].

Fig. 1.
figure 1

Framework of smart policing

Various analytical approaches have been adopted for information extraction from these data for crime construction, crime sensing, automated surveillance and automated tension monitoring as explained below and summarized in Table 1. These emerging “smart roles” of data are transforming the policing functions of law enforcement and order-maintenance and can have positive as well as adverse impacts on public confidence. Surprisingly, the role of data vis-à-vis third core function of service provision remains unexplored in literature and is, accordingly, not reflected in the framework.

4.1 Crime Construction

Crime construction employs directed data to proactively visualize crime incidents and enable prioritising targets and reallocating resources for crime prevention. We distinguish two main applications based on exploratory (e.g. spatial analysis, time-series analysis and graph theory) and predictive analytics (machine learning and other forecasting techniques) in accordance with policing literature [16].

Exploratory analytics include crime mapping, criminal profiling and gang analysis. Exploratory crime mapping uses primary directed data (reported crime and disorders events or calls for service) to map out areas into hotspots with high victimization risk or crime density [21]. Exploratory profiling employs the primary offender data to identify the most harmful offenders to create deterrence by conditioning police surveillance and arrest efforts. For example, Philadelphia police estimates the harm scores of violent crime offenders by aggregating the weighted scores of static factors like gravity of offence, severity of sentence and time decay [22]. In the case of gangs, network analysis is a better approach to identify influential targets, for instance, using capital scores and other network metrics [23]. However, network analysis suffers on account of missing links arising from untraced and unreported cases.

Our knowledge of the effectiveness of exploratory analytics has been reported based on aggregate crime statistics. Massarotti’s [41] longitudinal and Rydberg’s [42] quasi experimental evaluations of crime mapping approaches to patrolling found no significant decrease in property crime (burglary, motor vehicle theft and robbery) and violent crime respectively. A range of other studies have reported decreases in violent crime, property crime, gun violence, services calls and other indicators at varying levels [20, 43, 44]. However, these studies have not yet applied more robust methods like balanced random design and their effects are usually reported only in hotspot areas.

Predictive analytics include predictive crime mapping and profiling. Predictive crime mapping applies forecasting models on micro-locations of crimes called prospective or dynamic hot spots [24]. The commonly used techniques are near repeats models (spatio-temporal features of crime data) and risk terrain model (spatial risk zones using socio-demographic features) [25]. Mapping and database extraction tools that apply these techniques have usually developed as an outcome of police-private collaboration (e.g. PredPol, PRECOBs-Enterprise, HunchLab, BlueCRUSH) [25, 26].

Table 1. Summary of new roles of data in smart policing

Predictive profiling incorporates static and dynamic risk factors to predict future offending tendency of a person by calculating their predictive risk scores. Such automated risk-assessment tools include ProKid (Dutch police), HART (Durham Constabulary, UK) and Strategic Subjects List (Chicago police) [27,28,29].

Evaluations of predictive mapping and profiling have shown mixed effects [25] Mohler’s [24] found a 7.4% reduction in violent and property crime in predicted hotspots as compared to exploratory hotspots through directed patrolling. Brantingham [45] found no evidence of racial bias in arrest patterns while Saunders [29] found a 6.8% increased crime risk in enforcement intervention of Strategic Subjects List. Besides, the literature posits an increase in police craft due to increased citizen interaction through directed patrolling in previously untraversed areas [46].

4.2 Crime Sensing

Crime sensing involves the proactive gathering of community intelligence from public and social media to develop signals of crime and disorder to design appropriate strategies through resource reallocation and involvement of other stakeholders beyond relying on public complaints.

Signal crimes perspectives (SCP) employs community intelligence to develop an area signal profile based upon the crime and disorder events affecting public perceptions [30, 31]. The data are proactively collected by computer-assisted personal interviewing software (like i-NSI) and geocoded onto the neighbourhood maps to develop area signal profiles comprising crime, fear-and-avoidance and societal-tension hotspots [30, 31]. Using SCP, the Lancashire police (UK) detected signals of antisocial behaviour, tensions between residents groups and a range of other violations [30]. The Sutton police’s (London) use of SCP-based spatio-temporal reallocation of foot and vehicle patrols resulted in 28% decrease of signal events in the risk perception hot spots [31].

Social media users also act as sensors of crime while behaving as victims, witnesses or offenders. An analysis of London-tagged Twitter data in 2013–2014 revealed that the tweet frequency and broken windows indicators were significantly associated with reported burglary, criminal damage and violence [47]. The ePOOLICE system is another example of identification of crime patterns of human smuggling and drug trafficking through integration of police and social media data [32].

4.3 Automated Surveillance

Automated surveillance applies data mining tools and visual analytics to automated and crowdsourced data for law enforcement through real-time monitoring and for prosecution through generation of new forms of evidence. The literature has identified four new approaches of automated surveillance:

Sensor alerting uses video analytics, optical character recognition and acoustics correlations for generating alerts from CCTVs, environmental sensors, ANPRs and shotspotters [6]. Using sensor alerting, the Camden Police (New Jersey, USA) managed to reduce response times by half and overall crime by 40% between 2013 and 2014 [33]. The use of smart CCTV for traffic enforcement in Shanghai significantly increased seat belt compliance from 60.8% to 84.9% between 2015 and 2017 [34].

Automated pattern recognition produces time-and-place and routing patterns which help predict the future locations of suspected or under-watch vehicles [6]. This has helped New York police to trace court absconders, prevent kidnappings and arrest chronic robbers red-handedly [6, 35].

Radio-cell analysis is used to mine mobile phone connection data to identify suspicious connections involved in rioting and heinous crimes (murder and kidnapping) [36]. Human investigators further probe these suspicious connections for establishing links with crime or rioting. The German police used one such application to sort out and charge 379 suspicious individuals from 153,622 connections during the Dresden riots (2011) [36].

Social mapping involves extraction and analysis of social media data related to a suspected individual, group or event during criminal investigations. Data is extracted through manual searching, automated searching (through web crawlers like NiceTrack, EU Virtuoso, RIOT), lawful interception (using deep pocket inspection) and targeted interception (installing Trojan on targeted device) [32, 37]. The data are analysed through: temporal mapping (like Facebook timeline); facial mapping (photos and images in conjunction with facial recognition technology using data mining tools); relational mapping (connection and communication patterns); and sentiment mapping (the language in posts and responses) [37]. For instance, the Vancouver police collaborated with Insurance Corporation of British Columbia to use facial recognition technology for online identification of suspects involved in riots (2011) [37].

4.4 Automated Tension Monitoring

Automated tension monitoring applies tension monitoring analytics to crowdsourced data, mainly social media, for assessing communal tension level and identification of groups and actors involved in sparking social unrest and hatred.

Tension monitoring analytics is a combination of data mining tools (like natural language processing-NLP) and machine learning techniques (like double deep learning and formal concept analysis) [13, 38,39,40]. The Cardiff Online Social Media Laboratory (COSMOS), an online tension monitoring engine, uses NLP techniques of membership categorization analysis and conversation analysis, measured the community tension following a racial abuse incident between two football players, Suarez and Evra (2011), with an overall accuracy of 0.87 and spike accuracy of 0.97 [12]. Other data mining tools include Naives Bayes and Support Vector Machines that are not able to consistently detect tension spikes [38]. ATHENA, an European crisis management system, uses formal concept analysis for clustering of data and NLP for measuring the sentiment score as positive, negative or neutral [40]. HaterNet, a Spanish hate speech detection system, uses double deep learning for clustering, NLP for classification of tweets and social network analysis for pattern, actor and community identification through word clouds, ranked graphs and clustering respectively [39]. EMOTIVE, the UK project, uses NLP to assess the mood of nation from Twitter data [32].

5 Discussion and Conclusion

The systematic review substantiates the significance of data as central force in smart policing, highlights the major analytical approaches and outlines their smart roles. Together with the framework, the review helps to confine the opaque smart policing terminology to more focused processes (data analytics, smart functions and multi-pronged interventions) leading to outputs (law enforcement and order-maintenance) and outcomes (public confidence and collective efficacy). The review further consolidates the literature across existing multidisciplinary studies that have tackled the data aspects of policing in various ways and methods.

Data appears to be the real precursor of major shift from reactive policing (arrest and random patrolling) to proactive policing (directed patrolling, deterrence, target hardening and community outreach) through efficient reallocation and utilization of resources for law enforcement and order-maintenance thereby improving public confidence, collective efficacy and police craft. In parallel, automated and crowd-sourced data are spurring a radical shift from streel level policing to screen-level policing [48]. The emergence of various applications is pushing towards system-level policing though currently their use is limited to assist human decision making [6, 48].

The findings further reveal how smart policing is transforming police authorities as public management organisations although our knowledge of these aspects is too limited. New roles and skillsets within police departments have not been explored because of issues like deputing of lackadaisical sworn officers as analysts, cultural repulsion of civilian analysts, obscurity of analysts’ roles, police managers’ inadequate analytical skills, limited training opportunities etc. [46, 49]. Moreover, computers and smart technologies are expensive and upgrading drains the budgets. For instance, 85% of London Metropolitan police’s ICT budget in 2012 was consumed in maintenance of 40-year-old computers and redundant ICT systems making patrol officers dependent on radio calls for on-field verifications [5]. Also, data incompleteness, accuracy, overload and sharing issues persist despite the emergence of fusion centres in US [50]. The precision of crime construction is dependent upon liberal public reporting often thwarted by economic risks associated with hot areas [51]. Human interpretation is critical as crime construction and sensing do not explain underlying social complexities and are being theoretically challenged on ethnic and racial grounds though literature fails to support empirically [2, 51].

The review has certain limitations associated with coverage as it focuses solely on the data dimensions of core policing functions to overcome the opacity of smart policing. It does not fully capture the police management practices involved in translating the data-related processes into practice in the real-world scenarios.

Several gaps and limitations in the literature can be identified from the review as the areas can attract more attention by digital government and public management researchers. Current impact evaluations lack rigorous research designs because of unbalanced designs, non-random assignments, dearth of longitudinal studies, omission of displacement effects and use of generic measures like service calls and property crime. Most evaluation studies have also been conducted in the US or used US-based data with additional evidence coming almost exclusively from the developed world. This offers a unique opportunity for further investigations as police authorities in many developing countries tend to face extreme budget constraints and low police-public trust.

In terms of connecting smart policing to established police approaches and strategies, the framework provides a point of reference for future research. While automated surveillance and tension monitoring are well-known applications, no formal empirical evaluations have yet examined their effects on law enforcement and order-maintenance while the literature argues that they raise issues of fundamental rights like civil liberties, privacy and proportionality [32]. The smart use of data vis-à-vis service provision remains less explored despite the fact that earlier research found that about one-third of street-level work is occupied with providing assistance to weak, ill, drunk and vulnerable citizens, and dealing with emergency (like fire, tree down, accidents) and lost person or property [17]. Finally, there is a dearth of empirical studies exploring the impact of new forms of data on the working styles and discretion of street level officers and their coping strategies. Smart policing approaches might be legitimising police discretion or even possibly increasing their accountability while police managers are faced with trade-offs of resource diversion and screen-level versus street-level policing [46].