Michael Townsley

Page 1

Making Crime Analysis More Inferential Dr Michael Townsley School of Criminology and Criminal Justice, Griffith University

21–27 April 2014 / International Summit On Scientific Criminal Analysis

1 / 35


Outline

Defining What Analysis Is

Five principles of statistical reasoning

Three strategies to avoid errors

2 / 35


What Do We Mean by Analysis? Analysis is not simply descriptive. It must include some component of reasoning, inference or interpretation. Regurgitating numerical values or summarising the situation is not analysis Need a system for doing this, comprising: • Appropriate theory • Methods to generate and test hypotheses

A system will allow you to generate knowledge about the criminal environment.

3 / 35


Theories for Crime Analysis: Environmental Criminology

crime = motivation + opportunity • Rational choice • Routine activity • Crime pattern theory

4 / 35


Problems

Let’s acknowledge the range of factors limiting analysts from doing their work:

Organisational

Individual

Tasking Operational imperatives

Training Highly variable performance Cognitive biases

5 / 35


Humans find patterns anywhere

• Apophenia is the experience of seeing patterns or

connections in random or meaningless data. • Pareidolia is a type of apophenia involving the perception

of images or sounds in random stimuli (seeing faces in inanimate objects)

6 / 35


Seeing Faces

7 / 35


The hungry helicopter eats delicious soldiers

8 / 35


These boxes are planning something . . .

9 / 35


Cookie Monster spotted in Aisle 4

10 / 35


Outline

Defining What Analysis Is

Five principles of statistical reasoning

Three strategies to avoid errors

11 / 35


Principles of Statistics The field of statistics is decision making under uncertainty. Without being over simplistic, the entirety of statistics can be distilled into five core principles: 1

rates over counts

2

making comparisons

3

retrospective versus prospective

4

sampling bias

5

Simpson’s paradox

Because decisions in operational law enforcement need to be made with incomplete data, in imperfect conditions and under significant time pressure, a statistical approach should enable better analysis, or at least avoid common pitfalls. 12 / 35


Statistical Principle 1: Frequencies Versus Rates

• A rate is a frequency adjusted for the underlying population

at risk • Usually population based, but could be number of

properties

13 / 35


International comparisons Using data from UNDOC, police recorded assaults in Chile, US, Australia and New Zealand

2004

2006

2008

2010

Australia

200 2004

2006

2008

150

2e+05

Australia Chile NZ

0e+00

amount

NZ US

4e+05

250

6e+05

US

300

rate

8e+05

count

Chile

2010

year

14 / 35


Issues

• significance of counts for practical purposes • what is the population at risk? • relevance with respect to time (by day, seasonal patterns)

• most crimes measured at point level, but rate calculation

some level of aggregation (to streets or areas, say)

15 / 35


Aggregation effects (Safe as Houses project)

What is the population at risk? How relevant are the boundaries here?

16 / 35


Statistical Principle 2: Making Comparisons

• Crime figures are meaningless without reference to a

comparison area or some baseline crime level. • confounding issue is regression to the mean • Comparison groups can be misleading due to contextual

differences • Hot spot maps (with hot and cold areas) do not constitute

a valid comparison • need to include comparisons of the causal factor as well as

crime

17 / 35


Some connection between street network and drug arrests

Source: Eck (1997)

18 / 35


Strong relationship between poor place management and drug arrests

Source: Eck (1997)

19 / 35


Statistical Principle 3: Retrospective Versus Prospective Risks

• risk factors can be computed through different types of

studies: retrospective and prospective • retrospective studies examine a group experiencing an

outcome and examine their past • prospective studies follow a population and examine their

lifestyle and whether the outcome occurs • Many risk factors are computed using a retrospective

study, but expressed in prospective terms.

20 / 35


Sexual assaults on public transport A study claims that 80% of sexual assaults take place on the public transport system. The inference drawn is that there is a high chance of victimisation if you use public transport. • if victimised, there is a high chance of using public

transport (what the study says); and • if using public transport, there is a high chance of being

victimised (what get communicated). Each of these statements is a conditional probability (proportion of an event within a subsample). But here the subsamples and events are been swapped.

21 / 35


Let’s Look at Data to Make This Concrete

Public transport Not public transport Total

Victimised 80 20 100

Not victimised 10,000 11,000 21,000

Total 10,080 11,020 21,100

• if victimised (N=100), 80% used public transport

(retrospective) • if using public transport (N=10,080), almost 1% were

victimised (prospective).

22 / 35


Let’s Look at Data to Make This Concrete

Public transport Not public transport Total

Victimised 80 20 100

Not victimised 10,000 11,000 21,000

Total 10080 11,020 21,100

• if victimised (N=100), 80% used public transport

(retrospective) • if using public transport (N=10,080), almost 1% were

victimised (prospective).

23 / 35


To make a retrospective comparison

Public transport Not public transport Total

Victimised 80 20 100

Not victimised 10,000 11,000 21,000

Total 10,080 11,020 21,100

• if victimised (N=100), 80% used public transport • if not victimised (N=21,000), 48% used public transport.

24 / 35


To make a prospective comparison

Public transport Not public transport Total

Victimised 80 20 100

Not victimised 10,000 11,000 21,000

Total 10080 11,020 21,100

• if using public transport (N=10,080), almost 0.8% were

victimised. • if not using public transport (N=11,020), 0.2% were

victimised

25 / 35


Retrospective vs Prospective

1

Crime analysts will virtually always have retrospective studies, so this problem will come up

2

Make sure valid comparisons are made. Compare conditional probabilities appropriately

3

Retrospective proportions overstate the size of the risk factor.

26 / 35


Statistical Principle 4: Selection Bias

• occurs naturally whenever secondary data analysis

conducted • differences between victimisation surveys and official

statistics; various filters operating on what offences are reported to police, which get recorded. • survivorship bias

27 / 35


Spatial Selection Bias • Ratcliffe (2001) catalogues the ways that geocoding can

go wrong: • • • • • •

Out of date property parcel map Abbreviations and misspellings Local name variations Address duplication Non-existent addresses Non-addresses

• Bichler and Balchak (2007) found distinctive systematic

biases in geocoding errors in the major GIS applications. • Ratcliffe (2001) between 5 and 7% of records geocoded to

incorrect census tracts.

28 / 35


Statistical Principle 5: Simpson’s Paradox

Table: Aggregate Crime Rates for Areas 1 and 2

Total

Area 1

Area 2

7.75

7.20

Area 2 is safer than Area 1, in aggregate. It would be worth considering what examples of best practice might be transferred to Area 1. To do so, we look at different crime types.

29 / 35


Crime Rates by Crime Types

Crime Type Assault Comm. Burglary Car Theft Total

Freq

Area 1 Denom.

Rate

Freq

Area 2 Denomin.

Rate

256 178 69 503

41,250 2,800 20,850 64,900

6.21 6.36 3.31 7.75

430 30 66 526

54,000 350 18,750 73,100

7.96 8.57 3.52 7.20

Area 2 has a higher crime rate for all crimes.

30 / 35


Explaining Simpson’s Paradox

• Operates when patterns of rates (or proportions)

calculated for an entire sample are not consistent for patterns for subgroups of the data. • A result of changing denominators in crime rates and is a

result of only relying on proportions or rates as indicators of activity. • Usually a sign of a lurking variable • NOTE! This directly contradicts the First Principle listed

here.

31 / 35


Spatial Principles

Spatial data and analyses have a number of unique attributes that need to be controlled for: • modifiable area unit problem – when point level information

is aggregated arbitrary administrative boundaries • spatial autocorrelation – things are close are more similar

than distant things.

32 / 35


Outline

Defining What Analysis Is

Five principles of statistical reasoning

Three strategies to avoid errors

33 / 35


Three strategies to avoid errors

1

be more scientific. Come to my next talk!

2

employ more sophisticated methods. Upskill analysts and collaborate with researchers

3

be more focused and use crime theories. Read Eck (1997) “What do all these dots mean?� chapter.

34 / 35


Bibliography I

Bichler, G. and Balchak, S. (2007). Address Matching Bias: Ignorance Is Not Bliss. Policing: An International Journal of Police Strategies & Management, 30(1):32–60. Eck, J. E. (1997). What Do Those Dots Mean? Mapping Theories with Data. In Weisburd, D. L. and McEwen, T., editors, Crime mapping and crime prevention, volume 8 of Crime Prevention Studies, pages 377–406. Criminal Justice Press, Monsey, NY. Ratcliffe, J. H. (2001). On the Accuracy of Tiger-Type Geocoded Address Data in Relation to Cadastral and Census Areal Units. International Journal of Geographical Information Science, 15(5):473–485.

35 / 35


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.