Page 1



BIG DATA SAVING THE WORLD Big Data is making an impact on more than just proots, we look at how it’s saving the world


Editor’s Letter

Letter From The Editor Welcome to this issue of Big The use of data and analytics in policing has been well Data Innovation. During the past few years, the reported, but is it all going to major developments in Big be minority report or is there Data have revolved around more to it than that? We inthe growth in revenue and vestigate the wider uses of the increased understanding data in international policing.

NYFD is the second largest fire department in the world, we take a look at how they are utilising new data in order to improve their performance In reality, the scope for Big and keep New York as safe as Data improvements go well possible. beyond those remits and have In addition to these stories, a wider societal potential. In we also talk to Gregory Shapthis issue we are looking at iro, one of the faces of the the ways in which Big Data is data revolution and Editor at being used to help with these KDNuggets, about data priaspects. In essence, we are vacy. Harriet Connolly also looking at the ways that Big looks at the infamous hack of Target where over 110 million Data is saving the world. We look at how Big Data is pieces of customer data was being used to help in animal revealed. of customer behaviour. The idea being that Big Data is predominantly a business activity, designed to improve the performance of a company.

conservation, with the partnership between HP and Conservation International now beginning to show results, Chris Towers investigates how this is helping animal populations.

If you are interested in contributing to the magazine or if you have any feedback, please get in touch.


Managing Editor George Hill Assistant Editors Simon Barton President Josie King Art Director Gavin Bailey Advertising Hannah Sturgess

Contributors Heather James Chris Towers Harriet Connolly

General Enquiries

George Hill

With the level of destruction Managing Editor that civil wars and natural disasters can cause, Simon BarFor advertising opportunities ton looks at how those aiming please contact Hannah at to bring aid are utilising data to make their work more effective.



4 10 16 21 27 33

We talk to Gregory Shapiro, Editor at KDNuggets, about the issues surrounding data privacy today Harriet Connolly takes a look at the Target hack where over 110 million pieces of customer information was hacked How is Big Data being used to help in animal conservation? We take a look at the Earth Insights Project to find out As we see the effects of natural disasters and civil war, Simon Barton investigates how Big Data is helping with relief Crime fighting using Big Data is never going to be like Minority Report, but we see how it is being used As the second largest fire department in the world, we take a look at how data is being used to fight fires in New York


We are always looking for new contributors, if you have an interesting idea or passion for a subject, please contact George at



Data Privacy

Data Security: A Talk With Gregory Shapiro George Hill Managing Editor

Data Privacy

As a key protagonist in the Big Data movement, Gregory Shapiro’s website KDNuggets has become one of the most influential places to find new information and developments within data. As a prominent figure in the data revolution, he has seen himself and his website place on Forbes’ ‘Top Influencers in Big Data’.

with Gregory, one of the main elements that always comes across is his implicit knowledge of current trends, from general media coverage through to in depth scholarly work. It is this kind of thorGregory was ahead of oughness that has the times when he crekept KDNuggets at ated KDNuggets (which the centre of Big stands for knowledge data nugData communities gets) and has had his finger on for the past decthe pulse of all things data for ade and this same the past 10 years. I was interlevel of detail is still ested in his ideas about how this in evidence today. has changed data privacy and discussion the effects that it will have in the No on the ethics of future. data privacy toWhen discussing these matters, day would be reldespite being one of the faces of evant without the the Big Data revolution, GregoNSA scandal, and ry manages to adopt a stance Gregory tells me that is impartial and informed. that some recent work by stuHe recognises that many people dents at Stanford University has look at Big Data and the aspects proven that through the use of of data privacy that surround it metadata it is possible to idenas a way forward in humanity, tify medical conditions, finana greater way of understanding cial and legal connections, and society as well as the aspects of even if somebody owns a gun. technology that will push peoThat seemingly ‘safe’ data can ple forward. However, on the be used in this way shows how flip side he knows that there is much information can be gatha level that needs to be reached ered on you without needing to to balance privacy with useable type it in to Facebook or being data. aware of entering it on any othWhen exploring these concepts er website.



Data Privacy

As this kind of information can be gleamed from sources where only the metadata is seen, this could have a major impact on another emerging trend according to Gregory; The Internet of Things.

trackers, for instance, are worn frequently to track the general health of an individual as well as their levels of movement and the kind of activities that they perform. Some more advanced models even allow for heart With the levels of information rates to be monitored throughthat people are relatively aware out the day. This is the kind of of (people know that when they information that in the wrong send texts or make phone calls hands can be more sensitive that regardless of who see’s than knowing what you like to it, data is being created) when buy or the kind of places you like people create data that they to go, it is tracking your life. are not conscious of, it provides According to Gregory the use information that is potential- and storage of this kind of inly more corruptible and shows formation needs to be closemore about an ly monitored as well as used in individual. ethical ways. If the information Fitness was somehow made public through unsecured storage then the chances of identification of things like medical issues and information from an individual’s private life could easily be found. I was also interested in hearing Gregory’s opinions on the news that part of Google’s recent acquisition of Deepmind dictated that they start an ethics board that had powers to stop any project that was deemed unethical. One of the interesting points that he made was that one of the primary uses of this board would be a tangent to the actual regulation of Google and Deepmind, instead it would allow for

Data Privacy

ful government collection as opposed to wider data gathering activities. He even believes that the kind of information that these companies are not allowed to ask (race, sexual preferences, credit score etc) can be found out through the use of other information held on the individual. We are seeing companies holding more and more information Gregory believes that most ma- on their customers and even jor companies currently don’t their potential customers, Gregcare about ethics, but the mi- ory believes that to increase nority of these companies can trust ‘Big data requires transhave a major impact. For in- parency. Companies should alstance, the US has systems in low people to see what compaplace that allow legal action to nies know about them and give be taken against the majority people some benefit from their if only a small minority actual- data - this will reduce potential ly voice a concern. A classic ex- conflict’. ample of this being when AOL One of the elements that this released anonymous data, that requires and one that Gregory is unintentionally allowed for one keen to discuss, is the security of person in a huge data set to data once it has been collected. be identified. Although this was We are seeing that despite just a single individual in a silo of the majority of data being thousands, the PR received was securely held, there are several high profile examples of where a disaster. This kind of unintentional data this hasn’t been the case. For leak is of interest in other areas, instance, Target’s well publicised and I was curious to know if the data breach where over 40 NSA revelations would have an million customers had credit effect on the ways in which com- card information and passwords panies collect and store data in hacked and shared. a body to define ‘what is evil?’. Although this may seem like a relatively ambiguous phrasing, it comes from Google’s overall mantra on the subject which is ‘don’t be evil’, after all, if there is no definition of what evil is, then how can people avoid it? It is the attempt to make technological morality universal rather than subjective.

the future. Gregory believes that it will have very little effect on commercial companies such as Facebook and Google. The kind of limitations that the Obama administration are putting into effect will hamper only unlaw-

With this kind of well publicised data loss, I wanted to get Gregory’s opinion on whether this would be a catalyst that would see consumers stop sharing information as freely. He believes that this isn’t the case. The re-



Data Privacy

As a leader in this field with a heavyweight voice in these matters and one of the most effective vehicles for communicating it to the data community, it is good to see that Gregory is taking these issues seriously, whilst actively encouraging others to do the same. He not only discussed this information with me but later sent me an email outlining many of the key points complete with references to many of the scholarly and popular articles that he discussed. It is an attention to detail that defines not only himself but the wider data community and it is this level of detail that will need to be looked at when One of the key points that ran moving these issues through the entire conversation forward. with Gregory was that despite the recent furore regarding the use of data and it’s protection, the levels of sharing will continue to increase. On discussing the NSA revelations, it is clear that changes need to be made in the way that some data and personal information is collected, but in reality this will be at national governmental level as opposed to commercial company levels. Despite being very pro-data for commercial uses, Gregory is also an advocate of increasing the accountability of companies who hold large data sets whilst also improving transparency with their customers. ality is that the majority of the information collected is not shared because people intrinsically want to, it is given to make the task they are undertaking easier. For instance, 1 click buy on Amazon requires that credit card numbers are held, and if there is an option to sign up for a new service by allowing for the information to be fed from Facebook rather than filling out a form, people are likely to just share their Facebook data. Gregory remains adamant that as long as people are ‘rewarded’ for sharing their information and that this creates additional convenience for them, then the amount of data shared will not drop.



2 1& 22 Chicago, 2014

Manufacturing Analytics Summit “Applying Data to Increase Efficiency�

Confirmed speakers include

For more information contact Elliott Jay +1 (415) 315 9404


Target’s Hack

The Lessons To Learn From Target Harriet Connolly Analytics Leader

Target’s Hack

You won’t find many companies that have a section on their website dedicated to their biggest catastrophe, but in the face of Target’s data breach, which saw them lose 40 million customer credit card and debit card numbers, and contact information totalling nearly 110 million people, it was deemed a necessary step to maintain customer relations and loyalty. In reality, far more needs to be done to reignite Target’s once solid image as one of America’s premier retail brands. The consequences of their data breach in 2013 surface and resurface everyday, with the most recent repercussions coming from smaller banks, looking to recoup their losses. Financially, the loss was large, estimated to be around 61 million dollars, but what has cost them more is their reputation, which has been permanently tarnished. It doesn’t matter whether 1 person’s details are compromised or 110 million, any breach is unacceptable for investors and customers alike. Talking directly about the issue on Bloomberg TV, Adam Levin, Identity Theft 911 Chairman, stated, “your reputation is priceless, and you may take a hit, but what the public looks at is how urgently you responded”. For Target, they have now shifted to a damage limitation approach – straight after the news broke of

the breach, CEO, Gregg Steinhafel, made a concerted effort to ease the worry of their customers by making it clear that no PIN number’s had been compromised and that the company was approaching the breach with the upmost seriousness. On the face of it, the loss of credit card details is the most frightening aspect of this case for the customer, but the personal information that is now in the hands of fraudsters opens up fresh concerns over phishing attacks, which are far m o r e difficult



Target’s Hack

to negate in the long run. On top of these concerns, there is a vast amount of litigation on the horizon. In addition to fines of anywhere between $5,000 and $10,000 per payment brand, other lawsuits may be triggered when more effected entities come to light. There is a valuable lesson for companies operating in a similar space, you must heavily invest in data security in order to avoid experiencing the disaster that Target did. Data breaches are costly and unless you’re protected, the chances are you’re at risk. Many IT experts have done little to allay these fears, with the general consensus being that even the most modern security programs don’t stand up to the technical prowess of the fraudsters. Nevertheless, organisations are not powerless – with education being the key component. According to the 2013 Data Breach Investigations Report, carelessness from employees accounts for

67% of all network intrusions, as hackers take advantage of weak usernames and passwords and instances where malicious code is inadvertently downloaded. If Target had data savvy employees then they may well have had an opportunity to guard against the data breach. US financial firms are vulnerable compared with their European and Asian counterparts. Unlike European and Asian cards, American versions do not feature encrypted chips that prevent hackers from reusing data. This is partly down to the complexity of the American Financial System, which has an array of competing banks and business owners. A change would cost them a huge sum of money and is an obvious stumbling block for development. A change is on the horizon, however, and the Target breach is at the centre of it. Visa and MasterCard are aiming to have the encrypted chip in the majority of U.S. cards by the end of 2015, but still lagging behind are the retailers, who have yet to put a process in place to change their card readers. Reports have come out from Computerworld stating, “it [the data breach] may have resulted partly from the retailer’s failure to properly segregate systems handling sensitive payment card data from the rest of its network”. The takeaway from this is

Target’s Hack

that having a firm grip on sen- make sure their organisation is sitive data is imperative. Much ready for any upcoming data of the Target breach came from attack. stolen login credentials and if they had locked their sensitive data it wouldn’t have been the disaster it has since turned out to be – and this becomes all the more important when you’re operating on the cloud as you may well be sharing network resources with other organisations. Another important issue is that of Malware protection. Malware was the culprit behind Target’s breach and it shows just how important antivirus and malware protection is to your overall data privacy strategy. Malware most likely infected either their overarching network or their POS system and having protection against this is vital. Systems need to be robust and up-to-date, to provide the most comprehensive conundrum for prying eyes. Four months after the breach and the dust has still yet to settle on Target. There has been talk of legislation in the U.S., but it is unlikely that this will change the approaches of the perpetrators of the Target data breach. Fran Rosch, Vice President of Security Software Company Symantec stated, “This is kind of an on-going war, and the types of threats are changing all the time”. For Target, it was all a little too late, all they can do now is endeavour to learn from their mistakes and



Saving the World

Saving the World

Big Data is Saving the World


Big Data has always been associated with business improvements, increased profitability and even the sentiment of customers towards a particular company. In reality, this is the absolute minimum of the potential that Big Data has.

money in it. For this reason it is not surprising that financial institutions, banks and large multinationals are streaking ahead of other companies in terms of Big Data adoption.

Why has it been concentrated on business for so long? The reason for this is purely monetary, the early adopters of this kind of technology are always going to be the ones who can invest the most

manitarian organisations are leveraging the power of data to make a true difference to the world and how this kind of work is already making a huge impact on the world around us.

The reality is that with this kind of power, true impacts can be made on more than Often touted as an indus- bank balances and customer try worth $15 billion by 2015, happiness, it can make genthis does not give it the full uine change to the planet. In breadth of what can be essence; Big Data can save achieved with the increased the world. collection, analysis and ac- In the following articles we tionable insight that data look at the ways in which provides. companies, charities and hu-

16 16

Data in Conservation

Big Data in Animal Conservation Chris Towers Big Data Leader

Data in Conservation

When people think of saving rare species, they think of remote jungles, scientists and people chaining themselves to trees. The stereotyped idea is that animals in the wild are very difficult to track and that the only way that people do this is through a rudimental tracking system with a small sample making assumptions for the wider community.


Big Data and the complexities of data analysis could not be further from this, with the collection of massive data sets combined with complex predictive models and algorithms creating insights. The idea that enough data could even be collected to make a useful analysis is hard to imagine. However, this has changed recently as HP have teamed up with Conservation International (CI) to create Earth Insights. This programme has been designed to give an early warning system for animal numbers amongst endangered species across the world. Through the use of cameras and climate sensors, the system can collect data from around 1000 of these devices and use it to collate information on population numbers. This information is then fed into the HP Vertica platform, allowing for quick and accurate readings that can help to target specific areas or species that need to have time or money invested in them. The issues that arose for Conservation International before was not that this information was impossible to get hold of (after all, essentially all that would be needed is a series of well positioned cameras). The huge number of images that are collected due to this were almost unmanageable.


Data in Conservation

The scientists at CI would spend weeks or sometimes months analysing the data and drawing collations and conclusions from it. This kind of work is time sensitive, if a declining population is found too late, it could spell the end for that species.

climate monitoring equipment also measures the environment in the area.

The importance of these goes beyond simply knowing the temperature or the amount of rain, but has a further reaching purpose, allowing scientists to monThrough the collaboration with itor what is causing decreasing HP, CI have now improved pro- populations. cessing speed by 90%. This has Through the data collected it given them an effective early is possible to see what kind of warning system as well as free- conditions are having positive ing up time for their personnel to or negative effects on species actually make the differences to populations. This will then allow the animal population. It has es- scientists to have further insight sentially allowed 90% more time into what are causing to be spent on actively working these conditions with the species to increase their and take steps numbers. to negate or propagate It is in its early stages (the col- this laboration was only announced in December 2013) but early signs have been positive. Using the new technology and HP platform the team have recognised that from all species being monitored, 22% have experienced a drop in population numbers. That this has been identified in a timely manner means that efforts to protect these species and increase populations can be more effective. So has this system worked?

So far the system has created over 3 terabytes of analyzable data and has 1.4 million photos from the cameras placed to track the populations. In addition to these photos the

Data in Conservation

environment. The fact that the system allows for issues revolving around this to be identified much earlier, means that these conditions, and fluctuations in them, can be analysed in a more sustained and accurate way. Small changes can be picked up quickly and steps can be put into place to prevent negative changes. This kind of work shows the wider environmental benefits that Big Data can have on the world. The collaboration of companies like HP with charities shows that it is not necessarily just going to be a money making business function, but a true performance enhancer across a multitude of human, societal and environmental issues. If this programme can be as successful as it has the potential to be then we are likely to see this adopted across a far wider array of charities and environmental issues and that can only ever be a good thing.




Social Media & Web Analytics Innovation Drive Success Through Innovative Digital & Social Media Analytics


12 & 13 Miami, 2014 Speakers Include:

For more information contact Lewis Chandler +1 (415) 692 5281

Humanitarian Data

Big Data in Humanitarian Relief Efforts Simon Barton Assistant Editor



Humanitarian Data

In early November 2013, Typhoon Haiyan tore through the Philippines leaving in its wake a trail of destruction and tragedy. 13 million people were affected by the Typhoon, including 4.9 million children. 1.5 million of these children were under the age of 5 and are therefore deemed to be at higher risk of Global Acute Malnutrition. The pain and suffering that a disaster such as this precipitates is clearly impossible to quantify in terms of data, but data’s use can be an essential tool for the emergency services when they need to identify when and where people are in danger, and what resources they need to save them.

being of particular importance. Crowdsourcing and the incorporation of Big Data have been imperative in assisting the emergency services efforts to stem the tide against a number of natural disasters including Typhoon Haiyan and the Haiti earthquake.

Big data, when used in conjunction with humanitarian causes, is often harnessed in the form of crowdsourcing, a phenomenon that has been with us since 2007. Around that time, Kenyan non-for profit, Ushahidi, began mapping user-generated accounts of brutality after the elections in Kenya, in an effort to spur donations to the region. By plotting the outbreaks, it created a public record of the events and spawned a number of similar sites that have helped humanitarian projects.

Big Data is also being picked up by a number of international relief institutions, including Disaster Relief International (DRI), a major supplier of humanitarian aid. DRI has used Big Data analysis to improve response efforts in the Philippines by tracking assets and personnel in real-time and determining where help is most urgent. Their willingness to adopt Big Data gave them the Peter F. Drucker award for Non-Profit Innovation.

The process of crowdsourcing is developed by categorising and verifying reports transmitted by witnesses of events, normally through email, text messages and social media, with Twitter

Typhoon Haiyan was not Big Data’s first major endeavour into humanitarian causes. Instead, we first saw its

Picture: Dona_Bozzi /

Humanitarian Data


develop, up to 2100. This project was in line with their 7 Billion Actions campaign, which looked to potential being put to use in the raise global awareness around Haiti earthquake. Its success the opportunities and challengwas born out of the partnership es associated with a population between the public and private of over 7 billion. The fact that sector, and allowed data technology and data plays such scientists from the Karolinska an important role here shows Institute to use data from that analytics has the power to Digicall, Haiti’s largest mobile transform hard, un-navigatable phone operator, to compare data, into clear and consumpeople’s movements before and mate information. after the earthquake. This meant that they could decipher where Far from being just a reactive the ‘hot-spots’ were so that they tool, Big Data also has the cacould get medical supplies to pacity to pre-empt crises. Take the Cholera outbreak in Haiti them as soon as possible. that has been a pressing issue Big Data’s use is not just confor over four years, a study in fined to disaster relief, it can also 2012 showed that Twitter was play an important role in helping yielding data that would have policy makers and researchers. made for quicker detection of The United Nations Population the outbreak when compared to Fund teamed up with SAP AG in more traditional methods. 2011 to create two dashboards with the aim of engaging people Data Scientists found that, as the in the societal and demograph- number of incidents increased ic trends that are shaping the and decreased, so did the world we live in through to 2100. amount of tweets and informal media reports. A remarkable By using data, the dashboards finding, and one that shows the gives us an in-depth look into the power that data has. If these way in which the globe is likely to trends had been visualised Picture: Alessandro Colle /


Humanitarian Data

more readily, the disease could have been stifled in its earliest stages, and in turn saved lives. The report stipulates, “This information in the right hands could have saved lives” Disease tracking, as seen with the Cholera outbreak in Haiti, is perhaps the most meaningful contribution that Big Data can make. In Kenya, a nation with high mobile phone penetration, data has been combined with regional malaria prevalence information to estimate how population movements influence the spread of disease. The information they garnered, allowed them to calculate the

probability of a resident being infected with malaria and the chance that a visitor to a particular area would be infected on any given day. These developments are impressive and when you consider that mobile phone penetration in Africa has hit 80% and is increasing by 4.2 % annually, the opportunities for mobile data aggregation is a significant one and one that has the potential to foresee disease epidemics throughout the continent of Africa. Looking at Big Data and Humanitarian causes, it is clear that there is certainly ‘Data Philanthropy’ in much the same way as regular philanthropy. This concept involves companies sharing their proprietary datasets for social good. Big Data is at the centre of aid relief projects nowadays, but issues continue to persist in the form of the processing phase, which can still be time consuming and often requires individuals to monitor content in real time. This has led to the increased involvement of media companies, as their insights allow for cutting edge media monitoring that can be collected quickly and efficiently. Big Data and Humanitarian causes are a match made in heaven and help the emergency services quell some of the globe’s most pressing and urgent hu-

Picture: Umkehrer /

Humanitarian Data

manitarian crises. Similar to its implementation in the LAPD, it demonstrates how far Big Data can go outside of the business landscape and the extent to which it can assist the individuals working on the ground level. What is clear is that through Big Data, we have a more accurate view of the earth and where it is likely to be in the next 100 years. Through these projections, organisations such as the UN can make accurate assertions as to what sections of society need the most help to develop as fruitfully as possible. This has been reflected in the UN’s 7 Billion Act, where they marked 7 key issues that they feel are the most imperative for the globe’s growing population. The use of Big Data and crowdsourcing is clearly not without its limitations. You only need to look back a few months to the Google Flu Tracker failure to see that there is still significant pro-

gress to be made, and that relying on data is not always best practice. The Google Flu Tracker overestimated the size of the influenza pandemic by 50% and miscalculated the severity of last year’s flu, predicting double the amount of flu-related doctor visits. So, we must tread with caution, but Big Data can be an essential tool and this has been proven over the last 5 years.



Connect With Decision Makers Through The Innovation Enterprise E-Newsletters

Digital Magazines Email Marketing


On-Demand White Papers

Reach a targeted, localised and engaged community of decision makers through our customisable suite of online marketing services.

For further information CONTACT: +1 415 992 7502 US +44 207 193 1346 UK


Data in Crime

Data in Crime Fighting: Beyond Minority Report George Hill Managing Editor



Data in Crime

cities such as San Diego and New York have utilised the kind of data that new systems can create to not necessarily predict crimes in themselves, but pinpoint where crimes are likely to take place and preventing them. This could be anything from increasing a police presence in the area to changing certain physical This is not a particularly accurate aspects, improving street lightway of describing how Big Data ing or increasing the amount of is being used in society to help visible CCTV in the area. This has with crime prevention. Although been successful in both of these there are on-going experiments cities, allowing for police leaders around the use of this kind of to strategically place their forces data to predict who could com- in order to have the greatest somit certain crimes (most notably cietal good. the FAST programme currently This kind of ‘predictive policing’ being used by Homeland Secu- has also been adopted elserity in the US) these kinds of initi- where in the world, with the UK atives only have a 70% accuracy and other European countries rate, meaning that three in ten utilising similar systems to impeople who would be arrested, prove their own targeting of would have done nothing wrong. high crime areas. It has been a The idea behind this can also success and does not have the never be accurate, often crimes invasive elements of data coltake place due to a rush of blood to the head, crimes of passion have very little premeditation meaning that unless government departments can read somebody’s thoughts, then these would be totally unpredictable. When we discuss Big Data in crime fighting, the analogy of Minority Report, the 2002 Tom Cruise film, always comes up. This is the idea that it would be possible to predict who is going to commit a crime and when, meaning that law enforcement can stop these crimes before they are committed.

Big Data has a much bigger role to play than a sci-fi version of policing, it has been making society a safer place (albeit in smaller ways) for the past 3 years. It is well documented that US

Data in Crime

lection that many would associate with the use of analytics in policing, it has been successful without people feeling like their privacy has been compromised. This kind of information is relatively easy to collate however. Simply put together which crimes have historically happened in these areas and at what times. Therefore the likelihood of a certain crime happening in a

certain area, at a certain time is x% higher than the average. The difficulty with this system is the necessity firstly of crimes being reported and secondly the use of historical data, which for many areas may be lacking. This is why Rutgers have created Risk Terrain Modelling (RTM). RTM allows for crimes to be predicted not purely from the history of crime in that area, but from the surrounding environment and the likelihood of these conditions allowing for an increase in a certain amount of crime. There are always going to be the more obvious places, dark alleyways or areas that attract an increased amount of foot traffic, but this system allows for public officials to identify areas where crime is likely to happen that many would not consider. The system has been made available to public bodies through an app that automates the processes without the need for additional crime analysts, meaning that police forces can make informed decisions about asset deployment. This kind of work by Rutgers and others is incredibly useful for crime prevention and works well for demonstrating the power that data can have in deployment and keeping the pub-



Data in Crime

lic safe, but does not tell the full story of how a collaborative use of data can be invaluable in the actual solving of crimes.

arms essentially have one use (i.e shooting somebody/something) it is possible to identify a gun and the links it has between countries through the distinctive residues and markings that it leaves. Through a new database (Odyssey) in the EU, countries can now track a gun through the shootings it has been involved in. The distinctive aspects of a particular gunshot can be noted in Italy and if those same characteristics appear in Britain, the chances are that this has been a cross border trade of the same gun.

A telling sign of how difficult it can be to track and solve crimes is simply the ways that criminals work. They often do not even use money in trading, exchanging hard to track goods as a substitute. How is it therefore possible to track the movement of something that isn’t known to exist? This becomes even more complicated when the ‘currency’ used is often switched across international borders, making collaboration difficult across police This not only means that crimes can be easily identified and forces. One of the most common forms tracked but also gives law enof non-money-based currency forcement agencies the opporis firearms. The use of firearms tunity to pinpoint particular ilacross borders is difficult to legal trade routes. If there is an track as the actual guns traded abnormal number of shootings are often un-traceable through with the same guns taking place things like serial numbers and in both Paris and Lisbon for exdo not exist themselves on a ample, the chances are that database, at least not in their there is a trade link between physical form. However, as fire- criminal elements in France and Portugal.

A similar story comes from child protection elements of law enforcement. When catching a paedophile it is common to find thousands of abusive images on their computers, sometimes amounting to hundreds of gigabytes of data that needs to be stored and analysed. The time it would have previously taken for a human to analyze all of the data, as well

Data in Crime

as making links between these images and others found in the same country let alone across borders, made the task beyond the single case almost impossible. Therefore systems are now in place that allow these images to be analysed and identified, not only throughout the crime in hand but also across other cases nationally and internationally.


have been shared allows for law enforcement to identify rings and make multiple arrests rather than just one.

Big Data is not only used online for these kinds of actions either, they also hold power in the prevention of pirated software. At the Microsoft Digital Crimes Unit, they can identify when a serial number for a piece of software Due to the nature of paedo- has been stolen and used. They philes and the networks that are can even pick out when it is becreated, this kind of identifica- ing tested prior to it’s sale by the tion of images and where they counterfeiters. This means that this kind of software can be pinpointed and appropriate action taken. The use of data when finding the appropriate action in this case is also interesting. Through analysis it became apparent that it was far simpler and effective to follow the money to merchant accounts which allow online credit card transactions. Disabling these accounts meant that it didn’t matter how many websites were created to sell the counterfeit software (often multiple sites were linked to one merchant account), if the counterfeiter couldn’t take payment, they couldn’t sell fake software. Similar to this is the use of data analytics in the detection of fraud in both insurance and finance, reducing the costs of premiums and allowing banks to reduce losses and pass on the savings to customers. A prime example comes from the


Data in Crime

UK ,where the Durham police force used analytics to identify a complex fraud that involved multiple claims from the same car crash. Through the use of data they identified who was responsible and how the kind of scheme that had been used had not only subjected the insurance companies to fraudulent claims but also increased the cost of insurance for the local population. Using a similar system to this, but through an internal team, Nationwide building society in the UK have managed to reduce the amount lost to fraud by 75%. The use of Big Data in fighting, preventing and identifying crime has a huge knock-on effect to society in general. If criminal elements of society felt that they could lose themselves in the cloud of information or avoid identification through cross border work, Big Data has proven that this is no longer the case. The fact that this is a system that is being implemented so widely and has gained many fans across the world suggests that this is not simply a fad that police forces are jumping on for the sake of seeming progressive, but a gen-

uine solution to many of the issues that have a detrimental effect on society as a whole. Although we may never hit the levels shown in Minority Report, the reality is that our current level of use is still making a big difference to how we conduct our police work.

Fighting Fire With Data

Fighting Fire with Data Heather James Big Data Leader



Fighting Fire With Data

There are around 1,000,000 homes in New York City and every year about 3,000 of them are effected by fire. As such, the New York Fire Department is the second-largest in the world and recruits only the most capable officers to protect the city and it’s inhabitants. By only the second month of 2014, there were 232 serious fire incidents, with serious defined as four fire units becoming fully engaged. This is a substantial figure and one that requires more than manpower, water and hoses. Data has been heralded as the way forward for the NYFD and the most reliable source for identifying when and where fires are likely to erupt. Michael Bloomberg’s 12-year stint as mayor of New York made the city more data driven and ever since, there has been a willingness to investigate how data can help with more areas of city management. Bloomberg’s data centric decisions have been widely recognised as a positive step for the city, with individuals such as Paul Romer, Director of New York University's Marron Institute on Cities and the Urban Environment stating that “any city could adopt the central pillar of Bloomberg's approach – letting data drive policy decisions.” New York’s data driven initiatives are now being looked at by cities such as Rio De Janeiro, and have been held up as an

example for politician’s who are wishing to improve their city. A focus for New York is currently the Fire Department and how to reduce the number of fires that hit the city’s homes every month. It is a pressing issue and one that requires real endeavour from the city’s officials. Inspections play an important role in deterring fire hazards in the city, and prior to the adoption of data, barring high-risk areas such as schools and public libraries, it was an almost random procedure.

Fighting Fire With Data

In New York there are certain factors that make a building more likely to be a fire hazard. There is a long list of variants, 60 in total, but a list that the Fire Department has successfully catalogued. Some factors are more obvious such as the age of the building and electrical issues, whilst others including the average neighbourhood income, are less obvious but still important. This was backed fully by Jeff Chen, the Fire department’s Director of Analytics recently when he stated, “Low-income neighbourhood’s are correlated with fires”.


factors really isn’t that difficult a process. Problems hit the surface, however, when everything has to be absorbed and visualised for the fire fighters who ultimately have to put the data into practice. The data is run through an algorithm that assigns each of the city’s 333,000 inspectable buildings with a score. This score determines the buildings risk of fire allowing fire fighters to determine which buildings require more urgent inspections, preventing potential hazards before they become more serious.

An issue that Big Data advoOn the face of it this doesn’t re- cates often face is proof that ally seem that radical and the their systems is working. Proving aggregation of a few of these


Fighting Fire With Data

a negative is far more difficult than showing tangible evidence of a positive development. The New York Fire Department may well see the amount and severity of fires dropping but it will be difficult to show that these developments are intrinsically linked to data implementation. Having said this, the data program is due to expand to 2,400 categories, creating a system with a deeper understanding and more accurate targeting. It is not as if the use of Big Data in city management is new to the US or New York. Former mayor, Bloomberg set the foundations for a data-run city and other cities like Boston have adopted simliar work with their ‘Problem Properties Program’. With the backing of one of the globes most prominent Fire Departments, Big Data is likely to continue to be an important cog in fighting fires in New York City. If it is a success, its implementation is only going to become more widespread throughout the US and the world.




Big Data Industry Pioneers On-Demand “Highly focused Expert Content & Practical Solutions for your Big Data Requirements”

Email for more information

Big Data Innovation, Issue 9  

Big Data Innovation, Issue 9

Read more
Read more
Similar to
Popular now
Just for you