Page 1




We talk to CIO, James Rule about his work in the area


THE CONTINUOUS HEALTHCARE LEARNING ECOSYSTEM How does it look and how do we get there?

BIG DATA IN 2014 What are we expecting to see in the next 12 months?




5 10 16 20 23 27 BD Singapore half.pdf



Marc Folch discusses how companies should look to unlock themselves from Big Data privacy concerns We make our predictions about how Big Data will play out in 2014 Vipul Kashyap talks us through the data ideas and improvements that come from a continuous healthcare ecosystem We talk to James Rule, CIO, People & Organization at Thomson Reuters William Tubbs looks at how sports and the way people engage with them is making a more data literate society We investigate the use of Big Data and Analytics in policing, a policy being adopted frequently across the globe


Big Data & Analytics Innovation Summit February

27 & 28 Singapore, 2014 SPEAKERS INCLUDE:

For more information contact Ryan Yuan +852 8199 0121

Editor’s Letter

Letter From The Editor


Welcome to the first issue of Big Data Innovation in 2014. After the success of the first year, going from an idea in January to a global magazine with a readership of thousands, we are hoping that 2014 can bring even more success.

Managing Editor George Hill

This year we are going to bring you the the most innovative features, latest news and insightful interviews from the top mind working in data.

Assistant Editors Simon Barton Chris Towers

This starts off this month with an interview with James Rule, who was a key player in the building of the HR Analytics system at Thomson Reuters. Given the diverse nature of the business and the variety of components in this well established and famous company, this was no small achievement and we wanted to bring you his unique views. Vipul Kashyap, Senior Director at NYU Langone Medical Center gives us his perspectives on the controversial use of data in healthcare. He argues for the existence of a healthcare learning ecosystem in order to push the healthcare industry forward and promote increased collaboration between the different parts of the healthcare system. Marc Folch discusses how unlocking Big Data from privacy concerns should be of vital importance to companies in 2014 and how this can be a huge benefit to those who choose to do it. We hope you enjoy this edition of the magazine and we are hoping to be able to continue bringing you the same high quality and insightful content throughout 2014. If you have any feedback or are interested in contributing to the magazine please contact me at

George Hill Managing Editor

President Josie King Art Director Gavin Bailey Advertising Hannah Sturgess

Contributors Chris Towers Vipul Kashyap Marc Folch WIlliam Tubbs Simon Barton General Enquiries




For more information contact Sean Foreman +1 (415) 692 5514


Unlocking Data

Unlocking Big Data From Privacy Concerns Marc Folch Director, Global Pricing Roamly



Unlocking Data

With the recent privacy-related backlash, many companies are prematurely discarding their otherwise great "Big Data" ideas out of fear of this backlash being directed at them. However in many cases, this is a mistake. The main reason firms get in trouble in this area is because they do not sufficiently design their initiatives as win-win partnerships. The market has repeatedly shown that it will allow the use of its private data under two conditions. First, that the consumer receives what they feel is fair value in return and second, that the company acts as a trustworthy steward of the information.

privacy compromises. Gmail is arguably one of the best email services in the world, paid or not. Although recently tarnished, Google also has historically had one of the world’s strongest trust brands in tech. Its poignant “Don’t Be Evil” mantra resonates with consumers and employees alike, and its use of private data has mostly been non-obtrusive, respecting the user’s interests. Google has worked hard to try and balance the (sometimes conflicting), interests of users and advertisers.

When It Has Stopped Working: When Google has either stepped well outside the bounds that users had authorized, (Eg. GivTo illustrate, let us look at some ing the NSA access), or failed to provide sufficient transparency well-known successes: and control, users have gotten Gmail scared, then angry. Started in 2004 and with over Air Miles 425 million users as of June 2012, Gmail is a runaway suc- Formed in 1988 and now popcess. Hundreds of millions of ular in Canada, the Middle East, people accept Google’s ad tar- and the UK/Spain (As Avios), geting systems reading their AirMiles built a billion dollar most personal and intimate business around consumer data emails. This is much more pri- and loyalty. For 20+yrs consumvate information than most Big ers have earned free rewards in Data initiatives would use, yet exchange for giving marketers despite the persistent barking of visibility into their shopping habprivacy watchdogs, people keep its across many vendors. It also serves as a loyalty program to using Gmail in their droves. influence consumer preference Why It Works: Users get tremendous value, which in their towards certain vendors and eyes outweighing the cost of the products.

Unlocking Data

Why It Works: Consumers love the idea of being rewarded for activities they already do. Some consumers complain that the program’s rewards are not always rich, but part of the value is the “gamification” of their shopping which allows them to enjoy working towards a goal. This combined value again outweighs the perceived cost in members’ eyes. AirMiles has also proven a trustworthy steward and steers away from using private data in noticeably intrusive ways. Key Takeaway: Your firm does not need to offer a “best in the world” service like Gmail for free, simply something that consumers value more than the privacy compromise they are asked for. In the end it is their perception of the value and cost that determines whether the exchange is fair and sustainable. So involve them extensively in the design process. Why Concerns Arise At the heart of all privacy concerns lies a perceived loss of control over personal information. From the consumer’s perspective we expect companies to not do anything “sneaky” against our wishes when we are not looking. Even if they have the legal right to do so, we feel betrayed when this happens, and that makes us push back.


Although the company is holding the information about us, we still feel it is ours much like the money our bank holds for us. We do not like threats to our control of our information any more than threats to our control of our savings account. However this does not prevent banks from re-lending our deposits to earn a return for themselves. They have simply learned to be very careful that their clients never feel a loss of control or trust. Almost every backlash has arisen out of the company’s failure to see the data as belonging to the customer and thus realizing that they must earn the right to use it just like they earn the client’s other business. The Power to Say No Giving customers the right to opt out of your data mining initiative (potentially foregoing the benefits as well), can be a very powerful tool. Interestingly, consumers tend to treat privacy rights much like voting rights. They get very concerned when they feel these are threatened, yet when secure many don’t even bother to use them. It is very comforting to know that if a trusted partner steps too far out of line, one can “yank the leash” and revoke their privileges. So give your customers this power and you gain both their trust and a clear metric for gauging how well you are doing.


Unlocking Data

If many opt out, you need to re- pregnancy (Eg. The Target Baby examine your approach. Fiasco), adult activities, and The 6 Steps To Prevent Privacy medical conditions, since these can trigger concern and backBacklash lash far more easily 1- Give consumers value that outweighs the perceived It will take years (if not decades), “cost” or risk. Remember that for the data privacy story to play perceived cost is affected by out in full and turn into reliable how much trust your firm has laws. In the mean time, firms earned and by how much free- who hold back from Big Data will find themselves leapfrogged by dom you are requesting competitors who don’t, so it is 2- Be upfront about what in- important to start building cusformation you are keeping, how tomer trust and Big Data experit will be used and most impor- tise across your organization. tantly, how it will not be used. Even as regulation changes the Make this as clear, simple and landscape, building the resourcspecific as possible. It should not es and skills in-house will remain be “hidden in the fine print” a critical driver of future success. 3- Give participants the abil- If necessary, start small, but ity to opt out and wipe their per- start today. sonal information. They would forego the value-adding service(s) you offer in exchange, but this lets them feel in control, which is critical for trust 4- Do not step outside the boundaries that consumers have knowingly authorized. If you need expanded authorization, clearly state what its for and why they will benefit from giving it 5- Open a channel for consumers to ask questions and raise concerns about their data to proactively resolve the inevitable misunderstandings that will occur 6- Be extra sensitive regarding intimate insights such as


Boost HDFS Tame MapReduce No ETL required



Big Data In 2014 Chris Towers Big Data Leader


Many commentators have referred to 2013 as the year of Big Data. It was seen as the year in which it really entered business consciousness, w h e r e c o m panies wanted to start using it and more importantly, the year in which it made the biggest headlines around the world with the NSA and GCHQ scandals.


We believe that the biggest transformation that we are likely to see this year is going to be the shift towards more cloud based services in Big Data. With the Gartner Big Data Report 2013 showing that around 60% of companies want to introduce new Big Data initiatives, there is likely to be considerable amount of experimentation before it is truly invested in.

Therefore, although there can be broad predictions about what is needed to complete the operations that they require, in reality the actual needs of a system are always going to be variable. If a company invests in an on-premise system, the margin So with 2013 being the year of of error will directly impact on Big Data, what will 2014 bring the ROI of the technology that to a function that has already needs to be bought. With cloud based systems this will not be made huge strides forwards? I personally believe that if 2013 the case as one of the key selling was the year of Big Data, 2014 points of cloud based systems is will be the year that it becomes the ease of scalability. With the real data. We are going to see need for flexibility and expericompanies who have made ten- mentation, we believe that more tative steps into data initiatives and more companies will turn to jump in, we will see the impor- cloud based services at least for tance of data quality come un- the experimentation phases of der increased scrutiny as well as their Big Data implementation. Solution providers have also multiple other issues. As this is the first issue of 2014, been pandering to this need as I wanted to give a breakdown of startups such as Qubole have the main issues that we at Big seen considerable success from Data Innovation believe will af- offering this service. It is an fect the industry in the next 12 easier and more stable business model to create, without months. the need to create systems and Big Data As A Service produce physical products, this



means less initial investment is needed and companies can grow quicker. Big Data = Best Data

become a importance factor. Data Security Overhauls 2013 saw the ways in which companies collect and store data come under increased scrutiny after the revelations that some of the largest and most trusted online companies had been either undermined or were corroborating with government organisations for mass surveillance. This shocked the world’s media and has seen a backlash against this kind of practice.

Sean Patrick Murphy at a recent Innovation Enterprise summit surmised that Big Data was ‘big’ not because of size, but instead because of it’s importance. We believe that 2014 will see more companies thinking along these lines, there will be a concerted effort to see Big Data as an effort in collecting the correct data at the correct time as opposed Companies have seen success to collecting as much data as from offering more secure serpossible all the time. vices already and recently FaceThe ability to manage larger book, Google, Microsoft, Apple, data sets will always increase AOL, LinkedIn, Twitter and Yain the same way that we are al- hoo formed an alliance and said ways going to see progress in “it is time for the world’s governthe speed and reliability of all ments to address the practictechnology as developments es and laws regulating governare made. However, in the data ment surveillance of individuals realm, people will increasingly and access to their information.”. realise that analyzing as much That they felt data as they can will be the an- the need to alytics equivalent of leaving the come out lights on. Sure, its good when with a you walk into a room and you j o i n t can see everything, but a huge m e s waste for the 99% of the time s a g e when you aren’t using it. We are likely to see the companies who have the best results in the their data programmes leveraging smaller amounts of the correct data rather than large amounts of all the data they can find. Therefore, the idea of big as a cumulative factor will instead


about the NSA and the importance of individual information security is testament to the importance that 2014 will have on the rights of an individual’s data.

scientists will decrease in 2014 with the power and simplicity of tools increasing, but in reality, this is unlikely to be the case yet.

The reason for this was well exThis could manifest itself in sev- plained to us by a prominent eral ways, such as tighter reg- figure in the Big Data world. It ulation on storage or what can is essentially like having a great and cannot be collected. We stove, great ingredients and believe that tighter regulation great uten- sils in a kitchen, with these on storage m o s t will be the people most popc o u l d ular alongmake a side clear m e a l , communibut it cation of takes a security to propindividuals. er chef Transparto make ency will somealso bething come a key gourfactor, with m e t . members Comof the genpanies eral popwho ulation behaving lieve m o re access to the information that is that they can invest in technolheld about them by companies. ogy and expect non-data drivIncreased Demand For Data en employees to create the best insights are going to be wastScientists ing money. At the moment we One of the themes that has been as a society are not data driven repeatedly mentioned along- enough to truly utilise these new side the increasing popularity technologies to the potential of Big Data has been the lack of that they could be. data scientists needed to really make the most of Big Data pro- 2014 is likely to see an increase grammes. We have seen men- in understanding for data and tion that the need to far data we are likely to see far more




companies become data driven and data reactive, however one thing that would be dangerous to think at this point is that business focussed employees have the ability to do the same as a data scientist.

the demand will still outstrip the supply as more companies look to implement new data programmes, more data literate people will be necessary to implement these new ideas.

More Graduates


As data science has become more prevalent in our society and the awareness of the demand for jobs has increased, we are seeing more higher education establishments creating data driven courses. This combined with the many companies who have sent sections of their workforce to become trained in elements of data science, will mean that 2014 will see an increasing number of qualified data literate workforces.

2014 is likely to be another year of transition, with the majority of companies yet to implement Big Data programmes or if they have, not to their full potential. Due to this we are likely to see an increase in the numbers and varieties of technology, both for basic and in depth analysis.

multiple enterprises. Creating additional business opportunities is therefore likely to have a knock on effect in the amount of money invested and therefore the speed of development within many organisations. It will not be until we have a workforce that is sufficiently data literate, that we will truly be able to implement Big Data across every industry that we know could benefit from it.

2014 will represent a year where we move closer to this, but with companies still learning and still yet to implement, in many instances, it will be a With the likely increase in the step towards this goal as opnumbers of clients available, posed to the achievement of this could create more oppor- it. tunities for companies who Despite this, as we have men- have the expertise and skills to tioned in the previous point, consult or implement across


Transform Big Data into insight, and insight into action Experts say big data will double every two years. Today, companies are scrambling to figure out the secret to transform Big Data to big insight, and insight into action. Faced with data that is ever expanding in volume and speed—and questionable in accuracy, businesses must quickly identify how to sift through this changing data, to arrive to actionable insight. To unlock the value of Big Data, they’re turning to predictive analytics. At D&B we recognize the value of transforming volumes of data into actionable information. That’s why we now offer an Informed Perspective—indispensable guidance that helps business anticipate new opportunities, curb risk, and discover more efficient was of making decisions.

The future belongs to the informed. For more information, visit

© Dun & Bradstreet, Inc., 2014. All rights reserved. (DB-3706 2/14)




The Continuous Healthcare Learning Ecosystem - How Does It Look and How Do We Get There? Vipul Kashyap Senior Director NYU Langone Medical Center


Despite spending approximately 17% of their GDP on Healthcare, the United States lags behind other countries in key healthcare performance metrics. One of the factors behind the poor delivery of healthcare in America is its fragmented healthcare system. There is a lack of collaboration and coordination between healthcare providers, pharmaceutical companies and insurance providers, which prevents effective information and knowledge sharing and in this article I argue that by creating a collaborative healthcare ecosystem America could radically improve patient health.


The Current State of America’s Healthcare System The three major stakeholders in America today “Payor’s”, “Providers” and “Pharmaceutical” manufacturers. Healthcare Payors include healthcare insurance companies, healthcare maintenance organizations and organizations that manage claims for healthcare assistance. The Payor bankrolls the healthcare transaction and views the individual healthcare consumer from a population management perspective. Healthcare metrics are designed to estimate the aggregate risk or probability of a claim for healthcare services and health services are implemented to reduce the probability or risk. The Provider provides healthcare services and these include hospitals, clinics, primary care centers and other service delivery points. Disease management and other health/advocacy programs are often implemented outside of providers. The result is that the patient receives inconsistent and contradictory information, leading to a reduction in trust and engagement. Pharmaceuti-



cal companies during their clinical trials and drug development programs also adopt population segmentation and management strategies.

ous stakeholders. New insights can then be further derived from pre-existing insights and data … setting in motion a continuous learning cycle.

There is concern that the three healthcare silos are not sharing their experiences and data. In remaining separate they are squandering an opportunity to address the discrepancy between healthcare spending and delivery. Experiences of segmenting population groups for clinical trials could be utilized in disease prevention strategies and the insights into why certain healthcare consumers remain “high risk” could aid their individual treatment in the Provider setting.

2. Knowledge and information sharing should ideally decrease the information asymmetry and create value across the ecosystem. However this requires an investment in information infrastructure and economic incentives to encourage stakeholders to share data and insights with each other. We now present some example scenarios where such collaboration could create value and potentially bootstrap the continuous learning ecosystem.

The Future State – Continuous Learning Ecosystem An analysis of the current state strongly suggests a lack of coordination and sharing of information and insights across various stakeholders. Information and Knowledge Sharing across stakeholders in the ecosystem is the fundamental premise of the continuous learning ecosystem. The two key interactions which enable the ecosystem are:

Example Scenario: Outcomes Contracting and Medication Adherence The 2010 CIGNA Merck Outcomes Based was an innovative partnership that can be seen as a wider framework for rewarding pharmaceutical companies for improved outcomes. In the context of diabetes patients, Hemoglobin values are a very important outcome. Contracts that create incentives across the various stakeholders can be designed, and the results of the CIGNA Merck Outcomes Based Contract (average cost reduction of $8,000 per patient and 5% improvement in blood sugar levels) demonstrate they can improve patient health and reduce cost.

1. The principle stakeholders (providers, payors and pharmaceuticals) continuously share data with each other (assuming privacy concerns have been met). This will allow them to derive new insights from into their particular business needs and share these insights with vari- An example of collaborations in

Individuals (Populations?)


Health Advocacy

Health Plans/Payors

Employer/Payor Perspective

Individuals (Populations?)

Pharmaceutical Companies

Health Plans/Payors

Pharma Perspective

Individuals (Populations?)



Provider Perspective



this space include discounts on drug prices from Pharma companies if Payors achieve glycemic level targets across a given population. This would require educational efforts from the provider and strategies to change the behavior of the patient. In this case, the provider needs to have an incentive to achieve these outcomes – a goal similar to the pay for performance vision being implemented by Accountable Care Organizations (ACOs). And in exchange for the discounts received, the provider will share the clinical observations with the payer and the pharma, which will

holders to capture and share data. Investment in tools and infrastructure for data and knowledge management is required; frameworks to ensure privacy is respected and Conclusions incentives need to be introData, Information and Knowl- duced to encourage stakeedge Sharing is a necessary holders to join the ecosystem. component of cost reduction and patient care improve- These issues are currently unment in America. The example der discussion at a blog enscenario illustrates how col- tirely dedicated to the topic at laborations between Payor, http://continuouslearningeProvider and Pharma can help improve outcomes and reduce costs. However, in order to develop a healthcare ecosystem, the incentives need to be created for the Stakegive them useful information in their population health management strategies and tools for data capture, reporting and analysis.

HR & Workforce Analytics Innovation

19th & 20th March London, 2014 Speakers Include:

+44 (207) 193 0827




HR Analytics

HR Analytics at Thomson Reuters An Interview with James Rule, CIO

George Hill Managing Editor

HR Analytics

Ahead of his presentation at the HR Analytics Innovation Summit, London on March 19 & 20, we caught up with James Rule, CIO, People & Organization at Thomson Reuters, for a Q&A. James is a senior HRIS leader with over 15 years experience in helping organizations make best use of their investment in HR technology. He believes that to be an effective technologists need to understand as much about people management practices as they do technology – what has happened, is happening and is likely to happen. As a believer in strong talent management his ethos is to get the right people, doing the right thing, at the right time whilst staying engaged, motivated, committed and hopefully enjoying the experience and having fun. James has worked for organizations in the private and public sector and is currently in the Director, CIO People & Organization role at Thomson Reuters after successfully delivering all talent-centric work streams for their Workday implementation

to initiate and interact with HR processes. So the short answer is that there are less transactional people in HR now. This in turn is driving the organization to ask more questions around people data in general and from there more requirements are being made of our HCM platforms in terms of analytics and analysis. From an organizational perspective we are seeing more roles that are either truly partnering with the business; acting as a centre of excellence or providing more support to our HR technology. How do you think analytics have changed the HR function?

I think they are changing from what is traditionally a “how many have we got of this” or “we have to report on that” to a much more predictive and future looking state. There are still reports Having worked at Thomson that have to be produced (SOX, Reuters in the HR department SafeHarbor, EU country specific throughout the data revolution, laws etc) but reporting and anhow has this increase in data alytics has moved up the value impacted on the department chain now in terms of answering questions like “how can the orand company as a whole? The shift to HR self service real- ganization be more effective?” ly started this and more recent- and “how can we react to orly this has evolved to being able ganizational change more effectively” and even “I have this role,



HR Analytics

who could do it and how do I ensure transparency within the process”. The latter is a really important point as presenting data can help lower any perception of closed and unfair selection processes which is one way of addressing poor engagement and morale.

any user can benefit. There is a relationship between the complexity of the organization and the amount of data it generates but all organizations should be thinking about where they are going and how best to design from a people perspective, through to the talent succession example I mention above. At the end of the day skilled and productive resources are rare and no one wants to lose their best talent.

Having been an integral part of building many of Thomson Reuters HR analytics systems, how do you see these systems developing in the What advice would you give to a next 5 years? I see a much bigger shift from trans- company who is looking to impleactional to transformational and ment an HR analytics programme in a much bigger component for Big 2014? Data both within the organization and across the industry/segment. Who doesn’t want to know how their benefits packages compares to others? I also see a bigger consumerization of data in terms of configurable dashboards being available to managers and leaders. One of the things I regularly talk about is “Data driven talent management” and I believe it’s becoming more probable. I won’t go into the whole mobile arena but again that is huge! Do you think that HR analytics are exclusively useful to large companies or can small and medium sized companies benefit? The investment to be able to analyze and crunch the data is something that traditionally is outside the budgets of smaller companies. However with the rise of SaaS HCM platforms this investment is carried by the vendor and the companies are paying for what they consume. So if the intention is to use SaaS/ Cloud based HCM platforms then

As always when it comes to designing a program – “what business problem are you looking to solve?” and focus on the outcomes rather than any technology. There may be an immediate need to start providing reports because they have grown into a certain locale or it may be something more ethereal like trying to address poor attrition in a certain demographic or field. I’d always recommend talking to people both inside and outside of their organization. I do believe that most companies are trying to solve the same problems when it comes to HR in general and there’s no harm in sharing. It might sound odd but all companies want to treat their people well if they hope to succeed and it should be more about the company, its ethics and its products and services and less about how it treats its employees (as it knows from all of its data sources that it treats them well…)

Sports in Data

How are Sports Creating a Data Literate Society? William Tubbs Sports Analytics Leader



Sports in Data

the grand slams, even including the amount of positive tweets that have been posted about any selected player, at any one time. During the Wimbledon final in 2013, Andy Murray was the subject of one hundred and twenty thousand tweets per minute. But how many of those would have benefited from detailed sports analytics? The answer is unclear. What is clear is that accessible analytics tools will only increase a fans desire to be aware of how their favourite athletes are developing. For the dedicated fan, the excess of knowledge that sports analytics brings is incredibly exciting. Take the US Open 2013 final between Novak Djokovic and Rafael Nadal. Rafael Nadal won forty-five per cent of his service returns, a number that he has never reached before against Djokovic on hard courts. Was this the This month, the Australian Open reason he won? Armed 2014 will be contested. The with this type of stat, tournaments sponsor IBM will the committed fan approach be presenting a new web series will f o r t h com“Game Changers: Real. Sports. Data”. Set up as an initiative to ing debates foster social media interaction, it with more shows how analytics and Sport knowledge are converging, being of equal and knoimportance to both profession- whow. als and fans. This is by no means It is short IBM’s first sports analytics en- sighted to deavour. The Slamtracker, a free only view online analytics tool, offers the S p o r t s general public the opportunity a n a l y t to track a vast array of on go- ics from ing stats throughout all four of the standThe use of analytics in Sport affects every stakeholder involved in the industry. From professionals to brands and sponsors, the collection and analysis of data has increased their capacity to make informed decisions. The use of analytics is widespread and incorporated by some of the sporting worlds most prestigious organisations. One company at the forefront of sports analytics is Opta, an organisation whose mission is to use data to enhance and illuminate every aspect of sport. Instead of being a service for the few, they have outreached their data to the masses through their Twitter feeds and increasingly made sports data understandable and noteworthy to the average user. This trend is an important one, and mirrors the societal shift from ‘data fearing’ to ‘data toletant.

Sports in Data

point of the professional athlete. Cheaper analytics tools are commonplace. New York startup Krossover Intelligence has created a sports analytics platform for amateur coaches and athletes. The web application allows for detailed player and team analysis and even helped high school basketball team Fort Bend Travis Tigers win the Class 5A, Texas state championship. With other services such as Hudl and Gamebreaker available, it is of real importance that even amateur sides incorporate sports analytics to maintain their competitiveness. This trend will mean that high-school students are introduced to data early on in their lives and will allow them to feel more comfortable with data. This could see data being viewed with less pessimism from students who deem it boring and irrelevant for their future careers. Sport analytics could foster the next generation of Big Data experts and even go some way to bridging the Digital Divide if it were implemented in areas that are in danger of being left, in a technological sense, behind.


based on the statistical performance of the individual athletes selected. Fantasy football alone will generate around one-billion dollars in revenue this year. Baseball was the original haven for the budding statisticians, with paper based analysis being posted through newspapers on the following day, but with data being updated in real time and on-demand, users have a number of platforms to build their own fantasy teams, often for free or a nominal fee. With twenty-five million users, fantasy sport could be the catalyst for larger sections of society becoming data tolerant. In this area, data is also intrinsically linked to entertainment and ease of access, both imperative to promoting a digital friendly society. Additionally, if sports fans witness the effect analytics has had on their favourite pastime, it will increase the likeliness of them using data to shape other aspects of their lives. The possibilities really are endless for the data savvy.

Sports Analytics is also becoming a key facet in sports betting. For example, the NBA, one of the Big Data analytics have globes most important sporting also driven Fanta- organisations, is no stranger to sy sports. Fanta- the impacts of the data tolerant sy sport is seen society, with Haralabos Voulgarthroughout a num- being perhaps the most notoriber of disciplines ous sports gambler of all time. and allows partic- At his disposal he has a dataipants to build their base of statistics and predictive own team. Success is models that surpass anything at


Sports in Data

the NBA. He has been known to bring in eighty thousand dollars a game from his bets, all based around the statistical knowledge he garners from the predictive models he has in place. Voulgaris is neither a mathematician nor a data scientist, and yet he has managed to create an analytical system with considerable means, capable of beating the bookies. What are the implications of this? Sports analytics, as alluded to earlier, will lead to a more informed spectator, one who can make predictions based on hard evidence. This will make the gambler a more worthy oppo-

nent to the bookies, although still an underdog. Nevertheless, a data friendly society is not good new news for bookies and will inevitably lead to innovation from the odds makers as we as a society become more data literate. Data was once deemed the antithesis of sport. Sport was entertainment, whilst analytics were more appropriate when making a decision about company mergers or political campaigns. This is now a defunct hypothesis. The modern audience takes great pleasure in discussing the intricacies of sport. Soccer fans have contended the beloved Messi vs

Ronaldo argument recurrently, so much so that a website has been spawned detailing in-depth statistics as to which player is better. Coupled with Social Media, sports analytics add a new depth to sport, allowing us become enthralled in issues that act as welcome sub-plot for the viewer, and as more echelons of society become ‘data tolerant’ it will coincide with a wider, more informed debate. Sport has the capability to bring analytics to the masses and will see more and more people become data tolerant.

Data in Policing

The New Crime Fighter: Big Data Simon Barton Assistant Editor

27 27


Data in Policing

Policing is turning to Big Data in order to prevent the flow of crime. In the United Kingdom and the USA, police constabularies are using predictive analytics models to stop crime before it occurs. The hope is that by analyzing data trails left by criminals in the past the police will be able to predict where and how criminals will break the law in the future. Big Data will not only help to bring the common criminal to justice but companies will be able to use Big Data as a way of ridding external espionage and fraud, whilst governments will have the ability to better understand potential terrorist threats, a tactic that has already been implemented by the NSA and GCHQ. Incorporating data into policing is not a new development. In response to the misconduct of a clutch of police officers across a number of US police departments, an “Early Intervention Accountability” system was put in place to stem corruption. Its implementation however, was not without its difficulties, with opposition coming from Police Officer associations who deemed it untrustworthy. Despite its unpopularity, nobody can argue its importance. The Rampart scandal in Los Angeles highlighted the unfortunate truth that small clusters of the police force are willing to turn to crime and abuse their position of power. Twenty-four officers

were found to have committed a crime, making it one of the most widespread cases of police corruption in US his- tory. By using data, police departments are able to identify potential officers who are more inclined to commit an offence. Armed with this information, officials are able to intervene early so that scandals such as Rampart and other more recent events such as the Danzinger Bridge shootings do not come to their abhorrent fruition. Corruption is not only a problem in the US, in the UK, the police force has been subject to increased scrutiny since high-profile investigations such as the death of Sean Rigg and ‘plebgate’. Big Data is really starting to impact street crime; Professor Jeff Brantingham and his team at UCLA delved into thirteen-million crimes, committed over the last eighty years in order to help the LAPD better predict future crimes. By using a mathematical model for measuring earthquake aftershocks, Brantingham with the help of Professor George Mohler, developed an algorithm that allowed for sim-

Data in Policing

ilar conclusions to be made concerning the aftershocks created after a crime. The patterns they have found show that after a crime is committed, there is a higher risk of crime emerging from neighbouring regions. A recent episode of the BBC’s Panorama showed that LAPD officers were given maps that display boxes of five-hundred square feet in which crime was most likely to happen. Updated in real time, the new initiative has led to a twenty-six per cent decrease in crime within their precinct. In Kent in the UK, a similar trial is in progress. By examining historical data and using predictive analysis, officers are able to identify criminal hot spots in their area. Crime has been cut by six per cent in areas in which it has been enforced. Despite positive results, the system has only been implemented on a trial basis and its permanent inclusion is still undecided. However, issues have been raised as to whether the hot spots are specific enough, and whether there is enough manpower to patrol all of the highlighted areas. Barring the fears, due to the positive effect predictive analysis has had on crime rates; its adoption is seemingly imminent across a number of nations and states across the globe. Another interesting development was seen in New York, where the NYPD and Microsoft joined forces to create ‘The Do-

main Awareness System’. The system will allow the NYPD to access three-thousand CCTV cameras around the city, giving them the option of cross checking their findings against criminal and terrorist databases. The system is in real-time and updates officers as to suspicious events. Since 9/11, vigilance has been prioritised and the NYPD will be grateful for another tool in their quest to stop any further atrocities in the city. Microsoft want to implement the system across a number of different US cities, with the NYPD acquiring thirty per cent of all profits made from new installations. The US military has also used Big Data to learn about the dynamics of complicated terrorist networks. The patterns Big Data creates mean that terrorists, however sophisticated, find it difficult to go unnoticed. Initiatives such as ‘The Domain Awareness System’ have always been met with distain from certain sections of society, who claim their civil liberties are being eroded by the perpetual collection of data. But if the use of data can offset terrorism and predict the whereabouts of serious crimes, it is hard to ignore its advantages. The success of predictive analysis in the LAPD was not an aberration, and if seen as a tool to complement the resources already at the disposal of the police, it can certainly be a successful one.



Data Visualization Summit

April 9 & 10 Santa Clara 2014


For more information contact Rose Palmer +1 (310) 598 7733


Do You Have The Spark? If you have a new idea that you want to tell the world, contact us to contribute an article or idea

Big Data Innovation, Issue 6  
Big Data Innovation, Issue 6  

Big Data Innovation, issue 6