Page 1


Scalable privacy preserving intelligence analysis for resolving identities Project Objectives

SPIRIT delivers a set of tools to empower LEAs to create semantically rich pictures over all available evidence to be presented at court. SPIRITtools is a platform delivering social graphs of heterogeneous named-entity relationships that performs social and criminal network analyses. It addresses surface and dark Web data acquisition, analysis, modelling and visualisation.

Spirit project shines a light on the dark Web

Project Funding

The dark web provides a high degree of anonymity to users, and it is difficult to identify the people behind criminal activities conducted there. We spoke to Felice Ferrara and Costas Davarakis about the work of the Spirit project in developing tools to acquire and analyse data from different parts of the World Wide Web, which will help the police identify the perpetrators of crimes. The surface Web is the part of the WWW most commonly used by the general public, to shop online, to communicate over social networks, and to publish information as well as to access it. The dark Web offers similar services with a higher degree of anonymity. This protects user privacy which is very beneficial, for example, to politically persecuted people. On the other hand, anonymity makes it challenging to identify the perpetrators of criminal activity conducted on the Web, an issue central to the work of the Spirit project. “The overall goal of the project is to address the issue of criminal activity in cyberspace. We are tackling it by developing tools to resolve the identities of potential perpetrators that are using the Web – be it the surface Web or the dark Web,” says Costas Davarakis, the project’s technical coordinator. The project consortium includes a number of police organisations, who have seen over the last few years that an increasing amount of criminal activity is conducted via the dark Web. “This is why we initially committed a large part of our resources in the project to developing and evaluating solutions for the dark Web,” explains Davarakis. Spirit project This does not mean that the surface Web is neglected however, and researchers in the project are working to develop solutions which are applicable to both parts of the Web, as a cyber-criminal may leave some traces of their real identity there as well. The ultimate aim here is to develop tools that will support law enforcement agencies (LEAs) in resolving identities, which first requires the gathering of information. “We have to collect information about possible identities, then extract relevant information from the collected resources and analyse it. The basic idea is to collect all the possible information, from all possible


Spirit Project Partners

mechanisms in order to link entities and ultimately discover identities. “We also have some other models which work on images,” says Ferrara. There are several intelligent services within Spirit, since different approaches are required to gather information from different sources. “Social networks are very different from dark Web content for example. So we need different services to properly collect data from different data sources,” stresses Ferrara. “Then the huge mass of downloaded data needs to be analysed by a family of intelligent services in order to properly extract entities and identities by taking into account the specific characteristics of each collected resource.”

to help the police and LEAs identify people of interest. This promises to bring significant benefits to the police, providing another investigative tool to help identify perpetrators of crimes, yet this must not come at the cost of breaking data security laws and infringing civil liberties, an issue of which Davarakis is well aware. “Privacy preservation is a major consideration in all aspects of our activities. In the project we have an independent ethics board, which is comprised of experts coming from legal backgrounds and with ethical expertise,” he stresses. The next step could be to apply these tools in policing and law enforcement, and Davarakis says the project

The overall goal of the project is to address the issue of criminal activity in cyberspace. We are tackling it by developing tools to resolve the identities of potential perpetrators that are using the web – be it the surface Web or the dark Web. Evaluating the tools SPIRIT Identity Resolution jobs generating perpetrator clusters and matching hypotheses.

data sources, to identify where two or more seemingly distinct subjects are in fact the same individual,” outlines Felice Ferrara, a Senior Software Engineer at Lutech, one of the project partners. This is no easy task however, as many of the criminals that use the dark Web try to hide their identities when moving around social networks. “We collect information and try to identify where someone is trying to conceal their identity,” says Ferrara. From a technical point of view, gathering content from the dark Web is fairly similar to gathering it from the surface Web, as it’s primarily a matter of using the right protocols. A large part of the project’s work is focused on developing tools capable of gathering content from the surface Web as well as other networks, such as the TOR networks. “The tool is able to recognise the specific characteristics of each of these networks, in order to get the

required information,” continues Ferrara. This is an area in which Ferrara and his colleagues at Lutech hold deep expertise. “In our work at Lutech we develop tools to crawl information from distinct data sources, in order to provide sufficient information to data recognition tools,” he explains. “The key is to provide enough information to use intelligent tools, which can then be used to identify people in pictures, for example by comparing images published on social networks with information from a police database.” A tool capable of crawling significant amount of information available on the Web is the entry point for an investigation system where many intelligent services are combined to analyse heterogeneous data sources such as text, video and images. All the content collected by the Spirit crawler is examined by natural language processing services or video analyses

EU Research

This is an important issue for the police and LEAs that are the intended end-users of these tools. There are six police organisations within the project consortium, and Davarakis says their priorities in terms of the functions of these tools are clear. “The key aim for the police is to identify a person of interest, for example on the basis of a name or a specific physical attribute,” he explains. These police organisations are playing an important role in providing data, evaluating the system and offering feedback. “Our partners in the project have provided us with a set of anonymised cases. We can then use this to draw inferences regarding identity matching,” continues Davarakis. “Our endusers also have the experience to guide us in terms of the reasoning in these tools, and the use of the machine-learning algorithms that we are employing, in order to ensure that solutions do not infringe data protection laws. We want to make sure that the system does not generate either false positives or false negatives.” The system itself is essentially an opensource intelligence tool, which is designed

partners are looking to exploit the research results. “There are several industrial partners in Spirit, and there is interest in promoting and exploiting these tools beyond the project life, together with conventional actors in the market,” he continues. This reflects the rapidly-evolving nature of the field and the technical sophistication of cyber-criminals. Many cyber-criminals are very advanced and continuously develop new methods to evade detection, so it’s important for LEAs to keep pace. “This is why we are keeping the environment open,” stresses Davarakis. Close collaboration between academia, industry and LEAs is also very beneficial in terms of developing new tools to deal with emerging challenges, says Ferrara. “This project has given us the opportunity to work with LEAs and to understand their needs,” he outlines. “My team is working on cyber-security, and we’re developing anti-crime, anti-fraud, anti-terrorism solutions. We want to use our experience and knowledge gained through the project to extend our offering to our customers and provide stronger analytics tools.”

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 786993.

Project Partners

• LUTECH • NST - Nydor System Technologies AE • A E Solutions (BI) Ltd • SyNTHEMA Artificial Intelligence • SingularLogic SA • London Metropolitan University • West Midlands Police Authority • STAD Antwerp • POLICE AND CRIME COMMISSIONER FOR THAMES VALLEY • Innova Integra Limited • Ministarstvo Unutrasnjih Poslova Republike Srbije • Universitat Autonoma de Barcelona • Hellenic Police • Linköping University • European Center of Psychology Investigation Criminology • FraunhoferGesellschaft zur Förderung der angewandten Forschung e.V. • Wyzsza Szkola Policji W Szczytnie

Contact Details

Dr. Costas Davarakis Senior Consultant - Research Coordinator SPIRIT Technical Coordinator E: E: W: Felice Ferrara E: Paolo Fabbri, SPIRIT Coordinator E: Dr. Costas Davarakis Dr. Felice Ferrara

Dr. Felice Ferrara received his PhD in Computer Science in 2012. His research interests are mainly focused on User Modeling and Artificial Intelligence. He is currently working in Lutech as Security Consultant and Project Manager on Log Management and Anti-Fraud & Crime projects. Dr. Costas Davarakis contracted by Singularlogic, is the Managing Director of NST-AE. In the past he has worked as an R&D coordinator in European both public and private sectors. Being active during all eight EU research frameworks, he has coordinated many European consortia. Costas is the Technical Coordinator of H2020-SPIRIT.


Profile for Blazon Publishing and Media Ltd


We spoke to Felice Ferrara and Costas Davarakis about the work of the Spirit project in developing tools to acquire and analyse data from di...


We spoke to Felice Ferrara and Costas Davarakis about the work of the Spirit project in developing tools to acquire and analyse data from di...


Recommendations could not be loaded

Recommendations could not be loaded

Recommendations could not be loaded

Recommendations could not be loaded