DR INVESTIGATIVE DATA TEAM Contribution from DR DATA TEAM to the Data Journalism Award 2014 / Corporate tax
We live by two mottos when we select data for stories.
1 We only do the story if we find news. No news, we reject the data.
We want data to have relevance to our readers. If we use lots of resources on interactive graphics, we want our readers to find something useful when they click on the devise.
In this portfolio you will find a selection of the stories and illustrations we have published since we first went “on air” November 1st 2013. For a full overview of our production
Please visit this webpage: http://www.pinterest.com/katrinefrich/drs-unders%C3%B8gende-databaseredaktion/ All articles are in Danish. Katrine Birkedal Frich
Mads Rafte Hein
Kresten Morten Munksgaard
Bo Elkjær (Skipper)
Jens Lykke Brandt
The DR Investigative Datateam was launched October 1st 2013. We are one editor, two journalists, one graphics designer and one programmer. We do all parts of Data Journalism our self. From scraping data of the web and using freedom of information act to dig out data and documents from public administration to selecting, sorting, refining, filtering and analyzing data – to the final visual and editorial presentation.
Since the data was released in the way it was, we needed to set up an automatic scraper that would
Analyzing the data
We also published stories with lists over the wealthiest and poorest municipalities and accompanying interviews, also describing how the wealthiest municipalities lost revenue in the national redistribution
search SKAT’s database by company number and copy off the data one company at a time.
Published December 2013 Time spent on this story: 20 days Readers since publishing: 279.878
According to the Danish tax code the municipalities each get 13,41 percent in proceeds of the paid tion keys dependent on distribution of employees, daughter companies, etc. Also the proceeds are
The second batch of stories published December the 28th, was centered on the interactive graphics
The project was to map and illustrate the corporate tax paid by Danish companies in 2012. The project
distributed in a three year delay – so that the proceeds each municipality receives in 2012 were actually
we developed to show the distribution of corporate tax payments.
was based on data released by the Danish tax authority SKAT. It is only the second time since 2012 that
paid in tax in 2009. In total this means that even though we had data on locations in municipalities of
SKAT has released data in full on corporate taxes.
companies we were unable to directly compare the municipalities by proceeds based on the 2012 in-
which takes place each year according to the distribution keys briefly mentioned above.
corporate tax. But parts of the proceeds are divided between the municipalities after a set of distribu-
formation from SKAT. The Data Team was met with several challenges in collecting, analyzing and presenting the data. Some
Again, the main story was accompanied with other articles catching up on different aspects of the issue.
250.000 companies are listed as taxable. 57.000 companies actually paid tax. One percent of these
In order to get accurate data on municipality tax revenue we collected data on the actual reported reve-
companies paid more than two thirds of the total corporate tax paid to the Danish exchequer. Out of
nue from Statistics Denmark. These figures were divided up by public records on municipality population.
Finally, the corporate tax paid form 5.6 percent of the total tax paid to the Danish exchequer by all taxpayers.
What did we do? We decided to create a slideshow describing the corporate tax paid in 2012. This main story was to be accompanied with other stories, describing the largest taxpayers, the companies that lost money in 2012, and the distribution of corporate tax to the Danish municipalities. Data was released from SKAT
To do this we downloaded a full list of all company registration numbers from the Danish company reg-
the 5th of December 2012 and the stories were to be published on December 27 and 28.
istry cvr.dk. This list was fed into a program that was coded by the Data Teams programmer. The code
Collection of data
Company directors were interviewed and experts were interviewed who nuanced the information presented and put it in a national financial context.
this percent only seven companies paid one third of the total corporate tax.
was set up to collect between 20 and 30 individual company records per second from SKAT’s database. It took a few days to collect tax records on 243.000 taxable companies.
The primary obstacle was the collection of data. Even though SKAT released data on all Danish compa-
Since data on each company was very sparse we needed to combine the information collected from
nies’ tax payment in 2012, due to political reasons the release was severely amputated. Because of the
SKAT with data from the Danish company registry cvr.dk in order to get information on addresses and
way the data was released you could only get access to information on the companies one at a time.
accompanying municipalities on each company.
SKAT opened access to a database, where you could search the companies by name or by registration
Both the spreadsheet with the data from SKAT and the full list of company records from cvr.dk were per
number. Doing this you could get access to a page showing you the company name, the registration
se too large to handle in Microsoft Excel. So in order to combine the data we imported the data from
number, type of corporation, applicable tax code, and corporate tax for the company, taxable income
the scraper into OpenRefine and combined it with the full list of company records from cvr.dk.
Presenting the data The first batch of stories published December the 27th, was centered on the municipality revenues from corporate tax. The main story was carried by a map showing the revenues per citizen, nationwide. Thus,
and deductible deficit. Furthermore, if applicable, the page would contain information on taxed income of oil extraction for the companies operating in the North Sea and also when applicable informa-
After combining and cleaning up the data we were able to export it in files that could be imported into
we could show the rich and the poor municipalities based on corporate tax revenue, showing the very
tion on companies under joint taxation.
Microsoft Excel and analyzed here.
large differences nationwide in revenue. http://www.dr.dk/Nyheder/Penge/2013/12/23/195455.htm
A motion story about tax The presentation utilizes the D3 framework for illustration and animation. The large data-set were
The animation itself consists of roughly 1000 tiny boxes that are animated using randomized values for
compiled into a dense format for transport to the client, where the information would be extracted
delay and duration - and thus each session sports an unique animation. http://www.dr.dk/Nyheder/
again in order to fit into the animation-model developed for this presentation.