Page 1

Data Management Culminating

Submitted By: Carly Van Egdom, Jake Jardine Submitted to: Mr. D Date Submitted: 17/01/2013 Course Code: MDM 4U

Purpose of Research Our topic of study compares the Quantity of guns in a country vs homicides in a country. For this report, we have decided to study the effects the quantity of guns has on a population. We did this by comparing two data sets. these set are the average guns per 100 citizens and annual gun related deaths. With the data set above we will be asking the question "does having more firearms per 100 citizens affect the number of homicides caused by a firearm." Hypothesis When doing a study like this, we have to have a hypothesis, a vision in mind. the saying we want to prove or disprove is that "Guns Kill people". We personally believe that this statement will be proven correct. It is expected that that the more guns there are in a country, the more deaths. This means that any violence that is likely to occur, is more likely to have a gun involved due to the availability. We believe there will be a strong positive correlation between amount of guns and number of deaths. We believe this because if there are more guns, the people in that country will be more likely to use those guns to kill people. With a higher number of guns, the death rate due to guns will be higher. If people don’t have guns, then they can’t kill with guns. It makes sense that if there are more guns, then there will be more deaths due to guns. Data Collection The information in this report is taken from the years 2010-2013. We did this to ensure that the data is relevant and accurately represents today’s society. We chose to focus more on first world countries as everyday, normal citizens will have better access to guns. We thought it would be more appropriate to use data from similar countries in terms of development. This reduces external factors that could affect our stats, such as political unrest. This allows us to compare and contrast similar countries, in terms of GDP and development. The trend would be more difficult to see if we included countries from all stages of development because they will have different levels of gun use based on

their development. The data would be collected using a Voluntary-Response Sample because most of the data would be collected using a census. A census is a voluntary response in some countries, and in some it is required. In the countries where it is required, it would be a simple random sample, because they would take the data from the required census, and randomly select data points from the entire data set. In the countries where a census is done, and citizens respond if they want to, it would be a voluntary response, because they are responding because they want to. Each country would have a different method of collecting the data, which could cause a bias in the data, because the sampling methods are not consistent. Background The NRA has said for many years that the problem in society is not that there are too many guns. Lead members strongly believe that more guns make a country safer, giving citizens the opportunity to defend themselves. The statement that the association defends is that “Guns Don’t Kill People, People Kill People.” We plan to exploit the trend that the more guns in a society, the more gun related deaths. Bias Analysis Our report is studying the relationship between amount of guns in a county, and the number of homicides in that same country. We thought that there would be a relationship between the two of these because it would make sense that if there are more guns in a country, that there will also be more homicides. The amount of deaths in a country will be directly linked to the amount of guns. There will be some bias in the data, however. This bias will be a non-response bias. This is because some countries, such as China, Russia, and Bangladesh, didn’t have data for this report. Also, some countries have data on other things, such as “Homicide by firearm rate per 100,000 pop”, but not specifically on the amount of homicides vs the number of guns in a country. Because some of these countries are not included, and there isn't data for them, this will be a non-response bias. These countries didn’t respond or give data to

the report, so therefore they had a non-response. This will be a bias, because it could change the outcome of our report. As China is a huge country, and could represent a good portion of the world’s population, not having them as a data point could change our conclusion of our report. Measurement biases could also affect our data. This could be because of an under or overestimation on population when gathering data from any of the countries. As the one set is ‘average amount of guns per 100’, it means there was a population estimate involved in obtaining the number. Sources of Error A source of error in our data could be that there is potential for the way that the data was collected to not accurately represent the population. If the country only surveyed 100,000 people, then took that stat to represent a country of 1 billion people, their data may not accurately represent the country. The data was found through a third-party application, and was collected directly by the country themselves, and then displayed on the website. The way the country themselves could have misrepresented the data, or the truth in their country. This could cause a source of error in our data. Some sources of error could include the country's current political situation. It has been proven that in times of political or societal unrest, gun related violence rises. This could cause a country’s gun related homicide to rise much higher, when in a stable time it would be much lower. This could change what the “normal” amount of gun related violence would be in a country, when they aren’t at a point of civil unrest. The way that the data was collected by each country could be different in every country, so this could alter the conclusions of the data. If one country collects their data using a voluntary response sample, and one uses a simple random sampling technique, then this could alter the accuracy of the data. Because the sampling technique isn’t consistent, this could cause the correlation to be incorrect because the data of one country was gathered in a different way than another. Outliers Explained

We excluded the United States because of the excessively large amount of guns in their possession .The States have so many guns, that it alters the line of fit enough, such that it no longer accurately represents the data. In our data, the three obvious outliers were Sweden, Norway and Austria. These three countries had a large variation from the set. All are developed countries like the rest, but they don't follow the trend. Finding justification for this was difficult. There is no tangible reason why these countries have many guns but very few homicides. The laws are similar to those of Canada and other counties, but the deaths are substantially lower. Why is this? In order to answer this question we did research on the countries, and everyday behaviours and views. We believe that the primary reason for the difference is because of the countries outlook on guns. There are substantial amounts of firearms in Sweden, Norway and Austria, but they do not exist for the same reasons as in Canada and the other countries. In these countries it is not uncommon to own a gun, but it is very uncommon to have a gun in your house. This is because they do not use guns for self defence, but specifically for hunting. In conclusion, the reason the deaths are lower is because of society's views on guns. These countries own firearms for the purpose of hunting, and solely that. Although these outliers did not have a major effect on the graph, we decided to remove them so we can compare countries in which we believe have similar cultures and views on the subject. We believe that there are also many other countries included with similar views, yet the difference from the mean isn't as steep. If research was done on each individual country, and the countries were grouped with like-countries, we believe that the trend would be more evident. We decided to choose countries that are deemed “first world� countries.Because they have a similar GDP, we believed that they would most likely have a similar government structure, and culture. We took a group of these countries, and graphed them to see if there was a trend. We also removed the United States from our graph because they have such a high crime rate and a high number of guns, that they weren’t able to fit on the graph.

Our Data and Graphs

Calculations Based on Data and Graphs Measures of Central Tendency Mean Number of Homicides with Outliers M=∑x N M=733

16 M=45.8 Number of Homicides without Outliers M=∑x N M=676 13 M=52 Amount of Guns with Outliers M=∑x N M= 281 16 M=17.6 Amount of Guns without Outliers M=∑x N M=187.7 13 M= 14.4 Correlation Coefficient Outliers Included Coun Num Amo ∑xy ∑x2 try ber unt of of Homi Guns cides per ( X) 100 peop le (Y) Austr 30 15 450 900 alia Cana 173 30.8 5328 2992 da .4 9 Czec 20 16.3 326 400 h Repu blic Den 15 12 180 225 mark Singa 1 0.5 0.5 1 pore Neth 55 3.9 214. 3025


225 948. 64 265. 69

144 0.25 15.2

erlan ds Japa n Engl and and Wale s Spai n Ger man y Nort hern Irela nd Belgi um New Zeala nd Swe den Austr ia Nor way SUM










254. 2


38.4 4







4787 .4

2496 4

108. 16 918. 09



109. 5


479. 61







158. 2


295. 84 510. 76








1169 .2 547. 2 62.6



1573 4.3

7601 7

324 4

998. 56 924. 16 979. 69 6852 .46

R=n∑xy-(∑x)(∑y) √(n∑x2-(∑x)2)(n∑y2-(∑y)2) R= (16)(15734.3)-(733)(281) √((16)(76017)-(733)2)((16)(6852.46)-(281)2) R= 45775.8 √(678983)(30678.36) R= 45775.8 144326.31 R=0.32 Correlation Coefficient with Outliers Removed Country Number of Amount of Homicides( X) Guns per 100 people (Y) Australia 30 15







Canada Czech Republic Denmark Singapore Netherlands Japan England and Wales Spain Germany Northern Ireland Belgium New Zealand SUM

173 20

30.8 16.3

5328.4 326

29929 400

948.64 265.69

15 1 55 11 41

12 0.5 3.9 0.6 6.2

180 0.5 214.5 6.6 254.2

225 1 3025 121 1681

144 0.25 15.21 0.36 38.44

90 158 5

10.4 30.3 21.9

936 4787.4 109.5

8100 24964 25

108.16 918.09 479.61

70 7 676

17.2 22.6 187.7

1204 158.2 13955.3

4900 49 74320

295.84 510.76 3950.05

R=n∑xy-(∑x)(∑y) √(n∑x2-(∑x)2)(n∑y2-(∑y)2) R= n∑xy-(∑x)(∑y) √(n∑x2-(∑x)2)(n∑y2-(∑y)2) R= (13)(13955.3)-(676)(187.7) √((13)(74320)-(676)2)(13)(3950.05)-(178.7)2) R= 54533.7 √(509184)(19416.96) R= 54533.7 99432.41 R=0.55 Linear Regression Outliers Included Countr Numb Amou ∑xy y er of nt of Homici Guns des( X) per 100 people (Y) Austral 30 15 450 ia Canad 173 30.8 5328.4 a Czech 20 16.3 326 Republ ic Denm 15 12 180 ark


900 29929 400


Singap ore Nether lands Japan Englan d and Wales Spain Germa ny Northe rn Ireland Belgiu m New Zealan d Swede n Austria Norwa y SUM









11 41

0.6 6.2

6.6 254.2

121 1681

90 158

10.4 30.3

936 4787.4

8100 24964

















18 2

30.4 31.3

547.2 62.6

324 4



15734. 3


A=n(∑xy)-(∑x)(∑y) N(∑X2)-(∑X)2 A=16(15734.3)-(733)(281) 16(76017)-(733)2 A=45775.8 678983 A=0.0674 X=∑X/N X=733/16 X=45.8 Y=∑Y/N Y=281/16 Y=17.6 B=Y-AX B=17.6-(0.0674)(45.8) B=14.5 Y=0.0674X+14.5

Linear Regression Outliers Removed ∑xy


30 173 20 15 1 55 11 41

Amount of Guns per 100 people (Y) 15 30.8 16.3 12 0.5 3.9 0.6 6.2

450 5328.4 326 180 0.5 214.5 6.6 254.2

900 29929 400 225 1 3025 121 1681

90 158 5 70 7 676

10.4 30.3 21.9 17.2 22.6 187.7

936 4787.4 109.5 1204 158.2 13955.3

8100 24964 25 4900 49 74320


Number of Homicides( X)

Australia Canada Czech Republic Denmark Singapore Netherlands Japan England and Wales Spain Germany Northern Ireland Belgium New Zealand SUM

A=n(∑xy)-(∑x)(∑y) N(∑X2)-(∑X)2 A=13(13955.3)-(676)(187.7) 13(74320)-(676)2 A=54533.7 509184 A=0.1071 X=∑X/N X=676/13 X=52 Y=∑y/N Y=178.7/16 Y=14.4 B=Y-AX B=14.4-(0.1071)(52) B=8.8308 Y=0.1071x+8.8308 Standard Deviation For all of the tables in this section, the data was inserted into the following equation.

/∑(x-µ)2 √ N With Outliers

Without Outliers

Analysis of the Lines of Best Fit As you can see above, the line of best fit of the graphs changed when the outliers were removed versus when the outliers were included. The m value , as well as the y value of the graph changes when the outliers are removed versus when they are included. When the outliers are included, they cause the line of best fit to be a lesser slope, and the m value is lower (0.0674). The line of best fit, as you can see above, shows a decreased rise over the same amount. The m value for the graph without the outliers is higher (0.1071) and has an increased rise over the same amount. Both lines of best fit show a correlation between the two variables, but the graph with the outliers removed shows a stronger correlation. The line is at more of a 30 degree angle versus the 15 degree angle that the line of the outliers included graph has. This shows a stronger correlation. Assumptions In this report, we were assuming that the data which we used was accumulated accurately. We assumed that all the data was gained without bias, and that proper sampling techniques were used for the data. We also assumed that the data collected accurately represented the society of each country. The data could’ve easily been representative of only one part of the country, while we assumed that that data represented the whole of the country. We also assumed that all firearms within the country are registered, which is an assumption that is very unlikely. It is extremely unlikely that every firearm within a country will be registered under that countries gun registration. This is an assumption we make in order to use the data. Limitations of the Data As we chose not to represent all countries in our report, this limits the accuracy of our statement. As well, the data set we chose is very difficult to collect accurately. Millions of guns are unclaimed. The large gap between the data and the actual, will make our findings less accurate. In conclusion, our data set is very vast and creates a lot of limitations through the holes it presents.

Relationship Between X and Y We are trying to prove a cause and effect relationship in this study. We are trying to prove that there is a relationship between the amount of guns in a country and the number of homicides in the country. We are trying to prove that the amount of guns will cause the amount of homicides to increase. There is also a potential for there to be a reverse cause and effect relationship as well, however. There is the potential for the amount of homicides in a country to cause an increase in the number of guns, because citizens will be scared and wanting to protect themselves with guns. This relationship could exist, especially in countries like the US where self protectionism is a common occurrence.

Possible Extensions of the Research There are a few potential extensions of this research. We could research further into the relationship that could exist between gun-related deaths, rather than homicides in general, versus the number of guns. In addition, we could divulge more information on the effects guns have on society. We believe that there could be clear trends between amount of guns and other factors in society. Some of these factors could include crime rate (in general, not specifically guns), living standards of a country and overall ‘safeness’ of a country. Conclusions Based on the Findings Based on the findings of the graph, we concluded that there is a weak positive linear correlation between the number of guns in a country, and the amount of homicides in that same country when the outliers are included. When the outliers are removed, the linear correlation is a moderate positive linear correlation. This was shown by the graphs that were made, and by the lines of best fit, as well as the correlation coefficient. The correlation coefficient for when the outliers are included is 0.32. The correlation coefficient for when the outliers aren’t included is 0.55. As you can see based on the correlation coefficient, there is a correlation between the number of guns and the amount of homicides.

Therefore, if the amount of guns in a country increases, the amount of homicides will also increase. This causation proves what millions of North Americans are wondering: Do guns cause more deaths? In fact, they do. This should end the debate about gun control, as it is obvious that the amount of guns is playing a huge role in this debate.

Data Analysis  
Data Analysis