Casual Connect Summer 2011

Page 20

By the Numbers

Moving Download Customers with Analytics Figure 6

Taylor MALE

AGE

FEMALE

0

1.979%

0.421%

5

2.430%

0.800%

10

4.414%

1.719%

15

5.401%

0.285%

1.251%

0.224%

25

0.112% 0.045%

35

0.010%

0.028%

40

0.004%

Figure 7

10-19

0.532%

0.000%

20-29

1.050%

0.000%

30-39

2.816%

0.053%

40-49

8.542%

0.086%

50-59

19.474%

0.119%

60-69

27.793%

0.169%

70-79

24.220%

80-89

13.274%

0.004%

90-99

1.442%

0.000%

100-109

0.008%

Florence

Figure 11

Figure 9

Harry AGE

FEMALE

0-9

0.000%

3.786%

10-19

0.000%

5.175%

20-29

0.042%

3.700%

7.757%

30-39

0.073%

40-49

8.184%

14.399%

40-49

0.082%

0.023%

50-59

15.817%

24.448%

Figure50-59 19

0.095%

0.084%

60-69

24.157%

23.850%

60-69

0.100%

0.033%

0.118%

70-79

25.042%

11.719%

70-79

0.076%

80-89

AGE

FEMALE

0-9

1.021%

3.030%

0.000%

10-19

1.410%

0.000%

20-29

2.148%

0.077%

0.000%

30-39

40-49

0.086%

0.052%

21.219%

50-59

0.058%

14.250%

60-69

0.048%

70-79

AGE

FEMALE

0-9

0.010%

7.870%

10-19

0.016%

10.330%

20-29

0.070%

12.536%

30-39

21.233%

MALE

AGE

FEMALE

0-9

2.296%

0.010%

10-19

3.758%

0.030%

20-29

4.737%

0.032%

30-39

6.136%

0.061%

40-49

16.705%

0.062%

50-59

27.389%

0.067%

60-69

22.619%

0.048%

70-79

11.627%

5.179%

0.005%

0.357%

0.000%

Figure 8

John

Mary MALE

FEMALE

0-9

0.060%

0.038%

30

AGE

0.000%

MALE

1.055%

20

0.360%

Figure 10

Ethel

5.299%

MALE 0.000%

MALE

80-89

4.103%

80-89

0.015%

0.051%

16.467%

4.936%

80-89

0.045%

0.000%

90-99

0.303%

0.104%

90-99

0.001%

0.003%

90-99

1.716%

0.382%

90-99

0.003%

0.000%

100-109

0.001%

0.000%

100-109

0.000%

0.000%

100-109

0.007%

0.001%

100-109

0.000%

0.011%

1.565%

Figure 13

Figure 12

Figure 14

Clara

Clarence

Figure 15

Jessie

Sarah

AGE

FEMALE

FEMALE

MALE

FEMALE

MALE

AGE

FEMALE

2.453%

0-9

0.000%

0.007%

0-9

16.808%

0.026%

0-9

14.418%

4.602%

0-9

4.670%

4.280%

10-19

0.009%

0.007%

10-19

8.145%

0.041%

10-19

25.962%

7.611%

10-19

8.329%

6.822%

20-29

0.070%

0.023%

20-29

4.775%

0.127%

20-29

31.304%

9.291%

20-29

8.938%

9.567%

30-39

0.115%

0.006%

30-39

4.228%

0.048%

30-39

13.468%

6.382%

30-39

3.539%

14.977%

40-49

0.144%

0.012%

40-49

8.105%

0.016%

40-49

4.813%

6.463%

40-49

3.263%

22.466%

50-59

0.152%

0.034%

50-59

14.777%

0.011%

50-59

4.069%

6.743%

50-59

5.499%

20.507%

60-69

0.137%

0.057%

60-69

18.461%

0.010%

60-69

3.267%

5.071%

60-69

6.966%

12.556%

70-79

0.102%

0.086%

70-79

16.529%

0.009%

70-79

1.734%

2.906%

70-79

5.648%

5.207%

80-89

0.047%

0.026%

80-89

7.213%

0.002%

80-89

0.621%

1.016%

80-89

2.719%

0.383%

90-99

0.003%

0.002%

90-99

0.696%

0.000%

90-99

0.054%

0.068%

90-99

0.274%

0.001%

100-109

0.000%

0.000%

100-109

0.003%

0.000%

100-109

0.000%

0.000%

100-109

0.001%

MALE

MALE

AGE

Figures 7 through 15 show a selection of the surviving gender/age breakdowns for a selection of names with the granularity of age reduced to 10 years. Figure 10 shows that if you meet an Ethel she is most likely to be between 60 and 69 years of age. (Even though Ethel was a more popular name earlier than that, sadly there are not many Ethels still alive who were born in 1920 or before.) These histograms show the probability density that a person you meet of that name is of that age and gender. Using the data in Figure 6, we can predict that, if you meet a Taylor, there is a 44 percent chance that she is a girl between the age of 10 and 19. If you meet a Mary, the highest probability is that she is between the ages of 50 and 59; John would most likely be between 40 and 59, Florence between 70 and 79, Clarence 50 and 59, Sarah 20 and 29, and so on.

18  Casual Connect  Summer 2011

AGE

Practical Demonstration: Poker Cards So how would we use this sort of analysis on a customer database? I’ll show you how I applied this technique at my company GreatPokerHands. The first step was to process the sales data to extract just the first name from the sales records. By doing this, I removed any potential privacy implications associated with the storage and use of personally identifiable information, as first name alone is not sufficient information to uniquely identify a person. We only need the first name for this analysis. In addition, using just first name makes for smaller file sizes, which can be important if the database is large! Figure 16 shows a breakdown of the most popular names of my customers (the bars show the relative volume of sales for each name). At first glance, the names appear entirely male-biased. However, we need to be careful not to draw wrong conclusions from this. This table only represents the most popular (modal) names. There is a very long


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.