Visualizing movie data 1 Mapping data by Henk_Lamers

Mapping data Mapping data

Visualizing movie data

Henk Lamers

1 1

Visualizing movie data

Mapping data

The aim of Visualizing Movie Data is to give more insights in the choices we make when evaluating a movie. Ultimately, this could tell more about ourselves than about the movies. If that’s true, that would be a positive spin-off. Through Visualizing Movie Data I tried different ways of visualizing data collected by us. And I get some support from Ben Fry’s book ‘Visualizing Data’ published by O’Reilly Media Inc. Contrary to the previous project Generative Design Variations, Visualizing Movie Data is not about making as many as possible variations. It is now the intention that the movie data is used as functional and as basic as possible. I will try to skip any form of decoration. Although you might never know. In this publication I use a very simple and basic way of reading, displaying and interacting with a number of small data sets. These datasets consist partly of the data we found in our movie reviews. In this publication I position film-locations on a map. But another publication could for instance be about the height (or length) of actors.

143 Movie reviews of the 200 movies that we watched in 2015.

Visualizing movie data

Corita Kent in her apartment, 1970.

Corita Kent teaching with LIFE

Corita Kent, song about the

Corita Kent, (a little) more careful,

Courtesy of the Corita Art Center.

Magazine, photograph by Mary

greatness, 1964.

1967.

Alfred Hitchcock, Anglo-American

John Cage, American composer,

Saul Bass in 1985, American graphic

director and producer 1899–1980

music theorist, writer, philosopher,

designer and filmmaker, 1920–1996

(Aged: 80), Leytonstone, London,

and artist 1912–1922.

Photo: Tony Barnard/Los Angeles

Anne Karia in 1965. Courtesy Mary Anne Karia.

Corita Kent, untitled, 1986.

England.

Times.

Richard Buckminster Fuller, American

Ray and Charles Eames, American

Merce Cunningham, American

architect, systems theorist, author,

designers, architects and film makers

dancer and choreographer,

designer, and inventor, 1895–1983.

1912–1988 | 1907–1978.

1919–2009.

Mapping data

Let’s start with a historical mistake. Corita Kent (1918–1986), was an American roman catholic nun, artist, and educator who worked and lived between 1938 and 1968 at the Immaculate Heart Community of Los Angeles, USA. She taught and became the chair of its art department in 1964. Her classes were an avant-garde mecca for artists and inventors, such as Alfred Hitchcock, John Cage, Saul Bass, Buckminster Fuller and Charles & Ray Eames. She also was the writer of ‘Ten rules for Students and Teachers. The principles for learning how to live creatively and embracing uncertainty’. These ten rules were adopted by the Immaculate Heart college as their official art department rules. The historical mistake is that these rules were commonly attributed to John Cage. The fortunate misattribution to Cage happened because rule ten quotes John Cage directly. Merce Cunningham, John Cage’s life partner, kept a copy of the list in the studio where he rehearsed. After Corita Kent’s death, the ten rules were featured in a number of books and then spread to the internet. But what is even more important in the context of this Visualizing Movie Data project is the fact that beside these ten rules there are a few hints for students: ‘Always be around. Come or go to everything. Always go to classes. Read anything you can get your hands on. Save everything — it might come in handy later.’ And one of the last hints: ‘Look at movies carefully, often.’

Film review Essential Killing.

That is what we did since 2008. In 2013 we began reviewing movies. In the beginning we were doing that by rating one movie with points between zero and ten. At a later stage we used a more narrative way. We wrote detailed reviews of movies that we posted on Facebook. But over time that was too much work. You have to remember all facts and figures, you can’t concentrate on te film itself and it cost too much time. We decided to introduce a more accurate (and less emotional) way of reviewing. We started on January 5, 2015 using 13 categories for each movie: Storyline, Originality, Cinematography, Involvement, Sound, Editing, Educational, Title design, Acting, Interesting, Unusual, Exciting and Superior. Every category earns a score between zero and ten. And a simple Processing program add’s all points together and divides that outcome by the number of categories. Then you get an average point for one movie. On the right, a review from the film ‘Essential Killing’. The program also produces a text file with the results which we send as a rating and review to Apple iTunes, Netflix or MUBI.

Visualizing movie data

VMD_01_01

In this ‘Mapping Movie Data’ publication I will use the datasets to draw points on a map. I started looking for a world map. Eventually I found a world map that uses the Mercator projection. Mercator projection is a conformal cylinder projection with large surface deformations at higher latitudes. In this projection Europe is slightly larger which is an advantage because a lot is going to happen in this ‘small’ part of the world. Perhaps it is ultimately necessary to use a separate map of Europe as a insert. But I did not know that at that moment. I decided to omit all color because I wanted to use the color for the markers that would be positioned on the countries. At this stage I only display a world map.

Mercator projection

Mapping data

VMD_01_02

This is a first version in which the program reads coordinates from a text file. The coordinates are transformed by red dots. Size and location of the dots are fully arbitrary. I just wanted to know if it would work with random locations. And if that works than it should also work with exact coordinates and better sized dots.

Visualizing movie data

VMD_01_03

In order to use a map I need to find movie data. For instance I could gather a list of all countries that have produced movies. Googling away I found a two-letter code list of countries through the International Organization for Standardization (ISO 3166 Country Codes). I can use these abbreviations to display names of cities or countries in a later stage. The overall list is much too extensive so I have to select 46 countries that produced movies we actually have seen from the beginning of 2015. When you look at this simple visualization than there are a few things that stand out. In Europe it is very busy. Some countries are completely covered by a red dot. And some dots are overlapping others. Actually, itâ&#x20AC;&#x2122;s a mess! On all other continents there are only a few dots positioned. It looks very empty. Further, the overall color of the image looks too dark.

ISO 3166 Country Codes by the International Organization for Standardization (ISO), first edition was published in 1974.

Mapping data

VMD_01_03

I have made an outline version of the dots. But this does not solve the problem. In Europe the small countries are slightly more visible but it is a minimal improvement and not a good solution.

Visualizing movie data

VMD_01_04

I think Iâ&#x20AC;&#x2122;m going to make an insert for Europe. At least I thought that I would do that at that time. After I had made a rough sketch it actually delivered more problems than a solution. In the first place Europe is out of proportion when you compare it with the rest of the world-map. Mind you, that is the case with every world map. An exception is the projection of Goode. That has an equal-area map projection. And in the second place an insert covers always a portion of the world-map. Making that to be moved again. And actually you create five continents, which are all out of proportion. The question is whether thatâ&#x20AC;&#x2122;s good. It seemed best to temporarily leave everything as it is. And I solve problems when they are relevant. The dots are slightly reduced and everything becomes pretty clear. On the next page I have also added a title plus additional information which makes it even more complete.

Goode homolosine projection. The projection was developed in 1923 by John Paul Goode to provide an alternative to the Mercator projection for portraying global areal relationships.

Mapping data

VMD_01_04

Visualizing movie data

VMD_01_05

A few years ago I have been working with patterns and photography in Processing. I could apply a simple pattern on the world map. Since a realistic world map is not possible anyway, it is just as good to make some kind of abstraction of the world map. The question is now: is abstraction decoration? The world map is now completed with dots with a diameter of 4 pixels. The locations are displayed somewhere in the middle of the country. Although the middle in some countries is hard to find. Where is the middle of the USA with its territories and various possessions for example? 12

Processing photo manipulation

Mapping data

VMD_01_06

Those dots are of course meaningless. They only give the central locations of countries where movies are made. It would, for instance, be much more helpful when you could see where most movies are made. This would be possible when you could vary the size of the dot. Large dot is a lot of movies. Small dots are a few movies. This is a version that uses random generated dot sizes.

Visualizing movie data

VMD_01_07

This version uses our movie data. The size of the dot determines the number of movies from that country. It is immediately clear that most movies come from France, USA, UK and Germany. This, of course, says nothing about the quality of the movie. It only shows information about the quantity. And it is only partly the truth as we will see later. And how will this visualization look like at the end of 2015 or 2016 as we have seen much more movies. It was quite a surprise that France plays such a leading role in the field of movie production. But it also immediately raises questions. 14

Maybe weâ&#x20AC;&#x2122;re being manipulated? Or there might be another variable which makes that France appears so high in the movie production?

Mapping data

VMD_01_08

It is also possible to interpolate between two colors, and make all dots the same size. I go for the low numbers in red. And green for the high numbers. But I find that this version does not really show the small differences very well. In fact, there are only two green dots and two interpolations between green and red tending towards brown. The rest are nuances in red. And what does greenish or redish brown mean then?

Color interpolation between two colors.

Visualizing movie data

VMD_01_09

All countries of which we have seen five or more movies are green. The largest dot on the world map indicates that we have seen much more than five movies. A small dot indicates that we have seen five movies. All countries of which we have seen less than five movies are red. A larger red dot indicates that we have seen almost 5 movies. A small dot indicates that we may have seen just one movie. In short, this is very complicated and not very clear. And you really need some additional textual information here.

Mapping data

VMD_01_10

Of course I do want to be more accurate in displaying this data. So Iâ&#x20AC;&#x2122;ve changed the program in such a way that the abbreviated name of the country is displayed when you get close to the dot of the country with your cursor. This goes not flawless though. Sometimes the name of the country disappears under a dot. And at a certain point when the countries are small and close together both country names are being activated. I made two versions with Futura. One with Futura Medium and one with Futura Bold. I think that Futura Bold is better because it is more readable. 17

Visualizing movie data

VMD_01_11

I have optimized the mouse interaction. In this version there are never two countries selected at the same time. The name of a country is always drawn last. So it can never be overwritten by a dot.

Mapping data

VMD_01_12

I have replaced the floating numbers with integers. And I replaced all the colors by green. I think this is a reasonable version. You can easily see from which country most movies come from. And for the exact quantities you get feedback from the cursor. In this case we have seen two Australian movies in 2015.

Visualizing movie data

VMD_01_13

Iâ&#x20AC;&#x2122;m going to dig a little deeper into the movie productions made in France, United Kingdom, Germany and the USA. I begin with France. That data does not look very spectacular. Apparently, most of the movies are filmed in Paris. In addition there are a few movie locations in the south and center of France. Furthermore it appears that of the 47 French movies that we have seen only 20 are filmed on a location in France. The other 27 are made through partnerships with other countries. Those film locations are all situated outside of France. 20

Mapping data

VMD_01_14

There is a problem with the data. When I run the program I get an ArrayIndexOutOfBoundsException: 8 error. I tried to find out what causes this error. It appears when you read data from text-files that are of different lengths you get this error. That seems logical because in that way the arrays can not all be filled in the same way. For instance when you have an array with ten lines and another with eight lines you get an error. And that was the case when I went to adjust the French data files for the United Kingdom data files. The eight in the error message is the number of lines that were set aside for the array. 21

Visualizing movie data

VMD_01_15

It looks very empty in Germany. But I think it all will be fine in the long term. Also, I think Iâ&#x20AC;&#x2122;ve done something wrong. Some movies, of course, have several movie locations. If I can trace those it is guaranteed that the image is getting more interesting.

Mapping data

VMD_01_16

Also this US version is not really interesting. But something else is happening. If the first two characters in my text-file are not unique (and thus are duplicates in the list) the positioning of the dots and text goes wrong. And because the first two characters have no further function (but apparently have influence) it might be better to keep this form of abbreviations: AA, AB, AC, AD, AE, etc., and after AZ continuing with BA, BB, BC, BD.

Visualizing movie data

VMD_01_17

I have now found filming locations for all American movies we have seen since January 2015. First I did a much too superficially search so I found only 18 locations. Now I have 218 locations available. A number of them will not be used because the filming locations are outside the USA. And there are also several movies that play on the same location. So the final list will be shorter than 218 but much longer than 18. Now I need to make a list and avoid duplications. All the movies filmed in Los Angeles are to be summed up to a total. All movies in New York. All movies of Detroit. 24

And this goes for all the other cities and towns in the USA. I ended up with a final list of 87 cities and villages. Los Angeles has reached the top with 61 productions. Which is actually not very surprising.

Mapping data

VMD_01_18

I now go one level deeper. I started on a global scale. Then I did some mapping on a countries scale. And I’m now going to work on an urban scale. After some research it seemed suitable to me to use the movie locations of the series ‘Breaking Bad’. These series were filmed in the scenery of Albuquerque. I display the abstract version of the map of Albuquerque in the background. And I used a reduced version of the Breaking Bad word-mark. Which actually does not work. I also find the amount of movie locations insufficient. 25

Visualizing movie data

VMD_01_19

After some more research I could trace a lot more ‘Breaking Bad’ film locations. I changed the title and background colors. But I think it looks really terrible. When something is designed simple and basically it doesn’t have to look so bad. So I have to work on that in the next version.

Mapping data

VMD_01_20

This is the state in which I want to finish this exercise for the time being. All locations can now be read and the amount of scenes are shown after the location name. There is a title and a subtile. You can immediately see that Walter White’s House is the movie location which is most used for filming (80 scenes). Then Jesse Pinkman’s House follows with 42 scenes. Hank and Marie’s house with 29 scenes. The DEA offices with 27 scenes. The car wash with 22 scenes, Jesse Pinkman & Jane’s House with 21, Gus’s Laundry Service with 20, and Los Pollos Hermanos with 17. Too bad I can not show this in an interactive version. 27

Visualizing movie data

Walter White’s House.

Jesse Pinkman’s House.

Hank and Marie’s House.

DEA offices.

Car wash.

Jesse Pinkman & Jane’s House.

Gus’s Laundry Service.

Los Pollos Hermanos.

Mapping data

Breaking Bad intro screen.

Breaking Bad wordmark.

Visualizing movie data

A very rude and general conclusion: Data visualization is much more difficult than creating imagines with nice effects like I did in the Generative Design Variations project. In data visualization you have to limit yourself in order not to come up with lots of very unfunctional decoration. A considerable amount of time must be invested in research in the beginning, during and sometimes afterwards the design phase. And it is an iterative process. There are always improvements to make after you have made the best possible improvement. It also takes more time than a general design job. You must be very precise and constantly looking for better data, interpretation and visualization of the data. Finally, data visualization is detective work.

143 Movie reviews of the 200 movies that we watched in 2015.

Mapping data