Insights, analysis, and research about emerging technologies
State of the Computer Book Market
Copyright 2011, O’Reilly Media, Inc.
Post 1 In the previous two years, since the last State of the Computer Book Market posts, the Tech Book market has been going through some major changes. Hopefully you will see some of the trends that cause change, through the faint signals that the book market provides. You can get a quick refresher on how we see Computer Book Sales as a Technology Trend Indicator and our other posts on the State of the Computer Book Market. The data is from Bookscan's weekly top 3,000 titles sold. Bookscan measures actual cash register sales in bookstores. Simply put, whenever you buy a technology-oriented book in the United States, there's a high probability it will get recorded in this data. Retailers such as Borders, Barnes & Noble, and Amazon make up the lion's share of these sales. There will be five posts in total, which will be delivered every other day in the next week: 1. 2. 3. 4. 5.
Post 1, Overall Market Post 2, Category Performance Post 3, Publisher/Imprint Performance Post 4, Programming Language Performance Post 5, Summary and Digital Sales
Overall Book Market Performance Before we get to the specifics of the computer book market, let's get some context by looking at the whole book market for the week ending January 2, 2011. Everything that is printed, bound, and sold as a book, from The Girl with the Dragon Tatoo and Eat, Pray, Love to Decision Points and The Ugly Truth is represented in the table below.
Overall Book Market - EVERYTHING -Week Ending: 2011-01-02
All Books, All Subjects Juvenile Non-Fiction
Computers and Internet
Other Total Market
As you can see, the computer market is down about 4% from last year. It should be noted that the computer book market makes up only about 1% of total unit sales in bookstores and online retailers. If you would like to see the performance of the major categories, this table shows percentage growth. I find it interesting that the Humor category is one of the largestgrowing in an otherwise depressed market. The other growth area is Children's Non-Fiction Education/Reference -- and I certainly wonder why this category in particular is experiencing such strong growth. Now on to the tech book market. The chart below gives some perspective into how each year stacks up against prior years. As you can see, there has been some serious erosion since 2007, our most recent high point. The sales in 2007 had many believing that the market was finally recovering from the post-2001 decline, but then 2009 showed the biggest drop from a prior year.
Immediately below is the weekly trend for the entire computer book market since 2004, when we first obtained reliable data from Bookscan. Please remember that the data represents all publishers, and not just O'Reilly. The slightly thicker red line represents the 2010 data.
Click to enlarge As you can see, the clear seasonal pattern we've pointed out before still exists. That is, we have a strong start that declines through the summer, spikes for the fall "Back to School" season, and finishes strong. The trend line for each year closely mirrors the year before, with remarkably consistent weekly ups and downs. One trend that provides a bit of a silver lining in a fairly poor 24 months is that, 2009 averaged about 10-15% off of the prior year on a weekly basis, whereas in 2010 the market was about 3-6% off the prior year on a weekly basis. Could this indicate that the market has seen bottom, or are we just in a holding pattern as purchasers figure out how they want to acquire tech content? There are more selling options available now, as you will see in a later post, but the long and short of it is that many publishers are achieving more revenue and units through sales of the same content in a variety of digital formats. For some, the decline of print distribution is being offset by digital distribution and sales. What you won't see on this chart is that the computer book market cratered in 2001, shrinking 20 percent a year for 3 years, until it stabilized in 2004 at about half the size it was in 2000. (We only have reliable data going back to 2004.) You can now see a second cratering in the market that started in the second half of 2008 and has continued through 2010. The overall market growth rates for the previous six years are: 2005 = 1.48%; 2006 = 3.17%; 2007 = -2.00%; 2008 = -4.27%; 2009 = -15.31%; 2010 = -4.29% So what about that market was news in 2010? In 2010, there were 11 weeks that were ahead of the prior year unit sales. In 2009, there were only two weeks that were ahead of the prior year. So from that perspective, we have seen some signs of a recovery. 2010's overall growth finished at the market's 2008 level and declined much more slowly than 2009. To the optimists in the crowd, it appears as though we have seen bottom -- but pessimists will believe that they've seen this before: the market looks as though it hit bottom, but then takes another big hit downward. So it is really too unstable to predict whether it will move down again or continue to recover. Another way to look at the market is with the Treemap visualization tool. This tool helps us pick up on trends quickly, even when looking at thousands of books. It works like this: The size of a square shows the market share and relative size of a category, while the color shows the rate of change in sales. Red is down, and green is up, with the intensity of the color representing the magnitude of the change. The following screenshot of our treemap shows gains and losses by category, comparing the fourth quarter of 2010 with the fourth quarter of 2009.
So what are all the boxes and colors telling us? First remember that this is compares the last quarter of 2010 with the last quarter of 2009. This snapshot of the treemap looks less like the blood-bath of prior years (when there was red everywhere). There were quite a few bright spots (bright green) during the last quarter of 2010. Take a look at Android (in the upperleft box), with a 2,413% growth from the fourth quarter of 2009. You will also see Android in bright green (bottom-left corner) -- the difference is that the upper-left is for consumer books and the bottom-left is for Android programming books. Both had impressive growth in 2010 compared to 2009. In the upper-left corner is iPad which is black because there were no iPad books the prior year. However it is impressive how big the box is at this early point in its evolution. In 2010, Windows 7 was the number one growth area for units, followed by iPad, then Android (for consumers), and Android programming. This is unit growth, and a bit of the success for these technologies is that they are fairly new and do not have large market shares as a base to be measured against. Looking at longer-established technologies, Security and Network Security and Digital Photography had strong unit growth. I find it useful to organize the trends into classifications that are High Growth Categories bright green, Moderate Growth Categories dark green to black, Categories to Watch all colors, and Down Categories red to bright red. Most of these descriptions are self-explanatory, except perhaps Categories to Watch. This group contains titles that we've found are not typically susceptible to seasonal swings, as well as areas on our editorial radar. If there are categories you want to get on our watch list, please let me know. The table below highlights and explains some of the data from the chart above, although the data is for all of 2010. The Share column shows the total market share of that category, and the ROC column shows the Rate of Change (RoC = (current_period - prev_period) / prev_period). So, for example, you can see that Mac OS books represent 2.95% of the entire computer book market, and were shrinking by 32.12% (RoC).
This category has grown steadily since the end of 2008, and is now the 42nd largest book category, and has the 3rd fastest RoC.
This is not a huge area now, but its high RoC is accelerating its growth into a sizeable category.
This is the consumer area for Android -- user guides, best apps, etc. Jumped from nothing in 2008, to solid growth in 2009, to top of the charts in 2010.
A good-sized category, where title output decreased from 9 titles in 2009 to 6 in 2010, yet unit sales grew steadily.
A growing category that saw 8 new titles make the 2010 results. Titles are mostly introductory at this point in time.
A medium size category with 19 new titles in 2010, compared with 16 new titles in 2009 (the 2009 titles add to the units in 2010).
The titles in this category have doubled in the past two years, and 2010 had the biggest growth in number of titles, at 61%.
Security topics in 2010 have done well, with consistent output in the number of new titles.
Categories to Watch
Notes A large category that has finally taken over where XP and Vista titles left off. Windows 7 is a solid leader for operating systems. This category had no 2009 presence, but is now the 7th largest category in market share.
A very large category and usually consistent. Even though this category is down, it is not as down as the whole market.
A very large category (2nd after Windows), with 4 titles selling more than 10,000 units; 10 additional new titles in 2010 produced 64,581 fewer units.
Software Project Management
Got a bruising from Apple, and this category went into the tank, losing 59,340 in 2010 compared to 2009; In 2009, lost 55,187 units compared to 2008.
A large category with 59,668 fewer units sold in 2010. 67,642 fewer units were sold in 2009 than in 2008. Apparently Snow Leopard was a book-sales bust.
Web Design Tools
This category took a beating, mostly because Dreamweaver CS5 did not sell at the same pace as the CS4 titles slowed down. 45,709 fewer units were sold in 2010.
A good-size category with 16 fewer titles contributing to the category in 2010. In 2009, this category saw 56 titles selling more than 1,000 units, compared to 37 titles in 2010.
Web Page Creation
A large category with only 4 titles producing more than 10,000 units in 2010 whereas 7 titles in 2009 hit that mark. There were 70,492 fewer units sold in 2010.
The third largest category, with 4 titles selling more than 10,000 units; 11 additional titles making the list in 2010 produced 8,754 fewer units than 2009. A good-size and consistent category, though it is down in units sold. PMP and Agile PM seem to be the most popular titles here. Notes
Post 2 in this series will provide a closer look at the technologies within the categories. Post 3 will be about the publishers, both winners and losers. Post 4 will contain more analysis of programming languages, and Post 5 will look at digital sales.
Post 2 In this second installment (the first post can be found here), we look at computer book sales in specific technology categories. Remember that we've organized the data into six "Category Families" Systems and Programming, Web Design and Development, Business Applications, Digital Media Applications, Consumer Operating Systems and Devices, and Computer Topics. Within each of these Families are category group, super-category, category, and atomic category, in a five-level hierarchy. For example, Systems and Programming includes the category groups programming languages, databases, software engineering, general programming, security, and so on. In the rest of this post, we will contrast the final quarter of 2010 with 2009 as well as the whole year of 2009 with 2010. As a refresher, here is a new treemap of the Category Families, with their sub-areas for the final quarters of 2010 compared to 2009.
Click to enlarge This treemap shows a mix of red, green, and black, which basically reflects the fluctuating market. There is very little bright green (which represents fast growth). But again, remember that this is comparing the last quarter of 2010 with the last quarter of 2009. Two of the biggest and brightest green areas are Android Programming and Android Consumer both of which grew from tiny specks of boxes in 2008 to fairly sizeable areas in 2010. In the next two images, you can see how our Category Families stack up. The image on the left shows the number of titles that made the top 3000 in a given year. Contrast that with the image on the right, which shows the number of units sold in each year. What you will notice is that the number of titles in Business Applications/Topics and Systems and Programming went up in 2010, yet the units sold for both Categories went down. Consumer Operating Systems and Devices was the only area that went slightly up in both the number of titles and units sold in 2010. Systems and Programming is the largest category, but its performance is more volatile, and is experiencing the largest overall decline. This category is the chief indicator for the health of the computer book market, and it's in consistent decline for print books. You'll see some more positive indicators in my upcoming post on digital distribution.
The table below shows each Category Family's compared growth between 2009 and 2010 (YoY Growth), 2009 and 2010 ranking (09Rank/10Rank) and 2009 and 2010 percent of market share (09Share/10Share).
Category Families Business Applications Computer Topics / Other Consumer Operating Systems
YoY Growth 09Rank 10Rank 09Share 10Share -05.10%
Systems and Programming
Web Design and Development
Before we look into categories further, let's first take a look at the words that make up all the computer titles for 2010. It's an interesting view of the words that the publishing industry puts on the front of books, online searches, and anywhere there is metadata about content. A note about this data: I threw away the stop-words like "the", "and," "it," "with," etc. I also
disregarded "Microsoft," since it is a descriptor used for various products and is redundant. Here is the 'title' view of the market.
Click to enlarge When we drill into the category families a bit, we see that seven of our ten top categories (known as super-categories) sold fewer units in 2010 than in 2009, for a net loss of -244,936 units for just the top ten areas. In other words, our bigger and typically more stable areas were selling significantly fewer units in 2010. In the first half of 2010, there were 49 super category areas that were ahead in the sales over the first half of 2009, yet six of the 49 categories slowed down and ended up losing enough ground to show a year-over-year decrease in units. We ended up with 43 super-categories producing more units in 2010 than they did in 2009. The biggest winners in growth order are: Tablet, Mobile Programming, Windows Consumer, Security Topics, Hardware Topics, Social Web, Computers and Society, Cloud Computing, Information Technology, and Data Topics. The Tablet super-category went from roughly 15,000 units in the first half of 2010 to an additional 100,000 units in the second half of the year. An increase in titles fueled this growth output tripled from 7 titles in the first half of 2010 to 22 titles by the year's end. The areas with the largest drop in units were, in descending order: Web Page Creation, Digital Photography, Mac OS, Flash, Web Programming, Web Design Tools, Personal Computers, Linux, Software Project Management, and Personal Database. The category that surprises me the most is Web Programming. Sixteen fewer titles in Web Programming area made the list in 2010, and only 7% of the titles sold more than 1,000 units, as compared to 11% in 2009. As the market keeps declining, the response of many publishers is to increase the number of titles published, in an attempt to gain market share. Immediately below are two bar graphs showing the trend for how many titles made it into the Bookscan dataset in a given year, and the average units sold is for all titles. So this is the non-obvious point here: There are not necessarily more titles being published, but more titles making it into the data set. This could be attributed to a lower threshold to get in. In other words, some weeks the threashold to make the top 3000 list can be as low as 6 units sold. It is a relative measure. The last couple of years have had lower thresholds, and thus more titles made the list but with worse average units. When the market is healthy, the threshold moves up and only the solid-performing titles make it into the top 3000. The lower threshold barrier is resulting in a significant decrease in the average units per titles for all publishers. Out of the 22 largest imprints, 18 increased the number of titles that made the list in 2010. Yet only 6 of these 18 imprints with title increases also saw increases in their average units per title. The point again, is that the market can see more titles making the top 3000 list, but if the threshold is lower, the average units and overall units will be too.
Number of Titles
The table below provides a view of the market's erosion. The Average Min value represents the "low threshold" weekly average during a given year. The Average Max is the highrange weekly average for a given year. Number of Titles is self-explanatory. You will notice that the years with the highest min had fewer overall titles represented in the data. The bottom line is that as the market erodes, it appears as though we are seeing a watering-down more titles producing poor results.
Year Average Min Average Max Number of Titles 2004
So it could be said that we've been in a bit of a tech innovation slump. Will any technology, platform, method, theory, or new-fangled invention stave off this market slump? Or will we continue the treadmill effect of more publishers chasing lost revenue with more titles, which merely replace existing units with marginal decreases? I think it is the latter. Something big needs to come along to drive a large increase in the market. I'm not convinced it is cloud computing, mobile, or social platforms even though those areas seem poised for future growth. What do you think will be the big growth areas in the next five years? Is there anything poised to make a big impact on the tech world? Now let's look at the categories that comprise each category family. Below are some individual trend charts from our dashboard showing the 24-month period from January 2009 to December 31, 2010 for the major categories. By looking at a 24-month pattern, you get more insight into whether or not a particular area seems to be hit by seasonal factors, and if there is a steady decline/increase for the category. It is important to look at scale on these charts because it visually shows you the relative market size. Another way to think about it is if the trend line is high in the individual box, the category is big, and if it is low, it is a smaller category. What is interesting to note is that Consumer Operating Systems, Digital Media, and Business Applications and Devices all have a January spike, which is likely due to individuals buying "how to" books for their new computers, devices, and operating systems. This is a consistent seasonal pattern.
Web Development and Design
Consumer Operating Systems/Devices
Systems and Programming
The Categories (24-month rolling, Janaury 2009 December 2010) Clicking on the charts below will produce a larger view. When viewing the charts below, keep the reference charts above in mind. Viewing these jointly provides more context on the size of market and seasonal patterns. Category_Family: Consumer Operating Systems and Devices Here are the trend lines for the four main categories (cat_family) that make up Consumer Operating Systems and Devices.
This category is a medium-sized area and was the one of two Category Families to show growth year-over-year. This category's growth is driven by Windows 7 and Port Dev (Portable devices). Port Dev was dominated by Android in 2010 and iPhone in 2009. And remember from earlier that the Tablet area moved quickly up the charts in the second half of 2010. We see a new title and topic leading the way this year with Windows 7 For Dummies by Andy Rathbone. Two other Windows 7 titles are in second and third place. The perennial leaderMac OS X Snow Leopard: The Missing Manual fell to fourth, as Snow Leopard was not a huge OS release by Apple and the topic did not drive this category as it had in previous years. This market has shown growth because of the explosive growth of Mac OS X, but if you compare with Windows books, the Windows books are the steady sellers and have the growth in the last two years. The chart below shows how these two are stacked up against each other. Are you a PC or a Mac? The chart below says more of you are PCs!
Click to enlarge Category_Family: Business/Office Applications When comparing the Business Apps area for 2009 and 2010, there were 8 super_cats (one level below cat_family) that performed ahead of the prior year and 23 that underperformed compared to the prior year. Unfortunately the 23 underperforming super_cats lost 67,000 more units than the 8 positive areas had gained, for an overall -5.10% growth rate. The two healthiest super categories were Spreadsheets (Excel) at +2.42% growth, and Social Network (Facebook) at +11.49% growth, while Graphics Applications at -16.80% and Ecommerce at -47.00% were the two biggest laggards. What surprised me the most was that the Content Management Systems category did not grow, as I had thought it would. So I dug a bit, and discovered that most of the growth in CMS as a category occured between 2006 and 2009. During the past two years, the category has held its own and performed better than the overall market decline. For a view of CMS growth, click on the chart below.
Click to enlarge Here are the trend lines for the three main categories that make up Business/Office Applications.
Obviously the big sub category here is "web site". It is dominated by titles that talk about performance, scability, reliability, and tuning like what you can find at our Velocity Conference or in this bundle of references. Rich Web Interface moved to second among these categories, but is experiencing declines. In the RWI space, both Flash and Silverlight had fairly significant declines. Flash declined -84.43% while Sliverlight declined -8.29%. But the Flash subcategory is currently about four times as large as Silverlight. Could it be that HTML5 makes these two technologies seem kind of moot? Category_Family: Systems and Programming This is the largest of our top-level category families. It is the place where most of the programming language, database, and software development titles reside. The normal trend here is the category to get off to a good start early in the year, and then have another peak around September (when college students go back to school). There are now 67 super_cat subcategories in this area. In 2010, 44 of the areas were negative year-over-year and only 23 areas had growth when you add the negative and the positive areas, there were -72,024 fewer units sold in these areas during 2010. This is only a -3.32% decline, so this large family of titles actually did better than the overall market. The top five performing categories, in order, were Mobile Programming, Security Topics, Cloud Computing, Information Technology, and Data Topics. The categories with the worst performance, in order, were Linux, Software Project Management, Personal Database, Visual Basic, and SQL Server. In the top performing area of Mobile Programming, iPhone Programming led the way for growth in 2009, while Android led in 2010. Remember this is not the consumer market of books about how to use an iPhone or Droid, but the programming market iOS was nine times as large as Android in 2009, and roughly 2.5 times as large of a category in 2010. Here are the trend lines for the first set of three, of the nine main categories that make up Systems and Programming.
Click to enlarge Note the scale of the overall category. Programming languages have consistently been the largest category group; the category "prog" has come from a distant third to the number two super_cat in this area. Databases have been consistently declining for about three years now. As mentioned earlier, Software Project Management was one of the biggest losers of 2010, yet it was also the third-largest super_cat in Systems and Programming, preceded by Mobile Programming and Security Topics. However, these latter two showed positive growth, compared to SPM's decline. Another area that came from nowhere and is now a healthy super-category is Data Topics. Many of these titles are similar to the talks, sessions, and writings found at our Strata Conference and Data Science resources. The second set of three trend line charts are healthy and show less volatility when compared to category groups from other Category Families. Their trend is flat, yet consistent with the seasonal swings of the market.
Click to enlarge When comparing the whole year of 2009 to 2010, the Software Engineering group is the largest of the second set of three. It is led by a classic title in The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition and a new classic in Coders at Work: Reflections on the Craft of Programming. The Network category is dominated by CompTIA titles and holds five out of the top ten spots in the category, including the top two spots. The third set of trend lines were driven by CISSP, Intrusion topics, and CompTIA Security.
Click to enlarge Next up, Post 3 will be about the publishers, winners and losers. Post 4 will contain more analysis of programming languages. And Post 5 will look at digital sales.
Post 3 In this third installment, (see Post 1 and Post 2; Post 4 & 5 to come soon), we will look at how publishers fared in 2010, as compared to 2009. The chart below shows our dashboard view of the large publishers' results for 2010. The most notable piece of information is that Wiley continues to hold the leading spot as the largest publisher (with 32% market share of units sold), while Pearson and O'Reilly both lost 1%, which is picked up by Cengage and McGraw Hill. (We'll look at revenue share later in the analysis.)
2009 Pub Share
2010 Pub Share
You may not recognize the names of all the top publishers, because they are actually conglomerates of many smaller publishing imprints that they've acquired, created or distributed over the years. The imprints are the familiar consumer-facing brands. For instance, when you purchase a book from Peachpit or Sams, you typically see Peachpit or Sams on the spine, not Pearson, even though Pearson owns both companies. In O'Reilly's case, all the imprints that are not branded "O'Reilly" are part of a distribution partnership and are not owned by O'Reilly. The various imprints that make up each major publisher's share are shown in detailed pie charts later in this post. Let's look at the top publishers and how they performed year-over-year. The following table provides some interesting comparative data.
2010 Units 2009 Units 2010 Title Count 2009 Title Count 2010 Efficiency 2009 Efficiency
Lightning Source Sum/Avg
So what is notable in this data? First, the big three publishers (more than 1 million units per year) are all down. Second, four of the top eight publishers are up: McGraw Hill, Cengage, Reed Elsevier, and Lightning Source all had modest gains in 2010. Overall, these top eight publishers collectively saw 266,232 fewer units sold in 2010, with 222 additional titles making the list. A note on Market Share versus Title Efficiency A typical indicator of publisher performance is market share of units sold, which is what we've been looking at so far. Perhaps a better measure is how many published titles it takes to get a comparable share of unit sales. This is the ratio of title share to unit market share. Think about it this way: if a publisher has 15% of the titles appearing in the Bookscan Top 3000, and gets a 15% share of units sold, they will have a ratio of 1:1, expressed as a title efficiency of 1.0. A publisher with 20% of the title share, and 10% of the unit share would have a .5 efficiency. An efficiency of 1 is the market average: 100% of the title count delivering 100% of the unit sales. A publisher that achieves its share with fewer titles will have a higher ratio. Only two publishers continue to have an efficiency of more than 1: Wiley and O'Reilly. Publishers under the 1.0 threshold typically have many titles in the Bookscan data, but they are not selling many units. A note of caution thoughsome publishers have many evergreen titles, which can skew this data. Typically, older titles sell fewer units each subsequent year. But this is not always true, as some titles continue to sell like they are newly released. Head First Design Patterns is one example, still selling more than the majority of brand-new titles. So efficiency could be thought of as a frequency ratio rather than a true efficiency measure, because it is very efficient to publish a title and have it sell for years. A true efficiency metric would take into account all titles published by all publishers and how many make it into the top 3000. Some publishers have titles that never even make the top 3000, so we will not be able to count them (for or against an efficiency metric) because they are missing from the datasetset. A Note on Evergreen Status The table below shows imprints that have a percentage of evergreen titles. And how did we come up with an evergreen status? We assigned points to titles that had copyright dates older than 16+ years (most points), 10-16 years, 5-10 years, and less than 5 years (no points because they have not proved to be long-lasting yet). After assigning points for each title, we were able to see what percentage of evergreen titles each imprint had in the top 3000 between the years 2004 and 2010. It's interesting to note, and somewhat expected, that the top three evergreen imprints have a strong academic heritage (Wiley, Addison-Wesley, and Prentice Hall).
Now that we have a basic understanding of title efficiency and evergreen status, let's look further into the 2010 results for the imprints and drill in on the top three publishers: Wiley, Pearson, and O'Reilly. This is important because you typically see the imprint name on a book when you purchase it, but may not be aware of who the publisher is. (You'll likely see the publisher inside the book on the copyright page, except in the case of O'Reilly because our other imprints are distribution partners. That is, O'Reilly provides some sales and distribution services to these partners, but they are not owned as is the case for Pearson and Wiley imprints.) Click on any chart to get a bigger image.
#2 O'Reilly Media
In 2010, O'Reilly Media became the second largest publisher with all imprints aggregated under the O'Reilly partner and distribution umbrella. In this data, I have included all partners and roughly the year they were added to their respective conglomerate. For the most part, our agreement with Microsoft Press is what catapulted O'Reilly Media to become the second largest tech publisher, though only by a slight margin ahead of Pearson (less than 15,000 units). Wiley continues to dominate as the largest publisher and seems be more resilient to the market declines, chiefly due to the Dummies brand and its wide-ranging scope. The trend chart below shows the three main tech book publishers and their respective growth by year.
Now that you have an idea of the imprints that make up the largest three publishers, let's tease out all the imprints and look at their respective market share. The following chart shows the top 20 imprints and how they stack up against each other. These ten imprints saw 422,814 fewer units sold in 2010and remember that 2009 was not a strong sales year for tech books. The market was held together by the medium-to-small size publishers that you do not see on this list. From this imprint view, you'll notice that O'Reilly has the second largest market share behind Dummies.
So what do the graphs tell us? The first notable thing is that there was very little movement in the top ten imprints. In other words, the imprints that were occupying the top slots still do. In fact, of the top 20 imprints, there were only 7 that showed a slight increase in units compared with 2009 and 8 that showed an increase in dollars (see the graph below). The only movement out of the top ten was Apress, which dropped from #10 to #11. Sybex, Wrox, and then Course Technology (in that order) showed the biggest increase in units from 2009 to 2010 and Wrox, Course Technology, and then Sybex (in that order) showed the highest growth percentage. Basically, Sybex had a larger base to grow from, so the overall share growth was not as significant for them. The other two were half the size and their unit growth made an impact on their growth percentage. Before analyzing imprints by category, let's revisit the data with dollars rather than units. We have a fairly easy way of calculating this: units sold * listprice = dollars. Granted there are discounts, promotions, and other things that affect the precision of this, but it is pretty close. If nothing else, you can think of this as retail value. So here are the top imprints from a revenue perspective. Again, this is at the imprint level and from a dollar perspective. As you can see, compared to the units chart above, the leaderboard quickly changes. Microsoft Press becomes the number one revenue-producing imprint, followed by O'Reilly, and then Dummies. The biggest move in the top 20 is that Addison-Wesley jumps from #8 in units to #4 in dollars, and conversely, Wiley's Visual imprint goes from #10 in Units to #17 in Dollars.
Imprint Analysis by Category Now that we have seen a high-level picture of what imprints did in 2010, let's take a look at which categories each of them publishes in and where their strengths lie. Dummies and O'Reilly appear to have the most diverse publishing programs, as they are not at the bottom in any category. Dummies is clearly the leader in Business Apps and Consumer Operating Systems, while O'Reilly has climbed further ahead of Microsoft Press in the Systems and Programming category. This chart also seems to indicate that Addison-Wesley is really only publishing in the System and Programming space. That is what some publishers do: they have specific imprints publish in one or two categories only. The jury may still be out on whether that is a good or bad strategy, but Addison-Welsey's success in revenue growth in 2010 could be because of this focus. The corollary is when there is not much new tech driving one area, a publisher/imprint may become more susceptible to market declines because of the lack of diversification.
Imprints' Category Strength
Categories and the Publishers who Dominate Them The following category images are for 2010, and the tables have each publishers' count of titles and sum of units. The top titles are also listed for each area in 2010. Category: Systems and Programming In this category, you can see that O'Reilly now has the largest market share among the publishers, with Pearson a close second. If we drill into the imprint level, the picture of who is driving this gets clearer. The top six imprints are O'Reilly at 13.66%, Microsoft Press at 10.26%, Addison-Wesley at 9.65%, For Dummies at 7.04%, Apress at 6.67%, and Prentice Hall at 5.48%. What is not obvious from this data is that the top publishers all have come down a couple of percentage points, which means that the market is getting its growth in the middle of the pack.
As you can see in the table below, O'Reilly has the most units and best title efficiency rating. It is a relatively healthy mix. That is, we have quite a few titles, but our efficiency is also significantly above the market average. Sys & Prog - Publisher Market Share (01/01/2010 12/31/2010)
Title Count Units/Title Efficiency 637
Note: This category family contains "programming languages" and "programming", where more units were sold in 2010 than in 2009. The leading titles and publishers for Systems and Programming in 2010 were: 1. 2. 3. 4. 5.
PMP Exam Prep: Rita's Course in a Book for Passing the PMP Exam (this is the perennial leader) (RMC) MCTS Self-Paced Training Kit : Configuring Windows 7 (Microsoft Press) CISSP Certification All-in-One Exam Guide, 5th Ed. (McGraw Hill) CCNA Official Exam Certification Library, 3rd Ed. (Cisco Press) Head First Java, 2nd Edition (O'Reilly)
Category: Web Design and Development In this category, you can see that Wiley has the largest market share among the publishers, with Pearson second. If we drill into the imprint level, the picture changes a bit. The top six imprints are O'Reilly at 21.11%, Dummies at 13.30%, Sams at 6.25%, Wiley at 5.92%, New Riders at 5.82% and Peachpit at 5.40%.
As you can see in the table below, Pearson has the most titles and their performance is strong in this category. In Web Design and Development, it used to be that most of the top publishers were above the title efficiency average of 1.0, but now there are only three top publishers over the 1.0 efficiency threshold. (This suggests that there are a lot of second-tier publishers with lower efficiency who don't show up in the table.) The category experienced a decline of about 25,000 fewer units, yet saw 21 additional titles make the list in 2010. This contributes to the decrease in category/publisher efficiency. Web Des & Dev - Publisher Market Share ( 01/01/2010 12/31/2010 )
Title Count Units/Title Efficiency 219
The leading titles and publishers for Web Design and Development are: 1. 2. 3. 4. 5.
Don't Make Me Think: A Common Sense Approach to Web Usability, 2nd Ed. (New Riders' ) Head First HTML with CSS & XHTML (O'Reilly) CSS: The Missing Manual (O'Reilly) HTML, XHTML, and CSS: Visual Quickstart, 6th Ed. (Peachpit) WordPress For Dummies: 2nd Ed. (Wiley)
Category: Business Applications In this category you can see that Wiley has the largest market share among the publishers and O'Reilly (Apologies: the O'Reilly 21% is obscured in the graph.) has moved ahead of Pearson for second. If we drill into the imprint level, the picture changes a bit. The top six imprints are Dummies at 28.34%, Microsoft Press at 14.72%, McGraw Hill/Osborne at 7.06%, O'Reilly at 6.29%, John Wiley at 5.74%, and Que at 4.18%.
Bus Apps - Publisher Market Share ( 01/01/2010 12/31/2010 )
Title Count Units/Title Efficiency 386
McGraw Hill Cengage
The leading titles and publishers for Business Applications are: 1. 2. 3. 4. 5. 6.
Facebook For Dummies (Wiley) Office 2007 All-in-One Desk Reference For Dummies (Wiley) Excel 2007 for Dummies (Wiley) Microsoft Office Excel 2007 Step by Step (Microsoft) QuickBooks 2010 The Official Guide (McGraw Hill) Excel 2007 All-In-One Desk Reference For Dummies (Wiley)
Category: Consumer Operating Systems In this category, you can see that Wiley has the largest market share (at a whopping 46%), among the publishers with O'Reilly again in second at 22%, and Pearson comfortably in the third spot at 17%. (Apologies: the O'Reilly % is partially cut off in the graph.) If we drill into the imprint level, the picture changes a bit. The top five imprints are Dummies at 31.00%, O'Reilly at 13.15%, Que at 9.39%, Microsoft Press at 8.52%, and Wiley's Visual at 7.10%.
As you can see in the table below, Wiley has the most titles and a relatively good efficiency rating, and O'Reilly also has a very healthy title efficiency rate. What is impressive with this category, is that it has four of the top six imprints averaging more than 1,000 units per title. To me, that means it's a category that sustains numerous big-seller titles, not just an occasional retail success. As you can see from the best-selling titles below, it is mostly Windows 7 that is driving this category, though iPad: The Missing Manual came from virtually nowhere to be among the bestsellers in 2010. Cons Opsys & Dev - Publisher Market Share ( 01/01/2010 12/31/2010 )
Title Count Units/Title Efficiency
The leading titles and publishers for Consumer Operating Systems are: 1. 2. 3. 4. 5. 6.
Windows 7 For Dummies (Wiley) Windows 7 For Dummies Book + DVD Bundle (Wiley) Windows 7 Plain & Simple (Microsoft Press) Mac OS X Leopard: The Missing Manual (O'Reilly) Windows 7 Step by Step (Microsoft Press) iPad: The Missing Manual (O'Reilly)
Category: Digital Media In this category, you can see that Pearson has regained the top spot, with Wiley falling to second. As you can see in the table below, Pearson has the most titles and a relatively good efficiency rating. O'Reilly also has an extremely healthy efficiency rate and average units per title. This relates to my earlier comment about publishing fewer titles, while getting more out of the ones that you do publish. For instance, in the table below, Reed Elsevier has moved into third for publishers, yet this is largely due to twice as many titles as O'Reilly and hence have a much lower efficiency rating. If you factor in that Reed Elsvier also made 1/4 of their units in one title, their efficiency is a little deceiving.
If we drill into the imprint level, the picture changes a bit. The top six imprints are Peachpit Press at 18.19%, For Dummies at 15.97%, Focal Press at 12.91%, O'Reilly at 11.18%, Adobe Press at 9.91%, and New Riders at 7.73%. Digital Media - Publisher Market Share ( 01/01/2010 12/31/2010 )
Title Count Units/Title Efficiency
The leading titles and publishers for Digital Media are: 1. 2. 3. 4. 5.
Adobe Photoshop CS5 for Photographers (Focal Press) The Adobe Photoshop CS5 Book for Digital Photographers (New Riders) Adobe Photoshop CS4 Classroom in a Book (Adobe Press) Adobe Photoshop CS5 Classroom in a Book (Adobe Press) Photoshop Elements 8 for Windows: The Missing Manual (O'Reilly)
Next up, Post 4 will contain more analysis of programming languages. And Post 5 will look at digital sales.
Post 4 In this fourth post (posts one, two and three are found here) on the State of the Computer Book Market, we will look at programming languages and drill in a little on each language area. Overall, the market for programming languages was down -6.27% in 2010 when compared with 2009. There were 6,303,125 units sold in 2009 versus 5,931,452 units sold in 2010, which is a decrease of -371,673 units. Java experienced the biggest gain in units, at 28,633 more units in 2010 than 2009, while PHP occupied the opposite end with the biggest decrease at 38,614 fewer units year-over-year. Before we begin to drill in on the languages, we thought it would be best to explain our "language dimension." When we group books by their language dimension, we categorize them by the language used in their code examples. So Flash Programming with Java would be in our Flash atomic category, but the language dimension would be Java. Similarly, our Head First Design Patterns book contains examples written in Java, so it too carries the "java" tag on the language dimension. To provide some perspective, 2009 and 2010 have been the worst two years for book sales in the category of programming languages. The chart directly below does not include books that are method-oriented, about project management, about Consumer Operating Systems, or books without language-oriented material. So this is a different view of the market than the overall view found in Post 1 of this series. In the chart below you can see all languages on a week-by-week basis while showing that the Years 2009 and 2010 are consistently below prior years.
In 2008, we reported that C# surpassed Java as the number one language. But hold on, Java proved to be resilient in 2009 and experienced a resurgence in 2010 and is now the number one language from a book sales perspective. As you can see in the 2010 Top 20 langugages chart below, Java has a significant lead in the language race with Objective-C moving into third place closely behind C#.
2010 Market Share
A Treemap View of the Programming Languages
In the treemap view above, which compares the last quarter of 2010 with the last quarter of 2009, you'll notice a lot of bright green areas, several solid green areas and a fair share of black and red areas. The main reason Objective-C is down 12% is that it had a tremendous 2009, which was hard to sustain. The language came from a small speck on this treemap view, to occupying a fairly sizable square. Before we dive in, let's look at the high-level picture for the grouping of languages. I have grouped these languages by total number of units sold between 2004-2010. As you can see in the table below, only the Mid-Major group experienced growth in 2010, while the rest showed declines. The language driving the most growth in the Mid-Major area was R. An interesting observation is that the statistical languages, much like those you would be exposed to at our Strata Conference, are experiencing substantial growth. Namely, R, SAS, Matlab, Labview, Mathematica, and SPSS have collectively seen an increase of 49,504 units, or a whopping 102.87% growth. Maybe Hal Varian's quip about Statistics being the "sexy job of the future" is motivating developers to learn these languages.
Y2010 Units Y2009 Units Y2010 # Y2009 # 10MketShar 9MketShar 1,051,945
Mid-Major 3,000 9,999
Mid-Minor 1,682 2,999
For the sake of grouping and presenting this information in a more readable format, we have classified the categories for the languages in this way with the following headers:
1. Language 2. 2010 Units 3. 2009 Units 4. 2010 Titles 5. 2009 Titles 6. 10Mkt Share 7. 09Mkt Share
1. 2. 3. 4. 5. 6. 7.
Name or short name of the language Units sold in 2010 Units sold in 2009 Number of Titles making Bookscan 3000 in 2010 Number of Titles making Bookscan 3000 in 2009 2010 Market Share 2009 Market Share
Large Programming Languages 50,000 195,000 units in 2010 *Large* Language
2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Here are the top titles for the Large languages. Incidentally, the titles and order are the same whether you look at units sold or dollars generated, except that the WordPress title falls out of the top five and Addison-Wesley's PHP and MySQL Web Development moves to #5:
Head First Java, Second Edition
Professional Android 2 Application Development
Addison-Wesley Programming in Objective-C 2.0 Dummies
WordPress for Dummies (covers PHP)
You'll notice in the Major languages that C, Powershell, ShellScript, and VBscript all had growth. Overall, these languages sold roughly 27,000 fewer units in 2010 compared to 2009. That equates to a 12% decrease for the Major languages.
Major Programming Languages 10,000 49,999 units in 2010 *Major* Language
2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Here are the top titles for the Major languages.
C Programming Language
Practical Guide to Linux Commands, Editors, and Shell Programming
Learning Perl, 5th Edition
Morgan Kaufman Programming Massively Parallel Processors: A Hands-on Approach (C language) Pragmatic
Agile Web Development with Rails, Third Edition
Mid-Major Programming Languages 3,000 9,999 units in 2010 The news in this category is that the statistical languages are doing really well. As noted above, these languages have grown by 102.87% from 2009 to 2010. The most impressive growth is for the eight titles for the R language: the overall category is led by R in a Nutshell.
2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Here are the top titles for the Mid-Major languages.
R in a Nutshell: A Desktop Quick Reference
Using SPSS for Windows and Macintosh: Analyzing and Understanding Data
The Little SAS Book: A Primer, Fourth Edition
Open University Press SPSS Survival Manual: A Step by Step Guide to Data Analysis Using SPSS for Windows Sams
Mastering Unreal Technology, Volume I: Introduction to Level Design with Unreal Engine 3
Mid-Minor 1,682 2,999 units in 2010 The news in this category is the growth of functional languages, like F#, Scala, and Lisp. These languages showed a nice 51.38% year-over-year growth and generated 7,648 units in 2010, compared to 3,718 units in 2009.
2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Here are the top titles for the Mid-Minor languages.
Learning To Program with Alice
Programming in Scala: A Comprehensive Step-by-step Guide
No Starch Press Land of Lisp: Learn to Program in Lisp, One Game at a Time! Prentice-Hall
LabVIEW 2009 Student Edition
Real World Functional Programming: With Examples in F# and C#
Minor Languages 1,000 1,680 units in 2010
This category of languages saw 6 of the 10 languages sell fewer units in 2010. There was roughly a 20% decrease in units sold year-over-year. The bright spot was the performance of Mathematica, mostly fueled by the Mathematica Cookbook. This area is dominated by functional languages like the previous category, however, these languages are not experiencing the substantial growth.
2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Here are the top titles for the Minor languages.
Real World Haskell
Pragmatic Programming Clojure O'Reilly
sed & awk
Linelist 399 999 units in 2010 This category of languages saw 6 of the 10 languages sell more units in 2010, although the sales volume is fairly insignificant. There was roughly a -0.81% decrease in units sold yearover-year. I am not going to list the bestsellers, because they are not exactly bestsellers in this sort of category.
2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
TheRest Programming Languages < 400 units in 2010 Lastly, the following languages sold fewer than 400 units in 2010. Here is the list in descending order: autolisp, unity, x++, cfml, inform, mysql spl, blitz3d, q, nxt, gml, pure data, javafx, rpg, cobol, nxc, minitab, ml, boo, ada, fortran, octave, jcl, racket, jsl, idl, cfscript, abap, verilog, m, smalltalk, mumps, go, windows script, egl, c/al, realbasic, bondi, cl, cs2, eiffel, ocaml, and xquery. Next up, Post 5 will look at digital sales.
In 2010, Paul McFedries had his name on 58 different books (ranging from 2001 through 2010) that made our list, for a total of 112,152 units sold. His books sold the most units in 2010 and he had 12 new books publish in 2010, that made the list. His total was about 20,000 more units than David Pogue who saw 13 of his titles make the list with 92,000 units sold. Although Pogue averaged about 7,000 units per title, and McFedries averaged about 1,900 per title. The remaining authors that make up the the top five authors for unit sales are: Andy Rathbone, Greg Harvey, and Dan Gookin. And if you look at the top five authors from a dollars perspective the list, in descending order looks like this: Paul McFedries, David Pogue, Andy Rathbone, Rita Mulcahy, and Scott Kelby.
Electronic distribution and Sales Now let's move past print sales in 2010 or at least partially away from traditional channels of distribution, to discuss e-distribution. The chart immediately below shows eBooks sales in 2008. The data from The Association of American Publishers (AAP) is found here. What amazes me is the growth in scale. The scale has grown nearly 10-fold from the top of the scale being $18 million in 2008, to the top of the scale being $120 million in 2010. And the timing is just now at a tipping point because of all the tablet devices being released. If the market continues on its current growth pattern, it will be a Billion dollar business in 2013. Click on each image to view a larger version.
Market in 2008
Market in 2010
The AAP has these caveats to explain the data: ● ● ● ● ● ● ●
The data above represent United States revenues only. The data above represent only trade eBook sales via wholesale channels. Retail numbers may be as much as double the above figures due to industry wholesale discounts. The data above represent only data submitted from approx. 12 to 15 trade publishers. The data does not include library, educational or professional electronic sales. The numbers reflect the wholesale revenues of publishers. The definition used for reporting electronic book sales is "All books delivered electronically over the Internet OR to hand-held reading devices". The IDPF and AAP began collecting data together starting in Q1 2006.
Based on these caveats, it is an understatement, in my opinion, to say that the market will be a billion dollar market in 2013. The following two charts show Safari revenue growth and the oreilly.com direct sales mix. The reason I am showing these is because the same content that goes into our print books is available in various digital forms. Safari is a subscription service with more than half a million users. Its main focus is its B2B service, allowing developers from many of the largest companies in the world to have access to Safari Books Online. One notable difference is that the categories with consumer-oriented titles and many of our Digital Media titles, do not perform as well in Safari. Developer titles rule in Safari. As you can see from the chart to the left, our content in Safari is growing at a nice steady rate. The chart on the right is potentially more interesting and indicative of what is happening in the market. This shows the percentage of sales on oreilly.com during 2010 for Books, Ebooks, and Video. These are big three content types with much of the same content. The ebooks are just digital versions of our print products. So the big, and more likely HUGE, news is that ebooks represent about 88% of our unit sales, and 79% of our dollar sales on oreilly.com. What is really impressive is that the growth of our digital products is moving faster than the decline of our print products. This suggests to me that we are not seeing one product type cannibalizing another rather they are supplementing each other. The third chart shows the changing nature of our publishing program. The percentages represent each particular format and how many units we sold that year. To me, this shows the trend of what is happening. Print is slowly declining, and EPUB, Mobi, and Ebooks are skyrocketing off the chart at a rate faster than print is declining. PDF is declining and when you think about it, this makes sense. O'Reilly used to offer two type of book product: print and a PDF. Now we offer our content in virtually any form a reader would like it. So with Mobi, EPUB, and Ebooks, we are seeing the less useful PDF decline significantly.
Safari Growth for O'Reilly
oreilly.com Sales Mix
O'Reilly Product Sales Mix
Again, this data is taken from direct sales for O'Reilly and oreilly.com, and may not represent the whole computer book market. I have heard from other publishers, specifically Dave Thomas at the Prags, that this split is consistent with, or a little behind, their publishing program. Smaller publishers who are growing are seeing digital products catch on quicker than print, and some of us who have sizeable legacy-print-programs are seeing a faster ramp-up of digital products than the decline of our print programs. It is an interesting notion to talk about a print program as 'legacy', but that is truly the best description. How's that for the changing nature of the computer book market? Thank you for reading these posts. If there is something that you are itching to see [understand more clearly], please let me know and I will try to help. I plan to excerpt updated pieces of these posts on Twitter throughout the year. They'll come from @mikehatora and will likely get retweeted by @oreillymedia.
Published on Feb 23, 2011
Die Analyse von O'Reilly zum amerikanischen Markt für Computerbücher. Ein Markt, der auch wegweisend für die gerade wichtigen Technologien i...