Skytide_for_online_content_v2

Page 1

Beyond Web Analytics Analytics and Reporting for Real Business Results

By Joseph Rozenfeld VP Strategy & Solutions Skytide, Inc.

January, 2008

Skytide 1 Waters Park Drive, #160 San Mateo, CA 94403 650.292.1900 www.skytide.com


Table of Contents Overview

1

New Data Environment

2

Data Value Chain

2

The Skytide Process

3

The Old Way vs. The Skytide Way: A User Scenario

4

Sample Reports Skytide Application Sets for Online Content Delivery Excerpt from 451 Group Impact Report Summary Table 1

5 8 9 9 10

About the Author Joseph Rozenfeld Vice President of Strategy and Solutions (Co-Founder), Skytide Joseph is responsible for shaping the company’s strategic vision and technologies to meet the fast-growing needs of the business intelligence marketplace. Joseph has more than 20 years of software development and management experience and has founded or co-founded four companies including Skytide and ChainCast Networks. As executive vice president and CTO of ChainCast Networks, the ďŹ rst provider of commercial peerto-peer software for broadcast streaming, he grew the company into the largest streaming provider for terrestrial radio broadcasters in the U.S. with a client list including ClearChannel, NTT, Cox, and ABC. Joseph was also a founding engineer, development manager and architect of Essbase and IBM DB2 OLAP servers at Hyperion Solutions (formerly Arbor Software). Joseph holds an M.S. in Computer Science and a B.S. in Applied Mathematics from Moscow Polytechnique University.


Beyond Web Analytics: Analytics & Reporting for Real Business Results Overview Online content is exploding. In parallel, user expectations and reliance on this content have set the stage for accelerated growth in this market. Delivery of online content has become a legitimate and increasing portion of bottom-line revenue for both “new-economy” and traditional businesses-model companies. One has only to look at Microsoft, which generated over $150M in 2007 from movie downloads alone1 and imagine the scores of unforeseen revenue streams yet to be realized by new media companies such as YouTube and Facebook. Driven largely by the arrival of Web 2.0 and the availability of enterprise-level bandwidth, the online content phenomena has spawned a host of enabling technologies and business models to help companies deliver new, diverse and compelling types of content ever faster to an increasingly hungry global audience. This phenomena is responsible for revitalizing the content delivery market, projected to reach $4B in just a few years. From iTunes to Netflix movie downloads, maximizing the return on investment of the ongoing stream of online content is mission-critical. Companies have implemented new ways to produce, transport, store, and make available all sorts of exciting new content and applications – from streaming video and live Flash presentations to interactive chat sessions. What has lagged behind, however, are solutions that provide content owners effective ways to understand the “who, what, when, where, why, and how” of content usage, which are critical components to monetize content. Solutions offered by web analytics and Content Delivery Networks (CDNs) fall short of delivering comprehensive analytics and reporting on content usage, and remain unable to answer critical questions, such as: • How long is any one video clip is watched? • Which users downloaded my least trafficked content? • What region is responsible for the most content downloads? Answering these questions may seem simple, but it requires accessing and merging together multiple sources of extremely high volumes of data. Data as diverse as content server log files, network traffic control data, and content application log files, all notorious for their complexity, lack of standards, and huge volumes of raw data (think terabytes/day or hundreds of thousands of records generated/second). Web analytics solutions are built around analyzing a single data source – clickstream, and are thus not able to manage the multiple streams of high volume data. Enterprise content owners and content service providers are left to cobble together a mix of traditional business intelligence/analytics tools, web analytics, and even enterprise search tools. But, this mix of “old” and “new” technologies doesn’t solve the significant analytical challenges:

Traditional Business Intelligence: Business Intelligence (BI) typically relies on database technologies, which break down when dealing with very large and very diverse data sets. These high volume data sets can cause extreme latency issues, with analysis taking days, weeks or even months, rendering the intelligence meaningless to the business user. It is also inflexible in handling dynamic, changing data sources or complex data formats. Web Analytics: There are a host of web analytics solutions that can offer insights into web traffic logs, but typically involve special coding placed on corporate web pages, and often require predictive path for further drill-down. What these solutions don’t do is offer an un-biased view across all web traffic, and the ability to combine and relate the web log data to other data sources, such as streaming content logs, customer data, marketing campaign or lead/sales tracking data bases. These solutions have implemented complex, heavy infrastructure to handle a single stream of data (HTTP – click-stream data), but this can’t handle the multitude of content application data streams. Advanced Enterprise Search: Technology advances now offer faster, easier ways to search across enterprise databases and networks to locate data entities. However, enterprise search today is far from offering analysis and insights about these data streams.

Page 1


• Scale: Handle massive amounts of data over time • Flexibility: Seamlessly integrates diverse, “weird” data formats • Speed: Offer fingertip access to meaningful results based on this data. These challenges must be solved before organizations can turn content into sustainable revenue streams. Because understanding the details of content usage over time drives the ability to deliver differentiated service offerings as well as maximizing the return on investment into online content delivery.

The New Data Environment Monetizing content requires an understanding of how your customers interact with all of your content — not just easy to reach segments. The more you know about your user behavior, the better you can deliver what they want. As customer and prospect interactions increasingly happen over the network and the Internet, companies need to track behavior in this new cyber environment, which is far more complex, faster, and covers greater territory than ever before. And, online user interactions are no longer limited to HTTP transactions. Instead, users are accessing a wide range of online applications and services which generate a unique data streams. Users move seamlessly between online interactions, such as: • Viewing a Flash demo to review a new product • Watching a TV episode • Clicking on a special banner ad offer • Placing an eCommerce order • Downloading a music clip • Viewing a single web page Each interaction touches different network components (web servers, live media servers, ad servers, etc), and produces complex, diverse, and extremely large, streams of data describing customer interactions with content (Figure 1). The sheer volume and complexity of the data generated from these multiple user interactions make it impossible to efficiently put the data into a database, a requirement of traditional BI solutions. Too often, the different pieces of data required to paint a complete picture of user behavior is either discarded, accessed for “point in time” analysis, stored and ignored, or sampled for a limited historic view. Thus, businesses make decisions about online content offerings — the real basis of monetizing online content investments — based on a limited understanding of how users and customers interaction with the content.

Analytics and Reporting for Online Content .

Figure 1: Dynamic websites offer a wide range of possible user interactions, each of which connect with different network components, illustrated here. To obtain a comprehensive view of user behavior requires that each of these interactions be combined and analyzed together – over time.

Data Value Chain: Answering the Important Business Questions Data that is transformed into rich, timely information fuels successful businesses. The most valuable information, however, reflects the joining together of multiple sources of data, over time. Tracking how customers and prospects access and use online content can only be obtained by combining the various data sources generated by both online and offline processes. Each data source adds a dimension to understanding the patterns of behavior, with the highest business value derived from a complete multi-dimensional view of customer behavior as shown in Figure 2.

Key Question: A user goes to your website and clicks on a special promotional 2 minute video clip. Will your web analytics solution show you how long they spent viewing the clip? Answer: No. Web Analytics solutions track the HTTP data only for pages that you have inserted a special code, but, even with the code in place, you would only see the number of times the clip was accessed, but it would not include any metrics on viewing time, including average viewing times. This data is only available if you have access to the media server log files responsible for streaming the content. Web Analytics solutions can not track this data.

Page 2


Figure 2: Data Value Chain shows the increasing benefits obtained from combining multiple data sources.

Table 1(page 10) further demonstrates the types of meaningful results that are obtained when a single stream of operational data is merged with different customer data sources, permitting business users at the highest levels to better understand key factors driving both short and longterm profits and revenues.

it drastically increases the ability to perform meaningful analytics and reporting on the data that traditional Business Intelligence (BI) solutions can not deal with. The Skytide breakthrough enables business intelligence in previously prohibitive areas for BI, due to that technology’s dependency on database centric technologies and storage.

While the benefits of combining multiple data sources to optimize content monetization are self-explanatory, the practical implementation is more difficult due to the nature of network and content traffic data – complexity, volume, and lack of standards.

According to leading market analyst, Dennis Drogseth, Vice President of Enterprise Management Associates (EMA), “Traditional Business Intelligence solutions lack the ability to deliver dynamic, real-time information on very large heterogeneous data sets. A new generation of business analytics, such as the Skytide Analytical Platform, will allow companies to bridge these high volumes of structured and unstructured data sets with speed and power for on-the-fly information without the overhead of a data warehouse.”

The Skytide Process: The Skytide Analytical Platform™ presents a new way of processing data that makes possible this merging of data sources, even when volumes of data reach terabytes and billions of records daily. Because Skytide analyzes data directly, without the need to store data in a database,

Analytics and Reporting for Online Content .

The content delivery market is a key area where Skytide has quickly gained market share because of its ability to answer this sector’s business imperative to monetize content delivery.

Page 3


At a high level the Skytide process follows these steps: 1. Skytide connects to each data source where it resides, and data is pulled into the Skytide engine for processing. 2. Standard or user defined parsers are applied to data, transposing it “on-the-fly” into an XML structure. A Skytide library of standard parsers covers tens of formats, which can be quickly extended for new or proprietary formats in a matter of hours. With parsers in place, this is an automatic, near-real-time process. 3. Data flows into pre-defined multi-dimensional “cubed” models, reducing data volume without loss of fidelity.

4. Models are instantly available for analysis and reporting. 5. Queries are built using Skytide’s unique modeling pointand-click Designer environment, which can be performed by non-technical business users without IT involvement. 6. Reports are automatically generated through a wide range of presentation layers: Corda interactive dashboards, CrystalReports, JasperSoft, Excel, etc. 7. Incremental updates are automatically performed, providing users near real-time reports. 8. Ad hoc queries and reports can be quickly created to allow further drill-down or slicing through the data aggregated data in analytical models.

The Old Way vs. The Skytide Way: A User Scenario While every industry and company has unique data information requirements, the following scenario demonstrates the value of multiple data sources to better inform business decisions and actions for any enterprise or service provider generating online content. The Business Model: Online Content Delivery & Sales This sample media company generates revenue by selling online content to end-users, which could include music, video downloads & streams, live Flash presentations. Customers access content through a password protected user account, with automated billing per content item viewed. In addition, it delivers a range of online ad properties for its partners, ranging from graphic banners to flash videos. The data landscape is shown in Table 2 below. Neither traditional BI nor web analytics tools are capable of transforming this type of online content traffic data into meaningful information. Here’s why:

Data Source

• BI Tools: All of the data from each continuous stream would need to be transformed and placed in a structured relational database/data warehouse. The volumes of the network data alone make this impossible to achieve if timely results are expected. Using traditional BI tools for this analysis process would also require extensive storage, licensing costs, and time. This method is also inherently inflexible when dealing with ad hoc queries or changing data sources. BI in this new data environment isn’t scalable, flexible, timely, or affordable. • Web Analytics: The complexity of multiple large volume data streams is not possible to deal with using web analytics tools, which are limited to HTTP files analysis. Web analytics work to provide insight into the online interactions, but the view remains single dimensional. And, because it also requires changes to the actual HTML web page, it requires additional labor. It remains inflexible, and marginally scalable.

Type of Data

Data Challenges/Characteristic

Network/Server Data

Content logs, CDN traffic logs Click-stream, IP, HTTP, syslogs, VM, Flash, Real, Quicktime

• Extremely high volumes • Very diverse, “weird” formats • Semi-structured

Customer Data

CRM, Finance & Billing Records

• Highly structured • Security an issue

Advertising Content

HTML, Flash, Jpg/Gif, Streaming Video

• Diverse data formats • Placement issues relate to content & user profiles • Billing issues are critical

Ad Traffic Metrics

HTTP, Click-Stream

• Extremely high volumes • Very diverse, “weird” formats • Semi-structured

Table 2 Analytics and Reporting for Online Content .

Page 4


With Skytide, the organization can maintain a constant tap on all data streams, at the highest dimensional view. Here’s how: Skytide Results: Skytide is deployed as described above, allowing seamless merging of the following data sources: Windows Media Server log files, Customer information records in CRM and Finance System, Ad Traffic Server, HTTP Web Analytics Files. This results in the automated generation of reports that answer key business and IT-related questions including: • Identify the top 10 viewed content titles and media types • Calculate average view time across different media types and content titles • Show Sales and billing revenue across customers and aggregated by region • Identify top 10 error messages related to media type and player • Demonstrate ad traffic views segmented by user demographics linked to associated content • Calculate and trend ad revenue by publisher over time, segmented by top producers

A Key Question Revisited: A user goes to your website and clicks on a special promotional 2 minute video clip. With Skytide, can you track viewers of your promotional video even if this video represents a “long-tail” (a video with very few users on watching it)? Answer: Yes. Skytide can provide viewing statistics on any content and any URL — and not just the most popular ones. With Skytide you no longer need to worry about millions of individual users that you might want to track. Skytide can not only provide all of the details of the interaction times, it can also show you top geographic regions where the clip was viewed, specific times of the day, what players were used to view the data, and identify top errors that may have occurred.

Sample Reports Report 1: Traffic KPI Summary Overview report reveals traffic patterns by content type, time intervals, unique users, and number of requests. Clickthrough drill down shows exact traffic details. Traffic for this time period shows a dramatic spike on Thursday, which is made up mainly of Windows Live Media access.

Analytics and Reporting for Online Content .

Page 5


Report 2: Geographic Segmentation Summary Overview report identifies userrequests across geographies, with further segmentation by demographics available in clickthrough drill-down reports.

The report easily identifies California as the most active in viewing content. Further details pinpoint the San Francisco zip code of 94120 generated the most traffic.

Report 3: Basket Analysis Summary Report provides comprehensive overview of most popular pages or groups of content viewed.

This report quantifies traffic across “buckets” or groups of content pages on an iTunes site. For this time period, the Spanish language group of content was the most popular.

Analytics and Reporting for Online Content .

Page 6


Report 4: Content-Uptake Drill Down (Video Traffic) Drill-down report combines data from CRM records and video log files to show content usage by type, gender, age, and geography.

Action & Adventure, Comedy and Sci-Fi & Fantasy were the top three types of video viewed.

Report 5: Demographic Segmentation Summary Traffic across content types is segmented by gender and age, revealing content usage patterns.

Thursday traffic is predominently from viewers who are over 31 years old, and the majority are male.

Analytics and Reporting for Online Content .

Page 7


Skytide Applications for Online Content Delivery The Skytide Analytical Platform includes application sets designed to help organizations delivering content over the Internet to better utilize data to uncover untapped revenue opportunities.

Additionally, each standard analytical model can be accessed for fingertip ad hoc queries and reports. This makes even “outof-the-box” deployments highly flexible in meeting changing business needs without re-tooling or added professional services. Custom models and reports can also be incorporated — before, during or after deployment.

These applications include internal and external facing analytical applications that process extremely high volumes of diverse data sets, with a specific focus on data generated in the transfer of content over the Internet. These applications include: Analytical Apps

• Traffic Segmentation Analysis o Sample Reports: Network Provisioning, Network Troubleshooting, Network Usage and Billing

Reporting Tools

o Application Areas: IT, Finance • Customer Segmentation Analysis o Sample Reports: Customer Demographic Content Uptake, Customer Network Usage Statistics, Customer Behavior Trend Analysis

Tabular & Pivot Views

Portals Excel

Skytide Designer

o Sales, Marketing, Executive Office •

Multidimensional Navigation

Presentation Formats

Skytide SDK

Content Uptake Analysis o Sample Reports: Content Utilization Statistics Analysis, Content Basket Analysis, Customer Content Behavior o Application Areas: Content Owners, Marketing, IT

Content Segmentation Analysis o Sample Reports: Content Provisioning, Content Troubleshooting, Customer Usage & Billing o Application Areas: Content Owners, Marketing, Finance, IT

Virtual Documents Data Connectors

Queries Cubes

XML Modeling Engine

XML Rendering

Skytide Server

Application Segmentation Analysis o Sample Reports: Application Provisioning, Application Troubleshooting, Application Usage & Billing o Application Areas: Business Units, IT, Finance

RDBMS

HTML

XML

IM/Chat

Email

Log Files

Storage Segmentation Analysis o Sample Reports: Storage Provisioning, Storage Troubleshooting, Storage Usage & Billing

Figure 3: Skytide is a next generation analytical platform that performs analyis directly across multiple data types without requiring a relational database or warehouse.

o Application Areas: IT, Finance Each Skytide application set is configured for streamlined deployment that delivers immediate benefits — in days, not months or years. Included is a series of pre-configured standard analytical models (cubes) that provide automated connections to each required data source. Connections are made via standard parsers, which are easily configured to each unique deployment. A variety of standard reports are then available to users, from scheduled email PDF formats to interactive web portal views.

Analytics and Reporting for Online Content .

Page 8


Excerpt From 451 Group Impact Report The way in which Skytide deals with data makes it well suited to handling extremely large volumes of diverse data formats. By linking directly to the data sources, Skytide eliminates the need for a data warehouse. Data is then automatically aggregated, summarized and correlated across all data sources, making possible multidimensional, historical views of the data. In-memory processing speeds queries based on even the largest data sets. These functions allows Skytide customers to build highly segmented trend analysis across data sets that provide deep insight into customer behavior, quality of service, network performance, online content access, etc.

About Skytide Skytide delivers business analytical solutions that provide timely and unprecedented insight into the constantly changing environment in which today’s businesses operate. The XML-based Skytide Analytical Platform is the first and only solution available today that can understand complex data from virtually any source, including unstructured data such as network traffic data, content server logs, application transactions, and unstructured data such as emails and text , delivering the visibility necessary to make critical business decisions. Skytide customers include Fortune 1000 companies across a wide range of market segments, including networking, financial services, healthcare, utilities, manufacturing, and retail.

Skytide, Inc. 1 Waters Park Drive Suite 160 San Mateo, CA 94403 Phone: 1.650.292.1900

In this manner, a content-delivery network (CDN) can demonstrate for its customers the actual performance details of all content objects transmitted by customer or segment; CDN customers can track performance across multiple CDN providers; and individual organizations delivering content across their own networks could segment content delivery by region, content type or even individual user. — Krishna Roy, Analyst, 451 Group

Summary Companies across every industry are struggling to deliver and justify the investments they have made in online content delivery. Customers, end-users, and prospects expect fingertip access to content via the Internet for entertainment, to inform purchase decisions, for educational value, and to speed service & purchase requests. Up to this point, the focus of Web 2.0 applications has been in delivering the actual content or enabling the application. Now, organizations are focused on how to monetize this investment. A process that demands detailed understanding of how and why content is actually being used. Skytide delivers a way for organizations to maximize the value of the highly diverse volumes of data generated as a result of online content usage. For enterprises delivering their own content, or Content Delivery Network providers, Skytide offers a seamless way to transform these high volume streams of traffic control data into valuable information that helps monetize online content and drive revenue.

Put Skytide to Work for You Discover how the Skytide Analytical Platform can provide valuable insights about your online content usage. Contact us today at info@skytide.com or 650.292.1900.

Fax: 1.650.312.1400 info@skytide.com www.skytide.com © 2007 Skytide, Inc. All rights reserved. Skytide and the Skytide logo are registered trademarks of Skytide, Inc. All other trademarks are the property of their respective owners.

1

2007 Morgan Stanley, Content Delivery Market Report


Table 1 Dimension

Data Types

1-D View

Netflow Data

Key Question Answered:

— network traffic logs

Information/Reports Delivered • Reports that map IP to URL for traffic associated with websites & networks.

Primary Users • IT: Network Operations

• Network Performance: focuses provisioning & resource allocation, segmented by network device,

How is my network performing?

• Error Reports: most common error reports speed troubleshooting and improve service delivery

2-D View Key Question Answered: What and when is online content and/or applications being accessed?

Netflow Data + Content Logs — content server logs, playlists, classification logs, etc.

• Segmented Content Stats Reports: shows details across content type accessed, most often viewed, time viewed

• Content Managers: pertinent to various business units including sales, marketing, eCommerce

• Content Utilization Reports: allows for better content provisioning • Trend & Point-in-time Reports across all sectors

3-D View Key Question Answered: Where is my content/ application being used?

Netflow Data + Content Logs + IP Geo Data: — IP/GEO mapping data

• Content Uptake/Traffic Segmented by Geography: shows distribution of content usage across geographic regions by all factors included in content stat reports, including time viewed, most often accessed, etc.

• Content Managers: business units including sales, marketing, ecommerce to better enable regional targeting of advertising & content

• Trend & Point-in-time Reports across all sectors

4-D View Key Question Answered: Who and or/what content drives the most profitable online transactions?

Netflow Data + Content Logs + IP GEO Data + User Data — CRM records, Finance & Accounting files, user demographics

• User Behavior Segmentation Reports: track content usage by individual customer, segmented across demographics (age, gender, income, etc) • Billing Reports: granular billing details used to generate more accurate customer billing for online content and advertising revenue streams

• Content Owners: ad revenue owners, marketing & sales executives to drive improved ad insertion revenue, improved recommendation engine, drive promotions and improve search capabilities.

• Sales Reports: show revenue generated by key drivers, such as sales regions, reps, content types, advertisers, top customers, etc. • Trend & Point-in-time Reports across all sectors

Analytics and Reporting for Online Content .

Page 10


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.