Issuu on Google+

Overcoming the Challenges of XML Data Analysis XML in the Enterprise: BeneďŹ ts and Challenges

By Chris Davis VP Business Development Skytide, Inc.

June, 2007

Skytide 1820 Gateway Drive San Mateo, CA 94404 650.292.1900

Table of Contents Introduction


XML BeneďŹ ts


XML Challenges


Lost Opportunities


Roadblocks to XML Data Analysis Using Relational-Based Analytical Engines


The Capabilities and Limitations of XML Query Languages for Analytics


New Approaches to XML Analytics


Business BeneďŹ ts of XML Analytics


Case Study: Global Energy Producer




About the Author Chris Davis is currently Vice President of Business Development at Skytide, Inc., responsible for developing relationships with customers, prospects, and strategic partners. He has designed and implemented enterprise software solutions for over 20 years, and has served as interim CIO for multiple companies. Prior to Skytide, Chris was Vice President of Strategy and Solutions at Nextance, a leading provider of Enterprise Contract Management (ECM) software, where he guided Fortune 100 companies in best practice solutions for ECM. In 1998, Chris co-founded Corio, where he established consulting practices for PeopleSoft, Siebel, and Epiphany and reduced the industry standard for implementation cycle times of Tier 1 enterprise applications from 1218 months to 12-16 weeks.

Overcoming the Challenges of XML Data Analysis XML in the Enterprise: Benefits and Challenges Introduction Since its inception in 1996, XML has become a favorite of developers as a flexible and adaptable means to identify information. It is now a widely adopted standard for representing business information in a format that allows content to be processed with relatively little human intervention and exchanged across diverse hardware, operating systems, and applications as well as used with a wide range of development tools and utilities. At its most basic, XML is a standardized “meta-format” that can represent any kind of data, and for which pre-defined schema definitions are optional.

“Skytide has become

XML Benefits

an essential part of the

Companies adopt XML to take advantage of its extensible and structured nature. XML makes communication between different computer systems – both inside and outside of the company — a seamless, easy process. And, it can be used to integrate enterprise, supply chain, and web-based applications. The benefits are many: optimizing business processes by streamlining the sharing of data, improving response time to changing processes and improving access to information. XML becomes even more valuable when enterprises are utilizing Web 2.0 based applications, have internet-enabled sales/distribution channels, and/or receive data through email, wireless devices, and even print. XML makes communications between these diverse systems possible. These benefits have driven industries such as insurance, finance, healthcare and pharmaceuticals to standardize on XML.1

company’s IP program,

XML Challenges While the benefits are many, there is price to pay: being able to analyze the data. While businesses are eager to extract useful information from their XML data, they have found that the traditional business intelligence (BI) technologies cannot deliver fast or thorough analysis of this data. The primary issue is that legacy BI technology was architected for and has evolved based on a relational data environment. XML data, however, is structured in a hierarchical format – not relational. For XML data to be accessible to a traditional BI platform this hierarchical structure must be transformed into a tabular/relational format. This is not only a time-consuming process that involves “shredding” the XML data, but also requires building custom data models and loading the data into a additional dedicated repository — a datamart or data warehouse.

handling the complex analytics of the XML data. After reviewing numerous analytical packages, the Skytide Analytical Platform was the only one that could directly analyze the XML data and produce the complicated reports they required. Skytide is a next-generation analytical engine, which can easily handle the complex hierarchical XML data involved in the company’s rapidly-growing IP program.” Jim O’Hare, Partner Arnav Technology Solutions

Any time the structure of the XML data changes based on new processes, different data collected, changes made by a business partner, or perhaps an entirely new data source required by the business user, the company must again engage in the time-consuming manual process of working with IT to incorporate these changes into the data repository. Page 1

Given today’s dynamic business environment where change is a constant, this would be an endless process, absorbing valuable human and financial resources if data analysis is to be kept current. Due to the heavy cost to manipulate the data warehouse into accepting this new and changing data, users often settle for analyzing a small subset of all the available data. In many cases, even this limited transformation isn’t a feasible option. When analysis is needed “on the fly,” introducing a relational data store into the XML analysis process is simply too cumbersome, slow and expensive to be practical.

Lost Opportunities Companies using XML are ignoring a potentially valuable resource in untapped data by continuing to rely on outdated BI technologies. Unlocking this resource to harvest the value hidden in this complex data requires a new paradigm that is based on XML for the analytical engine. This would deliver a common layer to dramatically reduce system complexity, while offering functionality that cannot be achieved by traditional BI technology. These open architectures can combine data from non-relational sources with traditional transactional systems or data warehouses to provide an unprecedented view into what is driving business performance. This white paper looks first at the current relational-based methods for analyzing XML data, presenting the pitfalls in this model. Then, it will introduce the arrival of new XMLbased analytical platforms that allow companies to mine this rich data repository. Finally, a case study will discuss how a global energy enterprise migrated from a relational database structure to an XML-based structure, and is using an advanced XML-based analytical platforms to gain unprecedented insight into their business processes.

Roadblocks to XML Data Analysis Using RelationalBased Analytical Engines Most companies that capture and store information in XML also use a relational database system for business intelligence. As discussed above, transforming the XML data into the relational system is such a resource and time-intensive process that only a small portion of the data moves into the relational-based system. The complexities of the processes and the data volumes are just too great to make the process feasible from a cost or time standpoint. This sets up two related issues in analyzing the data.

• First, most firms focus on small samples of the XML data rather than the entire data set, resulting in data that could improve business processes and better serve customers being ignored. For example: An insurance company using ACORD XML data wants to identify possible links between policies and types of customers to inform the pre-sales process and then understand why deals don’t close by analyzing all outbound policy quotes. However, due to the costs and time required to shred, transform, and port all of the data; it only captures data on closed business and brings in the minimum set of data necessary for the policy administration application into the database. The data available for analysis thus misses the “lost business” as well as the details of the questions and responses included in the application. The analysis uncovers just a small subset of possible links, missing completely the reasons why sales attempts were unsuccessful. This is a major missed opportunity to gain insights that would improve sales conversions and to drive more effective marketing campaigns. • Secondly, the XML data that does move into a relational system is hard to analyze using traditional BI products. XML data is by nature hierarchical and comprised of “nested repeated elements,” versus the “flat” data of a relational system. Most BI solutions are designed to deal with relational data, not hierarchal data. For example: A company is using XML to track and store sales data for its web enabled transactions with channel partners. Using XML’s hierarchical format, the company tracks sales by product line, region, sales person, shipping information and discounts applied. The company transforms and ports this XML data to a relational format to analyze it with a traditional BI solution, however, in the process of flattening out the nested data much of the detail is lost. The result is an incomplete understanding of the sales picture. The company can identify product sales by time and sales representative, but is unable to differentiate which products were sold at a discount or final shipping destinations. The singledimensional questions can be answered; what is missing is a multi-dimensional view, which could provide the critical understanding and valuable insights.

Business Intelligence in Rapidly Changing Market Conditions © 2007 Skytide, Inc. All rights reserved.

Page 2

The Capabilities and Limitations of XML Query Languages for Analytics In a traditional relational-based database model, companies use SQL to query their databases. Until recently, XML data had to be transformed into a relational format before querying it with SQL. But when the new XML-based query language XQuery appeared on the scene, there was hope this would make extracting XML data easier, faster and more accurate. XQuery enables users to extract and manipulate data from XML documents or any data source that can be viewed as XML, including relational databases. There is still a difference between searching and extracting results, which Xquery does well; and performing multi-dimensional analysis, which is necessary to provide business users with the detailed information that drives better business performance. A clear example of this difference is that the XQuery language does not yet include a “Group By” capability; as in “group by sales territory” or “group by product family” or more likely “group by sales territory and product family.“ This means that while it is possible to create the analysis using XQuery Analytical Apps Reporting Tools Tabular & Pivot Views

Multidimensional Navigation

Portals Excel

Skytide Designer

Presentation Formats

Note: This differentiation between query tools and multidimensional analysis also holds true in the relational world. Even though “Group By” has been a concept in SQL for over a decade, there is still a set of queries that is better addressed by an analytical engine than in the core database engine; which is why BI tools were invented and continue to be used extensively.

A New Approach to XML Analytics Using a relational-based BI solution to do XML analysis is like trying to fit a square peg into a round hole. But there is another way. New XML analytics companies, such as Skytide, offer a new paradigm of business analytics which leverage the unique format of XML. Pure XML-based analytical solutions are designed from the ground up to concurrently analyze relational data and XML data where they reside, without having to transform, shred or port data into a data warehouse. XML-based analytical solutions can also identify and align common elements across multiple data hierarchies, extracting and analyzing complex data structures that are otherwise lost through data transformation and flattening, to provide sharper insight into the subtle relationships among data. And because these solutions completely bypass the extraction, transformation and loading steps, they easily adapt to changes in XML data schemas, enabling business users to quickly create new reports without needing deep technical skills such as designing an Entity Relationship Diagram, defining a STAR Schema or editing ETL mapping definitions. All a user has to do to analyze XML data using Skytide is point the analytical tool to the data source and click – no extracting or mapping is required.

Skytide SDK

Virtual Documents Data Connectors

to extract data, transforming the results into a tabular format and analyzing it using traditional BI tools is not economically realistic.

Queries Cubes

XML Modeling Engine

XML Rendering

Skytide Server






Log Files

Figure 1: Skytide is a next generation XML-based analytical platform that performs analyis directly on XML data, without requiring timeconsuming transformation of XML data into a data warehouse.

Business Intelligence in Rapidly Changing Market Conditions © 2007 Skytide, Inc. All rights reserved.

Page 3

The Business Benefits of XML-Based Analytics XML is increasingly prevalent in a wide range of industries, and these businesses need an accurate way to harvest the information stored in their XML repositories. Imagine what business would be like if companies could analyze all the data flowing through their systems, instead of just a tiny fraction of it. By adopting XML-based analytics, the following scenarios could become reality. • Insurance (ACORD XML): Companies could understand all of their underwriting business activities, not just those that translate into policy sales. They could thus improve responsiveness to customers and streamline their sales process. They could also gain deep understanding of the performance of new and existing products, and quickly see which products produce the most sales. • Financial Services (FIXML and FpML): These firms spend massive amounts of money to put XML data into a relational database, a time-consuming, complex process, that slows analysis to a next day availability. An XMLbased analytical platform eliminates the transform and port process as analysis of data is performed directly from the

Case Study: Global Energy Producer Background This global energy company with over 2,200 intellectual properties (patents, trademarks, etc.) managed across 100 countries wanted a more efficient and secure process to control the multiple processes associated with its strategic IP program. The program combined various types of data (budgets, invoices, payment receipts, property descriptions, etc.) over six continents, and included protecting these assets by assuring successful “on-time” filing for all properties in all associated countries. The company needed a flexible, user-friendly and scalable way to manage the entire process, providing detailed analysis, protection, and enforcement of all associated processes. The XML & Skytide Solution The company re-architected its IP protection program to allow for much more efficient control of the IP management by migrating from a relational database system to a centralized XML-based structure that is accessed by users in six continents. The Skytide Analytical Platform was selected as the analytical engine to perform the following tasks on a continuous basis: • Execute complex analytics on the high volumes of data associated with the IP program, in a timely manner;

source. This can speed analysis to sub-hourly availability. It also enables analysis of all key XML data, not just a subset (most financial services firms analyze less than 40% of their XML data), providing fast insights that increase reaction to changes in the trading environment. Companies can also decrease the cycle time to deploy new derivative products as well as continually monitor the results as they are introduced. • Pharmaceuticals (CDISC, SPL, AERS): Companies can better manage regulatory issues. CDSIC is the standard for clinical trial data, an extremely complex data structure. SPL is product labeling; legal risk makes continuous monitoring of labeling information critical. AERS is a standard for reporting adverse events or reactions to medication, carrying enormous legal risk and requiring detailed management. An XML analysis solution allows companies to analyze the market risks associated with changing labeling, the demands of international labeling, and the litigation risks of adverse effects. These insights would help pharma companies to release only correctly-labeled drugs and avoid potential litigation risks.

• Generate mission-critical reports that were not possible with its previous relational database structure; • Easily produce ‘ad-hoc’ reports pertinent to each country and/or IP entity; • Perform data migration Quality Assurance by providing immediate exception analysis during data migration. For example, the source system had 38,750 persons identified, but information on 5 persons was not successfully migrated to the destination system. Normally, it would take several days and thousands of dollars to identify the “lost” records. Skytide identified the 5 persons within 15 minutes. • Scale beyond the previous limit of approximately 110,000 information objects to allow a virtually unlimited number of entities to be added over time. “Skytide has become an essential part of the company’s IP program, handling the complex analytics of the XML data. After reviewing numerous analytical packages, the Skytide Analytical Platform was the only one that could directly analyze the XML data and produce the complicated reports they required. Skytide is a next-generation analytical engine, which can easily handle the complex hierarchical XML data involved in the company’s rapidly-growing IP program.” Jim O’Hare, Partner, Arnav Technology Solutions, Inc., a Systems Integrator for the company.

Business Intelligence in Rapidly Changing Market Conditions © 2007 Skytide, Inc. All rights reserved.

Page 4

Conclusion Companies today are struggling with an overwhelming volume of data across a host of sources. Many companies have adopted XML as a standard because of the many benefits its hierarchical structure delivers, from enhancing communication between disparate systems, integrating processes across the enterprise, and streamlining the sharing of data. These improvements have come at a price, however, XML has proven difficult to analyze with traditional analytical or business intelligence tools. A new paradigm in analytics is offered by companies such as Skytide, which leverage the unique format of XML. By performing analytics directly on source XML data, these solutions provide companies with a complete, current view of business processes.

About Skytide Skytide delivers business analytical solutions that provide timely and unprecedented insight into the constantly changing environment in which today’s businesses operate. The XML-based Skytide Analytical Platform is the first and only solution available today that can understand complex data from virtually any source, including unstructured data such as XML and HTML, delivering the visibility necessary to make critical business decisions. Skytide customers include Fortune 1000 companies across a wide range of market segments, including networking, financial services, healthcare, utilities, manufacturing, and retail.

Find out More about Skytide Discover how the Skytide Analytical Platform™ can uncover the power of your data. Contact us today at or 650.292.1900.

Skytide, Inc. 1820 Gateway Drive, Suite 300, San Mateo, CA 94404 Phone: 1.650.292.1900 Fax: 1.650.312.1400 E-mail: Internet: © 2007 Skytide, Inc. All rights reserved. Skytide and the Skytide logo are registered trademarks of Skytide, Inc. All other trademarks are the property of their respective owners.

1 The XML Shockwave: What Every CEO Needs to Know about the Key Technology for the New Economy (