Page 1

Radiant Advisors Publication










March 2013, Issue 6


[P4] Shifts in the Architectural Landscape Different technology vendors offer different perspectives on what big data means, but all of them tip their hat to the fact that the volumes of data that companies are gathering and analyzing can be big.

[By Dr. Robin Bloor] FEATURES




Today’s BI and DW platforms are highly

specific questions to be asked in order to

debate should be about how semantics

adapted to their environments; however,

select the right software and hardware

should be analyzed or discovered and where

they are less suited outside of these envi-

to optimize BI use and align with broader

that definition should be maintained for


business goals.

data going forward.

[By Stephen Swoyer]

[By Lyndsay Wise]

[By John O’Brien]

Time for an Architectural Reckoning

Selcting the Right BI Solution There are


Why Data Models Are Dead The real



Signal and the Noise is as interest-

seem to be betting on fit-for-

The Signal and The Noise The

Vendor Free for All? Vendors

ed in the predictive power of sta-

purposive platforms as the best

tistics as it is in the human ability

response to selection pressure --

to comprehend probabilistically.

not a single platform, but the right

[By Lindy Ryan]

platform for the right purpose. SIDEBAR

2 • rediscoveringBI Magazine • #rediscoveringBI

[By Stephen Swoyer]

FROM THE EDITOR Welcome to Rediscovering BI, Radiant Advisors’ monthly eMagazine featuring articles from leading names in today’s BI industry and other new voices we’ve discovered innovating in the BI community. Today’s BI environment is all about rethinking how we do BI and imagining new, innovative way to approach BI at large. The goal of Rediscovering BI is to continue growing as a leading industry publication that challenges readers to rethink, reexamine, and rediscover the way they approach business intelligence. We publish pieces that provide thought-leadership, foster innovation, challenge the status quo, and inspire you to rediscover BI. This month we are excited to “shift gears” and debut our new PDF edition of the eMagazine. In fact, “shifting gears” is the focus of this month’s issue of Rediscovering BI, and with contributions from Dr. Robin Bloor, Stephen Swoyer, Lyndsay Wise, and John O’Brien, this month’s issue explores the shift in today’s architectural landscape and what that means for data models and the unfolding architectural reckoning.

Editor In Chief Lindy Ryan

Art Director Brendan Ferguson


Lindy R yan

Distinguished Writer Stephen Swoyer

Contributor Dr. Robin Bloor

Contributor Lyndsay Wise

Contributor John O’Brien

Lindy Ryan Editor in Chief Radiant Advisors

SUBSCRIBE rediscoveringBI Magazine • #rediscoveringBI • 3




VERYONE KNOWS THAT BIG DATA is a real and growing phenomenon. Different technology vendors offer different perspectives on what big data means, but all of them tip their hat to the fact that the volumes of data that many companies are gathering and analyzing can be big.

When we talk of big data, we also often hear of Hadoop being associated with it in some way. It’s not that everyone is using Hadoop in earnest yet. Most companies I’ve talked to are experimenting (or doing something fairly limited) at the moment, but nearly everyone sees Hadoop as a component of their future software stack – they are just not entirely sure of the role it will play. But whatever evolves, the game is up for the previously dominant business intelligence (BI) architecture, which can be summarized as: operational systems -> data ingest -> data warehouse -> data marts -> desktop tools and databases. In my opinion, the big data trend represents the early stages of the emergence of Event-Driven Architecture.

Event-Driven Systems For decades, we built operational systems that were characterized by the idea of transactions. Transactions corresponded to the events that changed the business: receiving deliveries, paying invoices, placing orders, and so on. We built most of the systems that do such things a long time ago. Since that time, we have expanded BI software from its early days as a general reporting capability to conduct trend analysis, create dashboards, and monitor key performance capabilities of the business. Nowadays, monitoring the business usually involves gathering both transactional data and event data that is not transactional. With the advent of big data, we have seen the expansion of data used by businesses to include machine generated and log data, social media data, web-based data, mobile data, and external data streams. If we examine this data, we discover that very little of it is, in fact, transactional – almost all of it is simply event data. We are gradually moving toward viewing events as the fundamental atoms of business activity. Of course, event-based systems are not entirely new. The High Frequency Trading (HFT) systems used by investment banks are fundamentally eventbased. Internet companies provide many examples of event processing, and they have led this trend. Web retail sites interact with the customer entirely through the web browser,

4 • rediscoveringBI Magazine • #rediscoveringBI

The game is up for the previously dominant business intelligence (BI) architecture”

whether the customer is merely viewing

analytics, but they could capture the data

First of all, a fully event-driven envi-

products or actually buing. This began

they wanted to examine as a natural part

ronment will involve event flows that

with web sites tracking the behavior of

of their business process.

are captured from various sources and

users from web logs, but it has gradu-

Other businesses (transportation, brick

analyzed in flight in order to respond as

ally evolved into capturing and analyzing

and mortar retail, and health care) may

swiftly as possible to whatever the data

everything any customer does, or did:

have to deploy embedded chips to cap-

reveals. Consider this as a new layer of

how they arrived on the site, what links

ture some of the event data that interests

BI software, built to either inform people

they clicked on, how long they stayed on

them. And, in time they will – that is one

(or programs) of trends it sees or triggers

any given page, what they searched for,

of the next steps toward the emergence

that demand action. When the latency

what advertisements were presented to them, and so on.

of event-driven businesses.

between receiving data and taking action needs to be very low, think of this as real-

nizations that could, with little effort,

Event-Driven Architecture

capture and analyze most of the event

While we do not intend to define what

think of event-driven architecture as

data of a customer interaction. They

an Event-Driven Architecture might turn

closer to – but not necessarily the same

may have had to invest in appropriate

out to be, we can discuss some of the

as – traditional BI. Some of this event

computer technology to do the specific

features that it will inevitably involve.

data might come from within the busi-

Internet businesses were the first orga-

time operational intelligence (OI). Where it does not require prompt action,

rediscoveringBI Magazine • #rediscoveringBI • 5

For all intents and purposes, the data reservoir is the data warehouse for event data”

ness: RFID data, for

performing scale-out database to help with the data analysis,

example, that is moni-

but, if you do, it becomes a data analysis mart.

toring the movement of

For all intents and purposes, the data reservoir is the data ware-

goods between ware-

house for event data. It doesn’t have to be Hadoop, of course. It

houses. Some of the

could be one of the new scale-out NoSQL databases, many of

event data might come

which don’t impose structure on the data in the way that tradi-

from suppliers, custom-

tional databases and data warehouses do.

ers, or potential custom-

Does this mean that the old data warehouse can be retired?

ers. Some data might

In theory, yes, but in practice it is unlikely. There will be many

be marketing statistics

legacy applications that depend on it, and it may not be worth

relating to advertising

replacing them.

campaigns, or it might be social media data. Nowadays


data” is any data that might affect the business, including meteorological data, transport information, stock market and commodity price data, and so on. The point is that we do not necessarily know what data the business may suddenly become interested (and maintain interest) in.

In Summary I’ve just brushed the surface of this topic in this article. Nevertheless, I hope you can see the way that the IT industry is drifting. The issue with big data is less likely to be its volume than the simple fact that it is new data, possibly inconveniently structured data, and, most importantly of all, it is event data. Share your comments >

As such, we cannot define database structures to accommodate it ahead of time. This is one of the reasons that the loose structure of Hadoop quite attractive: it is a data reservoir that doesn’t require you to define the metadata until you want to use it. Naturally, data analysis will be carried out on such captured data, otherwise there would be no point in capturing it and storing it. If we imagine that a business has built all the transactional applications it needs to build, then none of the events it captures give rise

Dr. Robin Bloor is co-founder and principal analyst with The

to transactions. The only thing to do with such data is either report

Bloor Group. He has more than 25 years of experience in the

it – via a dashboard, perhaps – or analyze it. You may require a fast

world of data and information management.

6 • rediscoveringBI Magazine • #rediscoveringBI


Check out our eBookshelf




The Signal and the Noise is as interested in

the predictive or disclosive power of statistics as it is in the human ability to comprehend – to think – probabilistically.” VEN IF YOU HAVEN’T HEARD of

might be called the human tone-deaf-

Nate Silver’s The Signal and the

ness to probabilism. Indeed, many of the

Noise, you’ve probably heard of

“problems” Silver describes are actually

Nate Silver. His hybrid predictive

products of the human misapplication or

model didn’t exactly predict the outcome

misunderstanding of statistical concepts

of the 2012 Presidential race, but it was

and methods. He discusses this idea in

shockingly close; it was much closer,

the construct of the “prediction paradox,”

in fact, than the predictions of many

in which he says the more humility we

television- and print-media pundits. By

have about our ability to make predic-

Election Night, the divergence of Silver’s

tions – as well as learn from our mistakes

predictive model – along with those

– the more we can turn information into

of Sam Wang at the Princeton Election

knowledge and data into foresight.

Consortium and Drew Linzer of Emory

To be succinct, as Silver says in his

University – had set up a kind of show-

“Introduction” to The Signal and the Noise:

down between new-fangled statistics-

“We face danger whenever information

and model-driven predictive methods, on

growth outpaces our understanding of

the one hand, and old-school, horserace-

how to process it.”

style prognosticating, on the other.

Much of what is happening in informa-

Silver’s book is in a sense a meditation

tion management right now focuses on

on the promise of statistics and the limits

making vast (and growing) volumes of

of human understanding. The two aren’t

information intelligible or comprehen-

necessarily a neat fit. The Signal and the

sible. But the human brain is still the

Noise is as interested in the predictive

primary site of analysis and synthesis. As

or disclosive power of statistics as it is

the Election of 2012 demonstrated, our

in the human ability to comprehend – to

brains are lagging behind the statistical

think – probabilistically. Silver doesn’t

concepts and methods that might help

set up an infallible statistical strawman,

us achieve greater clarity and insight

either; he’s as alert to the misuse of sta-

– better understanding – of our world.

tistical or predictive models; to flawed

Silver’s book is a wake-up call for us to

assumptions; to insufficient data (or to

get cracking.

the impossibility of ever being able to

Share your comments >

have “sufficient” data) as he is to what

Visit the Radiant Advisors eBookshelf at to see all Editor’s Pick titles

Lindy Ryan is Editor in Chief of Radiant Advisors. rediscoveringBI Magazine • #rediscoveringBI • 7

TIME FOR AN ARCHITECTURAL RECKONING STEPHEN SWOYER The DW and big data: Two platforms, two very different purposes…seemingly on a collision course.


ODAY’S BUSINESS INTELLIGENCE (BI) and data warehouse (DW) platforms are highly adapted to their environments; however, they are less suited to use outside of these environments. The same might be said for big data platforms, too.

The DW and big data: Two platforms, two very different purposes, two agendas – seemingly on a collision course. It isn’t so much a question of which platform vision will triumph, but of how two such different visions can be reconciled.

8 • rediscoveringBI Magazine • #rediscoveringBI


...the DW-driven status quo is not sufficient to address the reporting and analytic needs of the enterprise.” tus quo is not sufficient to address the reporting and analytic needs of the enterprise. Nor, for that matter, is its big data counterpart, which has emerged as a platform for highly scalable real-time processing. The data warehouse is a query platform par excellence; it excels at aggregating business “facts” out of queries. Big data platforms – the most heavily hyped of which is Hadoop – excel at data processing. Big data systems can process data of all kinds; they are schema-optional platforms. They likewise have the ability to perform complex analytics at close to real-time intervals. They can scale with staggering ease. In fact, as one prominent BI industry technologist argues, scaling out on Hadoop is – for all intents and purposes – “free.” The DW-driven BI and the emerging big data paradigms aren’t the only platforms that need to be reconciled. There’s also the OLTP database and application stack, which is both conceptually and operationally distinct from the DW and BI. Nor is that all: over the years, BI has gradually usurped analytic functionality unto itself; it’s undeniable, however, that most enterprises also play host to dedicated high-end analytics platforms – e.g., entrenched SAS, SPSS, or (more recently) R statistical and data mining practices. What’s more, several other nominally discrete platforms – for example, enterprise data archiving and information lifecycle management (ILM) – must likewise

New Synthesis In 1975, entomologist E.O. Wilson published his landmark Sociobiology, subtitling it: The New Synthesis. Wilson’s book argued for a multidisciplinary reconceptualization of behavior and social interaction. His claim — provocative, controversial, and to this day tendentious — was that human and animal behaviors must be understood as products of natural selection; that ethology, sociology, and psychology, considered by themselves, are not sufficient to account for the diversity of behaviors and social adaptations among both humans and animals. What was needed, Wilson argued, was a kind of reconciliation or synthesis; sociobiology, on Wilson’s terms, is this reconciliation: it describes a synthetic approach to the understanding of behavior – one that’s informed by evolutionary theory, population ecology, and other disciplines. Experts who work with BI platforms say we’ve arrived at a similar moment, at least with respect to the architectural hub of BI and decision support: the data warehouse. By itself, the DW-driven sta-

be reconciled or synthesized in any Architectural Reckoning. So, too, must specialty systems, including dedicated graphing databases and legacy, non-relational data sources – such as the vast volumes of information stored in flat file, VSAM, IMS, Adabase, and other mainframe-bound systems. The problem isn’t that we lack for potential solutions; some technological futurists would have us throw everything into Hadoop, after all. It’s rather that we don’t yet have a Sociobiology-like vision of what a post-DW-driven architecture might look like. There’s no single, synthetic, vendor-neutral vision that reconciles the still-viable DW-driven BI platform with the emerging big data platform with the dedicated statistics and data mining platform with the data archiving and ILM platforms with the specialty niche platforms. Thanks to Wilson’s project, however, we have a criterion for undertaking such an Architectural Reckoning: viz., selection pressure, the engine of natural selection. rediscoveringBI Magazine • #rediscoveringBI • 9

Call it fit-for-purpose; call it using the best tool for the job; call

inside a hub-and-spoke architecture” says industry luminary

it supporting and maintaining each platform in the “habitat” for

Claudia Imhoff, CEO of information management consultancy

which it’s best suited and adapted. The criterion for Architectural

Intelligent Solutions Inc. “There are many valid and good

Reckoning is selection pressure – i.e., what works best and

[kinds of] analytics that belong outside that architecture.”

why. The product of Architectural Reckoning will be the New

Imhoff, of course, is almost as much a part of the history of


the DW space as are seminal technologists Devlin, Murphy,

The (Un)Making of the Platform Status Quo A quarter of a century ago, Barry Devlin and Paul Murphy published “An architecture for a business and information system,” the seminal paper in which they outlined the conceptual underpinnings of the classic DW-driven BI architecture. Within a couple of years, Ralph Kimball and Bill Inmon had begun implement-

Inmon, and Kimball. If she didn’t conceive of or build the first DW, she got in at the ground floor. But Imhoff thinks the case for an Architectural Reckoning – for a New Synthesis of some kind – is irrefragable: “I think part of it is that we are right now in a very, very disruptive period of a lot of new technologies flooding in, absolutely flooding in, to business intelligence. What’s proved to be most disruptive [to the DW status quo] is this issue of real-time analytics.” The data warehouse

ing physical systems

simply cannot “do”

based on the Devlin/ Murphy



Since then, BI has

its core, its design

evolved as a more or

philosophy embodies

less straightforward

assumptions about a

expression of Devlin/

static world – i.e., a

Murphy’s foundation-

world in which data

al DW-driven archi-

and requirements can

tecture. One upshot

be known and mod-

of this is that today’s

eled in advance; a

BI and DW platforms

world in which (more-

are highly adapted to their



change. This assump-

and they deliver sig-

tion is at odds with

nificant value — in But an emerging consensus says that the traditional DW-driven BI architecture simply cannot be all things to all information consumers; that DW-driven BI cannot adapt (or cannot be adapted) to the selection pressures of the information enterprise. For two decades, the data warehouse and its enabling ecosystem of BI tools functioned as the organizing focus of information management and decision-support in the enterprise. In other words, the DW was able to effectively dictate the conditions of its own

– is exploded by – the vicissitudes of business and human events. “If all analytics used static data, then we could pull them in[to the DW] … and analyze the heck out of them. What’s changed is that we now have the capability to analyze data on-the-fly,” she says. “That’s caused significant disruption. Now we have to accept that not all analytics belong inside the BI architecture — or else the BI architecture has to embrace and extend to address those analytics as well.”

environment. It was likewise able to adapt on its own terms.

Not so Fast?

and it arguably hasn’t been for the last half-decade.

WhereScape Inc., isn’t quite convinced.

With the emergence of BI discovery, big data, and the real-time, mega-scale data processing capacity of Hadoop, the data warehouse finds itself inhabiting a kind of micro-climate: a habitat or environment in which, yes, it still delivers compelling value, but outside of which its lack of adaptability – its limitations – can no longer be ignored. Its habitat is shrinking – and there’s plenty of disruption all around it. “It took me a long time, but I like to joke that after several therapy sessions I’m now able to say that not all analytics belong

10 • rediscoveringBI Magazine • #rediscoveringBI


do not significantly

They make sense —

the context of these environments.


Imhoff concedes. At

Michael Whitehead, CEO of data warehousing software vendor Perhaps we do need to do some architectural tweaking, Whitehead concedes; but shouldn’t we first do what we can to fix our bread-and-butter data warehouse-driven BI programs? It isn’t as if most BI programs are perfect, let alone optimal; in painful point of fact, Whitehead maintains, most BI programs under-perform. In this context, Whitehead sees the issue of Architectural Reckoning as a distraction. “We haven’t solved the basic problems yet. What’s happening at the edge is all very exciting, but at the core it still takes far

Architectural Reckoning is about which platform works best for which purpose” too long to build a basic data warehouse that is changeable and

“I’m not convinced that it’s the best tool for querying a known

just works,” he says.

set of data for a known set of reports. It isn’t going to uni-

Of course, WhereScape is a provider of data warehouse soft-

formly smoke the business intelligence and data warehousing

ware tools; it has an undeniable stake in the DW status quo.

world on all counts after just half a decade of people tinkering

Whitehead concedes as much, but argues that his company’s role

around with it. But people need to realize that even if a lot of

in any New Synthesis would look a lot like it does now. “Even if

the [Hadoop-related] BS is wrong, this is still a transformative

you start with the big data platforms, at some point, you’re going


to need to bring data together in a repeatable way and [you’re going to need to] do it consistently. You’re going to need to be able to materialize [it] and report on it. You’re going to want to persist it: it’s just a natural way to answer a set of questions,” he argues. “You just always end up at the same point: it’s natural to have a repository of data that’s materialized that you maintain for [addressing] a certain set of problems.” As a case in point, consider Facebook’s come-to-DW about-face. Yes, it’s true: Facebook is building itself a data warehouse, chiefly because it needs repeatability and consistency. Industry veteran Scott Davis, founder and CEO of information discovery and collaboration specialist Lyzasoft Inc., stakes out a position at the opposite end of the spectrum from Whitehead’s essentialism. As Davis sees it, Hadoop is a hugely transformative technology. It solves the distributed compute problem; it solves the distributed storage problem: it solves the problem of inexpensively scaling workloads of all kinds across massive compute and storage resources. More to the point, Davis argues, Hadoop is already being used to host traditional data warehouse workloads – including data integration (DI) and ETL jobs. “I think that staging ETL and highend analytic queries in Hadoop — I think it’s just very, very difficult for any other technology to compete with Hadoop” in this

Conclusion: Who’s up for it? A vendor-neutral New Synthesis needs to be conceived, however; it’s awaiting both conceptual demonstration and physical implementation. We’ve seen some possible candidates: Radiant Advisors, for example, has its Modern Data Platforms Framework . Gartner Inc. analyst Mark Beyer has an architectural vision of his own, as do respected BI industry thought-leaders Wayne Eckerson (a principal with BI Leader Consulting) and Shawn Rogers (a principal with Enterprise Management Associates). Most of these visions are focused by – or have for a frame of reference – the DW, either as a starting point or as a point of departure: they’re inescapable products of data management-centric thinking. Architectural Reckoning is about which platform works best for which purpose. The synthetic architecture (the New Synthesis) that is its product won’t be organized or managed – won’t be understood – primarily on DM-like terms. Architectural Reckoning is a multi-disciplinary project, involving stakeholders from different factions across IT and the line of business. That this entire article has assessed this project from a DM-centric perspective is testament to just How Hard of a task it is. Who’s up for it? Share your comments >

respect, says Davis. Davis comes close to suggesting that Hadoop has the potential to become an all-in-one platform for enterprise information management; he stops just short of saying it, however. Nevertheless, he argues, to the degree that an Architectural Reckoning does take place, Hadoop will likely be big beneficiary.

Stephen Swoyer is a technology writer

“The beautiful thing about Hadoop is that it wasn’t built for a

with more than 15 years of experience.

broad array of purposes. Once you understand that, it starts to

His writing has focused on business

look beautiful. If you judge it by the ability to do everything

intelligence and data warehousing for

elegantly, you’re going to find it wanting,” he concedes.

almost a decade. rediscoveringBI Magazine • #rediscoveringBI • 11





ONDUCTING SOFTWARE evaluations and aligning business goals to solution capabilities are not easy tasks. Organizations constantly struggle to identify how to optimize business intelligence (BI) investments

by expanding current use, or by looking at new solutions to address business pains. Irrespective of desired BI use and areas being looked at (such as call center management, operations, financial reporting, etc.), there are specific questions that need to be asked in order to select the right software and hardware to optimize BI use and align these choices with broader strategic business goals. These questions identify the types of solutions businesses should consider. For instance, if looking to expand a BI investment and infrastructure it might be possible to use current hardware and data warehouse expenditures as a base. Instead of adding a new server, new data sources may be added to the current environment, with considerations for new solutions being relegated to front-end dashboarding and analytics. Identifying technical considerations and infrastructure requirements are essential because some businesses require real-time access or high-powered analytics, while others only need weekly or monthly reporting and multi-dimensional analysis. Such technical- and business-oriented questions are first steps to making sure BI platform decisions support the overall goals being achieved through BI adoption or expansion.

12 • rediscoveringBI Magazine • #rediscoveringBI

Understanding Goals Different organizations look to business intelligence to meet a variety of business-oriented goals. Some businesses want to align technology use with overall strategy, while others are looking for greater insights into customers, or a more efficient supply chain. The end goal will affect any software selection focus. And, even if multiple goals are identified, solution capabilities — and the type of applications selected –need to properly support the goals identified as the bottom line. When identifying the BI mission, ask questions like:


What is the end goal? This might seem simple, but it requires identifying what the organization needs to achieve to make this iteration of BI a success.

Is this better visibility into marketing initiatives, an increase in lead generation by 15%, better data quality management, etc.?


Which stakeholders are involved? Aside from the development of business rules and new algorithms, the people using the tools will determine

the level of interactivity and self-service capabilities, and this in turn may limit the types of solutions considered.


How do these goals align with required metrics? This aspect requires moving beyond simple business rules and developing metrics that will

support overall goals. Although industry metrics exist, most

businesses will still need to develop

between business units and IT depart-

customized metrics to tailor analytics to

ments are essential. Latency, process-

individualized needs.

ing power, storage, and APIs are just


What are the strategic

some of the considerations that require

goals the company hopes

an understanding of the business and

to achieve, and what busi-

what it hopes to achieve. To understand

ness processes and information points

the breadth of requirements, a broader

are required to support these efforts?

evaluation of business needs is required

Developing the link between process

— including identifying how the right BI

management and required data sources

infrastructure can support BI

helps identify how data can support busi-

related goals.

ness functions.

Infrastructure Considerations

Tying Goals Into Business Requirements In general, three types of information

Developing the right infrastructure is

are required for business intelligence

important when looking at long-term


viability and scalability. In the past, many

-BI purpose, or, the overall goal align-

IT departments were in charge of design-

ment between business entities and BI

ing and maintaining a BI platform on


their own. Now, due to the diversity in the market and the different ways in

-Technical infrastructure that supports the business needs.

which business users need to access and

-Business requirements that address

interact with their data, collaboration

challenges and gaps within information

Different organizations look to business intelligence to meet a variety of business-oriented goals”

rediscoveringBI Magazine • #rediscoveringBI • 13

visibility and analytics access. All of these areas relate to (or support) business requirements and are the basics required when evaluating BI offerings. For instance, understanding customers better and identifying patterns and opportunities can include looking at several different customer access points. Looking at account information, demographics, accounts receivables, support, etc. that may exist across multiple data sources can provide insight into the lifetime value of the customer, identify customers that are influencers, and recommend multiple friends and family members to apply for services or buy products, and so on. Understanding how information connects to each other can provide broader access points to data that may have been inaccessible or overlooked in the past.

Connecting the Dots Asking the right questions and identifying goals when evaluating software solutions are common sense; however, making sure that the right BI solution is selected is another story. Because so many solutions with similar overlapping capabilities exist, understanding key market differentiations might not be intuitive. Consequently, companies need to make sure they identify business and technical requirements first and have a strong understanding of the link between these requirements and the underlying goals. Without this link, decision makers may select solutions based on features and not on how those capabilities support business needs. The following questions can help break down the barriers that lead to some of the confusion that exists within the marketplace: 1) What are the gaps that exist within our current BI tool? 2) What do we need to bring BI to the next level? i.e. technical, new features, new algorithms and business rules, etc. 3) How have the essential BI requirements shifted? For instance, is the organization moving from historical trends identification towards operational intelligence, or is there a new focus on unstructured data content? 4) What is required to meet the new needs of the organization – business, technical, 14 • rediscoveringBI Magazine • #rediscoveringBI

and cultural? 5) What is the justification for this expansion and potential evaluation of a new software offering? These are initial questions to get BI decision makers on the right track, and that organizations can use a starting point to justify expenditures and to support the transition from traditional BI to strategic analytics. Although company requirements will differ depending on industry and purpose, overarching requirements in relation to infrastructure and goal alignment will

Lyndsay Wise is the president and founder of WiseAnalytics. She pro-

be similar. These can then be used to sup-

vides consulting services for SMBs

port broader software choices.

and conducts research into BI mid-

Share your comments >

market needs.





port; Teradata’s Aster Discovery platform,

Corp., and Microsoft Corp.) seem to be



which aims to address information dis-

betting on fit-for-purposive platforms as

Hadoop might prove to be


covery and business analyst-y use-cases;

the best response to selection pressure.

late last month, when EMC

and Teradata’s Connector for Hadoop,

In other words, not a single platform

Corp. unveiled Pivotal HD, its new pro-

which addresses the big data use case.

to rule them all – a la EMC and Pivotal

prietary distribution of Hadoop. Pivotal

(Teradata has also been an early and

HD — but the right platform for the right

HD includes a technology called “Hawq,”

active proponent of Hcatalog, a metadata


which EMC describes as an ACID-

catalog for Hadoop.) UDA itself includes

While the ambitions of DI-only players

compliant MPP RDBMS running across

logical and management amenities (such

such as Informatica Corp. aren’t quite


as Teradata Viewpoint and Teradata Vital

as far-reaching, they’re no less fit-for-

Unlike Hive, Hawq isn’t an overlay or

Infrastructure) designed to knit together

purpose-y: Informatica markets a line of

translation layer: it’s an MPP RDBMS run-

its fit-for-purpose pieces into a single

data archiving- and ILM-related products,

ning on top of Hadoop. In EMC’s calculus,

synthetic architecture. Teradata isn’t

and is working to improve integration

Hadoop by itself is a New Synthesis; in

exactly alone. IBM Corp. markets a vast

and interoperability between these and

other words, EMC thinks Hadoop is flex-

middleware portfolio of fit-for-purpose

its bread-and-butter DI technologies.

ible and adaptable enough to withstand

database systems; DI assets – includ-

Informatica could credibly position its

any and all kinds of selection pressure.

ing ETL, data quality (DQ), master data

DI technology as a tissuing substrate for

It’s as if EMC asked itself “What works

management (MDM), data replication,

a would-be synthetic architecture. So,

best and why?” and came up with a kind

and data mirroring tools; several data

too, could Composite Software Inc., the

of compound answer: Hadoop – with a

virtualization (DV) technology offerings;

theme of whose 2012 Data Virtualization

big boost from Hawq. Pivotal HD is argu-

and connectivity into Hadoop and big

Day event in New York, NY was “The

ably the most audacious such strike from

data. Big Blue hasn’t unveiled as coher-

Logical Next Steps.” Composite cham-

the big data side of the aisle.

ent an architectural vision as Teradata,

pions the equivalent of a logical data

Across the aisle, Teradata Corp. with its

but it does have all of the pieces to

architecture: a virtual abstraction layer,

Unified Data Architecture (UDA) touts a

do so. Likewise for SAP AG, which mar-

enabled by its Composite Information

similar one-stop-platform – albeit one

kets a single engine for traditional DW,

Server DV platform. Composite competi-

that’s based on a very different calculus:

NoSQL, text analytic, and graphing data-

tor Denodo Technologies Inc. markets

a Teradata-centric fit-for-purpose-ness.

base operations (its HANA in-memory

similar DV software, as does Red Hat

UDA comprises the traditional Teradata

platform), along with a full DI, DQ, and


enterprise data warehouse – the long-

DV stack. These vendors and others (e.g.,

standing linchpin of BI and decision sup-

Dell Inc., Hewlett-Packard Co., Oracle rediscoveringBI Magazine • #rediscoveringBI • 15




IT application development, business intelligence (BI)

Living More and More Without Data Models

teams, and data management vendors, and is brought

From the programmer-centric view, accessing data in key-value

about by the confluence of several recent major trends in IT,

formats matches the objects they are loading data into for appli-

BI, and technology that are challenging the classic data mod-

cation execution, yet object databases never really became the

eling paradigm. The real debate, however, should be about

mainstream as many hoped, and the “object-to-relational” layer

how semantics should be analyzed or discovered and where

gained traction with incumbent relational databases. Primarily,

that definition should be maintained for data going forward.

it’s been the flexibility, adaptability, and speed that have driven

One major driver in this debate is the current technology adop-

many application developers to use key-value stores. This is

tion shift: the rise in data technologies such as NoSQL data

because it moves the semantic definition away from the rigid

stores – that are flexible with schema-less key-value stores

structured physical data model to the application layer, wherein

– along with the mainstream acceptance of analytic databases

a developer can control simple changes or additional data ele-

that leverage similar columnar and in-memory architectural

ments in code, then simply compile and redeploy it. Depending

benefits for BI. These technologies allow data elements to be

on the application at hand, many developers also are embracing

arranged into “tuples” (or, records based on a programmer’s defi-

document stores as their data repository.

nition) outside of the physical data model, and simultaneously

BI developers, on the other hand, have been finding value with

enable the ever increasing drive by the business that applica-

what key-value stores (like Hadoop) have to offer from both

tions be built quicker and more flexible for competitive advan-

an information discovery and analysis perspective. Once again,

tage (i.e. first to market and the ability to adapt quicker than the

when you remove the semantic definition – or, perspective bias

competitor). There is also more acceptance – realization – of the

– from data, analysts are able to discover and test and witness

fact that the business and analysts don’t always know what is

new relationships among data elements: analysts can work

needed, but want to discover what is desired

with the semantic definitions in very quick and iterative fash-

through user interactions.

ions through the use of abstracted or data virtualization layers

HESE DAYS, the phrase “data models are dead” seems to find its way into high-debate conversations with

16 • rediscoveringBI Magazine • #rediscoveringBI

above the data. Testing semantic defini-

worse yet, may misrepresent operational

tions early on in BI projects are proving

data in the DW.

to be invaluable in attaining a more

When working with application and BI

complete understanding of data quality

development teams, we have seen two

and avoiding business rules issues that

approaches (or a hybrid of these two

could be disruptive and cause significant

approaches) that work well. First, we

impact later. Finally, the interactive pro-

argued that “an order is an order” for

cess involving business users alongside

well-understood entities used in opera-

analysts and modelers is proving to cre-

tional data models, basically encouraging

ate more accurate and faster BI products,

application teams to only make parts of

similar to the agile BI process.

their data models “dynamic” where they

Can’t Live Without Data Models

As we see more applications move the semantics of data into their application layer and away from physical data models, we must also recognize that those applications are the source systems for many data warehouses (DW). If the business use of operational data sits in the application itself and not the physical database, then BI analysts and integration specialists are flying blind – or

needed them – like in sub-typing. This process would have a super-type for product, with respective sub-types, and one specialized sub-type to allow for dynamic creation of new sub-producttypes that could be migrated later to a formal sub-type. This approach satisfies the need for flexibility and speed. Second, the use of metadata models encourages

There will always be a strong need for a reference data warehouse”

the desire for meta-driven applications, while providing the BI team with a “key” to unlock data semantics and receive warning beforehand of dynamic data

rediscoveringBI Magazine • #rediscoveringBI • 17

18 • rediscoveringBI Magazine • #rediscoveringBI

changes. However, mostly important is the distinction that data models exist not only for an application’s use, but rather to persist data within context for many information consumers throughout the enterprise. BI and information applications not only deliver reports and information, but also support (and should encourage) ad-hoc requests and analytics within the proper context. Data models, especially in BI, are becoming a part of the data governance umbrella that govern whether data is made available to the right people, at the right time, and used properly. There will always be a strong need for a reference data warehouse. With good data governance, this data platform will enable business users to have self-service capabilities and prevent the misuse of information that could cripple an organization.

Where Data Models Are Born What is being discussed today is not really about whether the data model itself is dead, but rather how analysis is being conducted, discovery-oriented, and where the results of analysis – context – should be persisted. (Sometimes context should reside in application code that can deal with change faster, and sometimes instead in physical data models that can ensure that as many business users as possible can leverage a commonly agreed upon and proper context for decision-making in the business.) Modern data platforms balance and integrate the use of

Modern data platforms balance and integrate the use of both flexible and structured data stores”

both flexible and structured data stores through Hadoop and RDBMSs, but it’s the analytics lifecycle methodologies that will enable information discovery and the governance to decide whether to migrate and manage analytics throughout the enterprise. Modeling is about performing thorough analysis and understanding of the business; the resulting data models should represent the data persisted by the business in databases or virtual data layers. Key-value stores may be where a discovery process – as a form of analysis – leads to the “birth of data models,” which then can be properly persisted for business information consumers to share and leverage. Share your comments >

John O’Brien is the Principal and CEO of Radiant Advisors, a strategic advisory and research firm that delivers innovative thought-leadership, publications, and industry news.

rediscoveringBI Magazine • #rediscoveringBI • 19

CHADVISED OPRESEAR REARCHAD CHADVISED DVISEDEVE SEVELOPR ABOUT RADIANT ADVISORS R E S E A R C H . . . A D V I S E . . . D E V E L O P. . . Radiant Advisors is a strategic advisory and research firm that networks with industry experts to deliver innovative, cutting-edge educational materials, publications, and indepth industry research.

v i s i t w w w. r a d i a n t a d v i s o r s . c o m F o l l o w u s o n Tw i t t e r ! @ r a d i a n t a d v i s o r s

RediscoveringBI | March 2013  

Shifting Gears with Modern BI Architectures

Read more
Read more
Similar to
Popular now
Just for you