Computerworld Singapore July/August 2013

Page 17

computerworld Singapore www.computerworld.com.sg

information infrastructure

operational intelligence through provision of business insights, operational visibility, proactive monitoring, and search and investigation.

Analytics and Big Data

SNIA South Asia’s PK Gupta kicked off the SNIA Education Sessions by giving the audience an overview of the data deluge confronting us today: the world now has more than 1.2 zettabytes of digital data, growing to 35 zettabytes in 10 years. Out of this, 90 percent of the so-called “digital universe” is unstructured. “Just in 2011, the digital universe had 300 quadrillion files,” said Gupta. Making sense of huge amounts of data will become highly important, he said, because of the information derived will enable businesses to adapt to changing market conditions quickly and to strive. Big data analytics has been touted as the new arena of data distillation, but Gupta warned that it requires a very different Thomas Chua approach compared to traditional business intelligence, which “is repetitive, is structured, works with operational sources of data, and typically working on datasets ranging from gigabytes to tens of terabytes in size,” said Gupta. Big data analytics is more experimental and likely to be conducted on an ad hoc basis. It is also mostly semi-structured, and may require external as well as operational sources of data, and handles data volumes ranging from tens of terabytes to hundreds of petabytes. According to Gupta, big data consists of large data set sizes characterised by volume (for example, terabytes in size, and millions in transactions, tables, records and files), velocity (in processing, whether in batch, near-time, real-time, or streaming), and variety (structured, unstructured or semi-structured). From another perspective, BI and analytics can be understood from the kinds of answers businesses are asking. Past data enables the business to understand “what happened” through reporting and dashboards, and “why did it happen” through forensics and data mining. “Real-time analytics answers ‘what is happening’, and real-time data mining seeks to answer ‘why it is happening’,” said Gupta. As the term suggests, predictive analytics answers “what is likely to happen” and prescriptive analytics will be for answering “what should be done about it”. Gupta highlighted the 10 use cases for big data analytics (see sidebar).

Public, Private and Hybrid

Thomas Chua, education instructor with SNIA South Asia, talked about deploying public, private and hybrid storage clouds. He pointed out there are five major characteristics of a cloud—on-demand self-service portal, resource pooling, rapid elasticity, measured service, and broad network access. Cloud services fall into three categories—software-as-aservice (Saas), platform-as-a-service (PaaS) and infrastructure-as-a-service (IaaS). However,

what is deployed will usually depend on cost, whether it is a public cloud, private, community or hybrid cloud. As to what cloud storage is used for, there are several reasons. Elastic demand for Web-based media (such as video, ebooks, audio), backup to the cloud (restore, recovery, “seeding” the backup with hard drive), synchronisation of files to the cloud and multiple devices (Internet “drive” as secondary storage), archive/preservation in the cloud (including compliance, retention and e-discovery), are some of the main reasons warranting cloud storage. He also talked about the data storage interface evolution, which referred to the huge amount of data generated by an ever diversifying set of devices such as mobile phones and tablets. With trends like BYOD and consumerisation of IT, corporate data are being stored in public cloud in unmanaged and unprotected states. This might be prone to be Heoh Chin-Fah compromised. All these changes to storage requirements are leading to the increasing importance of cloud storage. The technical session spent some time on the adoption rate of CDMI, and ended with some detailed treatment of how cloud storage could be deployed, and what technical issues should be addressed before deploying.

Solid State Storage

Heoh Chin-Fah, who is also an education instructor with SNIA South Asia, took the stage to talk about how best to take advantage of flash memory in enterprise storage environments, in particular in recognising the relative advantages of flash tiering, caching and all-flash approaches, with reference to performance, cost, reliability and predictability. After giving a simplified description of flash memory, Heoh went on to explain its inner workings, and why flash “wear” is a real concern in how it should be used within a storage infrastructure. “In the end, capacity, cost improvement brought about by flash, and the attractiveness of its form factor must be weighed against true concerns such as flash performance, reliability and error-handling,” said Heoh.

10 Use Cases for Big Data Analytics 1. Modelling true risk 2. Customer churn analysis 3. Recommendation engine 4. Ad targetting 5. Point-of-sale transaction analysis 6. Analysing network data to predict failure 7. Threat analysis 8. Trade surveillance 9. Search quality 10. Data “sandbox”

July – August 2013

Heoh went on to talk about the technical differences between flash and disk drive memory, fleshing out the key properties of each drive technology. While flash wins in access, Heoh warned of flash “read wear”, an inherent shortcoming of solid-state architecture that does not suit high-speed random read/write input-output processes. Disks on the other hand spend a lot of time seeking and rotating (the disk), compared to data delivery. However, demand on higher memory performance brought about by multiple-core CPUs, data growth, and the consumerisation of IT, is contributing to what Heoh said is the storage I/O crisis. This is further exacerbated by randomisation of the virtualised or consolidated memory architecture due to virtualisation, data consolidation and different cloud architectures. In addition, demand for ever decreasing seek times and higher rotating disk speeds will add further pressure on data storage. Wong Tran There is also the need to understand the application’s I/O “fingerprint”. By that, Heoh meant the app’s requirements like I/O load, memory block size, locality of access, access pattern, and latency sensitivity.

Object Storage Systems

SNIA South Asia’s education instructor Wong Tran gave a talk on the underpinning of cloud and big data initiatives. After describing what constitutes a cloud (according to the U.S. National Institute of Standards and Technology), he described the five essential cloud characteristics: on-demand self-service; broad network access; resource pooling; rapid elasticity; and measured service. “Things like Google, Gmail, YouTube are examples of cloud services,” said Tran. “Being able to access these systems is the characteristic of cloud computing.” But what kind of storage can meet these criteria? “If you look at the evolution of storage, traditional spinning drives with block storage led to storage arrays and separate file systems that rely on tree structures to handle files instead of raw bits like in a storage array, which makes storing large amount of data a challenge.” With block storage, data is organised as an array of unrelated blocks and access to the blocks is directly controlled by a host. With file and network-attached storage (NAS), data is similarly organised as unrelated blocks but the onboard file system places data on the disks. External systems provide access to the files within the onboard file system. In object storage, the approach is to provide application-centric data storage, access and management. Virtual containers are used to encapsulate data, their attributes, metadata, and object IDs or keys. “The easiest way to store a large amount of data is using a “bucket” to keep different objects within. Object storage doesn’t care about file structures.”. //////////////////////////////////////////////////////////////////////

17


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.