Issuu on Google+

Storage Clarity archival, compliance & cloud experts

Big Data – Life Cycle Management & Challenges Graham Irving April, 24, 2012

Storage Clarity, 2012 All Rights Reserved

http://storageclarity.com Phone: +1.403.764.1320


Contents

5/6/2012



Introductions



Big Data?



Background



Life Cycle Management



Life Cycle Challenges



Recommendations



3-2-1 Archive

(c) 2012, Storage Clarity

2


Introductions 

Graham Irving 

President of Storage Clarity  



Canadian Cloud Council, VP Bus. Development, Prairies



Archive storage expert  



Past chairman of OSTA’s, COSA committee ANSI X3B11 & X3B11.1 standards committee

Focused on:   

5/6/2012

30 yrs storage experience BSc. Computer Science

Mission critical data Long term preservation/archival Cloud data protection (c) 2010, Storage Clarity

3


Big Data? “Big Data is a term applied to data sets whose size is beyond the ability of commonly used SW & HW to capture, manage, and process the data within a tolerable elapsed time.” Wikipedia

“Big Data sizes are a constantly moving target currently ranging from a few dozen TB’s to many PB’s in a single data set” Wikipedia

5/6/2012

(c) 2012, Storage Clarity

4


Background 

Big Data is bigger (it’s all relative) 



10’s TB’s 4 up to PB’s 4 up to EB’s 4

Data is rapidly growing 

Digital Universe - 1.8 ZB in 2011 to 7.9 ZB in 2015 SNIA - Rethinking Archiving

  



IT Spending is Falling   

 5/6/2012

Amount of data created is doubling every 2 years Keeping data for longer periods New sources of data HW & SW falling Tech staff expected to do more SW & HW as a service

Trade-offs are everywhere! (c) 2012, Storage Clarity

5


Life Cycle Management 

Managing data from Cradle to Grave 



5/6/2012

Avg. life span estimated at 30 years

Involves: 

System & storage architectures



Storage technology & media



Data protection



Security & encryption



Migration



Retention



Archiving



Destruction (c) 2012, Storage Clarity

6


Life Cycle Challenges 

Scalability



Performance



Preventing data loss 

Detecting & recovering from corruption 



Undetected “silent” storage/system errors

Detecting & recovering from storage failures 

Single points of failure



Parity based RAID works poorly with large volumes



Annual HD failure rates of 2% to 13% “Disk failures in the real world: What does an MTTF of 1M hours mean to you?”

5/6/2012

(c) 2012, Storage Clarity

7


Life Cycle Challenges 





Migrating data many times 

Avg. life for HD – 3yrs, Tape – 7yrs



How long will it take to migrate a big data set?



Technology life cycles are short and getting shorter

Clouds 

Virtualization is not kind to data I/O



Current LAN/WAN speeds are SLOW



Danger of being locked in

Device & Media Failures 

5/6/2012

Manufacturers are very optimistic! (c) 2012, Storage Clarity

8


Life Cycle Challenges 

Encryption key management 



Keeping keys over long time periods

Legacy OS’s and file systems 

Traditional hierarchical file systems have many limitations 



5/6/2012

Example: File lookups with “Find First” & “Find Next ops.”



Most OS’s are architected for only hierarchical file systems



Object/grid based architectures require gateways

Compliance 

Chain of Custody



Destroying expired records

(c) 2012, Storage Clarity

9


Recommendations 

Data Protection     

Batch/CDP backup Replication Granular policy driven data selection Delta’s & deduplication D2D or D2D2T (disk to disk to tape)



Backups are NOT Archives



Retention & destruction policies



Tiered storage 

HSM – transparent data migration between tiers 

 5/6/2012

D2D or D2T

Avoid single points of failure (c) 2012, Storage Clarity

10


3-2-1 Archive 

5/6/2012

Archiving & Data Protection Best Practice 

3-2-1 best practice is important



Green storage = removable media



3 copies of all data



2 different types of storage media/technology



1 copy off site, removable or fixed media

(c) 2012, Storage Clarity

11


Resources 

Unified Data Protection Software 

Cofio AIMstor free version: 



Cofio AIMstor commercial version: 



Grau Data open source HSM: 



http://www.openarchive.net

Grau Data commercial version: 

http://www.graudata.com/english/ArchiveManager

Magnetic disk WORM software 

Grau Data FileLock software: 

5/6/2012

http://www.cofio.com/AIMstor-Download/

HSM Storage software 



http://www.cofio.com/free-backup-software/

http://www.graudata.com/english/filelock (c) 2012, Storage Clarity

12


Thank you For more information: Graham Irving Phone: +1.403.764.1320 graham@storageclarity.com http://www.storageclarity.com

5/6/2012

(c) 2012, Storage Clarity

13


Irving Graham - Big Data - Life Cycle Managment & Challenges