Storage Clarity archival, compliance & cloud experts
Big Data – Life Cycle Management & Challenges Graham Irving April, 24, 2012
Storage Clarity, 2012 All Rights Reserved
http://storageclarity.com Phone: +1.403.764.1320
Contents
5/6/2012
Introductions
Big Data?
Background
Life Cycle Management
Life Cycle Challenges
Recommendations
3-2-1 Archive
(c) 2012, Storage Clarity
2
Introductions
Graham Irving
President of Storage Clarity
Canadian Cloud Council, VP Bus. Development, Prairies
Archive storage expert
Past chairman of OSTA’s, COSA committee ANSI X3B11 & X3B11.1 standards committee
Focused on:
5/6/2012
30 yrs storage experience BSc. Computer Science
Mission critical data Long term preservation/archival Cloud data protection (c) 2010, Storage Clarity
3
Big Data? “Big Data is a term applied to data sets whose size is beyond the ability of commonly used SW & HW to capture, manage, and process the data within a tolerable elapsed time.” Wikipedia
“Big Data sizes are a constantly moving target currently ranging from a few dozen TB’s to many PB’s in a single data set” Wikipedia
5/6/2012
(c) 2012, Storage Clarity
4
Background
Big Data is bigger (it’s all relative)
10’s TB’s 4 up to PB’s 4 up to EB’s 4
Data is rapidly growing
Digital Universe - 1.8 ZB in 2011 to 7.9 ZB in 2015 SNIA - Rethinking Archiving
IT Spending is Falling
5/6/2012
Amount of data created is doubling every 2 years Keeping data for longer periods New sources of data HW & SW falling Tech staff expected to do more SW & HW as a service
Trade-offs are everywhere! (c) 2012, Storage Clarity
5
Life Cycle Management
Managing data from Cradle to Grave
5/6/2012
Avg. life span estimated at 30 years
Involves:
System & storage architectures
Storage technology & media
Data protection
Security & encryption
Migration
Retention
Archiving
Destruction (c) 2012, Storage Clarity
6
Life Cycle Challenges
Scalability
Performance
Preventing data loss
Detecting & recovering from corruption
Undetected “silent” storage/system errors
Detecting & recovering from storage failures
Single points of failure
Parity based RAID works poorly with large volumes
Annual HD failure rates of 2% to 13% “Disk failures in the real world: What does an MTTF of 1M hours mean to you?”
5/6/2012
(c) 2012, Storage Clarity
7
Life Cycle Challenges
Migrating data many times
Avg. life for HD – 3yrs, Tape – 7yrs
How long will it take to migrate a big data set?
Technology life cycles are short and getting shorter
Clouds
Virtualization is not kind to data I/O
Current LAN/WAN speeds are SLOW
Danger of being locked in
Device & Media Failures
5/6/2012
Manufacturers are very optimistic! (c) 2012, Storage Clarity
8
Life Cycle Challenges
Encryption key management
Keeping keys over long time periods
Legacy OS’s and file systems
Traditional hierarchical file systems have many limitations
5/6/2012
Example: File lookups with “Find First” & “Find Next ops.”
Most OS’s are architected for only hierarchical file systems
Object/grid based architectures require gateways
Compliance
Chain of Custody
Destroying expired records
(c) 2012, Storage Clarity
9
Recommendations
Data Protection
Batch/CDP backup Replication Granular policy driven data selection Delta’s & deduplication D2D or D2D2T (disk to disk to tape)
Backups are NOT Archives
Retention & destruction policies
Tiered storage
HSM – transparent data migration between tiers
5/6/2012
D2D or D2T
Avoid single points of failure (c) 2012, Storage Clarity
10
3-2-1 Archive
5/6/2012
Archiving & Data Protection Best Practice
3-2-1 best practice is important
Green storage = removable media
3 copies of all data
2 different types of storage media/technology
1 copy off site, removable or fixed media
(c) 2012, Storage Clarity
11
Resources
Unified Data Protection Software
Cofio AIMstor free version:
Cofio AIMstor commercial version:
Grau Data open source HSM:
http://www.openarchive.net
Grau Data commercial version:
http://www.graudata.com/english/ArchiveManager
Magnetic disk WORM software
Grau Data FileLock software:
5/6/2012
http://www.cofio.com/AIMstor-Download/
HSM Storage software
http://www.cofio.com/free-backup-software/
http://www.graudata.com/english/filelock (c) 2012, Storage Clarity
12
Thank you For more information: Graham Irving Phone: +1.403.764.1320 graham@storageclarity.com http://www.storageclarity.com
5/6/2012
(c) 2012, Storage Clarity
13