Hadoop Training In Hyderabad

Page 1

GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

We

are

AUTHORISED

APACHE

HADOOP

HORTONWORKS,

SPARK

&

SCALA

CERTIFICATION PARTNERS. Training by Cloudera & Hortonworks Certified Professional having hands on experience and Real time working professional. MONEY BACK GUARANTEE if NOT SATISFIED.

Big Data/Hadoop Training: Pre-requisites: Knowledge of Core Java/Oracle; Basics of Unix

1. Introduction to Big Data & Hadoop  Importance of Data & Data Analysis  What is Big Data?  Big Data & its hype  Big Data Users & Scenarios  Structured vs Unstructured Data  Challenges of Big Data  How to overcome the challenges?  Divide & Conquer philosophy  Overview of Hadoop 2. Hadoop and its file system - HDFS  History of Hadoop  Hadoop Ecosystem  Hadoop Animal Planet  What is Hadoop?  Key Distinctions of Hadoop  Hadoop Components  HDFS  MapReduce  Why Distributed File System?  The Design of HDFS  Hadoop Distributed File System  What is a HDFS block?  Why HDFS block is so large in HDFS?  NameNode  DataNode

Reach us @ 040 – 6553 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

 

  

 Secondary NameNode A file in HDFS Hadoop Components/Architecture  NameNode, JobTracker, DataNode, TaskTracker & Secondary Namenode  Understanding Storage components(NameNode, DataNode & Secondary Namenode)  Understanding Processing components(JobTracker & TaskTracker) How Secondary Namenode overcomes the failure of the primary Namenode Anatomy of a File Read Anatomy of a File Write

3. Understanding Hadoop Cluster  Walkthrough of CDH VM setup  Hadoop Cluster modes  Standalone Mode  Pseudo-Distributed Mode  Distributed Mode  Hadoop Configuration files  core-site.xml  mapred-site.xml  hdfs-site.xml  yarn-site.xml  Understanding Cluster configuration 4. MapReduce  Meet MapReduce  WordCount algorithm – Traditional approach  Traditional approach on a Distributed system& it’s drawbacks  MapReduce approach  Input & Output Forms of a MR program  Hadoop Data types  Map, Shuffle & Sort, Reduce Phases  Workflow & Transformation of Data  Word Count Code walkthrough  Input Split & HDFS Block  Relation between Split & Block  MR Flow with Single Reduce Task  MR flow with multiple Reducers

Reach us @ 040 – 6553 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

   

Data locality Optimization Speculative Execution Combiner Partitioner

5. Advanced MapReduce  Counters  InputFormat & its hierarchy  OutputFormat & its hierarchy  Using Compression techniques  Side Data Distribution – Distributed Cache  Joins  Map side join using Distributed Cache  Reduce side Join  Secondary Sorting  MR Unit – An Unit testing framework 6. Pig  What is Pig?  Why Pig?  Pig vs Sql  Execution Types or Modes  Running Pig  Pig Data types  Pig Latin relational Operators  Multi Query execution  Pig Latin Diagnostic Operators  Pig Latin Macro & UDF statements  Pig Latin Commands  Pig Latin Expressions  Schemas  Pig Functions  Pig Latin File Loaders  Pig UDF & executing a Pig UDF  Pig Use cases

Reach us @ 040 – 6553 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

7. Hive  Introduction to Hive  Pig vs. Hive  Hive Limitations & Possibilities  Hive Architecture  Metastore  Hive Data Organization  Hive QL  Sql vs. Hive QL  Hive Data types  Data Storage  Managed & External Tables  Partitions & Buckets  Static Partitioning & Dynamic Partitioning  Storage Formats  File Formats – Sequence File & RC File  Using Compression in Hive  Built-in Serdes  Importing Data (Using Load Data & Insert Into)  Alter & Drop Commands  Data Querying  Using MR Scripts  Hive Joins  Sub Queries  Views 8. HBase  Introduction to NoSql & HBase  HBase vs. RDBMS  HBase Use cases  Row & Column oriented storage  Characteristics of a huge DB  What is HBase?  HBase Data-Model  HBase logical model & physical storage  HBase architecture  HBase in operation (put, get, scan & delete)  Loading Data into HBase  HBase shell commands  HBase operations through Java

Reach us @ 040 – 6553 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

 HBase operations through MR 9. ZooKeeper & Oozie  Introduction to Zookeeper  Distributed Coordination  Zookeeper Data Model  Zookeeper Service  Introduction to Zookeeper  Distributed Coordination  Zookeeper Data Model  Zookeeper Service 10. Sqoop  Introduction to Sqoop  Sqoop design  Sqoop basic Commands  Sqoop Table Import flow of execution  Sqoop Import Commands – to HDFS, Hive & HBase tables  Sqoop Incremental Import  Incremental Append  Incremental Last Modified  Sqoop export flow of execution  Sqoop Export Command 11. Flume  Flume Architecture  Flume Components  Streaming live Twitter data with Flume 12. Hadoop 2.0 & YARN  Hadoop 1 Limitations  HDFS Federation  NameNode High Availability  Introduction to YARN  YARN Applications  YARN Architecture  Anatomy of an YARN application 13. MongoDB - Overview

Reach us @ 040 – 6553 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

14. Spark Overview       

What is Spark? Why Spark? Spark & Big Data Spark Components Resilient Distributed Data sets Data Operations on RDD Spark Libraries

JAVA (15 HRS) – To the extent required for MAP Reduce (Complimentary)

Highlights of the Course:  Teaching is oriented towards – o Practical oriented & Hands on o clear understanding of basics o what to expect as an interview question while topic discussion  Exclusive Access to a variety of latest interview questions and answers  Work on real-time projects(in all tools like – Pig, Hive, Mapreduce & HBase)  Certification guidance & Material  Hand-outs will be given which would serve as a knowledge-check  Assistance in Resume preparation  Interviews guidance  Corporate level Training  Finally, this training gives you all that are needed to secure a desired job & keeps you get going in your job!

Reach us @ 040 – 6553 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

Why Gyanvriksh:  Authorized HADOOP Certification Partners  Money back guarantee with 15% interest if not satisfied – Quality Assured  Rated 88% excellent by students – Refer

reviews (around 700 plus

ratings) and Facebook page  Complete Practical Oriented Hands on Training  Situated in IT Hub Kondapur/Madhapur - Main Road  Experienced, Certified & Real time working professional  Class recordings of every session will be provided – Only for Online Training  Maximum batch size 30 to give more focus at individual level.  Register for one course and attend same course of the same faculty in future any number of times for free. Same course different faculty – 50% only charged  Weekend, Weekday, Online & Corporate Trainings  Nice ambience & AC Classrooms  Crash course / Fast Track / 1 – ON – 1 training available (charged extra)  Stay facility for outstation students (Charged separately)

Reach us @ 040 – 6553 2121 / 903 007 2121 …Empowering with Education & Expertise


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.