SR TECHNOLOGIES

#101, Santhinilaya, Behind HUDA Maitrivanam

PH: +91-9676126684

HADOOP Development & Admin Course

Hadoop Introduction:-

What is Hadoop? Why Hadoop?
Hadoop History?
Different types of Components in Hadoop?

ð HDFS, MapReduce, PIG, Hive, SQOOP, HBASE, OOZIE, Flume, Zookeeper and so on…

What is the scope of Hadoop?

Hadoop Distributed File System (HDFS) (for Storing the Data):-

ð Introduction of HDFS

ð Features of HDFS

ð Daemons of Hadoop

Name Node
Secondary Name Node
Job Tracker
Data Node
Task Tracker

ð Basic Configuration for HDFS

ð Data Organization and Replication

ð Rack Awareness, Heartbeat Signal

ð How to Store the Data into HDFS

ð Accessing HDFS (Introduction of Basic UNIX commands)

ð CLI commands

MapReduce using Java (Processing the Data):-

ð Introduction of MapReduce.

ð MapReduce Architecture

ð Data flow in MapReduce

Splits
Mapper
Portioning
Sort and shuffle
Combiner
Reducer

ð Basic Configuration of MapReduce

ð MapReduce life cycle

ð Writing and Executing the Basic MapReduce Program using Java

ð File Input Formats

ð Joins

Map-side Joins
Reducer-side Joins

PIG:-

ð Introduction to Apache PIG

ð MapReduce vs PIG

ð Basic PIG programming

ð Modes of Execution in PIG

Local Mode and
MapReduce Mode

ð Execution Mechanisms

Grunt Shell
Script
Embedded

ð Operators in PIG

ð PIG UDF’s

SQOOP:-

ð Introduction to SQOOP

ð Connect to mySql database

ð SQOOP commands

Import
Export
Eval
Codegen and etc…

ð Joins in SQOOP

HIVE:-

ð Introduction to HIVE

ð HIVE Architecture

ð Tables in HIVE

Managed Tables
External Tables

ð Partition

ð Joins in HIVE

ð HIVE UDF’s and UADF’s

HBASE:-

ð Introduction to HBASE and Basic Configurations of HBASE

ð HBASE Architecture

ð SQL vs NOSQL

ð How HBASE is differ from RDBMS

ð Client side buffering or bulk uploads

Cluster Setup:--

ð Downloading and installing the Hadoop

ð Creating Cluster

ð Increasing Decreasing the Cluster size

ð Monitoring the Cluster Health

ð Starting and Stopping the Nodes

Introduction about OOZIE, FLUME and ZOOKEEPER and some sample programs.

Topics Covered in Hadoop Developer

Introduction to Big Data and Hadoop
Hadoop ecosystem concepts
Hadoop MapReduce concepts and features
Developing MapReduce applications
Pig concepts
Hive concepts
Real-time queries with Impala
Real life use cases

Introduction to Big Data and Hadoop

What is Big Data?
What is Hadoop?
Why Hadoop?
History of Hadoop
Hadoop ecosystem
HDFS
MapReduce
Install Hadoop
Single Node Hadoop Setup
Test run Hadoop commands
Hands on
Understanding the Cluster
Writing files to HDFS
Reading files from HDFS
Rack awareness
5 daemons
Deep Dive into MapReduce
Before MapReduce
MapReduce overview
Architecture MapReduce
Word count problem
Word count flow and solution
MapReduce flow
Developing the MapReduce Application
Data Types
File Formats
Explain the Driver, Mapper and Reducer code
Configuring development environment – Eclipse
Writing unit test
Running locally
Running on cluster
Hands on

Monitoring MapReduce Job Status

Job submission
Job initialization
Task assignment
Job completion
Job scheduling
Job failures
Shuffle and sort
Hands on

MapReduce Types and Formats

MapReduce types
Input Formats – Input splits records, text input, binary input, multiple inputs database input
Output Formats – text output, binary output, multiple outputs, lazy output and database output
Hands on

MapReduce Features

Sorting
Joins – Map side and reduce side
MapReduce combiner
MapReduce partitioner
MapReduce distributed cache
Hands on

Hive

Fundamentals
Concepts
Hands-on

Pig

Fundamentals
Concepts
Hands-on

Sqoop

Fundamentals
Concepts
Hands-on

Flume

Fundamentals
Concepts
Hands-on

Case Studies

Real time use case explanation

SAP SUCCESS FACTORS ON LINE TRAINING

Thursday, July 17, 2014

BestOnline Training For HADOOP BIG DATA