Hadoop Developer Training In Bangalore
Apache Hadoop Developer
=====
—–
* Course Id : BIGD-HDPD
* Duration : 16 Hours
Overview
—–
* Participants will learn installing and setting up Hadoop Cluster
* This course allows attendees to learn Hadoop MapReduce Framework
* It also covers concepts like Apache Sqoop, Apache Oozie and Apache Hive
Training Objectives
—–
All attendees will :
* Learn limitations of traditional large scale systems
* Understand Core components of Hadoop
* Understand the concept of Mappers, Reducers
* Be able to perform Hive DDL, Hive DML operations
Prerequisites
—–
Basic knowledge on:
* Java
* Linux is an added advantage
Course Structure
—–
* We provide more focus on hands-on in our technical courses (typically 80% hands-on/20% theory)
* Students get the capability to apply the material they learn to real-world problems
Materials Provided
—–
* All participants receive
* PDF of slides
* PDF of handson
* Access to instance with lab environment
Software Requirements
—–
* Any one of
* Firefox 27;
* Chrome 40;
Hardware Requirements
—–
* Processor: 1.2 GHz
* RAM: 512 MB
* Disk space: 1 GB
* Network Connection with low latency (<250ms) to Internet
## Daywise Hadoop Developer Course Outline
—–
## Day 1
—–
* Unit 1: Introduction to Big Data and Hadoop
* Unit 2: Hadoop 1.x
* Unit 3: Hadoop 2.0
* Unit 4: Understanding Hadoop MapReduce Framework
## Day 2
—–
* Unit 5: Working with Yarn
* Unit 6: Understanding Apache Sqoop
* Unit 7: Understanding Apache Hive
* Unit 8: Understanding Apache Oozie
## Detailed Hadoop Development Course Outline
—–
Unit 1 : Introduction to Big Data and Hadoop
—–
* Why did Big Data suddenly become so prominent
* Limitations of traditional large scale systems
* Compare Hadoop architecture with traditional architecture
* Core components of Hadoop
* Understanding Hadoop Master-Slave Architecture
Unit 2 : Hadoop 1.x
—–
* Hadoop deployment Modes – Standalone, Single node, Multinode
* Configuration files in a Hadoop Cluster
* Understanding HDFS Architecture
* Learn about NameNode, DataNode, Secondary NameNode
* Learn about JobTracker, TaskTracker
* Anatomy of Read and Write data on HDFS
* Run HDFS and Linux commands
Unit 3 : Hadoop 2.0
—–
* Hadoop 1.0 Limitations
* MapReduce Limitations
* History of Hadoop 2.0
* HDFS 2: Architecture
* HDFS 2: High availability
* HDFS 2: Federation
Unit 4 : Understanding Hadoop MapReduce Framework
—–
* Overview of the MapReduce Framework
* Use cases of MapReduce
* MapReduce Architecture
* Understand the concept of Mappers, Reducers
* Anatomy of MapReduce Program
* MapReduce Components – Mapper Class, Reducer Class, Driver code
Unit 5 : Working with Yarn
—–
* Yarn Introduction
* Yarn Architecture
* Classic vs. Yarn
* Setting up cluster
Unit 6 : Understanding Apache Sqoop
—–
* Sqoop – How Sqoop works?
* Import/Export Data
* Sqoop Architecture
* Sqoop Installation
Unit 7 : Understanding Apache Hive
—–
* What is Hive?
* Hive DDL – Create/Show/Drop Database
* Hive DDL – Create/Show/Drop Tables
* Hive DML – Load Files into Tables
* Hive DML – Inserting Data into Tables
* Hive SQL – Select, Filter, Join, Group By
* Hive Architecture and Components
* Hive Data Model and Data Units
Unit 8 : Understanding Apache Oozie
—–
* How Oozie works?
* Oozie workflow
* Configuring Apache Oozie
* Running Apache Oozie
