Hadoop Administration Training In Bangalore
Hadoop Admin Training
Hadoop Administration
======
—–
* Course Id : BIGD-HDPA
* Duration : 32 Hours
Overview
—–
* Hadoop Administration training teaches the participants about the storage management, Hadoop filesystem, creation and management of Hadoop cluster
* Participants will get to know about Hadoop 2.0,Cloudera manager and Kerberos
Prerequisites
—–
* The participants should have the basic knowledge of Linux command line interface
* Basic of Hadoop
* Basics of Big Data
Objectives
—–
* Understand the Hadoop Architecture, Hadoop Cluster and Hadoop Administrator’s role
* Understand how to Manage, Maintain, Monitor and Troubleshoot a Hadoop Cluster
* Understand working of Hadoop Distributed File System (HDFS)
* How to work with YARN framework, Cloudera manager and Kerberos
* Commissioning and Decommissioning Cluster
* Coverage of Hive, Pig, Oozie, Sqoop and Flume concepts
* Optimizing Hadoop cluster for high performance
Course Structure
* We provide more focus on hands-on in our technical courses (typically 80% hands-on/20% theory)
* Students get the capability to apply the material they learn to real-world problems
Materials Provided
—–
* PDF of slides and hands-on exercises
* Access to instance with lab environment
Software Requirements
—–
Any of the following
* Any current internet browser
* vnc client
* rdp client
Hardware Requirements
—–
* Processor: 1.2 GHz
* RAM: 512 MB
* Disk space: 1 GB
* Network Connection with low latency (<250ms) to Internet
Daywise Hadoop Administration Course Outline
—–
## Day 1
—–
* Unit 1 : Understanding Big Data and Hadoop
* Unit 2 : Hadoop Architecture and Cluster setup
## Day 2
—–
* Unit 3 : Hadoop cluster Administration & Understanding MapReduce
* Unit 4 : Backup,Recovery and Maintenance
## Day 3
—–
* Unit 5 : Hadoop Cluster: Planning and Management
* Unit 6 : Hadoop 2.0 and it’s features
## Day 4
—–
* Unit 7 : Setup Hadoop 2.X with High Availability and upgrading Hadoop
* Unit 8 : Cloudera manager
Detailed Course Outline For Hadoop Admin
—–
Unit 1 : Understanding Big Data and Hadoop
—–
* Introduction to big data, limitations of existing solutions
* Hadoop architecture, Hadoop components and ecosystem
* Data loading & reading from HDFS
* Replication rules, rack awareness theory
* Hadoop cluster administrator
* Roles and responsibilities
Unit 2 : Hadoop Architecture and Cluster setup
—–
* Hadoop server roles and their usage
* Hadoop installation and initial configuration
* Deploying Hadoop in a pseudo-distributed mode
* Deploying a multi-node Hadoop cluster
* Installing Hadoop Clients
* Working of HDFS and resolving simulated problems
Unit 3 : Hadoop cluster Administration & Understanding MapReduce
—–
* Understanding secondary name node
* Working with Hadoop distributed cluster
* Decommissioning or commissioning of nodes
* Understanding MapReduce
* Understanding schedulers and enabling them
Unit 4 : Backup,Recovery and Maintenance
—–
* Common admin commands like Balancer Trash, Import Check Point
* Distcp, data backup and recovery
* Enabling trash, namespace count quota or space quota,manual failover or metadata recovery
Unit 5 : Hadoop Cluster: Planning and Management
—–
* Planning the Hadoop cluster
* Cluster sizing, hardware
* Network and software considerations
* Popular Hadoop distributions, workload and usage patterns
Unit 6 : Hadoop 2.0 and it’s features
—–
* Limitations of Hadoop l.x
* Features of Hadoop 2.0
* YARN framework, MRv2
* Hadoop high availability and federation
* Yarn ecosystem and Hadoop 2.0 Cluster setup
Unit 7 : Setup Hadoop 2.X with High Availability and upgrading Hadoop
—–
* Configuring Hadoop 2 with high availability
* Upgrading to Hadoop 2
* Working with Sqoop
* Understanding Oozie
* Working with Hive
* Working with Hbase
Unit 8 : Cloudera manager
—–
* Cluster setup
* Understand Kerberos
* Hive administration , HBase architecture
* HBase setup, Hadoop/Hive/H base performance optimization
* Cloudera manager and cluster setup
* Pig setup and working with grunt
* Why Kerberos and how it help
