Big Data Hadoop Online Training

Instructor Led Big Data Hadoop Certification Online Training

About the Course

Big Data Hadoop Training Course covers in-depth knowledge on Big Data and Hadoop Ecosystem tools. With this program you will be able to work in real time industry use cases.

It helps in distributed storage and processing of data of Big Data. Understanding Hadoop is a highly valuable skill for anyone working with large amounts of data.

Who can Learn

This Training is designed for the IT Beginners/professionals, Application Developers, Testers, Data warehousing professionals, Database professionals, System Administrators.


Knowledge of Linux, Any database and basic java will be added advantage.

Date Weekdays / Weekend Timings
december 12 MON - FRI (60 Days) 8:00 AM to 9:00 AM (IST)

Hadoop course contents

Virtual box/VM Ware

  • Basics
  • Installations
  • Backups
  • Snapshots


  • Basics
  • Commands


  • Why Hadoop?
  • Scaling
  • Distributed Framework
  • Hadoop v/s RDBMS
  • Brief history of hadoop

Setup hadoop

  • Pseudo mode
  • Cluster mode
  • Ipv6
  • Ssh
  • Installation of java, hadoop
  • Configurations of hadoop
  • Hadoop Processes ( NN, SNN, JT, DN, TT)
  • Temporary directory
  • UI
  • Common errors when running hadoop cluster,solutions

HDFS- Hadoop distributed File System

  • HDFS Design and Architecture
  • HDFS Concepts
  • Interacting HDFS using command line
  • Interacting HDFS using Java APIs
  • Dataflow
  • Blocks
  • Replica

Hadoop Processes

  • Name node
  • Secondary name node
  • Job tracker
  • Task tracker
  • Data node

Map Reduce

  • Developing Map Reduce Application
  • Phases in Map Reduce Framework
  • Map Reduce Input and Output Formats
  • Advanced Concepts
  • Sample Applications
  • Combiner
  • HAR

Joining datasets in Map reduce jobs

  • Map-side join
  • Reduce-Side join

Map reduce – customization

  • Custom Input format class
  • Hash Practitioner
  • Custom Practitioner
  • Sorting techniques
  • Custom Output format class

Hadoop Programming Languages:-


  • Introduction
  • Installation and Configuration
  • Interacting HDFS using PIG
  • Map Reduce Programs through PIG
  • PIG Commands
  • Loading, Filtering, Grouping….
  • Data types, Operators…..
  • Joins, Groups….
  • Sample programs in PIG


  • Basics
  • Installation and Configurations
  • Commands
  • NOSQL Databases Concepts


  • ETL tool (PDI ) ( Data Warehousing BI Tools)
  • Introduction
  • Creating RDBMS database
  • Establishing Connection between PDI to RDMS database
  • Creating data in hadoop
  • EstablishingConnection between PDI to Hadoop data
  • Summarization



The Motivation for Hadoop

  • Problems with traditional large-scale systems
  • Requirements for a new approach

Hadoop: Basic Concepts

  • An Overview of Hadoop
  • The Hadoop Distributed File System
  • Hands-On Exercise
  • How MapReduce Works
  • Hands-On Exercise
  • Anatomy of a Hadoop Cluster
  • Other Hadoop Ecosystem Components

Writing a Map Reduce Program

  • The Map Reduce Flow
  • Examining a Sample Map Reduce Program
  • Basic Map Reduce API Concepts
  • The Driver Code
  • The Mapper
  • The Reducer
  • Hadoop’s Streaming API
  • Using Eclipse for Rapid Development
  • Hands-on exercise
  • The New MapReduce API

Delving Deeper Into The Hadoop API

  • More about Tool Runner
  • Testing with MRUnit
  • Reducing Intermediate Data With Combiners
  • The configure and close methods for
  • Map/Reduce Setup and Teardown
  • Writing Practitioners for Better Load Balancing
  • Hands-On Exercise.
  • Directly Accessing HDFS
  • Using the Distributed Cache

Hands-On Exercise

Common Map Reduce Algorithms

  • Sorting and Searching
  • Indexing
  • Machine Learning With Mahout
  • Term Frequency – Inverse Document Frequency
  • Word Co-Occurrence
  • Hands-On Exercise.

Usining HBase

  • What is HBase?
  • HBase Architecture & HBase API
  • Managing large data sets with HBase
  • Using HBase in Hadoop applications
  • Hands-on exercise.

Using Hive and Pig

  • Hive Basics
  • Pig Basics
  • Hands-on exercise.

Practical Development Tips and Techniques

  • Debugging MapReduce Code
  • Using LocalJobRunner Mode For Easier Debugging
  • Retrieving Job Information with Counters
  • Logging
  • Splittable File Formats
  • Determining the Optimal Number of Reducers
  • Map-Only MapReduce Jobs
  • Hands-On Exercise.

More Advanced MapReduce Programming

  • Custom Writables and WritableComparables
  • Saving Binary Data using SequenceFiles and Avro Files
  • Creating InputFormats and OutputFormats
  • Hands-On Exercise

Joining Data Sets in MapReduce

  • Map-Side Joins
  • The Secondary Sort
  • Reduce-Side Joins

Frequently Asked Questions

When does the course start and finish?
The course starts now and never ends! It is a completely self-paced online course - you decide when you start and when you finish.
How long do I have access to the course?
How does lifetime access sound? After enrolling, you have unlimited access to this course for as long as you like - across any and all devices you own.
What if I am unhappy with the course?
We would never want you to be unhappy! If you are unsatisfied with your purchase, contact us in the first 3 days and we will give you a full refund.

Get started now!