Big Data Hadoop Online Training
Instructor Led Big Data Hadoop Certification Online Training
About the Course
Big Data Hadoop Training Course covers in-depth knowledge on Big Data and Hadoop Ecosystem tools. With this program you will be able to work in real time industry use cases.
It helps in distributed storage and processing of data of Big Data. Understanding Hadoop is a highly valuable skill for anyone working with large amounts of data.
Who can Learn
This Training is designed for the IT Beginners/professionals, Application Developers, Testers, Data warehousing professionals, Database professionals, System Administrators.
Prerequisite
Knowledge of Linux, Any database and basic java will be added advantage.
Date | Weekdays / Weekend | Timings |
---|---|---|
december 12 | MON - FRI (60 Days) | 8:00 AM to 9:00 AM (IST) |
Hadoop course contents
Virtual box/VM Ware
- Basics
- Installations
- Backups
- Snapshots
Linux
- Basics
- Commands
Hadoop
- Why Hadoop?
- Scaling
- Distributed Framework
- Hadoop v/s RDBMS
- Brief history of hadoop
Setup hadoop
- Pseudo mode
- Cluster mode
- Ipv6
- Ssh
- Installation of java, hadoop
- Configurations of hadoop
- Hadoop Processes ( NN, SNN, JT, DN, TT)
- Temporary directory
- UI
- Common errors when running hadoop cluster,solutions
HDFS- Hadoop distributed File System
- HDFS Design and Architecture
- HDFS Concepts
- Interacting HDFS using command line
- Interacting HDFS using Java APIs
- Dataflow
- Blocks
- Replica
Hadoop Processes
- Name node
- Secondary name node
- Job tracker
- Task tracker
- Data node
Map Reduce
- Developing Map Reduce Application
- Phases in Map Reduce Framework
- Map Reduce Input and Output Formats
- Advanced Concepts
- Sample Applications
- Combiner
- HAR
Joining datasets in Map reduce jobs
- Map-side join
- Reduce-Side join
Map reduce – customization
- Custom Input format class
- Hash Practitioner
- Custom Practitioner
- Sorting techniques
- Custom Output format class
Hadoop Programming Languages:-
PIG
- Introduction
- Installation and Configuration
- Interacting HDFS using PIG
- Map Reduce Programs through PIG
- PIG Commands
- Loading, Filtering, Grouping….
- Data types, Operators…..
- Joins, Groups….
- Sample programs in PIG
Hive
- Basics
- Installation and Configurations
- Commands
- NOSQL Databases Concepts
Specialties:
- ETL tool (PDI ) ( Data Warehousing BI Tools)
- Introduction
- Creating RDBMS database
- Establishing Connection between PDI to RDMS database
- Creating data in hadoop
- EstablishingConnection between PDI to Hadoop data
- Summarization
OVERVIEWHADOOP DEVELOPER
Introduction
The Motivation for Hadoop
- Problems with traditional large-scale systems
- Requirements for a new approach
Hadoop: Basic Concepts
- An Overview of Hadoop
- The Hadoop Distributed File System
- Hands-On Exercise
- How MapReduce Works
- Hands-On Exercise
- Anatomy of a Hadoop Cluster
- Other Hadoop Ecosystem Components
Writing a Map Reduce Program
- The Map Reduce Flow
- Examining a Sample Map Reduce Program
- Basic Map Reduce API Concepts
- The Driver Code
- The Mapper
- The Reducer
- Hadoop’s Streaming API
- Using Eclipse for Rapid Development
- Hands-on exercise
- The New MapReduce API
Delving Deeper Into The Hadoop API
- More about Tool Runner
- Testing with MRUnit
- Reducing Intermediate Data With Combiners
- The configure and close methods for
- Map/Reduce Setup and Teardown
- Writing Practitioners for Better Load Balancing
- Hands-On Exercise.
- Directly Accessing HDFS
- Using the Distributed Cache
Hands-On Exercise
Common Map Reduce Algorithms
- Sorting and Searching
- Indexing
- Machine Learning With Mahout
- Term Frequency – Inverse Document Frequency
- Word Co-Occurrence
- Hands-On Exercise.
Usining HBase
- What is HBase?
- HBase Architecture & HBase API
- Managing large data sets with HBase
- Using HBase in Hadoop applications
- Hands-on exercise.
Using Hive and Pig
- Hive Basics
- Pig Basics
- Hands-on exercise.
Practical Development Tips and Techniques
- Debugging MapReduce Code
- Using LocalJobRunner Mode For Easier Debugging
- Retrieving Job Information with Counters
- Logging
- Splittable File Formats
- Determining the Optimal Number of Reducers
- Map-Only MapReduce Jobs
- Hands-On Exercise.
More Advanced MapReduce Programming
- Custom Writables and WritableComparables
- Saving Binary Data using SequenceFiles and Avro Files
- Creating InputFormats and OutputFormats
- Hands-On Exercise
Joining Data Sets in MapReduce
- Map-Side Joins
- The Secondary Sort
- Reduce-Side Joins