Ssk Systems in its mission to provide cost effective innovative training services in latest software technologies has started online training service which is accessible to one and all globally. Our online software training program is designed to provide rich learning experience for students through our Live Interactive Environment which is accessible from the comfort of the home by internet. Our customized solutions focus on each of our clients specific needs.
Big Data and Hadoop:
In today's world, properly leveraged data can give organizations of all types a competitive advantage. Companies now handle vast amounts of data on a daily basis and there is unparalleled demand for professionals in this space. Learn how to extract useful information from data and increase the ROI of a business by taking up our wide range of Big Data Analytics Courses.
Big Data Hadoop and Spark Developer
Course preview
- Introduction to Big data and Hadoop Ecosystem
- challenges for processing big data
- Technologies support big data
- The Motivation For Hadoop
- History of Hadoop
- Use cases of Hadoop
- RDBMS vs Hadoop
- When to use and when not to use Hadoop
- Ecosystem tour
- Vendor comparison
- Features of HDFS
- 5 daemons of Hadoop
- Name Node and its functionality
- Data Node and its functionality
- Secondary Name Node and its functionality
- Job Tracker and its functionality
- Task Tracker and its functionality
- Data Storage in HDFS
- Introduction about Blocks
- Data replication
- Accessing HDFS through CLI (Command Line Interface)
- Fault tolerance
- Setting up the CDH
- Map Reduce Story
- Map Reduce Architecture
- How Map Reduce works
- Developing Map Reduce
- Map Reduce Programming Model
- Different phases of Map Reduce Algorithm
- Different Data types in Map Reduce
- How to Write a basic Map Reduce Program
- Driver Code
- Mapper
- Reducer
- Creating Input and Output Formats in Map Reduce Jobs
- Text Input Format
- Key Value Input Format
- Sequence File Input Format
- Data localization in Map Reduce
- Combiner (Mini Reducer) and Partitioner
- Hadoop I/O
- Distributed cache
- Basics of Hive, Hive architecture
- Working with Hive and Impala, Hive vs RDBMS, HiveQL and the shell,
- Managing tables (external vs managed)
- Data types and schemas
- Partitions and buckets
- Introduction to Impala
- Impala Architecture
- Hive vs Impala
- Exploring Impala
- Introduction to Apache Pig
- SQL vs. Apache Pig
- Different data types in Pig
- Modes of Execution in Pig
- Grunt shell
- Loading data
- Exploring Pig
-HBase Architecture and schema design
-HBase vs. RDBMS
-HMaster and Region Servers
-Column Families and Regions
-Write pipeline
-Read pipeline
-HBase commands
-Introduction to Sqoop
-Sqoop Architecture
-Sqoop Syntax
-Database connection
-Importing & Exporting data
-Introduction to Flume
-Flume Architecture
-Flume Data Flow
-Configuration
-Introduction to Oozie
-Oozie Workflow
-Property file, Coordinator & Bundle
-Introduction to Apache Spark
-Apache Spark Framework
-Playing with RDD’s
-Using Spark Shell
-Writing Spark Applications
-DataFrames and DataSets
-DataFrame Operations
-Creating & Saving DataFrames from Data Sources
-Transformations & Actions
-Caching & Persisting
-Spark SQL
Interview Preparation
Interested Candidates please respond to sales@ssksystems.com and contact us at 925-262-9383.
Thanks & Regards
Satya,
Ssk Systems ,
925-262-9383