The Big Data Paradigm | StackSkills

Autoplay
Autocomplete

Previous Lesson Complete and Continue

Introduction to Hadoop

Introduction

You, this course and Us (1:17)

Why is Big Data a Big Deal

The Big Data Paradigm (14:20)
Serial vs Distributed Computing (8:37)
What is Hadoop? (7:25)
HDFS or the Hadoop Distributed File System (11:01)
MapReduce Introduced (11:39)
YARN or Yet Another Resource Negotiator (4:00)

Installing Hadoop in a Local Environment

Hadoop Install Modes (8:32)
Setup a Virtual Linux Instance (For Windows users) (15:31)
Hadoop Standalone mode Install (9:33)
Hadoop Pseudo-Distributed mode Install (14:25)

The MapReduce "Hello World"

The basic philosophy underlying MapReduce (8:49)
MapReduce - Visualized And Explained (9:03)
MapReduce - Digging a little deeper at every step (10:21)
"Hello World" in MapReduce (10:29)
The Mapper (9:48)
The Reducer (7:46)
The Job (12:28)

Run a MapReduce Job

Get comfortable with HDFS (10:59)
Run your first MapReduce Job (14:30)

HDFS and Yarn

HDFS - Protecting against data loss using replication (15:32)
HDFS - Name nodes and why they're critical (6:48)
HDFS - Checkpointing to backup name node information (11:10)
Yarn - Basic components (8:33)
Yarn - Submitting a job to Yarn (13:10)
Yarn - Plug in scheduling policies (14:21)
Yarn - Configure the scheduler (12:26)

Setting up a Hadoop Cluster

Manually configuring a Hadoop cluster (Linux VMs) (13:50)
Getting started with Amazon Web Servicies (6:25)
Start a Hadoop Cluster with Cloudera Manager on AWS (13:04)

The Big Data Paradigm

Lesson content locked

If you're already enrolled, you'll need to login.

Enroll in Course to Unlock