Learn your way! Get started

Hadoop: Introduction

with expert Barry Solomon


Watch trailer


Course at a glance

Included in these subscriptions:

  • Dev & IT Pro Video
  • Dev & IT Pro Power Pack

Release date 6/30/2014
Level Intermediate
Runtime 2h 1m
Closed captioning Included
Transcript Included
eBooks / courseware Included
Hands-on labs N/A
Sample code N/A
Exams Included


Enterprise Solutions

Need reporting, custom learning tracks, or SCORM? Learn More



Course description

In this course we are going to look at the necessity of big data in today’s world and how it fits into your organizations future. Then we will look at one big data framework in particular, Hadoop, as it is fully open source and driven by the community. We will examine some of the pieces that comprise Hadoop and demonstrate some of its functionality. There are so many use cases where big data can enhance your organizations competitive edge - analyzing social media, sensor data, click stream data, geographic analysis, emails, the list goes on. Hopefully you have a better understanding, not only of what big data and Hadoop are, but, more importantly, where they fit into your organizations structure and what they bring to the table.

Prerequisites

This course assumes that the users have an understanding of working with databases and database systems. The user should also be familiar with syntax commands for Linux.

Learning Paths

This course is part of the following LearnNowOnline SuccessPaths™:
Hadoop

Meet the expert

Barry Solomon has over 23 years of experience as a consultant. He has developed with Fortran, C, C , Visual Basic, Java, and Visual C#. His extensive database experience includes working with Microsoft Access, Microsoft SQL Server, MySQL, and Oracle. His expertise now includes working with big data, Hadoop in particular, and all of its attending ecosystems as the limitations have been exceeded in most modern database systems.

Course outline



What is Big Data

Purpose of Big Data (40:41)
  • Introduction (00:22)
  • End of the Line (05:16)
  • OLTP and OLAP (03:07)
  • Storage (02:39)
  • Big Data as Supercomputer (05:11)
  • Scalability (02:22)
  • Hard Drives (03:20)
  • Parallelism (02:13)
  • Whose Data is it? (04:16)
  • Being Competitive and Relevant (03:47)
  • What is Big Data (02:06)
  • Variety, velocity and volume (01:31)
  • Leveraging and ROI (01:42)
  • Data Data Everywhere (01:38)
  • Throw it in the Lake of Data (00:52)
  • Summary (00:10)
Use Cases (13:22)
  • Introduction (00:15)
  • Use Cases (02:38)
  • Real Time vs Batch Processing (01:15)
  • What About Databases (02:10)
  • OLTP and OLAP (02:15)
  • Appliances (00:57)
  • Mix and Match (01:03)
  • Schema on Write, on Read (00:50)
  • NoSQL (01:40)
  • Summary (00:12)

Hadoop

Hadoop (37:06)
  • Introduction (00:16)
  • What do I get (01:01)
  • Hadoop (02:49)
  • File System (01:56)
  • MapReduce (01:18)
  • YARN (02:02)
  • Ecosystem (03:03)
  • Pig (03:30)
  • Hive (03:42)
  • Mahout and Oozie (03:05)
  • NoSQL (00:20)
  • Sqoop (01:36)
  • Ambari (01:51)
  • ZooKeeper (01:20)
  • The other pieces (07:16)
  • Tez (01:37)
  • Summary (00:16)
Hadoop Demo (29:50)
  • Introduction (00:20)
  • Where do we go? (05:44)
  • Demo: Download (02:32)
  • Demo: Putty (01:01)
  • Demo: Web Interface (03:56)
  • Demo: Back to Putty (02:57)
  • Demo: PIG (03:00)
  • Demo: HIVE Table (05:40)
  • Demo: Ambari (02:30)
  • Demo: Query (01:56)
  • Summary (00:09)