Get in Touch

Course Outline

  • Section 1: Introduction to Big Data & NoSQL
    • Big Data ecosystem
    • NoSQL overview
    • CAP theorem
    • When NoSQL is appropriate
    • Columnar storage
    • HBase and NoSQL
  • Section 2: HBase Intro
    • Concepts and Design
    • Architecture (HMaster and Region Server)
    • Data integrity
    • HBase ecosystem
    • Lab: Exploring HBase
  • Section 3: HBase Data model
    • Namespaces, Tables and Regions
    • Rows, columns, column families, versions
    • HBase Shell and Admin commands
    • Lab: HBase Shell
  • Section 4: Accessing HBase using Java API
    • Introduction to Java API
    • Read / Write path
    • Time Series data
    • Scans
    • Map Reduce
    • Filters
    • Counters
    • Co-processors
    • Labs (multiple): Using HBase Java API to implement time series, Map Reduce, Filters and counters.
  • Section 5: HBase schema Design: Group session
    • Students are presented with real-world use cases
    • Students work in groups to develop design solutions
    • Discuss, critique, and learn from multiple designs
    • Labs: Implement a scenario in HBase
  • Section 6: HBase Internals
    • Understanding HBase under the hood
    • Memfile / HFile / WAL
    • HDFS storage
    • Compactions
    • Splits
    • Bloom Filters
    • Caches
    • Diagnostics
  • Section 7: HBase installation and configuration
    • Hardware selection
    • Install methods
    • Common configurations
    • Lab: Installing HBase
  • Section 8: HBase eco-system
    • Developing applications using HBase
    • Interacting with other Hadoop stack components (MapReduce, Pig, Hive)
    • Frameworks around HBase
    • Advanced concepts (co-processors)
    • Labs: Writing HBase applications
  • Section 9: Monitoring And Best Practices
    • Monitoring tools and practices
    • Optimizing HBase
    • HBase in the cloud
    • Real-world use cases of HBase
    • Labs: Checking HBase vitals

Requirements

  • Familiarity with the Java programming language
  • Proficiency in Java programming (ability to navigate the Linux command line and edit files using vi or nano)
  • A Java IDE such as Eclipse or IntelliJ

Lab environment:

A functional HBase cluster will be provided for students. Students will need an SSH client and a web browser to access the cluster.

Zero Install: There is no need to install HBase software on your own machines!

 21 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses

Related Categories