Get in Touch

Course Outline

  1. Big data fundamentals
    • The role of Big Data in the corporate landscape
    • The development phases of a Big Data strategy within an organization
    • Understanding the rationale behind a holistic Big Data approach
    • Essential components of a Big Data Platform
    • Big data storage solutions
    • Limitations of traditional technologies
    • An overview of database types
    • The four dimensions of Big Data
  2. Big data impact on business
    • The business significance of Big Data
    • Challenges associated with extracting valuable data
    • Integrating Big Data with traditional data systems
  3. Big data storage technologies
    • Overview of big data technologies
      • Data storage models
      • Hadoop
      • Hive
      • Cassandra
      • MongoDB
    • Strategies for selecting the appropriate big data technology
  4. Processing big data
    • Connecting to and extracting data from databases
    • Transforming and preparing data for processing
    • Utilizing Hadoop MapReduce for processing distributed data
    • Monitoring and executing Hadoop MapReduce jobs
    • Core building blocks of the Hadoop Distributed File System
    • Understanding MapReduce and Yarn
    • Handling streaming data with Spark
  5. Big data analysis tools and technologies
    • Programming with Hadoop using Pig Latin
    • Querying big data with Hive
    • Data mining with Mahout
    • Tools for visualization and reporting
  6. Big data in business
    • Managing and establishing Big Data requirements
    • The business importance of Big Data
    • Selecting the optimal big data tools for specific problems

Data Warehousing Concepts

  • Defining a Data Warehouse
  • Distinctions between OLTP and Data Warehousing
  • Data Acquisition
  • Data Extraction
  • Data Transformation
  • Data Loading
  • Data Marts
  • Dependent vs. Independent Data Marts
  • Database design principles

ETL Testing Concepts:

  • Introduction to ETL Testing
  • Software Development Life Cycle (SDLC)
  • Testing methodologies
  • ETL Testing Workflow Process
  • ETL Testing responsibilities within DataStage

Big Data Fundamentals

  • The role of Big Data in the corporate landscape
  • The development phases of a Big Data strategy within an organization
  • Understanding the rationale behind a holistic Big Data approach
  • Essential components of a Big Data Platform
  • Big data storage solutions
  • Limitations of traditional technologies
  • An overview of database types

NoSQL Databases

Hadoop

Map Reduce

Apache Spark

Requirements

Participants should possess a foundational understanding and some hands-on experience with storage tools, as well as an awareness of managing large data sets.

 14 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories