Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction to the Stratio Platform
- Overview of Stratio’s architecture and core modules.
- The role of Rocket and Intelligence in the data lifecycle.
- Logging in and navigating the Stratio user interface.
Working with the Rocket Module
- Data ingestion and pipeline creation.
- Connecting data sources and configuring transformations.
- Using PySpark for preprocessing tasks within Rocket.
PySpark Essentials for Stratio Users
- PySpark data structures and operations.
- Looping constructs: application of for, while, and if/else logic.
- Writing custom functions using 'def' and applying them in workflows.
Advanced Usage of Rocket with PySpark
- Streaming ingestion and transformation techniques.
- Implementing loops and functions in both batch and real-time scenarios.
- Best practices for optimizing performance in PySpark pipelines.
Exploring the Intelligence Module
- Overview of data modeling and analysis features.
- Feature selection, transformation, and exploration.
- The role of PySpark in generating custom analytics and insights.
Building Advanced Analytics Workflows
- Creating user-defined functions (UDFs) within Intelligence.
- Applying conditionals and loops for complex data logic.
- Practical use cases: segmentation, aggregation, and prediction.
Deployment and Collaboration
- Saving, exporting, and reusing workflows.
- Collaborating with team members via the Stratio platform.
- Reviewing outputs and integrating with downstream tools.
Summary and Next Steps
Requirements
- Proficiency in Python programming.
- Fundamental understanding of data analytics or big data processing concepts.
- Basic familiarity with Apache Spark and distributed computing principles.
Target Audience
- Data engineers working with Stratio-based platforms.
- Analysts or developers utilizing Rocket and Intelligence modules.
- Technical teams transitioning to PySpark-based workflows within the Stratio environment.
14 Hours
Testimonials (2)
Doing Exercise
Joe Pang - Lands Department, Hong Kong
Course - QGIS for Geographic Information System
Hands-on examples allowed us to get an actual feel for how the program works. Good explanations and integration of theoretical concepts and how they relate to practical applications.