GPU Programming with CUDA and Python Training Course
CUDA (Compute Unified Device Architecture) is a parallel computing platform and API developed by Nvidia.
This instructor-led, live training (available online or onsite) is designed for intermediate-level developers who want to leverage CUDA to build Python applications that execute in parallel on NVIDIA GPUs.
By the end of this training, participants will be able to:
- Utilize the Numba compiler to accelerate Python applications running on NVIDIA GPUs.
- Create, compile, and launch custom CUDA kernels.
- Manage GPU memory efficiently.
- Transform CPU-based applications into GPU-accelerated solutions.
Format of the Course
- Interactive lecture and discussion.
- Extensive exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
- What is GPU programming?
- Why use CUDA with Python?
- Key concepts: Threads, Blocks, Grids
Overview of CUDA Features and Architecture
- GPU vs CPU architecture
- Understanding SIMT (Single Instruction, Multiple Threads)
- CUDA programming model
Setting up the Development Environment
- Installing CUDA Toolkit and drivers
- Installing Python and Numba
- Setting up and verifying the environment
Parallel Programming Fundamentals
- Introduction to parallel execution
- Understanding threads and thread hierarchies
- Working with warps and synchronization
Working with the Numba Compiler
- Introduction to Numba
- Writing CUDA kernels with Numba
- Understanding @cuda.jit decorators
Building a Custom CUDA Kernel
- Writing and launching a basic kernel
- Using threads for element-wise operations
- Managing grid and block dimensions
Memory Management
- Types of GPU memory (global, shared, local, constant)
- Memory transfer between host and device
- Optimizing memory usage and avoiding bottlenecks
Advanced Topics in GPU Acceleration
- Shared memory and synchronization
- Using streams for asynchronous execution
- Multi-GPU programming basics
Converting CPU-based Applications to GPU
- Profiling CPU code
- Identifying parallelizable sections
- Porting logic to CUDA kernels
Troubleshooting
- Debugging CUDA applications
- Common errors and how to resolve them
- Tools and techniques for testing and validation
Summary and Next Steps
- Review of key concepts
- Best practices in GPU programming
- Resources for continued learning
Requirements
- Python programming experience
- Experience with NumPy (ndarrays, ufuncs, etc.)
Audience
- Developers
Open Training Courses require 5+ participants.
GPU Programming with CUDA and Python Training Course - Booking
GPU Programming with CUDA and Python Training Course - Enquiry
GPU Programming with CUDA and Python - Consultancy Enquiry
Testimonials (1)
Very interactive with various examples, with a good progression in complexity between the start and the end of the training.
Jenny - Andheo
Course - GPU Programming with CUDA and Python
Upcoming Courses
Related Courses
Advanced Python: Best Practices and Design Patterns
28 HoursThis intensive, practical course delves into advanced Python techniques, engineering best practices, and widely utilized design patterns to help you develop maintainable, testable, and high-performance Python applications. The curriculum emphasizes modern tooling, type hinting, concurrency models, architectural patterns, and deployment-ready workflows.
This instructor-led, live training (available online or onsite) is designed for intermediate to advanced-level Python developers who aim to adopt professional practices and patterns for production-grade Python systems.
Upon completion of this training, participants will be able to:
- Apply Python typing, dataclasses, and type-checking to enhance code reliability.
- Utilize design patterns and architectural principles to structure robust applications.
- Implement concurrency and parallelism effectively using asyncio and multiprocessing.
- Develop well-tested code using pytest, property-based testing, and CI pipelines.
- Profile, optimize, and harden Python applications for production environments.
- Package, distribute, and deploy Python projects using modern tools and containers.
Course Format
- Interactive lectures and short demonstrations.
- Hands-on labs and coding exercises each day.
- Capstone mini-project integrating patterns, testing, and deployment.
Course Customization Options
- To request customized training or focus areas (data, web, or infrastructure), please contact us to arrange.
Agentic AI Engineering with Python — Build Autonomous Agents
21 HoursThis course delivers practical engineering methodologies for designing, building, testing, and deploying autonomous (agentic) systems using Python. The curriculum covers the agent loop, tool integrations, memory and state management, orchestration patterns, safety controls, and considerations for production environments.
This instructor-led live training, available online or onsite, is designed for intermediate to advanced ML engineers, AI developers, and software engineers aiming to construct robust, production-ready autonomous agents in Python.
Upon completing this training, participants will be able to:
- Design and implement agent loops and decision-making workflows.
- Integrate external tools and APIs to enhance agent functionality.
- Develop short-term and long-term memory architectures for agents.
- Coordinate multi-step orchestrations and enable agent composability.
- Apply best practices for safety, access control, and observability in deployed agents.
Course Format
- Interactive lectures and discussions.
- Hands-on labs focused on building agents using Python and popular SDKs.
- Project-based exercises resulting in deployable prototypes.
Customization Options
- To request a customized training session for this course, please contact us.
Introduction to Data Science and AI using Python
35 HoursThis five-day course provides an introduction to Data Science and Artificial Intelligence (AI).
The training includes practical examples and exercises conducted in Python.
Artificial Intelligence with Python (Intermediate Level)
35 HoursArtificial Intelligence with Python focuses on building intelligent systems by leveraging Python’s comprehensive ecosystem of AI and machine learning libraries.
This instructor-led live training, available online or onsite, targets intermediate-level Python developers looking to design, implement, and deploy AI solutions using Python.
Upon completing this training, participants will be able to:
- Implement AI algorithms utilizing Python’s core AI libraries.
- Work with supervised, unsupervised, and reinforcement learning models.
- Integrate AI solutions into existing applications and workflows.
- Evaluate model performance and optimize for accuracy and efficiency.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Algorithmic Trading with Python and R
14 HoursThis instructor-led, live training in Serbia (online or onsite) is designed for business analysts who wish to automate their trading using algorithmic strategies, Python, and R.
By the end of this training, participants will be able to:
- Utilize algorithms to rapidly buy and sell securities at specialized increments.
- Lower the costs associated with trading through the use of algorithmic techniques.
- Automatically track stock prices and execute trades.
Applied AI from Scratch in Python
28 HoursThis four-day course provides an introduction to Artificial Intelligence and its practical applications using the Python programming language. Upon completing the course, there is an option to dedicate an additional day to working on a hands-on AI project.
AWS Cloud9 and Python: A Practical Guide
14 HoursThis instructor-led live training, conducted Serbia (online or onsite), is designed for intermediate-level Python developers aiming to enhance their development experience with AWS Cloud9.
By the end of this training, participants will be able to:
- Set up and configure AWS Cloud9 for Python development.
- Understand the AWS Cloud9 IDE interface and features.
- Write, debug, and deploy Python applications in AWS Cloud9.
- Collaborate with other developers using the AWS Cloud9 platform.
- Integrate AWS Cloud9 with other AWS services for advanced deployments.
Building Chatbots in Python
21 HoursChatbots are automated computer programs that mimic human conversation through chat interfaces. They assist organizations in optimizing operational efficiency by offering streamlined and rapid solutions for user interactions.
In this instructor-led live training, participants will acquire the skills necessary to create chatbots using Python.
Upon completion of this training, participants will be able to:
- Grasp the fundamental principles of chatbot development
- Construct, evaluate, deploy, and debug various types of chatbots using Python
Target Audience
- Software Developers
Course Format
- A combination of lectures, discussions, exercises, and intensive hands-on practice
Important Note
- To arrange a customized training session for this course, please get in touch with us.
Administration of CUDA
35 HoursThis instructor-led, live training in Serbia (online or onsite) is designed for beginner-level system administrators and IT professionals who wish to install, configure, manage, and troubleshoot CUDA environments.
Upon completion of this training, participants will be able to:
- Comprehend the architecture, components, and capabilities of CUDA.
- Install and configure CUDA environments.
- Manage and optimize CUDA resources.
- Debug and resolve common CUDA issues.
Bespoke Applied Artificial Intelligence and LLM Engineering with Python
35 HoursCourse Overview
This practical training is tailored for professionals with a data engineering background who wish to develop hands-on skills in artificial intelligence, Python, and large language models. The program emphasizes real-world applications, including the utilization of models, prompt engineering, and the creation of AI-driven solutions. Participants will engage in progressive exercises that transition from foundational concepts to the development of deployable AI workflows.
Training Format
• In-person classroom instruction
• Instructor-led sessions featuring guided practice
• Interactive discussions and analysis of real-world case studies
• Daily practical exercises
Course Objectives
• Comprehend core AI and machine learning concepts pertinent to contemporary applications
• Enhance Python proficiency for AI development and data workflows
• Gain insight into the operation of large language models and effective utilization strategies
• Design and refine prompts to ensure reliable outputs
• Construct end-to-end AI solutions utilizing APIs and frameworks
• Integrate AI capabilities into data engineering pipelines
Scaling Data Analysis with Python and Dask
14 HoursThis instructor-led, live training in Serbia (online or onsite) is tailored for data scientists and software engineers who wish to use Dask with the Python ecosystem to build, scale, and analyze large datasets.
By the end of this training, participants will be able to:
- Configure the environment to begin developing big data processing solutions with Dask and Python.
- Explore the features, libraries, tools, and APIs available in Dask.
- Gain an understanding of how Dask accelerates parallel computing in Python.
- Learn techniques for scaling the Python ecosystem (including Numpy, SciPy, and Pandas) using Dask.
- Optimize the Dask environment to ensure high performance when handling large datasets.
Data Analysis with Python, Pandas and Numpy
14 HoursThis instructor-led, live training in Serbia (online or onsite) is aimed at intermediate-level Python developers and data analysts who wish to enhance their skills in data analysis and manipulation using Pandas and NumPy.
By the end of this training, participants will be able to:
- Set up a development environment that includes Python, Pandas, and NumPy.
- Create a data analysis application using Pandas and NumPy.
- Perform advanced data wrangling, sorting, and filtering operations.
- Conduct aggregate operations and analyze time series data.
- Visualize data using Matplotlib and other visualization libraries.
- Debug and optimize their data analysis code.
FARM (FastAPI, React, and MongoDB) Full Stack Development
14 HoursThis instructor-led, live training (available online or onsite) targets developers who want to utilize the FARM (FastAPI, React, and MongoDB) stack to create dynamic, high-performance, and scalable web applications.
By the conclusion of this training, participants will be able to:
- Configure the necessary development environment integrating FastAPI, React, and MongoDB.
- Comprehend the key concepts, features, and benefits of the FARM stack.
- Build REST APIs using FastAPI.
- Design interactive applications with React.
- Develop, test, and deploy applications (both front end and back end) using the FARM stack.
Developing APIs with Python and FastAPI
14 HoursThis instructor-led live training in Serbia (online or onsite) is intended for developers who want to use FastAPI with Python to build, test, and deploy RESTful APIs with greater speed and ease.
By the end of this training, participants will be able to:
- Set up the required development environment to build APIs with Python and FastAPI.
- Create APIs faster and more easily using the FastAPI library.
- Learn how to create data models and schemas based on Pydantic and OpenAPI.
- Connect APIs to a database using SQLAlchemy.
- Implement security and authentication in APIs using FastAPI tools.
- Build container images and deploy web APIs to a cloud server.
Accelerating Python Pandas Workflows with Modin
14 HoursThis instructor-led, live training in Serbia (online or onsite) is aimed at data scientists and developers who wish to use Modin to build and implement parallel computations with Pandas for faster data analysis.
By the end of this training, participants will be able to:
- Set up the necessary environment to start developing Pandas workflows at scale with Modin.
- Understand the features, architecture, and advantages of Modin.
- Know the differences between Modin, Dask, and Ray.
- Perform Pandas operations faster with Modin.
- Implement the entire Pandas API and functions.