CAC Workshops Listing

Unlock Your Research Potential: Empowering Minds through Training and Events

Explore our Workshop Listing for a diverse range of training sessions tailored to meet your team’s needs. From foundational command line skills to cutting-edge topics like parallel programming, machine learning, and data analytics, our experts provide comprehensive training to enhance your research capabilities.

Listing

Introduction to High-Performance Computing (HPC)

This course provides a comprehensive introduction to the CAC’s high-performance compute (HPC) platform, Frontenac, and the other resources available throughout the national compute ecosystem of the Digital Research Alliance of Canada. By the end of this workshop, you will know how to use the UNIX command line to operate a computer, connect to a cluster, write simple shell scripts, submit, and manage jobs on a cluster using a scheduler, transfer files, and use software through environment modules.

Introduction to Linux

This course provides you with a foundational understanding of the Linux command line interface. Through a combination of theoretical knowledge and hands-on practical exercises, you will learn essential concepts such as file navigation, command-line operations, file permissions, text editing, and basic scripting. By the end of the course, you will have the necessary skills to confidently work in a Linux environment and leverage its power for various computing tasks.

Introduction to REDCap

REDCap is a web-based data capture tool used to create, manage, and deploy research databases and surveys. It has built-in functionalities for data importing and exporting, quality checking, reporting, and basic statistics summarization. This course will include an overview of the REDCap tool, a demonstration of core features, and hands-on exercises. Typically, this course is delivered in 2 parts, 3 hours each (6 hours total). In the first session, we will introduce the platform and explain the basic workflow for developing a REDCap project with multiple forms. In the second session, we will present a case study where you will build your own project and deploy it as a survey study.

Introduction to Python Programming

This course is designed to provide you with a solid foundation in Python programming language. Through a comprehensive curriculum and hands-on coding exercises, participants will learn the fundamentals of Python syntax, data types, functions, and file handling. By the end of the course, you will have gained the essential skills to write Python programs, solve problems, and build the foundation for more advanced Python development. Whether you are a beginner or have some programming experience, this course will equip you with the necessary tools to start your journey in Python programming.

Introduction to Parallel Programming with Python using MPI

Parallel programming is key in high-performance computing. It allows us to run big jobs in a timely manner and leverage a cluster’s resources. In this workshop, you will learn about parallelization, how to write parallel programs, and run them on a parallel system. MPI (Message Passing Interface) will be used in combination with Python.

Text Mining

Text mining is the process of extracting meaning, patterns, and trends from unstructured textual data. Massive amounts of unstructured text are prevalent in research. Traditional machine learning algorithms handle only numerical or categorical data. Existing data analytical platforms provide special components to facilitate the analysis of textual data. This workshop introduces the topic of text mining and provides a tour with hands-on exercises and demonstrations of four text mining tools, each of which supports an interesting and diverse set of features.

Data Preparation

This course provides you with essential knowledge and skills to effectively prepare data for analysis. Starting with an overview of the Data Analytics pipeline and processes, the course explores various statistical and visualization techniques used in Exploratory and Descriptive Analytics to understand historical data. You will then delve into the art of Data Preparation, gaining expertise in data cleaning, handling missing values, detecting, and handling outliers, as well as transforming and engineering features. By the end of the course, you will be equipped with the necessary tools to ensure data quality and integrity, enabling you to make informed decisions and derive valuable insights from their data.

Virtual Assistants on the Cloud (Chatbots)

Virtual assistants, also known as Chatbots, have grown in popularity over the past couple of years. Technology advancements and Artificial Intelligence accomplishments have simplified the seemingly daunting task of creating a virtual assistant. This workshop is an introduction to virtual assistants and their capabilities, using the IBM Watson Assistant platform. Typically, this course is delivered in 2 parts, 3 hours each (6 hours total). In the first session, we will focus on the basics of creating a virtual assistant. In the second session, we will explore advanced features and integrate Watson Assistant with other services using Python. For the second session, it is recommended that you are familiar with the basics of Python programming.

Machine Learning

This interactive course is an introduction to the world of Machine Learning (ML). It discovers some supervised learning algorithms and discusses when and how to use them. It begins by introducing the data pipeline and its processes, before moving on to statistical and visualization approaches to conduct exploratory and descriptive analytics on data to answer the question “what happened in the past?” From there, you will explore the art of data preparation, including data cleaning, missing values, outlier detection, and feature transformation and engineering. Next, we will introduce predictive analytics to answer the question “What will happen in the future?” We will cover techniques for classifying and predicting data for the supervised learning algorithm, such as k-NN, Naïve Bayes, Decision Tree, and Random Forest, and provide guidance in deciding which ones to use. Finally, you will learn about statistical evaluation methods used in comparing the performance of predictive modelling techniques. This course balances theory and practice. You will use practical concepts of ML applications to understand real-world situations. Topics include Data preparation, ML theory, ML process, ML algorithms, and Model evaluation. Typically, this course is delivered in 2 parts, 3 hours each (6 hours total), but can be expanded to 8 hours total to dive deeper into the material.

Introduction to Spark

This introductory course on Apache Spark offers you a comprehensive understanding of this highly popular project within the Hadoop ecosystem. You will gain insights into the Spark environment, its underlying model, and core data abstractions. Additionally, you will be introduced to the Spark SQL API and Spark ML Library API, equipping you with the necessary skills to leverage the power of Apache Spark for data processing, analytics, and machine learning tasks. By the end of the course, you will be empowered to utilize Apache Spark effectively to extract valuable insights from large datasets and drive data-driven decision-making. It is recommended that you are familiar with Python programming, as well as data preparation and ML concepts.

Introduction to Bioinformatics

This introductory course offers you a comprehensive introduction to the field of bioinformatics and its applications in analyzing biological big data. You will gain experience utilizing bioinformatics tools and techniques within an HPC environment, including the CAC’s HPC platform, Frontenac, and the other resources available throughout the national compute ecosystem of the Digital Research Alliance of Canada. By the course’s end, you will have the necessary skills to effectively navigate and analyze biological data using bioinformatics approaches, enabling you to uncover meaningful insights and contribute to advancements in life sciences.

Our Security Standards

Safeguard your research data with our enhanced data centre security standards. We understand the critical importance of protecting your valuable research data. Our state-of-the-art facility is equipped with robust security measures, including advanced access controls, encryption protocols, 24/7 monitoring, physical security systems, and redundant power and cooling infrastructure. With our stringent security standards, you can rest assured that your research data is protected. Focus on your groundbreaking research while we ensure a high level of security for your valuable data.

CAC Support

We are dedicated to enabling your success through innovative computational resources and support. With a focus on user support, training, and consultation services, the CAC ensures you have the necessary tools and expertise to maximize the potential of advanced computing. Whether it is providing technical assistance, delivering training programs, or offering expert consultation, the CAC plays a crucial role in helping you harness the power of computing and data analytics for your innovative projects and discoveries.

Updated Frontenac HPC Cluster - click here for more information