Data Analysis and Visualization using Python

OverviewCourse OutlineMore DetailsCourse Registration

Overview

Python has been one of the premier, flexible, and powerful open-source language that is easy to learn, easy to use, and has powerful libraries for data manipulation and analysis. This training is a step-by-step guide to Python and Statistical Data Analysis with extensive hands on. The course is delivered with several activity problems, assignments and scenarios that help participants gain practical experience in data handling, analysis, interpretation as well as reporting. This course starts by exploring basic statistics such as mean, median and mode and commence to advanced exploratory features such as groups comparisons, regression, test of relationships, classification, clustering, just to mention a few.

Target Audience

The course is useful for professionals who use data as part of their work and who need to make decisions from data analysis. Those with prior understanding of programming and statistics finds it easier to take this course.

Learning Outcomes / Objectives

By the end of this course the participants will be able to:

Easily read and write files of various types in to a Python program.
Identify and fix errors in datasets.
Work with Python ‘modules’ and use them for data analysis tasks
Use libraries like pandas, numpy, matplotlib, scikit, and master the concepts like Python machine learning, scripts, and sequence.
Gain high level skills on statistical results interpretation and report writing.

Duration

5 days

Modules / Course Content

Module1: Introduction Introduction to Statistical Data Analysis

Introduction to statistical concepts
Descriptive and inferential statistics
Research designing
The research/survey process

Overview of Data Science

Introduction to data science
Different sectors using data science
Purpose and components of python

Data Analytics Overview

Data analytics process
Knowledge check
Exploratory Data Analysis (EDA)
EDA-Quantitative technique
EDA – Graphical technique
Data analytics conclusion or predictions
Data analytics communication
Data types and plotting considerations

Module 2: Statistical Analysis and Business Applications Introduction to statistical data analysis

Statistical analysis considerations
Population and sample
Statistical analysis process
Descriptive statistics – Measures of centres, distribution, dispersion
Inferential Statistics (correlation, regression, t-tests, chi-square, etc)

Python Environment Setup and Essentials

Anaconda
Installation of Anaconda Python distribution
Data types with Python
Basic operators and functions

Mathematical Computing with Python (NUMPY)

What is NumPy?
NumPy vs list
Installation
NumPy arrays
Built-in methods of NumPy (arrange; zeros and ones; linspace; eye; random)
Array attributes and methods (reshape; max, min, argmax, argmin; shape; dtype)
NumPy indexing and selection
Broadcasting
Indexing a 2D array (matrices)
Selection
NumPy operations (arithematic; universal array functions)
Vectorization

Module 3: Scientific Computing with Python (SCIPY)

Introduction to SciPy
SciPy sub package – integration and optimisation
Calculating eigenvalues and eigenvector
Using SciPy to solve a linear algebra problem
Use SciPy to define random variables for random values

Data Manipulation with PANDAS

Introduction to Pandas
DataFrame in Pandas
Viewing and opening data
Dealing with missing values
Data operations
Reading and writing files
Pandas SQL operation

Machine Learning with SCIKIT–LEARN

Introduction to machine learning
Understanding data sets and extraction features
Problem types and learning models
How to train, test and optimise models
Considerations for supervised learning models
Scikit-Learn
Supervised learning models – Linear regression, logistic regression
Unsupervised learning models
Pipeline
Model persistence and evaluation

Module 4: Natural Language Processing with SCIKIT LEARN

Overview of Natural Language Processing
Applications of Natural Language Processing
Libraries-Scikit
Extraction considerations
Scikit Learn-model training and grid search

Data Visualisation in Python Using MATPLOT-LIB

Introduction to data visualisation
Line properties
(x, y) plot and subplots
Types of plots

Module 5: Web Scraping with Beautiful Soup

Web scraping and parsing
Knowledge check
Understanding and searching the tree
Navigating options and modification options of a tree
Parsing and printing documents

Integration with Hadoop MapReduce and Spark

Big data solutions in Python
Big Data and Hadoop
Hadoop core components
Python integration with HDFS using Hadoop streaming
Using Hadoop streaming for calculating word count
Python Integration with Spark using PySpark
Using PySpark to determine word count

Training Methodology

The course will employ a hands-on, practical approach to ensure participants develop both conceptual understanding and technical proficiency. Each module will integrate interactive lectures, guided software demonstrations, and individual or group exercises based on real-world illustrations. Participants will receive continuous feedback and personalized coaching to reinforce learning. By the end of the training, they will have completed a mini project that demonstrates their ability to apply the acquired skills in a practical context.

More Details

Upon successful completion of this course, participants will be issued a certificate.

Month	Dates	Kenya	Rwanda	Nigeria
Jan	26th - 30th Jan, 2026	Register	Register	Register
Feb	23th - 27th Feb, 2026	Register	Register	Register
Mar	16th - 20th Mar, 2026	Register	Register	Register
Apr	20th - 24th Apr, 2026	Register	Register	Register
May	25th - 29th May, 2026	Register	Register	Register
Jun	22nd - 26th Jun, 2026	Register	Register	Register
Jul	27th - 31st Jul, 2026	Register	Register	Register
Aug	24th - 28th Aug, 2026	Register	Register	Register
Sep	21st - 25th Sep, 2026	Register	Register	Register
Oct	26th - 30th Oct, 2026	Register	Register	Register
Nov	23rd - 27th Nov, 2026	Register	Register	Register
Dec	14th - 18th Dec, 2026	Register	Register	Register