Data Analysis and Visualization using Python

Data Analysis and Visualization using Python

Overview

Python has been one of the premier, flexible, and powerful open-source language that is easy to learn, easy to use, and has powerful libraries for data manipulation and analysis. This training is a step-by-step guide to Python and Statistical Data Analysis with extensive hands on. The course is delivered with several activity problems, assignments and scenarios that help participants gain practical experience in data handling, analysis, interpretation as well as reporting. This course starts by exploring basic statistics such as mean, median and mode and commence to advanced exploratory features such as groups comparisons, regression, test of relationships, classification, clustering, just to mention a few.

Target Audience

The course is useful for professionals who use data as part of their work and who need to make decisions from data analysis. Those with prior understanding of programming and statistics finds it easier to take this course.

Learning Outcomes / Objectives

By the end of this course the participants will be able to:
  • Easily read and write files of various types in to a Python program.
  • Identify and fix errors in datasets.
  • Work with Python ‘modules’ and use them for data analysis tasks
  • Use libraries like pandas, numpy, matplotlib, scikit, and master the concepts like Python machine learning, scripts, and sequence.
  • Gain high level skills on statistical results interpretation and report writing.

Duration

5 days

Modules / Course Content

Module1: Introduction Introduction to Statistical Data Analysis
  • Introduction to statistical concepts
  • Descriptive and inferential statistics
  • Research designing
  • The research/survey process
Overview of Data Science
  • Introduction to data science
  • Different sectors using data science
  • Purpose and components of python
Data Analytics Overview
  • Data analytics process
  • Knowledge check
  • Exploratory Data Analysis (EDA)
  • EDA-Quantitative technique
  • EDA – Graphical technique
  • Data analytics conclusion or predictions
  • Data analytics communication
  • Data types and plotting considerations
Module 2: Statistical Analysis and Business Applications Introduction to statistical data analysis
  • Statistical analysis considerations
  • Population and sample
  • Statistical analysis process
  • Descriptive statistics – Measures of centres, distribution, dispersion
  • Inferential Statistics (correlation, regression, t-tests, chi-square, etc)
Python Environment Setup and Essentials
  • Anaconda
  • Installation of Anaconda Python distribution
  • Data types with Python
  • Basic operators and functions
Mathematical Computing with Python (NUMPY)
  • What is NumPy?
  • NumPy vs list
  • Installation
  • NumPy arrays
  • Built-in methods of NumPy (arrange; zeros and ones; linspace; eye; random)
  • Array attributes and methods (reshape; max, min, argmax, argmin; shape; dtype)
  • NumPy indexing and selection
  • Broadcasting
  • Indexing a 2D array (matrices)
  • Selection
  • NumPy operations (arithematic; universal array functions)
  • Vectorization
Module 3: Scientific Computing with Python (SCIPY)
  • Introduction to SciPy
  • SciPy sub package – integration and optimisation
  • Calculating eigenvalues and eigenvector
  • Using SciPy to solve a linear algebra problem
  • Use SciPy to define random variables for random values
Data Manipulation with PANDAS
  • Introduction to Pandas
  • DataFrame in Pandas
  • Viewing and opening data
  • Dealing with missing values
  • Data operations
  • Reading and writing files
  • Pandas SQL operation
Machine Learning with SCIKIT–LEARN
  • Introduction to machine learning
  • Understanding data sets and extraction features
  • Problem types and learning models
  • How to train, test and optimise models
  • Considerations for supervised learning models
  • Scikit-Learn
  • Supervised learning models – Linear regression, logistic regression
  • Unsupervised learning models
  • Pipeline
  • Model persistence and evaluation
Module 4: Natural Language Processing with SCIKIT LEARN
  • Overview of Natural Language Processing
  • Applications of Natural Language Processing
  • Libraries-Scikit
  • Extraction considerations
  • Scikit Learn-model training and grid search
Data Visualisation in Python Using MATPLOT-LIB
  • Introduction to data visualisation
  • Line properties
  • (x, y) plot and subplots
  • Types of plots
Module 5: Web Scraping with Beautiful Soup
  • Web scraping and parsing
  • Knowledge check
  • Understanding and searching the tree
  • Navigating options and modification options of a tree
  • Parsing and printing documents
Integration with Hadoop MapReduce and Spark
  • Big data solutions in Python
  • Big Data and Hadoop
  • Hadoop core components
  • Python integration with HDFS using Hadoop streaming
  • Using Hadoop streaming for calculating word count
  • Python Integration with Spark using PySpark
  • Using PySpark to determine word count

Training Methodology

The course will employ a hands-on, practical approach to ensure participants develop both conceptual understanding and technical proficiency. Each module will integrate interactive lectures, guided software demonstrations, and individual or group exercises based on real-world illustrations. Participants will receive continuous feedback and personalized coaching to reinforce learning. By the end of the training, they will have completed a mini project that demonstrates their ability to apply the acquired skills in a practical context.

More Details

Upon successful completion of this course, participants will be issued a certificate.

Registration

Registration as an individual (Onsite course delivery)
Click on the Register button aligned with your course dates and venue from the table provided.

    Registration as an individual (Online course delivery)
    Click NEXT button (below ↓) to view dates or/and register for this course in online instructor-led delivery mode.

    Available Online Course Dates

    • January 2026: 12 – 16 Jan
    • February 2026: 9 – 13 Feb
    • March 2026: 9 – 13 Mar
    • April 2026: 6 – 10 Apr
    • May 2026: 11 – 15 May
    • June 2026: 8 - 12 Jun
    • July 2026: 13 – 17 Jul
    • August 2026: 10 – 14 Aug
    • September 2026: 21 – 25 Sep
    • October 2026: 12 – 16 Oct
    • November 2026: 9 – 13 Nov
    • December 2026: 14 – 18 Dec

    Group Registration

      Registration as a group (either onsite or online course delivery modes)
      Click NEXT button (below ↓) to register a group for this course.

      Available Online Course Dates

      • January 2026: 12 – 16 Jan
      • February 2026: 9 – 13 Feb
      • March 2026: 9 – 13 Mar
      • April 2026: 6 – 10 Apr
      • May 2026: 11 – 15 May
      • June 2026: 8 - 12 Jun
      • July 2026: 13 – 17 Jul
      • August 2026: 10 – 14 Aug
      • September 2026: 21 – 25 Sep
      • October 2026: 12 – 16 Oct
      • November 2026: 9 – 13 Nov
      • December 2026: 14 – 18 Dec

      Scroll to Top