top of page
Data Science Machine Learning
Abstract Shapes

Data Science - Enterprise Edition

Multiple added modules for advanced concepts

​

​

​

Comprehensive training on Software foundation skills and Softskill. This is part of Syllabus package without additional charges. Refer the course menu.

​

 

 

 

Machine Learning, Deep Learning with Python programming and demonstration real-time projects on multiple business domain

 

Modules 1 – Data Engineering Arcitecture

 

This understanding is must for professionals to showcase experience on Data Engineering. Data Science is just one part of this Data Engineering Echo system. You will be trained on High-Level Designs created for MNCs.

 

Syllabus:

 

  • Data Architecture Principles of building Data Platform

  • Project methodologies, deliverables and designs

  • Data Integration concepts. ER & Dimensional modelling.

  • OLTP & OLAP: Reports, Dimensions, Facts, Star & Snowflake Schemas

  • Slowly Changing Dimensions

 

Module 2 – Python programming

 

Data Science is one of the application streams of software development. There is no alternative to scale-up as full-stack programmer.

 

Syllabus

 

  • Program involving VM architecture,

  • Popular IDEs for Python

  • Conditions, Loops, Functions, Comprehensions & Lambda functions

  • OOPs: Objects, Classes, Constructors, Methods, Modifiers, Polymorphism, Overloading, Overriding, Abstract classes, Interfaces, Multi-Threading, Multi-Processing, GPU’s, Packages, Exception handling, File I/O

  • OS Module & Regular Expression

  • Data Structures: Lists, Dictionaries, Tuples & Sets

  • NumPy Library: Data Handling

  • All important Libraries and their significance

  • Pandas Libraries : Data handling operations

  • Python to Database with from SQL lite

 

Module 3: Data Science in nutshell

 

Broad spectrum of Data Science for strategic decisions of business

 

  • Definition of AI, DS, ML, DL and their Applications

  • Supervised, Semi-Supervised, Unsupervised and Reinforcement Learning

  • Different Roles & Responsibility in Data Science

  • Programming languages for Data Science and their significance

  • Pros, Cons and Challenges of Data Science

  • Data Properties and processing steps

 

Module 4: Descriptive statistics & Exploratory Data Analytics

 

Before prediction or building models, this is common process to understand data

 

  • Descriptive vs Inferential statistics

  • Mean, Median, Mode, Standard Deviation, Variance & Correlation

  • Pearson’s corelation coefficients

  • Outliers & IQR

  • Distribution: Uniform, Normal, Standard Normal distributions

  • Central Tendency

  • Measures of Variability

  • Modality

  • Chebyshev's & Markov’ theorem

  • Skewness

  • Kurtosis

 

Module 5 - Data Visualization

 

Visualization is the key to interpret data.

 

  • Libraries: matplotlib & Seabourne

  • Line chart, Bar chart & heatmap, scatter plat, swamp, and regression charts, Outliers

  • Uses cases and projects

  • There are many more which will be dealt with for each modules

 

Other visualizations will be trained with indivisual Models

 

Module 6 - Inferential statistics

 

Inference of strategic outcome before making business decisions. ML models will extend further to below concepts for implementation.

 

  • Binomial theorem

  • Probability Distributions

  • Bayesian Statistics

  • Hypothesis testing

  • Z & t tests and P-value

  • Library statsmodels & scipy

 

Module 8 - Linear & Polynomial Regression

 

One of the widely used supervised learning models to understand Linear relationship between dependent and influencing parameters. To handle labelled data.

 

  • Machine Learning Data Types

  • Univariate analysis

  • Linear Relationships

  • Approaches OLS (SSE), RMSE, MAPE, MAE

  • R-square & Adjusted R-Square

  • Data Sampling – Population & Samples

  • Train & Test split

  • Linear Regression Model

  • Understanding LR Coefficients

  • Approaches to build regression models

  • Interpreting results

  • Selecting influencing Independent Variables

  • Assumptions of Linear Regression

  • Overfit and Underfit models

  • Polynomial Regression – Extension of Linear Regression

  • Projects

 

Module 9 - Logistic Regression

 

Essential part of supervised learning model to prediction binomial outcome from the independent parameters

 

  • Linear vs Logistic Regression

  • Cheat code

  • MLE

  • Plotly & graph_objs Libraries

  • Exploratory Data Analysis

  • Logit summary from Model

  • ROC curves

  • Inference

  • Projects

 

Module 10 - Decision Tree & Random Forest Models

 

Tree models can be applied for both supervised and unsupervised learning. It has its significance in feature engineering

 

  • Introduction

  • Tree structure

  • Gini Index & Entropy

  • Tuning – CART regression

  • Graphviz library

  • Over Fit & Underfit Models

  • Bias & Variance

  • Random Concepts

  • Feature Engineering projects

 

Module 11 - Clustering Models, Hierarchical & K-Means

 

Unsupervised learning models to Group and classify un-labelled data clusters

 

  • Unsupervised Learning

  • Content & Collaboration Filtering

  • Recommendation Engine

  • Wards & Dendogram

  • KNN versus Hierarchical clustering

  • Visualizing Clusters

  • Limitations of K-Means

  • Tuning: Silhouette Analysis - Elbow Method

  • Projects

 

Module 12 - SVM Support Vector Machines

 

Widely used model often used for accuracy with less data

 

  • Introduction to SVM

  • Hyper-plane

  • C & Gamma parameters

  • Applying for multidimension

  • Kernel Trick

  • Tuning SVM parameters

  • Projects

 

Module 12 - Ensemble ML Algorithms: Bagging, Boosting, Voting

 

These models often combine the estimates of other models and plays important supporting role

 

  • Demonstration with Project

  • Ensemble concept

  • Combine Model Prediction into Ensemble Prediction

  • Bagging Algorithms – Decision Tree & Random Forest

  • Extra Trees Classifier

  • Ada Boost

  • Stochastic Gradient Boosting

  • Voting Ensemble

  • Model Comparison

  • Summary

  • Projects

 

Module 14 - Miscellaneous Concepts

 

  • Cross Validation, Lasso, Ridge Regression concept

  • Chi-square test & PCA - Principal Component Analysis

  • Targeting Multicollinearity with Python

  • VIF

 

​

​

Sphere on Spiral Stairs
Brief of Data Science History
bottom of page