Machine learning with python

Course Syllabus – Machine learning with python

1.Module PYTHON Fundamentals.

 

  • Python Basics
  • Take your first steps in the world of Python. Discover the different data types and create your first variable.
  • Python Lists
  • Get the know the first way to store many different data points under a single name. Create, subset and 
  • manipulate Lists in all sorts of ways.
  • Functions and Packages
  • Learn how to get the most out of other people’s efforts by importing Python packages and calling functions.
  • Numpy
  • Write superfast code with Numerical Python, a package to efficiently store and do calculations with huge amounts of data.
  • Matplotlib
  • Create different types of visualizations depending on the message you want to convey. Learn how to build complex and customized plots based on real data.
  • Control flow and Pandas
  • Write conditional constructs to tweak the execution of your scripts and

get to know the Pandas DataFrame: the key data structure for Data Science in Python

2.Probability and Statistical Methods:

 Introduction to random variables, probability theory, conditional probability, Bayes Theorem.

  • Central tendencies (Mean, Median, Mode); Measures of spread (Range, Variance, Standard Deviation); Basics of Probability Distributions; Expectation and Variance of a variable.
  • Discrete probability distributions: Geometric, Poisson.
  • Continuous probability distributions: Exponential, Normal distribution; t-distribution
  • Central Limit Theorem; Sampling distributions; Confidence Intervals, Hypothesis Testing.
  • statistical hypothesis testing and will be introduced to various methods such as chi-square test, t-test, z-test, F-test and ANOVA
  • Covariance and Correlation.
  • Hands-on implementation of each of these methods will be conducted in R.

3. Statistical and Probability in Decision Modelling: 

  • Two very powerful techniques, viz., Linear Regression and Logistic Regression, which are used to solve problems in Prediction and Classification.
  • A very brief math refresher on calculus and gradient descents and arriving at suboptimal or optimal solution. 
  • Relationship between multiple variables: Regression (Linear, Multivariate Linear Regression) in prediction. 
  • Least squares method. 
  • Identifying significant features, feature reduction using AIC, multi-collinearity check, observing influential points, etc. 
  • Checking and validating linear fit, model assumptions and taking actions.
  • Hands on R-Session of Logistic and linear regression. 

4.Algorithms in Machine learning: 

Unsupervised:

  • Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behaviour. (K-Means)

Supervised learning:

  • Decision trees.
  • Support vector machines
  • Random Forest 
  • Ensemble modelling 
  • Bagging & boosting and its impact on bias and variance 
  • Adaboost
  • XGboost 

5.Text mining, Natural language processing: 

Introduction to the Fundamentals of information retrieval; Language modeling 

  • n-gram models of language 
  • Smoothing 
  • Probabilistic language models 

Feature engineering: 

  • TF and IDF 
  • Bow technique, word2vec.
  • Thinking about the math behind text; Properties of words; Vector Space Model 
  • Evaluation Metrics for Ranking 

Natural Language Processing 

  • Stemming, Phrase identification, word sense disambiguation 
  • POS tagging 
  • Parsing and semantic structures 
  • Coreference resolution 

Topic Modelling using LDA

Spread the word. Share this post!

Leave Comment

Your email address will not be published. Required fields are marked *