Duration 150 Hours covered across 10 months (Includes 10 Hours of Examination)
₹INR 2,00,000 + GST (inclusive of tax)
B.E., B.Tech., MCA, or MSc (CS or IT) Graduates as on the Programme Start date i.e., Technical Orientation Date
Candidate should have a basic understanding of mathematic topics such as Linear Algebra, Calculus, Probability and Statistics; Familiar with programming in either C or Python
Programme Coordinator - Prof. C. Chandra Sekhar
Professor, Department of Computer Science & Engineering, IITM
Prof. Sekhar’s expertise and research interests include speech recognition,neural networks, kernel methods, machine learning, deep learning and metric learning. He is the author of many research papers that have been published in peer reviewed, national and international journals and conferences. In 2016, he was the recipient of the coveted Srimathi Marti Annapurna Gurunath Award for Excellence in Teaching from IITM.
Programme Faculty – Dr. Dileep A.D.
Associate Professor, School of computing and electrical engineering, IIT Mandi
Received his M. Tech and PhD in computer science and engineering from IIT Madras His research interest includes pattern recognition, Kernel methods of Pattern Analysis, Machine learning for speech technology, Computer vision, Cloud and telecom networks.In 2020 he was the recipient of Teaching Honour Roll Award for Excellence in Teaching during the academic year 2019-20 at IIT Mandi.
Motivation for Programme, Overview of Programme, Expected Outcomes of Programme
Motivation for Programme, Overview of Programme, Expected Outcomes of Programme
Function approximation (Regression), Classification, Clustering, Ranking, Information retrieval Text processing applications - Text classification, Parts-of-speech tagging, Named entity recognition, Text summarization, Text question answering, Machine translation; Image and video processing applications: Image classification, Image annotation, Image captioning, Video classification, Video captioning, Visual question answering, Visual common sense reasoning;Speech processing applications - Speech recognition, Speaker recognition, Speech emotion recognition, Spoken language recognition, Text-to-speech synthesis, Speech-to-speech translation;
Data representation - Feature extraction, Representation learning, Embeddings
Supervised learning, Unsupervised learning, Semi-supervised learning, Active learning, Self-supervised learning, Transfer learning, Domain adaptation - Zero-shot, One-shot and Few-shot learning; Federated learning
Review of basics of mathematical topics Linear Algebra: Vectors and Matrices, Inner product of vectors, Matric multiplication by a vector, Eigen values and vectors of a matrix, Singular value decomposition of a matrix.
Calculus: Differentiation with one variable, Differentiation with multiple variables, Differentiation of a vector and a matrix, Unconstrained optimization problem solving.
Probability and Statistics: Random variables, First order and second order statistics, independent variables, Uncorrelated variables, Sum and Product Laws of probability, Probability distributions, Bayes rule.
Linear model for regression, Supervised learning, Parameter estimation - Maximum likelihood method, Overfitting, Regularization, Ridge regression, Lasso
K-nearest neighbours’ classifier, Bayes classifier, Normal density function, Maximum likelihood estimation, Gaussian mixture model, Naïve Bayes classifier, Decision surfaces, Dimension reduction methods - Principal component analysis, Linear discriminant analysis
McCulloch-Pitts neuron, Perceptron learning rule, Perceptron convergence theorem, Sigmoidal neuron, Softmax function, Multilayer feedforward neural network, Error backpropagation method, Gradient descent method, Stochastic gradient descent method, Stopping criteria, Logistic regression based classifier
Deep feedforward neural networks (DFNNs), Optimization methods: Generalized delta rule, AdaGrad, RMSProp, Adadelta, AdaM, Second order methods; Regularization methods: Dropout, Dropconnect; Batch normalization
Basic CNN architecture, Rectilinear Unit (ReLU), 2-D Deep CNNs: LeNet, AlexNet, VGGNet, GoogLeNet, ResNet; Image classification using 2-D CNNs; 3-D CNN for video classification; 1-D CNN for text and audio processing; Vector of Linearly Aggregated Descriptors (VLAD) method for aggregation - NetVLAD
Architecture of an RNN, Unfolding an RNN, Backpropagation through time, Vanishing and exploding gradient problems in RNNs, Long short term memory (LSTM) units, Gated recurrent units, Bidirectional RNNs, Deep RNNs
Encoder-decoder paradigm, Image and video captioning models, Machine translation, Text processing models, Representation of words: Word2Vec, GloVe
Attention based models, Scale dot product attention, Multi-head attention (MHA), Self-attention MHA, Cross-attention MHA, Position encoding, Encoder Decoder module in a transformer, Sequence to sequence mapping using transformer, Machine translation using transformer model, Vision transformer for image classification, Video captioning using transformer model, Bidirectional encoder representations from transformers (BERT) model for text processing, Pre-training a BERT model, Fine tuning a BERT model for text processing tasks, Vision-and-Language BERT (ViLBERT) for image and video processing tasks, Text and Visual question answering and reasoning using transformer models
Image generation models, Architecture and training of a GAN, Deep convolutional GAN, Cyclic GAN, Conditional GAN, Super-resolution GAN, Applications of GANs for image processing
Introduction to reinforcement learning, Markov decision processes, Policy gradients, Temporal difference learning, Q-learning, Deep reinforcement learning - Deep policy gradient, Deep Q learning; Text processing using deep reinforcement learning - Text classification, Text summarization