Course Overview, What is Data Mining and its Origin, Typical Data Mining Tasks, Data Mining Applications/Examples, Data Mining vs. OLAP and Statistics, Introduction to Classification/Decision Trees

Model Interpretation of Classification Trees, Measures of Node Impurity, Computation of GINI Index, Computation of Entropy and Misclassification Error, Induction of Classification Trees, Handling of Continuous and Multi-State Variables, Data Preparation, Normalization, Outlier Detection

Discretization using Value Reduction, Overview of Chi Square Test, ChiMerge Discretization, KNIME Demo (using German Credit Card Data), Model Evaluation, Accuracy, Weighted Accuracy, Recall and Precision, Receiver Operating Characteristics (ROC Curve)

Model Evaluation (Holdout, k-Cross Validation), Sampling with Replacement (Bootsrapping), Ensemble Methods (Bagging and Boosting), Random Forest, Stacking, Lazy Learner vs. Eager Learner, k-Nearest Neighbor: Pros and Cons

Clustering: Basic Concepts and Popular Types, Applications, K-Means: Concepts, Working, Limitations, Schemes to Handle Initial Centroid Problems in K-Means, Hierarchical Clustering: Simple/Complete/Average Linkages, Validity of Clusters: External and Internal Metrics

KNIME Demo (K-Means and Fuzzy c-Means Clustering with Relative Index and External Index, Hierarchical Clustering), Distance Computation for Mixed Type Variables: Interval-Scaled, Symmetric and Asymmetric Binary, Categorical and Ordinal, Fuzzy c-Means

Association Rule Mining, Apriori Algorithm, Frequent Itemsets and Rules Generation, Support, Confidence, Interest and Lift, Handling of Continuous and Categorical Data, min-Apriori, Multi-level Association Rules, KNIME Demo, Principal Component Analysis

S.#DateDayTopicsDownloads