Department of Computer Science and Engineering
| B.Tech. III (CO) Semester - 6 | L |
T |
P |
C |
| CO314 : DATA SCIENCE (EIS-II) | 3 |
0 |
0 |
3 |
| COURSE OBJECTIVES | ||||
|
|
||||
| COURSE OUTCOMES | ||||
After successful completion of this course, student will be able to
|
||||
| COURSE CONTENT | ||||
| INTRODUCTION TO PARADIGMS FOR DATA MANIPULATION, LARGE SCALE DATA SETS | (14 Hours) |
|||
| MapReduce (Hadoop) and software interfaces (e.g., hive, pig): Moving from traditional warehouses to map reduce. Distributed databases and distributed hash tables, near-real-tips query. |
||||
| LARGE-SCALE ITERATIVE ALGORITHMS | (16 Hours) |
|||
| ML at large scale (distributed supervised and unsupervised learning). |
||||
Feature hashing |
||||
Topic models (LDA) |
||||
Large scale SVD and NMF for spectral clustering |
||||
Inverted-index and LSH based clustering |
||||
Large scale k-means clustering |
||||
| VISUALIZATION | (08 Hours) |
|||
Graph visualization |
||||
Data summaries |
||||
Hypothesis testing, ML model-checking and comparison |
||||
| ADVANCED TOPICS | (04 Hours) |
|||
| (Total Contact Time: 42 Hours) | ||||
| BOOKS RECOMMENDED | ||||
| ||||