Data Science Foundation
There is much debate among scholars and practitioners about what data science is, and what it isn’t. Does it deal only with big data? What constitutes big data? Is data science really that new? How is it different from statistics and analytics?
One way to consider data science is as an evolutionary step in interdisciplinary fields like business analysis that incorporate computer science, modeling, statistics, analytics, and mathematics. Data science is the study of the generalizable extraction of knowledge from data, yet the key word is science. It incorporates varying elements and builds on techniques and theories from many fields, including signal processing, mathematics, probability models, machine learning, statistical learning, computer programming, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products.
From government, social networks and ecommerce sites to sensors, smart meters and mobile networks, data is being collected at an unprecedented speed and scale. Data science can put big data to use.
Average number of “likes” and “comments” posted on facebook daily.
Percentage of the world’s data that has been produced in the last two years.
Projected volume of e-commerce transactions in 2016.
Data Science is not restricted to only big data, although the fact that data is scaling up makes big data an important aspect of data science.
- Introduction to Data Science
- Data in Data science ecosystem
- Data sets , Training , Testing data sets
- Volume ,Variety,Velocity and Values
- Structured , Unstructured Data, Text Data
- Meta Data Modelling /KDM standard
- MOF , KDD and KDDML
- Datamining Group and PMML standard
- Unified Modelling language - Meta Data modelling
- Text Data and Classification
- Global Standards - UNSPSC , DBPedia
- Datascience solution using platform . Iaas , Paas and Saas based solution approach
- Big Data and Big Data Technology , Tools and Platform
- Why Big Data ?
- Hadoop Framework
- Map Reduce
- Services attached with Hadoop Framework (Hbase , Hive , Zookeeper , Cassandra , MongoDB)
- API and Its Integration Model
- Data Science Platform
- Virtual Infrastructure Platform and Public Cloud
- AWS Elastic Map Reduce Platform
- Apache Spark , Spark SQL , Apache Storm
- Machine Learning Platform
- API design and Model for platform
- Data Platform - MapR
- Data Science Services Platform
- Data set Design and Model using UML Infrastructure and MOF
- KDM and KDDML for Numeric Data
- Content /Text Data modeling
- Text/Content Extraction
- XML Pipeline for Text aggregation / transformation
- Semantic Content , Annotation
- OWL/RDF /RDF Graph standard
- Ontology , Vocabulary , Linked Platform
- UIMA and NLP for text analysis
- Algorithm and Machine Learning
- Classification of Data - Support Vector Machine,
- Clustering of Data - K means
- Collaborative Filtering & Recommendation Engine / easyrec engine
- Business Intelligence Platform