0.1 Introduction to Data Science

What is Data Science and how did we get here?

“Data Science is the study (practice) of integrating mathematics, programming, and data to make meaningful inferences about the world around us.”

We differentiate data science from other fields, such as computer science, statistics and information science.

Computer science is the study of algorithms, data structures and programming methodologies, which differentiates from data science as it is missing the meaningful inferences about our world.

Statistics is a mathematical discipline concerned with the collection, description, and interpretation of data. This is a closer definition to data science but it is missing the hard data, such as data formats, archiving, and access.

Information science is a field primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information. The focus for information science is much more on the physical data: library collections, archives, databases, and how this information is consumed by users, but misses the computational and mathematical side of data science.

You may think of data science as the intersect of computer science, statistics, and information science.