Page 1

DATA SCIENCE: Defining the Pieces of the Data Puzzle

DATA SCIENCE

ENC LLIG

Top Data Science Terms Defined

LEA

ART

HINE RNIN G

Machine Learning involves training computers, through repeated presentation of observations and outcomes, to make predictions that are not obvious to a person.

ICS E

LY T

IFIC

MAC

MACHINE LEARNING

NA

Data analytics is the most common use of data in organisations, used to produce reports. Data analysts are often extracting data from relational databases (like SQL Server and Oracle ) and presenting them as reports and corporate dashboards.

BIG DATA

DATA SCIENCE • Advanced usage of statistical tools and methods • Programming in at least one data science language (e.g. Python, R) • Extract and manipulate data from diverse data sources • Use machine learning methods, such as clustering and random forests

DATA ANALYTICS • Manipulate databases using SQL • Use dashboard tools and design effective dashboards. • Utilise statistical tools to maintain data integrity • Produce effective, clear charts that inform, rather than confuse, decision-makers

MACHINE LEARNING

INTE

S A T A D

N E I C

CE

DATA ANALYTICS

IAL

This umbrella term encompasses the other data science disciplines. The data scientist is often attempting to create new knowledge from existing data— e.g. by producing predictions.

DA TA A

Skills Required

ARTIFICIAL INTELLIGENCE (AI)

Artificial Intelligence (AI) has been around since the 1950s, but today’s AI researchers use cutting-edge technologies such as deep learning (previously known as neural networks), Natural Language Process, or NLP (used in conversational user interfaces), and image processing (as used in products such as self-driving cars).

• Expertise in one or more modelling techniques/tools • Understand the statistical basis of algorithms • Programming in popular machines learning languages, such as Python • Work closely with data scientists to ensure machine learning technology delivers results for the organisation

ARTIFICIAL INTELLIGENCE (AI) • Highly specialized; Skill requirements determined by area of research and niche expertise

BIG DATA

BIG DATA

Big data describes working with data that is too large to be processed using standard (e.g. workstation, single server) tools. The two most common big data platforms are Hadoop and Spark. Platforms like Tableau are popular with those looking to do data analytics with big data.

• Operate and manage clusters of networked computers • Maintain high availability of the cluster • Understand how cyber security issues can affect big data • Programming using enterprise languages, such as Java and Scala

Learn More at LearningTree.co.uk/DataScience

Data Science: Defining the Pieces of the Data Puzzle. UK Version  
Data Science: Defining the Pieces of the Data Puzzle. UK Version