Data science training in hyderabad

Page 1

GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

The Power of Statistical Science DATA SCIENCE Statistics

R Programming

Machine Learning

Project Execution

-: Tips For Future Data Scientists:1. Be flexible and adaptable There is no single tool or technique that always works best.

2. Cleaning data is most of the work Knowing where to find the right data, how to access the data, and how to properly format/standardize the data is a huge task. It usually takes more time than the actual analysis.

3. Not all building models Like the previous tip, you must have skills beyond just model building.

4. Know the fundamentals of structuring data Gain an understanding of relational databases. Also learn how to collect and store good data. Not all data is useful.

5. Document what you do This is important for others and your future self. Here is a subtip, learn version control.

6. Know the business Every business has different goals. It is not enough to do analysis just because you love data and numbers. Know how your analysis can make more money, positively impact more customers, or save more lives. This is very important when getting others to support your work.

7. Practice explaining your work Presentation is essential for data scientists. Even if you think you are an excellent presenter, it always helps to practice. You don’t have to be comfortable in front of an audience, but you must be capable in front of an audience. Take every opportunity you can get to be in front of a crowd. Plus, it helps to build your reputation as an expert.

8. Spreadsheets are useful Although they lack some of the computational power of other tools, spreadsheets are still widely used and understood by the business world. Don’t be afraid to use a spreadsheet if it can get the job done.

9. Don’t assume the audience understands

Reach us @ 040 – 4026 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

Many (non-data science) audiences will not have a solid understanding of math. Most will have lost their basic college and high school mathematics skills. Explain concepts such as correlation and avoid equations. Audiences understand visuals, so use them to explain concepts.

10. Be ready to continually learn I do not know a single data scientist who has stopped learning. The field is large and expanding daily.

11. Learn the basics Once you have a firm understanding of the basics in mathematics, statistics, and computer programming; it will be much simpler to continue learning new data science techniques.

CONTENT Session1 Understanding of the basics in Statistics 

Basic Statistical Concepts         

Statistical Terminology and Basic Notations Importance of Data and Numbers with domain specific Measure of Central Tendencies & Measure of dispersion Variance Discussion and its importance across the business Legendre’s Least Square Principle Scatter Diagram and Data points distribution Trend lines and Trend Pattern Discussion Outlier and Missing Value Treatment Analysis Central Limit Theorem

Basic Probability       

Probability Terminology and Notations Sample Space, Events and Experiments Probability Rules & Probability Types Bayes Theorem & Error Matrix Probability Scores and its importance in banking domain Discussion on Churn probability P-Value Significance in model outputs

Reach us @ 040 – 4026 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

 Understanding Distributions  Discrete and Continuous distributions  Binomial distribution & Poisson distribution  Exponential distribution&t- Distribution  Normal/Gaussian distribution  Concepts on Confidence intervals  Industry examples on understanding the distributions  Advanced Statistical Concepts  Theory of Hypothesis Testing  Small Sample and Large Sample Tests( t & Chi square testing )  ANOVA ( One way and Two Way )  Explanation on F test and Z tests in summary outputs  Theory of Association  Bivariate and Multivariate Analysis  Importance of Linearity  Correlation (Positive Correlation, Negative Correlation and Types of Correlation)  Regression Theory and Assumptions 

Exploratory Data Analysis          

Business Data Understanding Data management ( Cleansing, Formatting, Tabulating, Transpose & Recode ) Summarizing the information Design and layout the variable patterns Performing Descriptive Analysis Performing trend analysis and notify the spikes and dips Normalize the variables and transformation Missing Values Treatment Analysis Outlier Detection and Replacement Generate the reports ( Excel Pivot, V-lookup & Tableau usage )

Reach us @ 040 – 4026 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

Session2 Introduction to R in Statistics  Introduction to R Programming    

The importance of R in analytics Installing R and other packages Perform basic R operations R Studio – Installation and Guidance

 Basics, Data Understanding     

The datatypes in R and its uses Built-in functions in R &Subsetting methods Summarize and structure of data &head(), tail(), for inspecting data Reading and Writing Data and Matrices, List, Factors, Data Frames Functions and Loop Functions

 Preprocessing of Data    

Handling Missing Values Changing Data types Data binning techniques Dummy Variables

 Modelling & Validation     

Splitting of data - Test & Train Dependent & Independent variables Machine learning Algorithm Error terms calculation Accuracy & Precision

 Data Visualization   

Histograms, Bar plots and Plotting Graphs Customizing Graphical Parameters Usage of ggplot package

Reach us @ 040 – 4026 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

Session3 Machine Learning and its Business Applications Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data. Such as;

 Introduction to Modeling Building the Statistical Model Evaluating the Model  Model Deployment  

 Supervised Learning Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.   

Linear Regression Logistic Regression Nonlinear Regression

 Machine Learning  Naïve Bayes Classification  Neural Networks  Decision Trees  CART  C&R Tree  Classification Techniques  CHAID Algorithm

Reach us @ 040 – 4026 2121 / 903 007 2121 …Empowering with Education & Expertise


GYANVRIKSH INTERACTIVE PVT LTD III Foor, QZ Plaza, Opp Haveli Restaurant, Kothaguda, Kondapur

 Support Vector Machines (SVM)

 K Nearest Neighbour  UnSupervised Learning Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labeled responses. The most common unsupervised learning method is cluster analysis, which is used for exploratory data analysis to find hidden patterns or grouping in data.      

Concept of Clustering K means Clustering Hierarchical Clustering Discriminant Analysis Principle Component Analysis Data Reduction Techniques Factor Analysis

 Time Series Analysis      

Decomposition of Time Series Trend and Seasonality detection and forecasting Smoothening Techniques Understanding ACF & PCF plots ARIMA Modeling Holt - Winter Method

 Optimization & Regularization    

Gradient descent Simulated Annealing Genetic Algorithm - Basics Dimensionality Reduction

Reach us @ 040 – 4026 2121 / 903 007 2121 …Empowering with Education & Expertise


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.