Statistics Vocabulary

Taken from: Psychology Illinois State

Statistics Vocabulary For a bar graph, a vertical bar is drawn above each score (or category) so that 1) The height of the bar corresponds to the frequency, & 2) there is a space separating each bar from the next. A bar graph is used when the data are measured on a nominal or an ordinal scale. The binomial distribution results from situations in which there are only two possible outcomes for a particular situation. Central Limit Theorem: For any population with mean Âľ and standard deviation Ďƒ, the distribution of sample means for sample size n will approach a normal distriution with a mean of Âľ and a standard deviation of as n approaches infinity. Central tendency is a statistical measure that identifies a single score as representative of an entire distribution. The goal of central tendency is to find the single score that is most typical or most representative of the entire group. A confounding variable is an uncontrolled variable that is unintentionally allowed to vary systematically with the independent variable. A constant is a characteristic or condition that does not vary, but is the same for every individual. Constructs are hypothetical concepts that are used in theories to organize observations in terms of underlying mechanisms. A control group is a condition of the independent variable that does not receive the experimental treatment. Typically, a control group either receives no treatment or receives a neurtal, placebo treatment. The purpose of a control group is to provide a baseline for comparsion with the experimental group. For a continuous variable, there are an infinite number of possible values that fall between any two observed values. A continuous variable is divisible into an infinite number of fractional parts. With the correlational method, two variables are observed to see if there is a relationship. Data (this is plural, so we say "the data are", not "the data is") are measurements or observations. A data set is a collection of measurements or observations. A datum (singular) is a single measurement or observation and is commonly called a score or raw score. Degrees of freedomdescribe the number of scores in a sample that are free to vary.

The dependent variable is the one that is observed for changes in order to assess the effect of the treatment. Descriptive statistics are statistical procedures used to summarize, organize, and simplify data. A discrete variable consists of separate, indivisible categories. No values can exist between two neighboring categories. The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population. An experimental group does receive an experimental treatment. In the experimental method, one variable is manipulated while changes are observed in another variable. To establish a cause-and-effect relationship between the two variables, an experiment attempts to eliminate or minimize the effect of all other variables by using random assignment and by controling or holding constant other variables that might influene the results. A frequency distribution is an organized tabulation of the number of individuals located in each category on the scale of measurement. For a histogram, vertical bars are drawn above each score so that 1) the height of the bar corresponds to the frequency, & 2) The width of the bar extends to the real limits of the score. A histogram is used when the data are measured on an interval or a ratio scale. A hypothesis is a prediction about the outcome of an experiment. IN experimental research, a hypothesis makes a prediction about how the manipulation of the independent variable will affect the dependent variable. Hypothesis testing is an inferential procedure that uses sample data to evaluate the credibility of a hypothesis about a population. The independent variable is the variable that is manipulated by the researcher. In behavioral research, the independent variable usually consists of the two (or more) treatment conditions to which subjects are exposed. The independent variable consists of the antecedent conditions that were manipulated prior to observing the dependent variable. Inferential statistics consist of techniques that allow us to study samples and then make generalizations about the populations from which they were selected. The interquartile range (IQR) is the distance between the first quartile and the third quartile. So this corresponds to the middle 50% of the scores of our distribution. An interval scale consists of ordered categories where all of the categories are intervals of exactly the same size. With an interval scale, equal differences between numbers on the scale reflect equal differences in magnitude. However, ratios of magnitudes are not meaningful.

In a frequency distribution polygon (or a line graph) a single dot is drawn above each score so that 1) The dot is centered above the score 2) The height of the dot corresponds to the frequency. A continuous line is then drawn connecting these dots. The graph is completed by drawing a line down to the X-axis (zero frequency) at each end of the range of scores. The mean for a distribution is the sum of the scores divided by the number of scores. The median is the score that divides a distribution exactly in half. Exactly 50% of the individuals in a distribution have scores at or below the median. The median is equivalent to the 50th percentile. In a frequency distribution, the mode is the score or category that has the greatest frequency. A nominal scale consists of a set of categories that have different names. Measurements on a nomnal scale label and categorize observations, but do not make any quantitative distinctions between observations. An operational definition defines a construct in terms of specific operations or procedures and the measurements that result from them. Thus, an operational definion consists of two components: First, it describes a set of operations or procedures for measuring a construct. Second, it defines the construct in terms of the resulting measurements. An ordinal scale consists of a set of categories that are organized in an ordered sequence. Measurements on an ordinal scale rank observations in terms of size or magnitude. A prameter is a value, usually a numerical value, that describes a population. A parameter may be obtained from a single measurement, or it may be derived from a set of measurements from the population. A population is the set of all individuals of interest in a particular study The power of a statistical test is the probability that the test will correctly reject a false null hypothesis. So power is 1 - Î˛. The quasi-experimental method examines differences between pre-existing groups of sugjects (for example, men vs. women) or differences between groups of scores obtained at different times (for example, before treatment vs. after treatment). The variable that is used

to differentiate the groups is called the quasi-independent variable, and the score obtained for each individual is the dependent variable. A random sample must satisfy two requirements: 1. Each individual in the population has an equal chance of being selected. 2. If more than one individual is to be selected for the sample, there must beconstant probability for each and every selection.

Random selection, or random sampling, is a process for obtaining a sample from a population that requires that every individual in the population have the same chance of being selected for the sample. A sample obtained by random selection is called a random sample. The range is the difference between the upper real limit of the largest (maximum) X value and the lower real limit of the smallest (minimum) X value. The rank or percentile rank of a particular score is defined as the percentage of individuals in the distribution with scores at or below the particular value. When a score is identified by its percentile rank, the score is called a percentile. A ratio scale is an interval scale with the additional feature of an absolute zero point. With a ratio scale, ratios of numbers DO reflect ratios of magnitude. For a continuous variable, each score actually corresponds to an interval on the scale. The boundaries that separate these intervals are called real limits. The real limit separating two adjacent scores is located exactly halway between the scores. Each score has two real limits, one at the top of its interval called the upper real limit, and one at the bottom of its interval called the lower real limit. Note that the upper real limit of one interval is also the lower real limit of the next higher interval. A sample is a set of individuals selected from a population, ususally intended to represent the population in a study. Sampling error is the discrepancy or amount of error, that exists between a sample statistic and the corresponding population parameter. A sampling distribution is a distribution of statistics obtained by selecting all the possible samples of a specific size from a population. Sampling with replacement - a sampling method in which each sample (individual) is replaced into the population before the selection of the next sample (individual) In essence, the standard deviation measures how far off all of the individuals in the distribution are from a standard, where that standard is the mean of the distribution.

A standardized distribution is composed of transformed scores that result in predetermined values for Âľ and Ďƒ, regardless of their values for the raw score distribution. Standardized distributions are used to make dissimilar distributions comparable. A standard score is a transformed score that provides information about its location in a distribution. A z-score is an example of a standard score. A statistic is a value, usually a numerical value, that describes a sample. A statistic may be obtained from a single measurement, or it may be derived from a set of measurements from the sample. Researchers use of statistics - refers to a set of methods and rules for organizing, summarizing, and interpreting information.

Stem and leaf displays - These displays break each number down into a lef part called the stem and a right part called the leaf. If numbers are two digits, then the left digit is the stem and the right digit is the leaf. -get a picture and can recover all of the individual data points The t statistic is used to test hypotheses about Âľ when the value for Ďƒ2 is not known. The formula for the t statistic is similar in structure to that for the z-score, except that the t statistic uses estimated standard error. Variability provides a quantitiative measure of the degree to which scores in a distribution are spread out or clustered together. A variable is a characteristic or condition that changes or has different values for different individuals. A z-score specifies the precise location of each X value within a distribution. The sign of the z-score (+ or -) signifies whether the score is above the mean or below the mean. The numerical value of the z-score specifies the distance from the mean by counting the number of standard deviations between X and Âľ.

Statistics Vocabulary

Published on Oct 17, 2013

This is a Glossary of Statistical Terms in English.

Advertisement