Ron Jones Logo

Contact RJ

Ron Jones Bio
CorporateWellness
Coach & Train
Exercise Library
Handouts
Health & Fitness
KETTLEBELLS
Products by RJ
Site Map

RJ Foot Fitness Logo

TheLeanBerets.Com "Avengers of Health!"

Coach RJ Blog

KIN 610 Quantitative Analysis of Research

Key Terms & Definitions Notes from Lectures, Textbook, & Blackboard.Com Weblinks

(Last Updated On: 11-3-01)

Text: Statistics in Kinesiology by William J. Vincent

Instructor: Dr. Ann Maliszewski

Website As: www.ronjones.org/csun610notes.htm

  Statistical Symbols:

  • *=multiply symbol
  • N=whole population
  • n=sample population
  • m=population mean
  • s=standard deviation
  • sm=standard error of mean when SD of population is known
  • S or SD=standard deviation
  • S^2=variance
  • X=single raw score
  • Bar X= sample mean
  • µ=area under normal curve for rejection of null hypothesis (H0)

Absolute Value: The positive value.  Drop negative if present.

  • r=-.37 but absolute value is needed so that is r=.37 (no negative).

Alpha: Area under normal curve for rejection of null hypothesis (H0).

ANCOVA: (Analysis of Covariance) The adjustment of dependent variable mean values to account for the influence of one or more covariates that are not controlled by the research design.

  • Used if you have some variable that starts out different to begin with i.e. comparing diet vs. exercise for weight loss.  The design to reduce chances that means don’t start out different i.e. try to eliminate to begin with.
  • Adjusts for unequal baseline starting point i.e. adjusts for different means at the beginning.
  • Use this if after random assignment, the numbers still don’t match.

ANOVA: (Analysis of Variance) An F value that represents the ratio of between-group and within-group variance.  Used when you want to test for statistical significance but some factor wasn’t equivalent during pre-test.  You need to adjust to make sure post-test values are truly equivalent.

  • Results tell that if X is added to equation, does it better enable you to predict the Y-intercept?  If you can’t predict Y-intercept better, then you don’t know slope of line. 

ANOVA/Factorial Analysis (or 2-Way): Analysis of variance performed on more than one factor (e.g. the effects of gender (factor A) and treatment (factor B) analyzed simultaneously).  Factorial ANOVA permits evaluation of the interaction of the factors on the DV.

  • Do factors (IV) affect outcome (DV)?
  • Examines overall (or global) means for each other.
  • Example: Age & Gender for risk factors of Heart Disease
    • Main Effect: If there is NO significant interaction, then ask is there an age effect & gender effect?
    • Interaction: Looks at combination of the two factors i.e. the age by gender interaction.

Ø      If no interaction, then the lines would be parallel.  If interaction exists, then the lines cross i.e. menopause for women during 66-75 age group was example given in class.

Ø      If interaction is significant, then don’t even look at main effects.  If no significance, then you are only looking at “main” effects.

Ø      Perform Post Hoc to determine where the significant differences occur.

  • Example: Do grades change over years with boys and girls?
    • Factors= gender (2 levels), grade (4 levels)=this is a “2X4 study”

Ø      Are differences similar over time? i.e. interaction

Ø      If no significant interaction between the two levels, then look at main effects.

Ø      Perform Post Hoc to determine where the significant differences occur.

ANOVA/One-Way: Compares two or more means.

  • Can also use T-Test for same data.
  • Compares differences among group means (“groups” of different individuals).
  • Means are NOT related!  They are randomly assigned and unpaired.
    • One-Way=individual, not related, unpaired
    • Use the “unpaired” T-Test because NOT related but can’t just run repeated T-Tests because you’d increase chances of finding significance when it isn’t there i.e. need to adjust alpha/p level to make more rigid. 
    • Divide alpha/p value by # of comparisons performed then use the adjusted and more rigid p value to find significance.  You can also use ANOVA because it factors repeated T-Tests into its equation.

ANOVA/Repeated Measures: Two or more means from the same people so data IS related now.

  • Different conditions but same people.
  • Only used on “different people” when you have identical twins or litter mates if testing animals.
  • STATEMENT: There is statistical difference among means (F= , p  ).
  • Then do a Post Hoc to determine when these differences exist.  Examples of Post Hoc Tests are: Tukey HSD, Fischer, & Scheffe’s.

ANOVA Table: (Subpoint of Step Wise Regression Analysis) Does the next variable add strength to the F ratio (or equation)?

Axes:  Manipulating can alter the appearance of the table.  Need to make as “realistic” as possible.

Best Fit Line: Line on scatter plot that best indicates the relationship between plotted values; line on scatter plot that balances positive and negative residual values so that they sum to zero. 

Between-Within: Factorial ANOVA comparing independent groups (between) measured tow or more times (within) that is sometimes called a “mixed model.”

Bias: The factors operating on a sample so it is not representative of the population from which it was drawn.

Bimodal: A distribution of values with more than one mode.

Bonferroni Adjustment: Adjustment of p value (probability of error) to correct for a familywise error rate when making multiple comparisons on the same set of subjects. 

  • To perform the adjustment, divide the single-test alpha by the number of tests to be performed.
  • Example: If five tests are to be made at p<.05, then divide .05 by 5=.01.  The new alpha to keep the same level of significance is then p<.01.

Central Tendency:  Values that describe the middle, or central, characteristics of a set of data.  The three values of central tendency are the mode, median, and mean plus Confidence Interval that describes your confidence in the probability that statement is true.

  • Four measures of Central Tendency:
    1. Mean
    2. Median
    3. Mode
    4. Confidence Interval (95% is usually the minimum standard)
  • The typical or average.
  • The central characteristics of the data i.e. what is typical or what is average?
  • Example: What strength is needed to be a “good rower”?  This relates to a specific group.  Use arithmetic generalization about the group being evaluated i.e. mode, median, and mean.

Coefficient of Variability:

  • (std dev/mean)*100 (*=divide)
  • Looks at two data sets and determines which one is more variable.
  • Good for group data but also individual data especially for specific athletes.
  • Variability=% of overall score
  • Range
  • Sum of Squared Deviations of individual scores of mean variance (S squared)

Condition:  Equivalent in meaning to “level or variables” and “treatment.”

Correlation: A numerical coefficient between +1.00 and –1.00 that indicates the extent to which two variables are related (determines relationship) or associated; the extent to which the direction and size of deviations from the mean in one variable are related to the direction and size of deviations from the mean in another variable. 

  • Examines the relationship between two variables measured on the SAME person i.e. a “bivariate” measurement.
  • If the known X doesn’t help predict the Y intercept, then no correlation exist and the slope of the line is flat. 
  • Correlation Test tells “group” mean and only group mean!
  • P value tells significance of relationship and r value tells its strength.

Correlation Coefficient: Reflects the slope of the line.

Correlation Graph: How does factor Y and X change relevant to one another?

  • Regular Line=Line of best fit
  • Distance is equal between upper and lower points in relation to line i.e. the mean of points=the line.
  • Sum of all distances below and above the line=equivalent for both.
  • Lines are called either “negative” or “indirect.”  They mean the same but you can’t call them both—just pick the term you will use.
  • If Y increases 1, then X increases 1.  A perfect positive is r=+1 and a perfect negative is r=-1.  Both are equally strong—just positive/direct or negative/indirect. 
  • ¯ P Value=Less chance of error and more impressive
  • ­ P Value=More chance of error and less impressive
  • y=mx+b
    • y=axis
    • m=slope of line (Example: For every unit you went up in height, how much did you go up for weight?)
    • x=measure (Example: The multiple that precedes x is directly related to this multiple increase.)
    • b=y intercept
  • Excel Tip: Use “scattergram” and not “line gram” in Excel.
    • Tools>add ins>check Pak Data Analysis>correlation>then highlight both columns
  • Multiple R=r Pearson Regression
  • R^2: How much meaning is behind the correlation?  How much is a true connection?
  • ANOVA: Is it better to predict Y with X or predict X with Y?  Was it significant?  Look at significance or ANOVA and if <.05, then it is significant. 
  • Analysis of Variance=looks at difference

Correlation “r” Value: Use absolute value of “r” to determine strength.  The closer to “r” the stronger the relationship.

  • 0 slope=0 r value (the weakest link; no relationship observable; line can be perfectly horizontal or perfectly vertical)

Confidence Interval: (LOC) The amount of confidence that can be placed in a conclusion; a value expressed as a percentage that establishes the probability that a statement is correct.

  • How confident are you in the correlation i.e. affected by the spread of scores.
  • ­N>reduced effect of variability around the mean
  • LOC=Mean +/- Z(SE of mean)
    • The Z Score produces the amount of desired probability of error i.e. Z Score of 1.96=95% CI & P.05.
    • P.05=95%
    • 95%=1.96 (1.96 SD away from mean)
    • The probable mean is somewhere between these +/- upper and lower “estimated” limits.

Continuous Variables: A variable that theoretically can assume any value such as: distance, force, and time.

Correlation:  Examines the relationship between two variables measured on the SAME person i.e. bivariate. 

Cumulative Frequency:

Data: Information gathered by measurement.

Decile: One-tenth of the range of values.

Degrees of Freedom: The number of values in a data set that are free to vary when restrictions are imposed on the set.

  • (df) (n-1)  *Because only one step has been done (mean), you must subtract 1 (1=steps already done) from n (n=number of scores).
  • Must establish mean first before getting DF.
  • When you only have a sample.  If you have N, you don’t need to estimate whole group because you already have it with your N.  Most of the time it is NOT feasible to test the whole N—you have to get a sample of n.

Dependent Variable: Depends on something you are manipulating i.e. depends on the independent variable.

1.      The effect or consequent of the IV; also called the yield.

2.      Variable whose value is partially determined by the effects of other variables.  It is not free to assume any value.  It is usually the variable that is measured in the research design. 

Descriptive Statistics:  Mean, mode, median, range, standard deviation, variance, sum of squares.

  • Used to describe the nature of a data set and the population it reflects.  Examples are central tendency (mode, median, mean) and variability.

Deviations: å (X-mean)

Discrete Variables: A variable that is limited in its assessment to certain values, usually integers i.e. the data is not continuous; there are gaps between values in the range of data.  Example is gender.

  • Variable limited to certain numbers, usually whole numbers and integers such as counting people or heartbeats. 

EXCEL Symbols:

  • Square Root=^

EXCEL Formulas & Directions: *Note: [ ] symbols=cell boundaries

  • Coefficient of Variance: [=sd/mean x 100)]
    • Enter cell# for SD then cell# for mean—don’t have to enter actual numerical values.
  • Data Analysis Option: >descriptive statistics>input cell ranges (cell:cell)>summarize statistics (then expand columns to read better)
  • Mean: [=average(beginning cell#:ending cell#)]
  • Median: [=median(cell:cell)]
  • Mode: [=mode(cell:cell)]
  • Powers=^ [10 squared would be (10)^2 in the formula bar]
  • Standard Deviation: [=stdev(cell:cell)
  • Sum of Squares: [=sumsq(cell:cell)]
  • Variance: [=var(cell:cell)]

External Validity: The ability to generalize the results of an experiment to the population from which the samples were drawn.

Factor: A component in the design of a study that is combined with other factors to answer multiple questions about the data; a virtual variable that is the result of a combination of two or more variables in a factor analysis design. 

Factorial ANOVA: *(See ANOVA/Factorial Analysis) Analysis of variance performed on more than one factor (e.g. the effects of gender (factor A) and treatment (factor B) analyzed simultaneously).  Factorial ANOVA permits evaluation of the interaction of the factors on the DV.

F Value: Determined by ANOVA.

Figures: Figure titles go on bottom.

Fisher Post Hoc Test: Not used much.

Frequency Distribution:  History of scores.

Frequency:

Graph: A diagrammatic representation of quantities designed to show their relative values; visual representation of data.

Grouped Frequency Distribution: An ordered listing of the values of a variable organized into groups with a frequency column indicating the number of cases included in each group. 

Groupings: General rule is 6-15 intervals.  <6 loses meaning. 

Histogram: A graph plotting blocked scores against frequency; it is commonly known as a bar graph.

Independent Variable:  The part of the experiment that the researcher is manipulating; also called the experimental or treatment variable.

·        The one the researcher is trying to understand.

·        A Categorical Variable (also called a Moderator Variable) is a kind of independent variable except that it cannot be manipulated, for example, age, race, or sex. 

·        Variable that is free to vary and that is not dependent on the influence of another variable; a variable in the research design that is permitted to exert influence over other variables (i.e. the DV) in the study.  The IV is usually controlled by the research design. 

Integer: A whole number; a natural number, the negatives of these numbers, and zero.

Interval Level of Measurement:

  • “Temperature” would be interval because it is continuous with equally spaced values on a scale, no proportional comparison i.e. one temperature is not twice as hot or twice as cool as another.
  • Zero does not denote absence of measure.
  • There is no proportional comparison among scores i.e. one temperature is not twice as cool or twice as hot as another.
  • Keep range narrowed to show realism.

Interval Size: The numerical size of each group in a group frequency distribution i.e. the number of data points in the group.

Kurtosis: A measure of the vertical deviation form normality (amount of peakedness or flatness) in the plot of a data set. 

Labeling Figures & Tables:  Don’t label as “histogram, etc.” if it is obviously a histogram, etc. 

  • Use period Use period at end of title
  • Use sentence structure with only first word capitalized
  • Label axes clearly
  • Put units in full words i.e. Height (in inches)
  • Frequency=Y axis
  • =X axis
  • What sample or population it was taken from in title
  • Table titles go on top; figure titles go on bottom

Leptokurtic: Curve that is more peaked than mesokurtic (normal) curve.

  • NM=Arched like a leopard about to leap.

Level or Variables:  Equivalent in meaning to ”condition” and “treatment.”

Level of Confidence: See “Confidence Interval”

Levels of Measurement:  Nominal, ordinal, interval, ratio.

Line of Best Fit:  Determined by Scattergram points i.e. an “estimate.”

MANOVA: (Multiple Analysis of Variance) Simultaneous analysis of tow or more dependent variables in a research design using analysis of variance. 

  • Similar to others (ANOVA, ANCOVA, etc.) but more than one outcome (DV) variable i.e. example given for HD risk was not just cholesterol but also systolic BP as another risk factor for heart disease. 

Mean: The arithmetic average score in a distribution.

  • (åX/n) (sum of all scores divided by total # of scores or “n”)
  • Provides information about the central tendency of the distribution but is affected by extreme scores.
  • More useful for symmetrical data because outliers can dramatically skew results.
  • Recommended for interval (continuous with equally spaced values on a scale; no proportional comparison) and ratio (continuous; units are equally distanced; values are proportional) data.
  • With symmetrical data, mean is more reflective of “true” central tendency.
  • Trimmed Means: Takes away extremes or ends (10-15%) of bell curve.
    • If trimmed mean and mean are different, then at least one extreme score was trimmed off in the process.  If they are the same, then the extreme scores were not significant. 

Measurement: The process of comparing to a standard.

Median:  The 50th percentile or the score that falls midway in the range of ordered values.

  • (Mdn)  Middle number EXACTLY (if odd numbers this is simple) in the middle of rank order distribution; the “true” value in the middle.
  • “Even Number” Formula: [X (n/2) + X ([n/2] –1)] /2
  • For odd numbers, the Mdn is the simply the score in the middle of the distribution.
  • Mdn is the center score in a ranked distribution.
  • Mdn is a measure of central tendency NOT affected by extreme scores because it’s in the middle of “rank” order.  Rank order does not factor “values” of data points.
  • Mdn is for ordinal and/or highly skewed data.
  • 9, 8, 7, 7, 5, 4, 3 (n=7>odd number so middle # is 7; 7=Mdn)
  • 10, 7, 6, 5, 4, 3 (n=6>even number so use formula; 6+5=11; 11/2=5.5; 5.5=Mdn)

Mesokurtic: The typical, bell-shaped, normal curve.

Mode:  The score in a distribution of values that occurs the most often.

  • Most often occurring score is a set of scores.
  • Tells central tendency in a “normal” curve.
  • When 2 or more scores occur the most, there is a bi-modal or multi-modal distribution and they together serve as the modes.
  • Easiest to detect when the distribution is in rank order (highest to lowest or vice versa) or there is a frequency column (highest number).
  • Does not tell a lot about the central tendency unless the distribution is normal. 
  • 10, 12, 13, 13, 14, 15 (normal distribution; mode=13)
  • 10, 10, 12, 14, 15, 16 (skewed & abnormal distribution; mode=10)

Multiple Regression Analysis: Many IVs predicting just one DV where each variable you are entering into the regression is based on the strength of the outcome variable.

  • Used if you have several X’s from which you want to predict a Y.
  • Subform is “Step Wise Regression Analysis”

N:  # of cases or “total” population.

n:  # of those “sampled” or the sub sample.

Nominal Level of Measurement: (categorical)

  • “Gender” would be nominal because only two categories (or options) and these categories are not ranked or ordered.  Gender is categorical and discreet.

Normal Curve: A curve, which has known characteristics, formed by the bilaterally symmetrical, bell-shaped distribution of values around the mean.

Normal Distribution: Peak occurs in middle of bell-shaped curve at 50th percentile.

Null Hypothesis: (H subscript 0) Hypothesis that predicts absence of a relationship among subjects or no differences between or among subjects.  It is typically the hypothesis that is tested statistically. 

One-Tailed Test: Test of research hypothesis wherein the difference between two mean values is predicted to be significant.  It uses only one tail of the normal curve. 

Order of Operations:

  • Please Excuse My Dear Aunt Sally
  • Parenthesis, Exponents, Multiply, Divide, Add, Subtract
  • ( ), Exp., X, ¸, +, -

Ordinal Level of Measurement: (categorical)

  • “Height” would be ordinal because it can only be ranked from highest to lowest and the intervals are NOT evenly spaced.

Ordinal Scale: A nonparametric listing of data based on order without consideration of the absolute value of each data point i.e. listing from highest to lowest or first, second, third, etc.

  • “Height” would be ordinal if height was classified as at, above, or below average.  Groupings can be ranked relative to one another.  Intervals are not equally spaced or proportional. 

Outlier: A value in a data set that lies beyond the limits of the typical scores. 

Paired T-Test: If not sure about direction, use two-tailed test.  If you use one-tailed test, must justify why you suspect this.

Parameter: A characteristic of a population. 

Parametric: Data that meet the assumptions of normality.

Percentile: A point or position on a continuous scale of 100 theoretical divisions such that a certain fraction of the population of scores lies at or below that point. 

  • %=what area of curve falls above or below mean.

Platykurtic: A curve that is more flat than a mesokurtic (normal) curve.

  • NM=flat like a platypus’ tail.

Population: Group of people, places, or things that have at least one common characteristic.

  • Sample must be “random” but can be in different areas.
  • Must reflect that population but note limits to exact population.

Post Hoc Tests: (p. 161) Identifies the groups that differ significantly; goes beyond just indicating that a difference exist at all.  Tells where the means occur and how significant they are. 

  • Performed after ANOVA tests
  • Examples: Scheffe’s Confidence Interval and Tukey’s Honesty Significant Difference (HSD).
  • Similar to T-Test except they have a correction for familywise alpha errors.
  • Some are more conservative.  Conservative means they are less powerful i.e. they require larger mean differences before significance can be found.
  • Conservative tests offer greater protection from Type I errors but more susceptible to Type II errors. 

Power in Statistics:  “Power” in your study is whether you have the power to find significance.

  • Stat Power=­N or ­n
  • ­N=¯what you need to find significance

Probability of Error: The probability that a statement is incorrect; a value expressed as a decimal that establishes the probability that a statement is incorrect.

P Value: Tells the significance of a relationship.

  • p<.05 (less than 5% chance for error)

Quartile: One-fourth of the range of values.

Quintile: One-fifth of the range of values.

R=Range

Random Sample: A sample taken from a population where every member of the population has an equal chance of being selected in the sample.

Range: The numerical distance from the highest to the lowest score.

  • (max-min=range) Tells extremes of distribution.

Rank Order Distribution: An ordered listing of data in a single column.

Ratio Level of Measurement:

  • “Age” would be ratio because it’s continuous, units are equally distanced i.e. the time between 3 and 4 years is the same as the time between 7 and 8 years old.
  • Values are proportional i.e. 4 years is twice as long as 2 years.

Ratio Scale: A parametric scale of measurement based on order, with equal units of measurement and an established zero point i.e. data based on time, distance, force, or counting events.

Real Limits: The assumed upper and lower values for a group in a grouped frequency distribution that include all possible values on a continuous scale.

Regression: A method of predicting values on the Y-variable based on a value on one or more X-variables and the relationship between the variables; a statistical term meaning prediction. 

  • Also gives Analysis of Variance i.e. knowing X helps to better predict Y.
  • To have a “usable” regression equation, the correlation MUST BE significant.  If X and Y are NOT related, then you cannot predict one from the other!
  • Be careful not to flip X and Y values.

Regression Analysis: Predicts one variable from another.

  • Get a Y intercept i.e. y=mx+b

Reliability: A measure of the consistency of the data when measurements are taken more than once under the same conditions.

Repeated Measures: Measuring the same set of subjects more than once as in a pre-post comparison; same as within-subjects design.

Repeated T-Test: Must use Bonferroni Adjustment to compensate for multiple tests.

R Value:  Tells the strength of the relationship.

  • R=0 means there is no slope and line is flat.  R=2 symbolizes a significant skew. 
  • Significant Skew: Can’t run parametric statistics and must run non-parametric stats instead.

Sample: A portion or fraction of a population.

Sampling Error: The amount of error in the estimate of a population parameter based on a sample.

Scale “Identification” of Variables:  (nominal, ordinal, ratio, or interval)

Scale “Types” of Variables:  (discrete or continuous)

Scattergram: Graph plotting X against Y.

  • Find X, then match with Y, repeat for each X, etc.
  • Estimates so will still have some error.
  • Determines “line of best fit.”

Scatter Plot: A scattering of individual points that produces a visual picture of the relationship between the variables.

Scheffe’s Confidence Interval (I): Post hoc test conducted after a significant ANOVA to determine the significance of all possible combinations of cell contrasts. 

  • Adjust p value for each contrast. (p. 161)

Significant: A statistical term meaning that a relationship or a mean difference is not due solely to chance.

Simple Frequency Distribution: An ordered listing of the values of the variable with a frequency column that indicates the number of cases for each value.

Skewed: A plot of values that is not normal i.e. a disproportionate number of subjects fall toward one end of the scale—the curve is not bilaterally symmetrical.

Skewness: A measure of lateral deviation from normality (bilateral symmetry) in the plot of a data set that reflects asymmetry.

  • Negative Skew: high score/outlier present
  • Positive Skew:  low score/outlier present
  • No Skew:  symmetric with no extreme high/low scores or outliers

Sorting:

Standard Deviation: A measure of the spread, or dispersion, of the values in a parametric data set standardized to the scale of the unit of measurement of the data; the square root of the average of the squared deviations around the mean.

  • Shows how much a typical score deviates from the mean
  • Population or N:  Öå (X-mean)^2/N
  • Sample or n:  Öå (X-mean)^2/n-1
  • Ö of variance (variance=S squared)
  • SD should always be positive (+) as well as variance

Standard Error of the Mean: The numeric value that indicates the amount of error in the prediction of a Y value in bivariate or multivariate regression. 

  • SEm=SD/Öof n (SD of sample; n=sample size)
  • A standard deviation on a normal curve.

Standard Score: (p. 71)  A score that is derived from raw data and that has a standard basis for comparison i.e. it has a known central tendency and a known variability.

  • X4: %, T score, Z score, & stanines

Stanine: A standard score based on the division of the normal curve into nine sections, each of which is one-half of a standard deviation wide, with a mean of 5 and a range of 1 to 9. 

Stratified Sample: A series of samples taken from various subgroups of a population so that the subgroups are represented in the total sample in the same proportion that they are found in the population. 

Statement Sentences: (Examples)

  • ANOVA/Repeated Measures=There is statistical difference among means (F=?, p=?).
  • Correlations:
    • (r=.49, df=35, p<.01) There is a low but positive relationship that is significant between male weight and height.
    • (r=.62, df=21, p<.01) There is a low but positive relationship that is significant between male weight and height.
    • (r=.17, df=35, p>.01) There is a positive relationship that is weak and not significant between male weight and resting heart rate.
    • (r=.28, df=21, p>.10) There is a positive relationship that is weak and insignificant between female weight and resting heart rate.
  • Non-significant correlation=There is no significant correlation between X & Y (need to spell out what X & Y represents) in older men (r=?, p>.05).
    • If you are reporting the “real p value,” then round to two places.
  • Probability of error=(r=.42, p<.05) There is a positive relationship that is strong between variable one (describe) and variable two (describe).
    • If not “strong,” then label “moderate” or “weak.”
  • Significant difference between means=There is a significant difference between the mean of group one (G1 Mean=?, SD=?) and the mean of group two (G2 Mean=?, SD=?) for female athletes (F=?, p<.05). 
    • Need to spell out what is “group one” and “group two.”
    • F Value= represents the ratio of between-group and within-group variance.
  • Significant relationship=There is a significant relationship (need to state whether it is positive or negative) relationship between X & Y in older men (r=-.76, p>.05).  As X increases, Y decreases.
    • Need to spell out what X and Y represents.

Statistic: A characteristic of a sample.

Statistics: A mathematical technique by which data are organized, treated, and presented for interpretation and evaluation.

Step Wise Regression Analysis (subform of Multiple Regression Analysis): Adds one variable at a time but only the ones that add strongly to the equation.

  • Only adds measures that add to the equation that are related.
  • Subpoint is “ANOVA Table”

Sum of Squares:  The sum of the squares of the deviation from a mean of a set of scores.  Reflects deviation around the mean. 

  • Sum of each squared deviation.
  • å (X-mean) squared
  • Example: If mean height for class is 66” then I’m 72” or +6” over the mean.  The deviation is 6”.  6 squared (6x6) is 36.  Each student in the class needs to do the same thing then ALL the squared deviations are summed.

Symmetrical Distribution:  If median and mean are close, probably a symmetrical distribution.

Tables:

  • Table titles go on top.

Tails: If you have a good idea of what way the hypothesis will go, then you’d use a one-tailed test.  If you aren’t sure which way the hypothesis will go, then you’d need to use a two-tailed test.

Tally:

Terminal Statistic: A statistic that does not provide information that can be used in further analysis of the data.

Titles:  Table titles go on top.  Figure titles go on bottom.

Treatment:  Equivalent in meaning to “condition” and “level or variables.”

T Score: A standard score with a mean of 50 and a standard deviation of 10.

  • Often used in education scores. 
  • T scores have a mean of 50 and a range of 0-100 that is easier for the public to understand in comparison to the Z score that has a mean of 0 (most people think “0” means no value and thus don’t think in means 50% like it does with the Z score). 
  • It is very unlikely that a T Score would be less than 20 or more than 80 because these figures represent Z Scores of –3 and +3 i.e. Z Score % of 99.74% which is quite a high % value.

T-Test: Compares two means i.e. evaluates significant difference between group means.

  • Gives positive value
  • P value tells level of significance

T Tests

(compares 2 means)

ANOVA

(2 or more means can be compared)

  1. Paired (same people over and over)
  2. Unpaired (Group 1 vs. 2)

=Repeated Measures (Trial vs. 2, vs. 3, etc.)

=One-Way (G1 vs. G2, G3, G4, etc.)

 

Tukey’s Honestly Significant Difference (HSD): Post hoc test conducted after a significant ANOVA to determine the significance of pairwise cell contrasts. 

  • Used for repeated measures and most often used to check for significance (similar to the Bonferroni). 

Two-Tailed Test: Test of null hypothesis wherein a difference between tow mean values is predicted to be zero.  It uses both tails of the normal curve. 

Type I Error: Rejection of null hypothesis when it is really true.

  • You reject null but shouldn’t have.

Type II Error: Acceptance of the null hypothesis when it is really false. 

  • Don’t reject null but should have.

Unpaired T-Test: Compares two different populations that don’t have correlated data i.e. two different schools. 

Validity: The soundness or correctness of a test or instrument in measuring what it is designed to measure i.e. the truthfulness of the test or instrument.

Variability:  Scatter of scores.

  • Range (useless if you have an extreme score)
  • Doesn’t tell about dispersal
  • “Describe” means to explain meaning specific to that population data set.
  • The four measures of Variability are:
    1. Range
    2. Sum of Squares
    3. Variance (^2)
    4. Standard Deviation

Variable: A characteristic of a person, place, or thing that can assume more than one value.

  • “Variables” or “Level” is equivalent in meaning to “condition” and “treatment.”

Variance:  A measure of the spread, or dispersion, of the values in a parametric data set; the average of the squared deviations around the mean. 

  • Average of squared deviations from mean
  • Shows how much the data vary among the sample
  • S squared=å (X-mean) squared/n-1
  • Should always be positive (+)
  • Variability is error (in part):

o       ­Variability>¯Confidence (clear separation in scatter plot between points)

o       ¯Variability>­Confidence (no clear area of separation so can’t be as confident if you wanted to just pull out one sample as an example of the whole population)

X Axis:

Y Axis: Frequency

Y Intercept: Point where the extension of the best fit line intercepts the Y-axis. 

Z Score: (Z % areas on p. 70) A standard score with a mean of 0 and a SD of 1.

  • Example was PE comparison of two different years of classes.  A Z score makes both years equivalent so scores can be compared.
  • Appendix Table: tells area under the curve and only how far you are from the mean. 
  • On-Line Table: tells total area from negative (-) to positive (+) under the curve.
  • By using Z score table, you know percentile ranking (%=what area of curve falls above or below the mean).
  • To compute Z Score: Z=(x-mean)/SD
  • Z Mean is 0=always 50% above/50% below
    • If you are above the mean, 50% or more of sample is below you.
  • Z scores typically fall between +/- 3.5 but not beyond +/- 5.
  • Have a Z score but need to find %:  Z=-1>-1=15.87%

o       State Above: 84% of group scored above this person who was at 16% percentile based on the normal distribution.

o       Whatever is below the % of standing area under the curve tells the person’s standing in the population. 

   RonJones.Org | Back to Top | Back to CSUN 610 | Site Map 

Ron Jones/www.ronjones.org (11-3-01)

 

                      Get Fit.  Be Strong.
                                
Corporate Wellness · Consulting · Health Promotion