# Statistics: Concept and definitions

posted in: Misc | 0

The first module of the “Practical Statistics for HCI” asks you to research the meanings of terms used in statistics (e.g., factor, level, and so forth). Here are my answers. Note that there could be mistakes and I will be very glad if you can point them out.

Factor and Level

A factor of an experiment is a variable that you control. You observe how the result of experiment changes as you set different levels, possible amount or magnitude in the factor. Terms such as independent variable, predictor variable, feature, and input variable are used interchangeably for the term “factor”. Terms such as treatment and groupare used for “level” depending on literatures.

References:

Independent variable

As mentioned above, the term independent variable is similar to factor. One source suggests that the independent variable implies causality (i.e. a values of a dependent variable varies due to the change in independent variable), whereas the factor does not imply any causality.

References:

Dependent variable

The dependent variable is a value that is studied in an experiment, where you test if the value changes depending on the change in a value of an independent variable.

References:

Measure

There are four types of measures of values in statistics: nominal, ordinal, interval and ratio. I will explain these terms shortly.

References:

Trial

A trial is an instance of an experiment. For example, imagine you are measuring the height of all the persons in your research team. Here, the act of measuring one person’s height is the trial.

Covariate

The covariate is another term for the factor and the independent variable.

References:

Within-subject design and within-subjects factor (repeated measure)

The within-subjects design is a type of an experiment where participants are tested on more than one level of a factor. The independent variable in the within-subjects design is called  within-subjects factor.

For example, imagine an usability test of two user interfaces A and B. If each participant of your study use both A and B and you measure the usability (e.g., task error rate, task completion time), the usability test is the within-subjects design and the interface is the within-subjects factor.

References

Between-subjects design and between-subjects factor

The between-subjects design is a type of an experiment where each participants are tested on one and only one level of a factor. The between-subjects factor refers to the independent variable in the between-subjects design.

For example, imagine an usability test of two user interfaces A and B. If each participant use only one of A or B and you measure the usability, the test is the between-subjects design.

References

Factorial design

The factorial design is an experiment whose design consists of two or more factors.

References

Main effect

The main effect is the effect of an independent variable on a dependent variable averaging across the levels of any other independent variables. The main effect does not take into account of interaction between two or more independent variables.

References

Interaction

The interaction is the effect caused by two or more factors in the factorial design that interfere to each other.

References:

Mixed factorial design

The mixed factorial design is, as the name suggests, a mixture of the between-subjects design and the within-subjects design. In the mixed factorial design, at least one independent variable is a between-subjects factor and at least one independent variable is a within-subjects factor.

References:

Confound

The confound, or the confounding variable is a factor that is not considered in the study but correlates with both the dependent variable and the independent variable.

Imagine you are testing the effect of  time participants spend working out (independent variable) to a health-index (dependent variable). Let’s say you found out that more you work out, healthier you are. For such a setting, age could be a confound. For example, you have 10 participants who are 20-30 years old (young), and 10 participants who are 60-70 years old (old). (I guess) The young group would work out more than the old group, and the young group are likely to be healthier than the old group. The problem here is that you cannot be sure if the result you found actually comes from the fact participants work out more, or they are healthier because they are young.

References

Control

By controlling, you try to separate the effect of confound from the main effect.

References

Carryover effect and counter balancing