Bahariniya S, Ezatiasar M, Madadizadeh F. A Brief Review of the Types of Validity and Reliability of scales in Medical Research. JCHR. 2021; 10 (2) :100-102
Dear Editor in Chief
Reliability and validity are concepts used to evaluate the quality of  psychometric properties of a scale in medical research. Reliability is related to reproducibility, and validity is related to the accuracy of a scale. It is important to check reliability and validity when using a scale or questionnaire. The present study provides a brief overview of the types of validity and reliability in medical research.
  • Reliability
Reliability deals with whether the measuring instrument in different repetitions with the same conditions and the same subject group offers the same results or undergoes a drastic change (1). In other words, reliability refers to the stability of the results in the repetitive measurement of a scale.
The different types of reliability measurement methods are as follows (2)
  1. Test-retest reliability
In this method, the subjects are measured twice
at intervals of one or two weeks with the same instrument. Finally, the correlation (usually intraclass correlation coefficient (ICC) through two way mixed-effect model with absolute agreement type) is calculated between the scores recorded in the first and second measurements.
  1. Interrater reliability
This method examines whether different people will achieve the same results if they use the desired tool or not. ICC and Kappa statistics used to measure interrater reliability for ordinal/interval and nominal responses, respectively (2).
  1. Internal consistency
In this method, the internal consistency of the questions is examined. In other words, the question is answered whether the results of different parts of the tool are in line with the general results of the tool or not.
Kuder-Richardson Formula 20 (KR20), ' 'Cronbach's alpha (α), and person separation reliability (R) are three criteria for calculating internal consistency. The internal consistency of a tool is strongly influenced by random responses so that by increasing random responses in the data, the degree of internal consistency decreases (3).
  • Validity
Validity is a term that refers to the goal that the test is designed to achieve. It is adapted from the word "correct and permissible," and its meaning is "correctness." In other words, validity means how accurately a scale measures what is designed to be measured. If the research is highly valid, it means that it produces results that are consistent with reality.
The different types of validity measurement methods are as follows:
  1. Face validity
It deals with the appearance of the tool, and the question is whether the scale you are considering is valid or not. Usually, word shifting can improve face validity to some extent. In fact, face validity examines the level of difficulty, the degree of appropriateness, and ambiguity of questions in scale or questionnaires. Face validity can be measured in two ways, qualitatively and quantitatively. In the qualitative method, several subjects or specialists (usually 5 to 10 people) are interviewed about the level of difficulty, appropriateness, and ambiguity in the questions.
In a quantitative method, respondents are asked to rate each question in the questionnaire from "Not at all important," "Slightly Important," Important," "Fairly Important," and "Very Important." Finally, the average rating given to each question is calculated and multiplied by the percentage of people who considered that question important and very important, and finally, a score is calculated for each question. Questions with a score of less than 1.5 will be removed from the set of questions (4).
  1. Content validity
In this method, the content of the scale is examined. Content validity can be evaluated qualitatively and quantitatively. In the qualitative method, experts are asked to express their views on the observance of language grammar and how to score each question's options. If corrections are needed, corrections should be made according to their opinion (5). In the quantitative method, the content validity ratio (CVR) is calculated by examining the necessity of having a question, and the content validity index (CVI) index is calculated to check the relevance of the question to the purpose of the research (6).
  1. Construct validity
Construct validity refers to the degree of efficiency of a scale and seeks to answer whether the results presented by the tool in question are consistent with theoretical evidence. This type of validity can be checked by factor analysis (exploratory and confirmatory factor analysis)
  1. Criterion validity
This type of validity compares the scale results with the results of other existing scales with the same meaning. The two most common forms of validity are predictive validity and concurrent validity (other types are convergent and discriminant). In predictive validity, based on the current scale, the characteristics of individuals in the future are predicted, and in simultaneous validity, the current instrument simultaneously examines the status of the participants with an existing instrument and compares their results. Pearson correlation coefficient is a measure to check this type of validity (8).
In summary, the validity and reliability of the tool are major concerns for health researchers. Tools that do not have the required validity and reliability do not provide reliable results. Researchers need to be familiar with the concepts of validity and reliability that are briefly presented in this article to be used in research desirably and thus pave the way to increase health research quality.
Author's contribution
S.B. and F.M. conceived of the presented idea. S.B. wrote the manuscript with support from F.M., S.B., and F.M. read the manuscript and verified it.
Conflict of interest
The author had no conflict of interest.
