American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Washington, DC: Author.
Barman, A. (2011). Feasibility of applying classic test theory in testing reliability of student assessment. International Medical Journal, 181(2), 110–113.
Beck, A. T., Steer, R. A., & Carbin, M. G. (1988). Psychometric properties of the Beck Depression Inventory: Twenty-five years of evaluation. Clinical Psychology Review, 8.1, 77–100.
Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for the Beck Depression Inventory—II. San Antonio, TX: Psychological Corporation.
Beckman, T. J., Cook, D. A., & Mandrekar, J. N. (2005). What is the validity evidence for assessments of clinical teaching? Journal of General Internal Medicine, 20(12), 1159–1164.
Bergkvist, L., & Rossiter, J. R. (2007). The predictive validity of multiple-item versus single-item measures of the same constructs. Journal of Marketing Research, 44(2), 175–184.
Betty Ford Institute Consensus Panel. (2007). What is recovery? A working definition from the Betty Ford Institute. Journal of Substance Abuse Treatment, 33(3), 221–228.
Blalock, H. M. (1968). The measurement problem: a gap between the languages of theory and research. In H. M. Jr. Blalock, A. Blalock (Ed.), Methodology in social research (pp. 5–27). New York: McGraw Hill. (p.140)
Blalock, H. M. (1984). Basic dilemmas in the social sciences. Beverly Hills: Sage.
Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.
Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110(2), 305–314.
Borsboom, D. (2008). Latent variable theory. Measurement, 6, 25–53.
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071.
Brayfield, A. H., & Rothe, H. F. (1951). An index of job satisfaction. Journal of Applied Psychology, 35(5), 307–311.
Brennan, R. (2000). (Mis)conceptions about generalizability theory. Educational measurement: Issues and practice, 19(1), 5–10.
Brennan, R. L. (2006). Perspectives on the evolution and future of educational measurement. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 1–16). Westport, CT: Praeger.
Brennan, R. L. (2011). Generalizability theory and classical test theory. Applied Measurement in Education, 24, 1–21.
Cabrera-Nguyen, P. (2010). Author guidelines for reporting scale development and validation results in the Journal of the Society for Social Work and Research. Journal of the Society for Social Work and Research, 1(2), 99–103.
Campbell, D. T. (1960). Recommendation for APA test standards regarding construct, trait, or discriminant validity. American Psychologist, 15, 546–553.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the Multitrait–Multimethod Matrix. Psychological Bulletin, 56, 81–105.
Cattell, R. B. (1978). Scientific use of factor analysis in behavioral and life sciences. New York: Plenum.
Cizek, G., Bowen, D., & Church, K. (2010). Sources of validity evidence for educational and psychological tests: A follow-up study. Educational and Psychological Measurement, 70(5), 732–743.
Cizek, G., Rosenberg, S. L., & Koons, H. H. (2008). Sources of validity evidence for educational and psychological tests. Educational and Psychological Measurement, 68(3), 397–412.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in scale development. Psychological Assessment, 7(3), 309–319.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. New York: Lawrence Erlbaum.
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997–1003.
Corcoran, K., & Fischer, J. (2000). Measures for clinical practice. New York: The Free Press. (p.141)
Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78(1), 98–104.
Cronbach, L. J. (1950). Further evidence on response sets and test design. Emotional and Psychological Measurement, 10, 3–31.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302.
Davis, L. L. (1992). Instrument review: Getting the most from a panel of experts. Applied Nursing Research, 5(4), 194–197.
Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of Business Research, 61(12), 1203–1218.
Diamantopoulos, A., & Siguaw, J. A. (2006). Formative versus reflective indicators in organizational measure development: a comparison and empirical illustration. British Journal of Management, 17(4), 263–282.
Diamantopoulos, A., & Winklhofer, H. M. (2001). Index construction with formative indicators: an alternative to scale development. Journal of Marketing Research, 38(2), 269–277.
Dickersin, K., Min, Y. I., & Meinert, C. L. (1992). Factors influencing publication of research results. JAMA, 267(3), 374–378.
Edelen, M. O., & Reeve, B. B. (2007). Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement. Quality of Life Research, 16(1), 5–18.
Embretson, S. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93(1), 179–197.
Embretson, S. (2010). Cognitively based assessment and the integration of summative and formative assessments. Measurement: Interdisciplinary Research & Perspective, 8(4), 180–184.
Fanelli, D. (2010). “Positive” results increase down the hierarchy of the sciences. PLoS One, 5(4), e10068.
Feldman, R. A., & Siskind, A. B. (1997). Outcomes measurement in the human services. Washington, DC: National Association of Social Work.
Freese, J. (2007). Replication standards for quantitative social science: Why not sociology? Sociological Methods & Research, 36(2), 153–172.
French, D. P., & Sutton, S. (2011). Does measuring people change them? The Psychologist, 24, 272–274.
Friedmann, M. (1975). Interview with Richard Heffner on The Open Mind. http://www.thirteen.org/openmind/public-affairs/living-within-our-means/494/
Frisbie, D. A. (2005). Measurement 101: Some fundamentals revisited. Educational Measurement: Issues and Practice, 24(1), 21–28. (p.142)
Gehlert, S. J. (1994). The applicability of generalizability theory to social work research and practice. The Journal of Social Service Research, 18, 73–88.
Gillespie, D. F. (1988). Barton’s theory of collective stress is a classic and worth testing. International Journal of Mass Emergencies and Disaster, 6(3), 345–361.
Graham, J. R. (1987). The MMPI: A practical guide. Oxford University Press.
Green, S. B., Lissitz, R. W., & Mulaik, S. A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37(4), 827–838.
Guilford, J. P. (1954). Psychometric methods. New York: McGraw-Hill.
Guo, B., Perron, B., & Gillespie, D. F. (2008). A systematic review of structural equation modeling in social work research. British Journal of Social Work, 39, 1556–1574.
Hallgren, K. A. (2012). Computing inter-rater reliability for observational data: An overview and tutorial. Tutorials in Quantitative Methods for Psychology, 8(1), 23–34.
Hambleton, R. K., Swaminathan, H. Y., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park: Sage.
Hattie, J. (1985). Methodology review: Assessing unidimensionality of tests and items. Applied Psychological Measurement, 9, 139–164.
Haynes, S. N., Richard, D., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7(3), 238–247.
Heggestad, E. D., George, E., & Reeve, C. L. (2006). Transient error in personality scores: Considering honest and faked responses. Personality and Individual Differences, 40, 1201–1211.
Hinkin, T. R. (1995). A review of scale development practices in the study of organizations. Journal of Management, 21(5), 967–988.
Hogan, T. P., & Angello, J. (2004). An empirical study of reporting practices concerning measurement validity. Educational and Psychological Measurement, 64, 802–812.
Hogan, T. P., Benjamin, A., & Brezinski, K. L. (2000a). Reliability methods: A note on the frequency of use of various types. Educational and Psychological Measurement, 60(4), 523–531.
Hogan, T. P., Benjamin, A., & Brezinski, K. L. (2000b). Reliability methods: Frequency of use of various types. Educational and Psychological Measurement, 60, 523–531.
Howell, R. D. (2008). Observed variables are indeed more mysterious than commonly supposed. Measurement: Interdisciplinary Research & Perspective, 6, 97–101. (p.143)
Hunter, J. E., & Schmidt, R. L. (2004). Methods of meta-analysis: correcting error and bias in research findings (2nd ed.). Thousand Oaks, CA: Sage Publications, Inc.
Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of consumer research, 30(2), 199–218.
Johnston, M. (1999). Mood in chronic disease: Questioning the answers. Current Psychology, 18, 71–87.
Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational measurement, 38(4), 319–342.
Kane, M. (2011). The errors of our ways. Journal of Educational Measurement, 48(1), 12–30.
Kashy, D. A., Donnellan, M. B., Ackerman, R. A., & Russell, D. W. (2009). Reporting and interpreting research in PSPB: Practices, principles, and pragmatics. Personality and Social Psychology Bulletin, 35(9), 1131–1142.
Kelvin, W. T. (1871). Presidential inaugural address to the General Meeting of the British Association, Edinburgh.
Kerlinger, F. (1968). Foundations of behavioral research. New York: Holt, Rinehart and Winston.
King, G. (1995). Replication, replication. PS: Political Science and Politics, 28, 444–452.
Kishton, J. M., & Widaman, K. F. (1994). Unidimensional versus domain representative parceling of questionnaire items: An empirical example. Education and Psychological Measurement, 54(3), 757–765.
Kline, P. (1979). Psychometrics and psychology. London: Academic Press.
Kline, R. B. (2010). Principles and practice of structural equation modeling. New York: Guilford Press.
Kline, R. B. (2012). Assumptions of structural equation modeling. In R. Hoyle (Ed.), Handbook of structural equation modeling (pp. 111–125). New York: Guilford Press.
Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press.
Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3, 635–694.
Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading MA: Addison-Welsley Publishing Company.
MacKenzie, S. B. (2003). The dangers of poor construct conceptualization. Journal of the Academy of Marketing Science, 31(3), 323–326.
Madsen, D. (2004). Stablility coefficient. In M. S. Lewis-Beck, A. Bryman, & T. F. Liao (Eds.), Encyclopedia of social research methods (pp. 1064–1065). Thousand Oaks, CA: SAGE. (p.144)
Marsh, H. W. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors?. Journal of Personality and Social Psychology, 70(4), 810.
Mayfield, D., McLeod, G., & Hall, P. (1974). The CAGE questionnaire: validation of a new alcoholism screening instrument. American Journal of Psychiatry 131(10), 1121–1123.
Maxwell, J. C. (1870). Address to the Mathematical and Physical Sections of the British Association, Liverpool.
McCambridge, J. & Day, M. (2007). Randomized controlled trial of the effects of completing the Alcohol Use Disorders Identification Test questionnaire on self-reported hazardous drinking. Addiction, 103, 241–248.
McGrath, J. E. (1982). Dilemmatics: The study of research choices and dilemmas. In J. E. McGrath, J. Martin, and R. A. Kulka (eds.), Judgment Calls in Research. Beverly Hills: Sage, 69–102.
McGrath, R. (2005). Conceptual complexity and construct validity. Journal of Personality Assessment, 85(2), 112–124.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: Macmillan.
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749.
Mileti, D. (1999). Disasters by design: A reassessment of natural hazards in the United States. Washington DC: National Academies Press.
Mills, C. W. (1959). The sociological imagination. New York: Oxford University Press.
Mowday, R. T., Steers, R. M., & Porter, L. W. (1979). The measurement of organizational commitment. Journal of Vocational Behavior, 14(2), 224–247.
Muchinsky, P. M. (1996). The correction for attenuation. Educational and Psychological Measurement, 56(1), 63–75.
Mulaik, S. A. (1994). The critique of pure statistics: Artifact and objectivity in multivariate statistics. In B. Thompson (Ed.), Advances in social science methodology (pp. 241–289). Greenwich, CT: JAI.
Netemeyer, R. G., Bearden, W. O., & Sharma, S. (2003). Scale development in the social sciences: Issues and applications. Palo Alto: Sage Publications.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.
Perron, B. E., Zeber, J. E., Kilbourne, A. M., & Bauer, M. S. (2009). A brief measure of perceived clinician support by patients with bipolar spectrum disorders. The Journal of Nervous and Mental Disease, 197(8), 574–579. (p.145)
Raykov, T., & Marcoulides, G. A. (2011). Classical item analysis using latent variable modeling: A note on a direct evaluation procedure. Structural Equation Modeling, 18(2), 315–324.
Rogers, W. M., & Schmitt, N. (2004). Parameter recovery and model fit using multidimensional composites: a comparison of four empirical parceling algorithms. Multivariate Behavioral Research, 39, 379−412.
Rossiter, J. R. (2002). The C-OAR-SE procedure for scale development in marketing. International Journal of Research in Marketing, 19(4), 305–335.
Rozeboom, W. W. (1966). Scaling theory and the nature of measurement. Synthese, 16, 170–233.
Rubio, D. M., Berg-Weger, M., Tebb, S. S., Lee, E. S., & Rauch, S. (2003). Objectifying content validity: Conducting a content validity study in social work research. Social Work Research, 27(2), 94–104.
Schmidt, F. (2010). Detecting and correcting lies that data tell. Perspectives on Psychological Science, 5(3), 233–242.
Schmidt, F. L., & Hunter, J. E. (1996). Measurement error in psychological research: Lessons from 26 research scenarios. Psychological Methods, 1(2), 199–223.
Schmidt, F. L., & Hunter, J. E. (1999). Theory testing and measurement error. Intelligence, 27(3), 183–198.
Schmidt, F. L., Le, H., & Ilies, R. (2003). Beyond alpha: An empirical examination of the effects of different sources of measurement error on reliability estimates for measures of individual differences constructs. Psychological Methods, 8(2), 206–224.
Schmidt, F. L., Viswesvaran, C., & Ones, D. (2000). Reliability is not validity and validity is not reliability. Personnel Psychology, 53(4), 901–912.
Schumacker, R. E., & Lomax, R. G. (2004). A beginner’s guide to structural equation modeling. New York: Lawrence Erlbaum.
Sellers, S.L., Mathesien, S., Perry, R., & Smith, T. (2004). Evaluation of social work journal quality: Citation vs reputation approaches. Journal of a Social Work Education, 40(1), 143–160.
Society for Social Work and Research. (2005). Peer review and publication standards in social work journals: The Miami statement. Social Work Research, 29(2), 119–121.
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103(2648), 677–680.
Streiner, D. L., & Norman, G. R. (2008). Health measurement scales: a practical guide to their development and use. Oxford University Press.
Suddaby, R. (2010). Editor’s comments: Construct clarity in theories of management and organization. Academy of Management Review, 35(3), 346–357. (p.146)
Thompson, B., & Daniel, L. G. (Ed.). (1996). Factor analytic evidence for the construct validity of scores: A historical overview and some guidelines. Education and Psychological Measurement, 56(2), 197–208.
Viswanathan, M. (2005). What causes measurement error?. In T. Alpern, M. Crouppen, L. Lech, L.C. Shaw, & M. Viswanathan (Eds.), Measurement error and research design (1st ed., pp. 135–148). Thousand Oaks: Sage.
Williams, L. J. & O’Boyle, E. H. (2008). Measurement models for linking latent variables and indicators: A review of human resource management research using parcels. Human Resource Management Review, 18, 233–242.
Worthington, R. L., & Whittaker, T. A. (2006). Scale development research a content analysis and recommendations for best practices. The Counseling Psychologist, 34(6), 806–838.