With all the investments that colleges and universities make in trying to develop their academic leaders—sending them to conferences and workshops, creating their own in-house professional development programs, assigning new leaders to mentors, and so on—institutions want to know whether they’re getting any return on their investment. In short, does the leadership development that current and prospective academic leaders participate in make any real difference? If so, what difference does it make? And in either case, how do we know?

Two of the most common approaches we can take to assess the impact of leadership development are the Kirkpatrick Model and the Holton Model. The former s named for Donald Kirkpatrick, who taught for many years at the University of Wisconsin and served as president of the American Society for Development and Development (ASTD; now the Association for Talent Development or ATD. Kirkpatrick believed there were four levels at which development could affect a participant.

Level 1: Reaction

This level of reaction measures whether people liked the development. It is generally assessed as soon as the workshop or development program is over. The participants are given a survey and asked whether the development met their needs. Did it cover the right topics? What other topics should have been included? Was the person in charge of the development effective? What was good about the development, and what could have been done better? Level 1 assessments tell the institution the degree to which participants were satisfied by the program.

Level 2: Learning

This level of learning measures what participants took away from the development. It is generally assessed at multiple points throughout the program. For example, a pretest/posttest approach can be used to determine what the participants knew about the topic before the program began as opposed to what they knew about the topic after the program concluded. While the posttest is typically given immediately after the completion of the development, it may be beneficial to delay that activity one–three months after the program so that the results reflect long-term rather than merely short-term memory. Level 2 assessments tell the institution the degree to which participants were informed by the program.

Level 3: Behavior

This level of learning measures whether participants changed what they do because of the development. At this level of assessment, it’s useful to survey not just the participants themselves but also the stakeholders working with those participants. Do the academic leaders’ supervisors, peers, and subordinates notice any improvement in the area covered by the leadership development? Is the material from the workshop or program actually being applied? Level 3 assessments are typically conducted three months to a year after the conclusion of the development. They tell the institution the degree to which participants were transformed by the program.

Level 4: Results

This level of results measures whether the institution derived any tangible benefit from the development. Did the investment made in the workshop or program pay off? Has student learning increased? Have interpersonal conflicts decreased? Have resources been saved as a result of the changes in behavior observed during the Level 3 assessment? Results may not be observed for a year or more after the development has occurred, so Level 4 assessment requires institutions to exercise a good deal of patience. This type of assessment tells the institution the degree to which the institution benefited from the program.

Like all forms of assessment, the Kirkpatrick model shouldn’t be confused with individual evaluation. It is intended to tell you how well the development program functioned, not how well individual participants did during their development. It may well be that some academic leaders liked the development and learned a lot (Levels 1 and 2) but was not able to implement certain aspects of what they learned (Levels 3 and 4) because of external circumstances such as a lack of funds or resistance from their boss.

Most institutions that provide leadership development engage in Level 1 assessment. They distribute surveys after a session is completed and find out whether participants enjoyed the program. Far fewer engage in Level 2 assessment, and Levels 3 and 4, although not unheard-of in the corporate world, are quite rare in higher education. As a result, most assessment data collected on leadership development programs consist of satisfaction data. They tell institutions very little about what participants learned from the program, how their behavior changed, or whether academic leadership improved as a result.

For this reason, there are two primary approaches that colleges and universities can use to improve the way in which they assess the impact of academic leadership development. Larger institutions where leadership development centers have a sizable staff can assign impact assessment as part of someone’s duties. If a development program is relatively complex, consisting perhaps of several dozen workshops each semester and a cohort-based leadership program or mentorship opportunity, assessment can become almost a full-time job. Because not all four of the Kirkpatrick levels can be assessed immediately, someone needs to track the best time to follow up with participants and their stakeholders after the development is completed, conduct the assessment, and interpret the results.

The other approach is to fold leadership assessment into the responsibilities of an office of institutional research. Because this type of office is already well prepared to assess student learning outcomes and conduct program reviews, it has the expertise needed to perform this type of ongoing research. What may be lacking, however, is upper administration’s desire to add to the responsibilities of a unit that, at most colleges and universities, is already severely overworked.

The Kirkpatrick Model, though widely known in the area of human resource development, has its critics. Unless it is completely incorporated into an institution’s program review process, it tells you little about whether a leadership development program is sustainable or central to the institution’s mission. In addition, it doesn’t account for such factors as the participants’ motivation to learn and their attitude and the institutional or departmental climate. For this reason, we’ll explore a second major approach to assessing leadership development, the Holton Model, in Part 2 of this article.


Gmelch, W.H., & Buller, J.L. (2015). Building Academic Leadership Capacity: A Guide to Best Practices. San Francisco, CA: Jossey-Bass.

Kirkpatrick, D.L., & Kirkpatrick, J.D. (2006). Evaluating Training Programs: The Four Levels. San Francisco, CA: Berrett-Koehler.

Kirkpatrick, J.D., & Kirkpatrick, W.K. (2016). Kirkpatrick’s Four Levels of Training Evaluation. Alexandria, VA: ATD Press.

Jeffrey L. Buller is director of leadership and professional development at Florida Atlantic University and senior partner in ATLAS: Academic Training, Leadership & Assessment Services. His latest book, the second edition of The Essential Academic Dean or Provost: A Comprehensive Desk Reference, is available from Jossey-Bass.


Reprinted from “Assessing the Impact of Leadership Development: Part 1, The Kirkpatrick Model” in Academic Leader 33.8(2017)1,6 © Magna Publications. All rights reserved.