How can multimedia enhance cognition?

The Cognitive Theory of Multimedia Learning applied to instructional products

The cognitive theory of multimedia learning (CTML) (Clark & Mayer, 2024) is increasingly vital to instructional design. CTML is grounded in four fundamental assumptions that elucidate how humans process information: dual channel, limited capacity, short duration, and active processing (Clark & Mayer, 2024). Effective application of these principles can minimize extraneous processing, manage essential processing, and foster generative processing.

Design in Context

The Learning and Development (L&D) department at Road Nation Car Insurance was tasked with training employees on a new Insurance Claims Management software that would be used company wide. Instructional videos were designed and developed to train employees on software use. These videos consisted of screen captures that demonstrated different actions being performed in the software, with descriptions about each task presented via on-screen text. The videos required learners to focus on mouse movements in the video to learn the workflow. Initial performance assessments revealed that 85% of employees were unable to successfully complete claim processes using the new insurance claims management software within two standard deviations of the average completion time. This resulted in a 30% increase in the claims backlog and a 15% decrease in customer satisfaction ratings as measured by customer satisfaction surveys.

Discussion Questions

What mistakes do you think the team made when designing and developing these instructional videos, based on previous experiences you’ve had with similar training?
What improvements should be made to the video to improve its effectiveness? [i.e. What do you think helped you make that decision?]

Key Design Principles

Key principles for designing effective multimedia include:

Design for both the visual/pictorial channels and the auditory/verbal channels of our information processing systems (Mayer, 2005). This can be done by using strategies from the Cognitive Theory of Multimedia Learning

Do not overwhelm either information processing channel by using strategies that manage the cognitive loadntegrate accessibility features to design for maximize perceivability and operability for all learners.

Introduction

The cognitive theory of multimedia learning (CTML) (Clark & Mayer, 2024) is increasingly vital to instructional design, particularly as technology continues to evolve and reshape how we deliver and engage with learning materials. CTML provides a framework for understanding how learners process information through multimedia formats, such as videos, animations, and interactive modules. As instructional environments become more reliant on digital tools, especially as a result of the COVID-19 pandemic (Adedoyin & Soykan, 2023), it is essential to explore how these principles can guide learning design practices and enhance learning outcomes.

CTML is rooted in cognitive theories that discuss how individuals process information. Baddeley's (1992) model of working memory, cognitive load theory (CLT) (Sweller et al., 1990), and Wittrock's (1992) generative learning theory provide foundational insights into learning processes. Baddeley's (1992) model of working memory posits that memory consists of multiple components, including the phonological loop, visuospatial sketchpad, and the central executive, which together manage the processing and storage of information. When learners are presented with both visual and auditory information, they can utilize separate channels for processing, thereby enhancing their ability to retain and manipulate information. Furthermore, Sweller et al. (1990) emphasized the importance of managing cognitive load to enhance learning outcomes. Instructional materials should be designed to minimize extraneous cognitive load, that is, mental effort that is imposed on a person via ineffective unnecessary tasks that do not directly contribute to learning or problem-solving. Learners should be able to focus on intrinsic load, effort needed to process the inherent complexity of learning material, and germane load, the effort required for processing, construction, and automation of mental models, encoding knowledge into long-term memory. Wittrock's (1992) generative learning theory further complements these cognitive foundations by focusing on the active role of learners in constructing meaning from instructional materials. Wittrock (1992) argued that learners engage in generative processes, such as summarizing, questioning, and making connections, which enhance comprehension and retention. This theory underscores the importance of designing multimedia materials that encourage active engagement, allowing learners to generate their own understanding rather than passively receiving information.

Principles of Multimedia Learning

CTML is grounded in four fundamental assumptions that elucidate how humans process information: dual channel, limited capacity, short duration, and active processing (Clark & Mayer, 2024). The dual channel assumption sets forth that humans have separate channels for processing visual and auditory information. This principle is supported by the notion that the brain can process information through these two modalities simultaneously, which enhances learning when these channels are effectively used together (Mayer & Moreno, 2003). The limited capacity assumption highlights the constraints of working memory, which can only hold a finite amount of information at any given time. The short duration assumption refers to the temporary nature of sensory memory, which retains information for only a brief period before it is either processed further or lost. Finally, the active processing assumption emphasizes the necessity for learners to engage actively with the material to construct meaningful knowledge. This involves not only receiving information but also organizing, integrating, and applying it (Mayer, 2024).

Given such characteristics and limitations, Clark and Mayer (2024) proposed different principles that guide the design of effective multimedia instructional materials to manage mental load. Among these principles are Coherence, Signaling, Redundancy, Contiguity, Segmenting, Pretraining, Personalization, and Embodiment, which we are focusing on for this chapter. Each principle enhances the learning experience and is aligned with how humans process information:

The signaling principle suggests that learners benefit from cues that highlight the organization and important aspects of the material.
Following the redundancy principle, designers should prioritize using audio and graphics only in multimedia instead of also including on-screen text.
The contiguity principle suggests that learners also perform better when corresponding words and pictures are presented close together rather than separated.
The segmenting principle posits that people learn better when information is presented in manageable segments rather than as an uninterrupted flow.
The pre-training principle emphasizes that learners benefit from prior exposure to key concepts before engaging with more complex material.
The personalization principle suggests that learners learn better when the instructional material is presented in a conversational style rather than a formal tone.
Finally, the embodiment principle indicates that learning is more effective when on-screen agents perform relevant gestures and maintain eye contact.

Effective application of these principles can minimize extraneous processing, manage essential processing, and foster generative processing. It is important to consider that this is not an exhaustive list of principles of multimedia learner; rather, these principles have been selected due to their ample applicability to a variety of multimedia types.

Minimizing Extraneous Processing

To optimize learning through multimedia, it is essential to create instructional materials that leverage both auditory and visual modalities. For instance, narration can provide verbal explanations while accompanying visuals, such as infographics or videos, illustrate the concepts being discussed. This approach allows learners to process information through both channels simultaneously, which has been shown to improve comprehension and transfer (Mayer & Moreno, 2003). The redundancy and contiguity principles relate to this guidance.

The image below illustrates the application of these principles in an instructional video discussing best practices for use of Articulate Rise 360, a web-based e-Learning development tool. Narration is used to present instructional content, while the visuals provide an example that contextualizes the concept being discussed (Proximity). This practice highlights the redundancy principle. Furthermore, temporal and spatial contiguity are also considered in the video as the proximity between the elements presented on the screen is emphasized with the blue arrow as the concept is discussed in the narration.

A screenshot of a video demonstrating contiguity.

Figure 1. Example of contiguity

It is also crucial to eliminate seductive details such as unnecessary animations, non-relevant content, and multimedia elements that do not contribute to the learning objectives in instructional materials (Rey, 2012). For example, in a module that introduces a scientific concept, superfluous animations or distracting background music should be avoided, as they can overwhelm learners and detract from the core content. Instead, the focus should be on clear, relevant visuals that directly relate to content and contribute to understanding. The coherence principle relates to this practice.

Effective signaling within instruction is also important for promoting learning and decreasing cognitive overload. Appropriate signaling can guide learners' attention to essential information and enhance their ability to process and retain knowledge, in addition to their satisfaction (Sung & Mayer, 2012). Using visual cues such as color changes or animations to highlight specific content or elements in an instructional video can, for instance, significantly improve comprehension. Additionally, employing verbal cues in the narration to emphasize critical concepts can further aid in directing learners' focus. Modern video editing software, such as TechSmith Camtasia, Adobe Premiere, and DaVinci Resolve simplify the application of the signaling principle in videos by offering user-friendly strategies to incorporate on-screen emphasis. See Figures 2 and 3 for examples of signaling in a training focused on performing statistical analysis in a software called JASP.

Screenshot from JASP software showing a highlighted section.

Figure 2. Example of signaling: Highlight

Screenshot from software showing highlighted content, zoomed in..

Figure 3. Example of signaling: Zoom in.

Manage Essential Processing

Providing pre-training relevant to the target audience is an effective strategy that makes content complexity manageable from the start, thus avoiding cognitive overload. Pre-training involves providing learners with foundational knowledge before delving into more complex material. This approach prepares learners to engage with the content actively and enhances their ability to integrate new information with prior knowledge. Before introducing a new process, product, or concept, a brief overview of essential terminology that learners will encounter can help them build a framework for understanding the content that follows (Mayer, 2021). When teaching novice e-Learning developers how to use an interactive development tool such as Articulate Storyline, for example, it is important to introduce relevant terminology associated with important functionality, such as variables, triggers and states.

The segmenting principle can also assist in managing essential processing. Segmentation involves grouping related information into smaller units, making it easier for learners to process and remember content (Biard et al., 2018). For instance, a procedural video on a software application can present features in distinct sections established by using visual cues (such as title slides) or platform functionality, such as YouTube’s chapters (Seidel, 2024). That said, the strategies adopted for segmentation can be dependent upon contextual and platform variables: for example, Learning Management System such as Canvas does not offer the same chapter feature included in YouTube; thus, designers should segment video in different ways, such as breaking down content into different parts or adding title slides before a new concept is introduced.

Fostering Generative Processing

Fostering generative processing multimedia design is essential for creating effective learning experiences that engage learners and facilitate meaningful understanding. Learners need to actively engage with the material to foster generative processing, which can be facilitated by the incorporation of personalization and embodiment principles (Clark & Mayer, 2024; Mayer, 2021). Using a conversational style in instruction can significantly enhance learner engagement and facilitate generative processing. Research shows that learners respond more positively to an instructor that uses a friendly, informal tone compared to formal, academic language. Positive emotions can also enhance generative processing (Clark & Mayer, 2024).

The use of on-screen agents or avatars can provide a presence that guides learners through the material. An on-screen agent in a language learning module can, for example, model pronunciation and engage learners in practice exercises, promoting active participation and reinforcing learning. On-screen agents can also enhance learning by directing learners with their gaze, eye contact, and pointing gestures (Li et al., 2023; Wang et al., 2018). Tools such as user-friendly animation creators (Vyond, Powtoon) as well as AI-powered talking head creators (Synthesia, D-ID) can facilitate the creation of on-screen agents and the effective application of the embodiment principle. Figure 4 below provides an illustration of an on-screen agent created using Vyond.

Figure 4. Example of on-screen agent

Design Challenge

Integrate the principles of Coherence, Signaling, Contiguity, Segmenting, Pretraining, Personalization, and Embodiment within one instructional video or e-Learning module that effectively trains employees in the basics of cybersecurity, such as ensuring password security as well as recognizing and reporting phishing attempts. Give your video to a peer and ask them to point out the CTML principles in your video. If they struggle to identify the principles in your video, ask them how you could make that part of the video clearer.

Conclusion

CTML reflects the ongoing advancements in empirical research and technology, which continuously reshape our understanding of how learners process information (Mayer, 2024). New methods by which learners can interact with content will also continue to emerge. For example, personalized feedback (Becerra et al., 2024) and adaptive learning pathways (Wang et al., 2023), increasingly enabled by AI technologies, might enhance the generative learning process, enabling learners to construct knowledge more effectively.

It is important to emphasize that designing instruction based on principles of multimedia learning does not guarantee the success of instruction. The design of multimedia content should also adhere to accessibility standards, such as those defined by the World Wide Web Consortium (W3C) (World Wide Web Consortium, 2024). At a minimum, functionality that allows learners with disabilities to properly navigate and perceive content should be included in instructional materials. This includes providing closed captions for videos (see Figure 1), alternative text for images, and ensuring that navigation is possible via keyboard for users with disabilities. By adhering to these standards, instructional designers can create materials that are accessible to all learners, thereby enhancing the overall learning experience.

Knowledge Check

Why is it important to minimize extraneous processing, and what strategies can be used to achieve this in multimedia instruction?
What is the coherence principle, and how does it affect the design of multimedia learning materials?
Explain the signaling principle and provide an example of its application in multimedia learning.
What is the segmenting principle, and how does it help manage essential processing?
How can pre-training help learners?
Explain how the embodiment principle enhances multimedia learning and provide an example of its application.

Richard E. Mayer, Roxana Moreno, Ruth Colvin Clark, Logan Fiorella, John Sweller, Alan Baddeley, and Merlin C. Wittrock are key scholars to review related to this topic, including the following book chapters and articles:

Clark, R. C., & Mayer, R. E. (2024). Chapter 2: How people learn from e-courses. In Clark, R. C., & Mayer, R. E. E-learning and the science of instruction: Proven guidelines for consumers and designers of multimedia learning (5th ed). John Wiley & Sons.

Clark, R. C., & Mayer, R. E. (2024). Chapter 18: Designing effective instructional video. In Clark, R. C., & Mayer, R. E. E-learning and the science of instruction: Proven guidelines for consumers and designers of multimedia learning (5th ed). John Wiley & Sons.

Mayer, R. E. (2024). The past, present, and future of the cognitive theory of multimedia learning. Educational Psychology Review, 36(1), 8. https://doi.org/10.1007/s10648-023-09842-1

Swann, W. (2023). Developing just-in-time video resources to support online and remote teaching. International Journal of Designs for Learning, 14(1), 1–10. https://doi.org/10.14434/ijdl.v14i1.34538

References

Adedoyin, O. B., & Soykan, E. (2023). Covid-19 pandemic and online learning: the challenges and opportunities. Interactive Learning Environments, 31(2), 863-875. https://doi.org/10.1080/10494820.2020.1813180

Baddeley, A. (1992). Working memory. Science, 255(5044), 556-559.

Becerra, Á., Mohseni, Z., Sanz, J., & Cobos, R. (2024, May). A generative AI-based personalized guidance tool for enhancing the feedback to MOOC learners. In 2024 IEEE Global Engineering Education Conference (EDUCON). IEEE.

Biard, N., Cojean, S., & Jamet, E. (2018). Effects of segmentation and pacing on procedural learning by video. Computers in Human Behavior, 89, 411-417. https://doi.org/10.1016/j.chb.2017.12.002

Clark, R. C., & Mayer, R. E. (2024). E-learning and the science of instruction: Proven guidelines for consumers and designers of multimedia learning. John Wiley & Sons.

Li, W., Wang, F., & Mayer, R. E. (2023). How to guide learners' processing of multimedia lessons with pedagogical agents. Learning and Instruction, 84, 1-13. https://doi.org/10.1016/j.learninstruc.2022.101729

Mayer, R. E. (2005). Cognitive theory of multimedia learning. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (pp. 31–48). Cambridge University Press. https://doi.org/10.1017/CBO9780511816819.004

Mayer, R. E. (2021). Evidence-based principles for how to design effective instructional videos. Journal of Applied Research in Memory and Cognition, 10(2), 229-240. https://doi.org/10.1016/j.jarmac.2021.03.007

Mayer, R. E. (2024). The past, present, and future of the cognitive theory of multimedia learning. Educational Psychology Review, 36(1), 1-25. https://doi.org/10.1007/s10648-023-09842-1

Mayer, R. E., Heiser, J., & Lonn, S. (2001). Cognitive constraints on multimedia learning: When presenting more material results in less understanding. Journal of Educational Psychology, 93(1), 187. ttps://doi.org/10.1037/0022-0663.93.1.187

Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38(1), 43-52. https://doi.org/10.1207/S15326985EP3801_6

Rey, G. D. (2012). A review of research and a meta-analysis of the seductive detail effect. Educational Research Review, 7(3), 216-237. https://doi.org/10.1016/j.edurev.2012.05.003

Seidel, N. (2024). Short, long, and segmented learning videos: From YouTube practice to enhanced video players. Technology, Knowledge, and Learning. https://doi.org/10.1007/s10758-024-09745-2

Sung, E., & Mayer, R. E. (2012). Affective impact of navigational and signaling aids to e-learning. Computers in Human Behavior, 28(2), 473-483. https://doi.org/10.1016/j.chb.2011.10.019

Sweller, J., Chandler, P., Tierney, P., & Cooper, M. (1990). Cognitive load as a factor in the structuring of technical material. Journal of Experimental Psychology: General, 119(2), 176-192. https://doi.org/10.1037/0096-3445.119.2.176

World Wide Web Consortium. (2024). Web content accessibility guidelines (WCAG) overview. Retrieved October 3, 2024, from https://www.w3.org/WAI/standards-guidelines/wcag/

Wang, F., Li, W., Mayer, R. E., & Liu, H. (2018). Animated pedagogical agents as aids in multimedia learning: Effects on eye-fixations during learning and learning outcomes. Journal of Educational Psychology, 110(2), 250-268. https://doi.org/10.1037/edu0000221

Wang, S., Christensen, C., Cui, W., Tong, R., Yarnall, L., Shear, L., & Feng, M. (2023). When adaptive learning is effective learning: comparison of an adaptive learning system to teacher-led instruction. Interactive Learning Environments, 31(2), 793-803. https://doi.org/10.1080/10494820.2020.1808794

Wittrock, M. C. (1992). Generative learning processes of the brain. Educational Psychologist, 27(4), 531-541. https://doi.org/10.1207/s15326985ep2704_8