Edinburgh Research Explorer Measuring morality in infancy

We conducted a scoping review of methods that have been used to measure infants' (under age 2) moral development. We aimed to assess the state of knowledge through a thematic overview of the methodologies that have been used and the specific constructs studied. We found that the majority of studies used an experimental methodology, and within this, infants' actual behaviour and their evaluations were the most common sources of information. An evidence map depicting concept delineation between studies and presenting concepts as related to an underlying moral sense, as prosocial (emotion and behaviour), and as antisocial components (emotion and behaviour) is provided. Just under one-third of studies were longitudinal, and a high percentage reported a statistically significant longitudinal relation for moral development. Results highlight a need for measures that can be used longitudinally at different stages of development so that trajectories can be observed and mapped to behavioural outcomes, such as conduct problems.

One of the hallmarks of early social development is the emergence of the ability to reason about choices that have a moral component, for example, obeying the rules.Infants demonstrate the ability to begin reasoning about actions that have moral underpinnings at a very early age, within the first year of life (e.g., Hamlin, Wynn, & Bloom, 2007;Jin, Houston, Baillargeon, Groh, & Roisman, 2018;Meristo & Surian, 2014), and they increasingly continue to demonstrate the ability to reason about complex situations in which they must juggle multiple social and moral rules as they mature (Smetana, Metzger, Gettman, & Campione-Barr, 2006).With the increase of the study of morality (e.g., see reviews Dahl & Killen, 2018;Hamlin, 2013;Lapsley & Carlo, 2014) also comes a myriad of methods for measuring morality in infancy, as well as challenges for aligning definitions and operationalizations of morality in such a way as to optimally advance the field with future studies.
Illuminating infants' emerging sense of morality is of considerable value for understanding the early foundation of moral behaviour and how this develops over time.For example, the foundation of early moral choices and behaviour may set the stage for developmental patterns through childhood and adolescence, and may explain how some children tend towards a trajectory of morally and socially accepted behaviour from an early age, while others tend towards trajectories of unaccepted behaviour, such as delinquency and violence.As of yet, this has been neglected as a focus of empirical research, despite the considerable value and ability of early social behaviour (e.g., aggression) to predict later social behaviour (Campbell, 1991;Loeber, 1991;Moffitt & Caspi, 2001;Patterson, DeGarmo, & Knutson, 2000;Shaw, Bell, & Gilliom, 2000;White, Moffit, Earls, & Robins, 1990).
To support such research, there is value in first conducting an overview of existing studies that have focused on the early moral development of infants (2 years of age and under).We used a scoping review technique (Arksey & O'Malley, 2005;Colquhoun et al., 2014), following the guidelines of the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR).Scoping reviews are used to identify and synthesize a broad body of literature that is not focused on a specific set of questions.For example, a scoping review tends to be useful for 'answering broad questions, such as "What information has been presented on this topic in the literature"' (Sucharew & Macaluso, 2019, p. 416).Scoping reviews are well-suited for large bodies of literature that are heterogeneous (Peters et al., 2015), and can be useful for assessing the scope of research literature (Grant & Booth, 2009).This approach can also be useful for revealing themes and identifying gaps in the literature (Peters et al., 2015).
Our scoping review surveyed studies of moral development in infancy to assess (a) the methods that have been used to measure early moral development, (b) to characterize the methods that have been used and how these map onto domains of moral development, and (c) to potentially identify early flags for conduct problems within the moral domain of infants' behaviour.The findings of our scoping review can be used to assess the current state of the evidence, and to inform future study designs in the field.

| Defining morality
Morality is a broad term and it is thus important to consider how it has been and can be defined.Infants' early moral development provides a foundation for how they engage in society in terms of right (or accepted) versus wrong (unaccepted, against the rules or laws) behaviour.Morality is not defined in a consistent manner in the field; a situation complicated by arguments that what is moral for one may not be moral for another (Greene, 2007;Turiel, 2015;Wynn & Bloom, 2014).Multiple perspectives overlap, with perhaps key divergences on where a sense of morality originates and whether it has a stable or dynamic developmental process.For example, some researchers argue that morality is intuitive and that infants have an innate social core (e.g., Hamlin, 2013;Van de Vondervoort & Hamlin, 2016).Some researchers consider morality to be more deriving from a developmental process that occurs from exposure to culture and rational thinking (e.g., Thompson, 2012;Wynn et al., 2018).Others view morality from a constructivist approach that considers morality is constructed as part of a developmental process (Dahl & Killen, 2018).This latter perspective views morality from a focus on how one's actions affect third parties, or, in other terms, 'prescriptive norms concerning others' welfare, rights, fairness, and justice' (Dahl & Killen, 2018, p. 2; as used by Killen & Rutland, 2011;Turiel, 2015).We adopt this latter definition as the framework for our review and expand on this perspective to include the element that morality stems from an internal standard, which becomes the basis that determines right actions from wrong actions, with such a standard being reflected in part through external rules that act as social regulators for the benefit and protection of society.We highlight that this definition is not specific or dependent on age or developmental abilities, but rather it represents a conceptual umbrella definition for morality as children's moral development is a gradual progress from early in infancy onwards through adulthood.
Operationalizing morality.Morality as a construct can be divided into three subconstructs that are more easily operationalized in child and infant-based research.The first is moral cognition or moral judgement, which is the ability to reason or think about value-based decisions (Buon, Seara-Cardoso, & Viding, 2016).The ability to measure moral cognition is increasing due to advances in the field of neuroscience (Narvaez & Vaydich, 2008), and one example of how this has affected the ability to study moral cognition is through the use of eye-tracking in infancy studies (e.g., Cowell & Decety, 2015).In young preverbal infants, looking time, as a proxy for distinguishing between expected and unexpected events following the violation of expectation (VoE) approach (e.g., Hamlin et al., 2007;He, Bolz, & Baillargeon, 2011;Hespos & Baillargeon, 2008;Jin et al., 2018) has been used as a proxy for moral cognition (e.g., Meristo, Strid, & Surian, 2016).However, the use of this approach has also been criticized as not necessarily representing true infant preferences as a proxy of their moral thinking on a subject (Tafreshi, Thompson, & Racine, 2014).
The second subconstruct within morality is moral emotions.Moral emotions are emotions that arise based on evaluations of how one's actions may affect another, as well as those that arise based on evaluations of another's emotional state (Malti & Dys, 2015).Following from our above-expanded definition of morality, moral emotions may also arise based on violations of one's own internal standard.Moral emotions include those such as guilt, shame, remorse, compassion, gratitude, and elevation (Malti & Dys, 2015;Tangney et al., 2007), and can be connected to one's own feelings or feelings for another.
The third subconstruct within morality is moral behaviour or action, which is manifest behaviour that has a moral foundation or component to it, for example, honesty, obeying the rules, and treating others in socially accepted and ethically correct ways (Krettenauer, 2012).Many studies that have focused on the early emergence of morality, broadly speaking have studied early social behaviour that is thought to be a precursor to early moral judgement abilities (Dahl & Killen, 2018).This can be seen in, for example, seminal works by Warneken and Tomasello (2006) who studied spontaneous helping behaviour in toddlers.Behaviour in the moral domain has a value-based component, and often impacts the self or another person (e.g., behaviour i.e., considered right or wrong, and behaviour that can harm the self or another person; Smetana, 2013), whereas social behaviour is largely based on social conventions or personal choice (Killen & Smetana, 2015).However, given that many studies have relied on measures of social behaviour as a proxy or precursor to moral judgement, this systematic and scoping review included both measures of moral or social behaviour with a moral component (e.g., sharing and comforting) to ensure that overlap was captured.
Although these constructs (moral cognition, emotions, and behaviour) are correlated, the correlations are far from unity, suggesting that they are not only conceptually but also empirically distinct (e.g., Krettenauer, 2012;Lakoff & Johnson, 1999;Lavoie, Nagar, & Talwar, 2017;Lavoie, Leduc, Crossman, & Talwar, 2016).This distinctness can also be observed at a developmental level where many of the contributing developmental processes are maturing at different rates (Krettenauer, 2012).

| Development of early morality
Evidence of infants' morality can be detected early.It can be seen, for example, in their general preferences for prosocial characters who help others even as young infants, rather than antisocial characters who work against others' efforts (Dunfield & Kuhlmeier, 2010;Hamlin et al., 2007;Jin & Baillargeon, 2017;Meristo & Surian, 2014;Olson & Spelke, 2008;Ziv & Sommerville, 2017).It can also be seen in their tendency to spontaneously help adults around them early in their second year of life (Warneken & Tomasello, 2006), and by their desire to repair the harm that they have accidentally caused to another person (Hepach, Vaish, & Tomasello, 2017).Young children's sense of morality can also manifest in ways such as obeying the rules placed on them by external control agents (e.g., parents, teachers, and early childhood educators) and behaving in ways that support the welfare of others within their peer groups.For example, recent studies have measured toddlers' evaluations of defending actions and of their expectations of the rewards and punishments following an act of aggression (e.g., Geraci, 2020a;Geraci, 2020b;Geraci & Surian, 2021), and have found, largely, that toddlers prefer characters who defend other characters who had themselves been victimized.The results of such studies suggest that even in infancy, babies have a sense of morality, for example, around helping and hurting actions, and highlight a likely innate component to morality, following the natural-tendency view (Bloom, 2013;Hamlin et al., 2007).These findings also highlight the importance of the infancy stage in establishing a foundational understanding of morality in children's lives.
Building from a natural-tendency view of a possible innate aspect of morality, there is also the consideration of how infants do not develop in a vacuum, but rather in a complex ecosystem (Bronfenbrenner, 2005) that includes a lot of early social interaction with parents and caregivers.Following this social-interactional view that early behaviours develop through social interactions (e.g., Dahl, 2015), one of the key ways in which early morality develops is through socialization between caregivers and the child.This socialization begins early in infancy, as infants are tended by parents and caregivers who respond to their needs, such as feeding and comforting (Hammond et al., 2017).One study found that infants with secure caregiver attachment were more likely to look longer at an unresponsive caregiving scenario than those without secure attachment, which suggests the infants with secure attachment were surprised by the unresponsive caregiving scenario (Johnson, Dweck, & Chen, 2007).These findings highlight an association between early caregiver responsiveness, through attachment, and an infant's socio-moral evaluations.
Further, the parenting approaches used in the home serve to shape children's moral behaviour.For example, parents who use warmth and support over power assertion and punishment promote compliant behaviour from an early age (Kochanska & Aksan, 2006).Through childhood, parents' own moral standards such as sensitivity to justice and fairness help to shape the development of moral behaviour according to the child's individual characteristics (e.g, Padilla-Walker & Christensen, 2011).For example, toddlers with a higher fear response internalize attitudes emphasized by parents with gentle discipline techniques, whereas children with lower fear responses do not obtain the same arousal from discipline, and may respond better through discipline that is based on mother-child cooperation and secure attachment (Kochanska, 1993(Kochanska, , 1995;;Kochanska, Aksan, & Joy, 2007).Early morality, then, while it likely has a natural component, is also constantly being shaped and moulded according to social interactions, for example, responsiveness to need, praise, and withdrawal of attention or affection.
Infants' moral cognition continues to develop as they age and mature (Thompson, 2012), gaining increasingly complex cognitive skills.For example, toddlers' attentive imitation of their mothers during a teaching task, which requires social-cognitive skills such as the theory of mind, is associated with a stronger conscience in early childhood (Forman, Aksan, & Kochanska, 2004).In sum, the development of morality is affected by multiple processes, including the complex interactions with the social environment, including parent and caregiver socialization, as well as cognitive processes and skills.
Early morality and later problematic behaviours?A longstanding question among researchers, parents, and professionals working with young children is what types of early behaviours flag the potential onset for later problematic behaviours?With the identification of early onset behaviours also comes the ability to act in a preventative manner.
Research findings with school-age children suggest that moral behaviour during some experimental paradigms is associated with behaviour problems (e.g., Lavoie, Wyman, Crossman, & Talwar, 2018;Mugno et al., 2019).Thus, there is the potential that early morality may act as a predictive indicator for problem behaviours that may become more manifest as the infant or toddler ages.In support of this, studies have found associations between early infant behaviour or emotions, and later problem behaviour.For example, infants' negative reactivity during a mildly distressing activity (Still Face Paradigm, where the mother keeps a still face during an interaction with the child) is associated with fewer symptoms of oppositional-defiance and conduct disorder in childhood (Wagner et al., 2016).More specifically related to morality, infants' preferences for prosocial characters are associated with fewer callous-unemotional traits in early childhood (Tan, Mikami, & Hamlin, 2018).In our scoping review, we explored whether early measures of morality may have the potential to flag later behaviours of concern through the assessment of studies that included a longitudinal component.

| Purpose for scoping review of measuring morality in infancy
Given the relation between early trajectories of problematic behaviour and later problematic behaviour through childhood and adolescence (Loeber, 1991;Moffitt & Caspi, 2001), the study of early childhood moral development may be used to inform prevention.One key to being able to conduct this type of study is knowledge of the available methods that can be used to validly, reliably, and feasibly assess moral development.For this reason, we have conducted a scoping review (Colquhoun et al., 2014) of the methods used to study infants' moral development.As mentioned, scoping reviews are a literature gathering method to broadly assess the literature on a specific topic to assist in knowledge synthesis, mapping important concepts, and identifying gaps in the field (Arksey & O'Malley, 2005;Colquhoun et al., 2014).
In our scoping review, we have systematically identified and collected empirical studies that have measured moral development in infancy to generate a thematic overview of the types of methods that have been used as well as the specific constructs within moral development that have been the focus of empirical study.Our secondary aim within this review was to assess whether existing methods in the field may be able to capture variation in moral behaviour that may flag the onset of conduct problems.

| METHOD
We conducted a scoping review (Arksey & O'Malley, 2005;Colquhoun et al., 2014) of moral development in infancy and followed the PRISMA-ScR as a guideline.Scoping reviews are useful for broad searches that are not restricted to a fixed question or set of questions (Sucharew & Macaluso, 2019), and that can be used to assess the scope of the research literature (Peters et al., 2015).Given that we wanted to generate a snapshot of the current state of the scope of the methods used to measure moral development and infancy and the specific concepts studied within moral development in infancy, we selected a scoping review approach.
In our searches, we used the specific terms 'moral development' and 'infant' or 'toddler' within the main library catalogue and included databases PsycInfo, Scopus, PsycArticles, and Social Science Citation Index (from 1990 onwards), with 6,748 results returned.We conducted a data text analysis to determine the topical scope of our results returned by the overall search.A simple word frequency analysis of all of the results returned from the title and abstracts demonstrated that common terms yielded from the search included children (n = 4,334), social ), and infants (n = 1,578), which highlights the scope of the returned results.
Screening process.We conducted an initial screen of all of the articles returned by our search criteria above.We retained for further review and inclusion studies that were (a) empirical, (b) had a moral development component (broadly defined to include prosocial: socially accepted and expected behaviour that benefits another person; or antisocial; unaccepted behaviour that can hurt the child or others), (c) had a child sample population of 2 years and under (24 months and under), and (d) that had an English translation accessible.Each article that met this criterion was retained in a citation software management system for coding.This screening process resulted in 70 articles.An additional 19 studies were added from hand searches of key articles that had been returned through the scoping review search, for a final total of 89 articles.All retained articles were assessed for clarity and rigour of the method and results and each article passed a minimum threshold for low risk of bias.
Coding process.Each article was reviewed in duplicate and information regarding the study method and outcome was extracted.We extracted (a) the primary method within the study (experimental: any type of procedure for measuring infants' behaviour in a specific type of paradigm or procedure; observational: structured or unstructured observations of free-activity such as unscripted interactions or play; or questionnaire-based); (b) main outcome type (moral behaviour: the child's own moral behaviour during an activity, e.g., sharing; moral evaluation: the child's moral evaluation of a scenario or event, e.g., whether they determine an action was good or bad; ability to identify emotions: the child identifies emotions; or child's own emotions: the child's own emotionality, e.g., how they reacted to a specific scenario); (c) sample characteristics (participant number and age range); (d) specific aspect of morality measured by the outcome variable (e.g., fairness, empathic concern, and compliance with prohibitions); (e) whether the study was longitudinal; and (f) if longitudinal, whether there was a relation between the time 1 (T1) outcome and the time 2 (T2) outcome to assess for predictive ability and associations between measures of morality and other factors.
Although we were most interested in assessing whether early measures of morality could predict later child behaviour, we looked at longitudinal relations more broadly to explore whether there was an evidence basis for predictive associations between morality and other factors (and vice versa, in studies where such was the case).A coding manual is available in Data S2 for further detail.
To synthesize the data, we extracted results from each of the studies in a table format, and subsequently used an evidence mapping approach (Hetrick, Parker, Callahan, & Purcell, 2010;Miake-Lye, Hempel, Shanman, & Shekelle, 2016) to visualize the spread and overlap of aspects of morality that were measured across studies.The organization of themes was guided by theoretical discussions by Thompson (2012) and Malti and Dys (2015), and is discussed further in our results.We used both of these visual outputs to gauge the scope of the current knowledge base on moral development in infancy, as well as identify gaps in the field to generate recommendations for future study.

| Sample size and age range
There were 185 studies/experiments across 89 articles included in the scoping review.Details about the studies are presented in Table 1.The sample sizes ranged from 4 to 1,204 participants within a study, with a substantial skew towards sample sizes 40 participants and under (see Figure 1 for a breakdown of sample sizes of all experiments, including across time points).The age range of participants in the studies included in the scoping review was 24 months and under.The youngest participants were 1-month old, and the oldest was 2 years old (see Figure 2 for the number of experiments that included participants 1-2 years old vs. those under 1-year old).There were a greater number of experiments that included participants 1-2 years old than experiments with younger infants under 1-year old.

| Type of study and main outcomes
The majority of the studies included in this scoping review relied on primarily experimental methods (n = 73), followed by primarily observational (n = 14), and questionnaire-based (n = 2).The child's own moral behaviour and the child's moral evaluations were overwhelmingly the predominant primary outcomes within studies (see Figure 3).
Moral behaviour.This approach for measuring moral development relied on infants' actual behaviour, and was captured through a questionnaire (n = 2), observational (n = 10), and experimental (n = 28) methods.A seminal study by Warneken andTomasello (2006, 2007) employed 6-10 scenarios (or paradigms) with toddlers 14, 18, and 24 months to test toddlers' tendency to behave in helpful ways to the experimenter and found that infants tended
a For additional details, including time points that were outside of the 2 years and under (24 months and under) sample inclusion criteria, please consult paper.
F I G U R E 1 Sample size distribution (number of participants) across experiments listed in Table 1 F I G U R E 2 Distribution of studies with infants 1-2 years and infants under 1 year, according to the type of method F I G U R E 3 Number of studies according to the type of method and main measurement outcome to help the experimenter with out-of-reach objects, but not with more complex tasks.Other studies have measured behaviours such as cooperation (Warneken & Tomasello, 2007), compliance with instructions (e.g., Dahl, 2016;Lickenbrock et al., 2013;Poehlmann et al., 2012), comforting another (e.g., Chiarella & Poulin-Dubois, 2018;Dunfield, Kuhlmeier, O'Connell, & Kelley, 2011), and sharing (e.g., Dunfield et al., 2011;Paulus et al., 2015;Ziv & Sommerville, 2017).Table 1 presents a complete list of the aspect of moral development that was the focus of study within each article.Figure 4 maps the moral development concepts and includes a frequency count for overlapping outcomes.
There was considerable variance in the methodologies employed to study infants' moral behaviour.Some studies opted to use an experimental approach through the development of standardized paradigms to assess infants' responses during a controlled laboratory scenario, such as dropping an item and watching to see if the child would assist (e.g., Warneken & Tomasello, 2006).Other studies used an observational approach and observed how infants would react and respond to naturalistic events that emerged during the assessment, such as complying with parental requests or instructions (e.g., Dahl, 2016;Poehlmann et al., 2012) to measure moral behaviour within the natural context of daily life.
Moral evaluations.This approach for measuring moral development captured infants' evaluations of moral situations (broadly defined again to include prosocial or antisocial situations) and was entirely captured using experimental methods (n = 38).The predominant method was through evaluating infants' preferences for prosocial (vs.neutral or antisocial) characters by measuring indicators such as looking time and reaching for a prosocial character (over an antisocial character), which was a common measurement method with adaptations.A seminal study in the field by Hamlin et al. (2007) found that 6-and 10-month old infants were more likely to reach for a character who had behaved prosocially (helped another character), over a neutral or negative character (who hindered another character from reaching their goal).They also found that infants were more likely to look longer at an event in which the main character approached the negative character (who had hindered it previously), which following a VOE hypothesis likely indicates the infants were surprised by the event.The authors use these results to argue for innate social evaluation preferences in infants (as a 'foundation to moral thought and action' p. 557).
Another common aspect of moral development measured using evaluative methods was infants' fairness expectations, specifically of the fair allocation of resources.Following a similar methodology to the experimental paradigms employed for studying infants' preferences for prosocial versus antisocial agents, of looking time and reaching for Concept map of specific aspects of moral development studied.Concepts are organized thematically, with main concepts in large bold text, and related concepts in close proximity characters, these studies found that infants tend to prefer agents who allocate resources similarly across group members (e.g., Burns & Sommerville, 2014;Geraci & Surian, 2011;Meristo et al., 2016;Meristo & Surian, 2014).
Infants' moral emotions.Two studies focused on infants' ability to identify emotions (n = 2) and 10 studies focused on infants' own emotions (n = 10).Studies looked at infants' abilities to detect maternal shifts in emotion (Hatzinikolaou & Murray, 2010) and at their ability to identify empathy (Lyubchik & Schlosser, 2010), using observational methods and experimental methods, respectively.Within studies that measured infants' own emotions, guilt (Barrett, 2005;Kochanska et al., 2002) and empathy (Campbell et al., 2015;Huang et al., 2017) were the two most common emotions studied, with two studies on each, all employing an experimental methodology to measure the infants' own emotions within a controlled setting.
Evidence mapping of moral development outcomes.We used an evidence-mapping methodology (Hetrick et al., 2010;Miake-Lye et al., 2016) to visually organize and synthesize the scope of moral development that was captured by the studies in this review, as well as to identify gaps (discussed in the next section).Figure 4 presents our evidence map of the specific measures within the scope of moral development that were studied.Main concepts are represented by large bold text and are organized thematically.The organization of themes was guided by theoretical discussions by Thompson (2012) and Malti and Dys (2015).
We conceptually organized first the centre, or core, as an underlying sense of morality, that is, the place from which moral decisions or behaviours flow.Concepts above the top line represent aspects of morality that are fundamental, but not explicitly visible per se.In other words, they are concepts that need to be operationalized to be measured.This includes topics such as the moral self (three articles in the review), the conscience (four articles in the review), and moral awareness.
Directly below 'underlying morality', we placed aspects of moral development that can be considered operationalizations of the core of morality and that are not necessarily 'good' nor 'bad', but rather operationalized concepts that stem from a moral core.This includes self-regulation (two articles in the review), social evaluations (20 articles across types of social evaluations), and moral emotions that stem from moral awareness, such as guilt (two articles), and emotional responses to jealousy (one article).
To the left and right (respectively), and below these concepts, we have placed prosocial components and antisocial components to visually depict our conceptualization of them stemming from a deeper underlying sense of morality and awareness, and to distinguish them from moral awareness.Prosocial components include behaviours such as compliance (six articles), helping (11 articles), and fairness expectations (13 articles).Antisocial components include behaviours under the umbrella of antisocial behaviour (e.g., externalizing behaviour and disruptive behaviour) and moral disengagement.
Longitudinal findings.There were 30 studies included in our review that reported a longitudinal component.Studies commonly used longitudinal methods to examine moral development as an outcome at a later testing point with early predictors such as parenting (Kochanska, Forman, Aksan, & Dunbar, 2005;Kochanska, Kim, Barry, & Philibert, 2011), attachment (Kok et al., 2013), and emotion regulation (Eiden, Lewis, Croff, & Young, 2002;Feldman, 2009;Morrell & Murray, 2003;Spinrad & Sifter, 2006).No study repeated the same task and measure longitudinally (or an adapted version that might avoid infant familiarity with the task) to track a specific aspect of moral development.
Eight longitudinal studies had a specific focus on behaviour problems (Eiden et al., 2002;Hecke et al., 2007;Hyde et al., 2010;Kochanska et al., 2008;Morrell & Murray, 2003;Poehlmann, 2012;Tan et al., 2018;Wagner et al., 2016).These studies help to shed light on whether early morality may be useful and have a predictive ability to identify later problematic behaviours.For example, Tan et al. (2018) found that infants' early identification of prosocial characters was associated with fewer callous-unemotional traits in early childhood.Of these studies, six were experimental in focus, five were observational in focus, and one was questionnaire-based.Four studies were primarily multi-method, combining these methods to measure morality.Together, these findings suggest that there are direct (e.g., attachment and emotion regulation) and indirect (e.g., parenting) factors related to moral development that can be used longitudinally to predict moral, pro-, and antisocial behaviour, but that studies have not tracked a specific element of moral development over time, whether with the same task or adapted ones.

| Summary
The purpose of this scoping review was to survey the field for studies of moral development in infancy and to synthesize the types of methods and specific aspects of moral development within this review.We found that the overwhelming majority of studies used an experimental methodology to collect information on moral development, and within this, infants' actual behaviour and their evaluations were the most common sources of information.An evidence map depicts concept delineation between studies and presents concepts as related to an underlying moral sense, as prosocial components (emotion and behaviour), and as antisocial components (emotion and behaviour).Just under one-third of the studies were longitudinal, and a high percentage of them reported a statistically significant relation between moral development measurements at different time points.

| Measures of moral development
Infants' own behaviour.We found that a common method of studying moral development was studying infants' own behaviour.Within these studies, the focus tended to lean towards the measurement of prosocial behaviours in infancy, for example, sharing with another individual and comforting another individual (Dunfield et al., 2011;Paulus et al., 2015;Ziv & Sommerville, 2017).The methods used for measuring moral behaviour varied between unstructured observations of the infant's behaviour, questionnaires completed by parents about the infant's behaviour, to structured experimental paradigms to test the infant's behavioural response to a specific scenario.Of note, one of the great challenges in measuring infants' moral behaviour is that many developmental skills are needed to truly assess moral behaviour.For example, regarding sharing specifically, researchers have found that young children (2-5 years old) are not able to share fairly, not out of a desire to not share, but because their numerical abilities are still developing (i.e., not being fully competent in complex mathematics of equitable resource distribution; Chernyak, Harris, & Cordes, 2018).This highlights some of the methodological challenges inherent to studying moral behaviour with infants.
One positive finding to emerge regarding infants' moral behaviour is that it is highly feasible to measure moral behaviour, even in infancy.Studies included in this review measured an aspect of moral behaviour even as early as within the first 4 months of life (e.g., Eiden et al., 2002;Morrell & Murray, 2003), which opens the possibility to developing future measures to capture infants' early moral behaviour, for example, empathy, resource allocation, and compliance, in novel paradigms.
Infants' moral evaluations.The predominant method of measuring moral evaluations was of studying infants' social evaluations of a helping agent versus a hindering agent (Hamlin et al., 2007), which although not directly a moral question per se (Carpendale & Hammond, 2016), does stem from a moral awareness of right versus wrong social actions.Since this study, Holvoet, Scola, Arciszewski, and Picard (2016) sourced an additional 27 experiments across 12 studies (publications) that employed this or a similar methodology with mixed results (for a synopsis of the additional studies see Holvoet et al., 2016 for a systematic review).Specifically, Holvoet and colleagues report that, of the 27 experiments, 18 found a preference for a prosocial character.The inconsistent findings may be highlighting another underlying methodological challenge, which is that it is difficult to capture infants' evaluations given their developmental capacity in aspects such as attention span, emotional self-regulation for any uncomfortable distractions (fatigue and hunger), and pre-verbal ability.
Perhaps because of these inherent measurement difficulties, the series of findings from the helper/hinderer paradigm have been critiqued methodologically and conceptually.For example, there is debate over the interpretation of looking time, questions of whether results are driven by perceptual reasons such as whether the main climber's gaze is looking towards the target goal, rather than social evaluative reasons (Holvoet et al., 2016).Despite the inconsistent findings, a large number of studies have successfully (in terms of design, low attrition, completion rates, and theoretically supported interpretable findings) employed an evaluative method, which suggests that it is possible to measure infants' social evaluations, but that there is still room for improvement in the specifics of measurement methods.For example, it may be of interest for studies to employ alternative methods for measuring social evaluations that do not rely on elements such as looking time or vignette-based scenarios to develop a broader evidence based on very early morality that can then be judged against the findings from the helper/hinderer paradigm.
Some studies in the field have integrated measures such as eye-tracking and pupil dilation as a measure of arousal (e.g., Cowell & Decety, 2015;Geraci & Surian, 2011;Gredeback et al., 2015;Kanakogi et al., 2017), and future studies may consider comparing results from a social evaluative task using the standard measure of looking time, in comparison with neurocognitive measures (e.g., eye-tracking and functional near-infrared spectroscopy; fNIRs), as well as observational methods such as providing doll figures of the vignette characters for the infant to play with and measuring the infant's behaviour.The considerable interest in measuring infant moral evaluations, gauged by the number of studies in this area, suggests that the resolution of the methodological challenges in this area should be a key focus of future research.Future studies will be able to continue building and developing on the ground-breaking work that has already begun in this area, for the benefit of further in-depth understanding of how infants think and evaluate moral situations.
Infants' moral emotions.Within studies that measured moral emotions, experimental methods for measuring infants' own emotions were most commonly employed.Across the 12 studies included that studied moral emotions (including studies that measured infants' own emotions and infants' ability to identify emotions), empathy was the most highly studied moral emotion (three studies: Campbell et al., 2015;Huang et al., 2017;Lyubchik & Schlosser, 2010), followed by guilt (two studies: Barrett, 2005;Kochanska et al., 2002).Experimental methods were commonly used, but no specific paradigm or scenario predominated.The study of infants' moral emotions may be a promising avenue for future research on infants' morality given the relatively restricted range of emotions that have been studied, as well as the fact that moral emotions can be measured without the need for language skills, which makes it possible to study even in very young infants.

| Longitudinal measures
Within the scoping review, just over a third of the studies included a longitudinal component.Given that moral behaviour during experimental paradigms differentiates between children with typical versus atypical levels of problem behaviours (e.g., Lavoie et al., 2018;Mugno et al., 2019), one of the secondary aims of our review was to explore the possibility that a measure of moral development may be able to act as an early flag to indicate a potential onset of behaviour problems that could be caught early enough for preventative support and action.We found that it was possible to predict moral behaviour and awareness (e.g., conscience), specifically problem behaviours, using factors directly (e.g., attachment and emotion regulation), and indirectly (e.g., parenting) related to morality.The majority of such studies focused on measuring infants' behaviour (e.g., Eiden et al., 2002;Forman et al., 2004;Kochanska, Aksan, Knaack, & Rhines, 2004), but two studies did measure infants' evaluations (Sodian et al., 2016;Tan et al., 2018).
However, no study focused on using a measure of morality to predict morality at a different time point.This is a key direction for future studies that include a longitudinal component and that also focus on the onset and development of behaviour problems.

| Limitations and future directions
A scoping review is by nature very broad, and while it is a strength to have an overview of a particular field, there are also limitations to trying to gather all of the information.For example, given that aspects of moral development do not map directly onto the literal term 'moral', it is possible that some studies of relevance were not captured by our search strategy.However, we did find that studies reporting prosocial and antisocial behaviour components were returned in our search results, which does help to minimize the number of articles that may have been missed.
Future searches could focus on more specific aspects of moral development, such as virtues (e.g., courage and gratitude) to provide more in-depth analyses of infants' moral development as well as the factors that may predict moral development.
We also looked to gather and organize the information thematically, but in doing so, some of the more nuanced interrelations were not given much emphasis or focus.This scoping review would be well complemented by more directed searches particularly on the longitudinal aspect of moral development and the information that this can provide to help improve behavioural outcomes for children over the long term.
Further, it is possible that 'morality' in infancy looks somewhat different than morality in other age groups given preverbal abilities and other developmentally contingent challenges.That is, infants cannot yet tell the researchers at this age what they think morality or moral behaviour is, but rather assumptions are made based on observations of infants' behaviour in naturalistic contexts or in experimental paradigms that have to be carefully designed to exclude alternative explanations.These limitations need to be carefully considered in the design of future research methods and in the interpretation of findings.
Based on the results of the studies included in this scoping review, there are several promising areas for future research.One area that stands out as a key gap is a need for studies that measure both infants' evaluation of a moral subject and their behaviour.Given that an important aspect of moral development is both knowing and doing-I know what is moral and my behaviour will either reflect or challenge that knowing-there is a gap to be filled in infancy studies that consider both the evaluation of moral subjects and then the doing of moral tasks.In our scoping review, one study included both an evaluation and behavioural measure (Vaish, Carpenter, & Tomasello, 2009), and there is more space in the literature to develop this harmonic or discordant relation further.
Another gap that stands out is topical.The field of moral development has a large scope with many relevant topics to study and to understand how they develop from infancy through childhood.Roughly onequarter of the studies included some component of emotionality and covered quite a broad spectrum within this, which limits the current state of understanding of infants' moral emotions as well as the precursors to such emotions.Further, the development of specific virtues such as courage and gratitude, and also of broader concepts such as justice has ample room for future studies on the development of morality in infancy.
Additional research coverage and evidence in these areas will provide valuable information that will increase a deeper understanding of how infants develop a moral sense and of how this development may be able to point as an indicator to early trajectories of behavioural problems before they become a concern.
In terms of methodological gaps, there is a great amount of space for the development of longitudinal studies to track moral development over time, beginning in infancy.Although roughly one-third of the studies included in this scoping review did include a longitudinal element, no studies examined moral development longitudinally at multiple time points.This would be of considerable value for future studies to consider as it would provide information about emerging morality as well as how morality develops and changes over time, including the types of factors that influence how morality changes from early infancy through childhood.

| CONCLUSIONS
The purpose of our scoping review was to gather information about methods for studying moral development and morality in infancy, and to synthesize and present this information for the benefit of assisting the design of future studies, particularly longitudinal studies, on the design considerations for measuring morality in infancy.We found that measurement of infants' moral evaluations and behaviour are quite feasible, and that both of these aspects may also be measured longitudinally with the potential to act as an early flag indicator for future problem behaviours, which is an area for further study.
Study characteristics T A B L E 1 (Continued)