Impact of group structure and process on multidisciplinary evidence-based guideline development: an observational study.

RATIONALE, AIMS AND OBJECTIVES
This paper presents selected results from a study investigating the impact of small group processes on the development of clinical practice guidelines by multidisciplinary panels. Observations of one panel developing a guideline for primary care over several months are reported here.


METHODS
Non-participant observation with content analy-sis of transcripts aided by field notes.


RESULTS
Bales's interaction process analysis was used to categorize interactions in terms of their task-oriented or socioemotional qualities. This revealed a well-functioning, task-oriented group characterized by predominantly positive social behaviours. However, a breakdown of dialogue by speaker indicated a marked effect of professional role and status on the level of contribution to group discussions. This, and marked changes in panel composition across meetings, has implications for the multidisciplinarity of decision-making in such groups and hence for the acceptance and implementation of their outputs.


CONCLUSIONS
These findings are likely to generalize to other health care settings in view of the growing emphasis on multidisciplinary decision-making and the clear status hierarchies inherent within the medical and allied fields.


Introduction
The last decade has seen a proliferation of clinical practice guidelines, defined by the Institute of Medicine as 'systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances' (Field & Lohr 1992, p. 39). In the UK, guidelines are assuming a growing importance with increasing moves towards clinical audit and regulation of professional behaviour, as embodied in the clinical governance agenda (Secretary of State for Health 1998). Despite the IOM definition, early guideline development processes were decidedly unsystematic and opinion-based, often resulting in invalid and unworkable recommendations that were unacceptable to practitioners. Methodological advances towards evidence-based development have undoubtedly improved the quality of the guidelines disseminated to health professionals (e.g. Cluzeau et al. 1999;Grimshaw et al. 1995). The methodological 'gold standard' for guideline development proposed by bodies such as the Scottish Intercollegiate Guidelines Network, the North of England Evidence-Based Guideline Group and the US Agency for Health Care Policy and Research incorporates a multidisciplinary process of literature search, critical appraisal and discussion of evidence and contextual issues for implementation. This aims to bring together the benefits of a systematic review, expert opinion and stakeholder participation, with the intention of producing scientifically valid, evidence-linked recommendations that are acceptable to users and feasible to implement. Nevertheless, guideline development in such groups is rarely as systematic and structured as descriptions of the method would imply. Indeed, whilst certain requirements are expected of such groups, the way in which they go about their task (e.g. how they divide subtasks, order the development process or determine the time to be devoted to each topic) is largely unstructured and will vary from group to group. This may depend on the nature of the task (e.g. how wide in scope it is, for whom the guideline is designed, how much evidence is available), the leadership of the group (e.g. how strict the chairman is), its composition (e.g. available expertise) and the amount of supplementary support available (e.g. librarians). This flexibility increases the likelihood that factors other than scientific evidence and balanced contextual information will be brought to bear in the development process -namely psychosocial or 'small group' processes. The risk of such factors impacting upon the evidence-based development process is exacerbated by the multidisciplinarity of such groups since, in health care, the term multidisciplinary denotes multistatus and there is a wealth of evidence from psychology linking status with influence and control.
Several authors have noted this potential source of bias in evidence-based group decision-making. For example, as early as 1995, Grimshaw and colleagues wrote: 'the effectiveness of clinical guidelines depends at least as much on the quality of the consensus development as on the quality of the evidence base' (Grimshaw et al. 1995, p. 46). Readers of the British Medical Journal will no doubt recognize Isaac & Fitzgerald's more tongue-in-cheek description of the alternatives to evidence-based decision-making, as shown in Box 1.
Whilst such anecdotal accounts abound, there has been little systematic study of potential psychosocial confounders in this context. This is probably attributable to: (a) lack of knowledge amongst health services researchers of appropriate methods of investigation, and (b) the practicalities of including such an evaluation in what is already a complicated and time-pressured task. Some pertinent work has been conducted with structured consensus development (detection) groups, in which the process of developing a guideline (or similar product, such as a set of review criteria) is more tightly controlled and may involve anonymous rating or ranking of alternatives (Murphy et al. 1998). This has, for example, identified an influence of professional affiliation on judgements of the value of clinical interventions, such that those who perform the intervention rate it as more appropriate and necessary than those who do not (Leape et al. 1992;Coulter et al. 1995;Kahan et al. 1996). However, the major source of information about potential small group processes in guideline development is the literature from social and organizational psychology, although it should be acknowledged that this is derived largely from laboratory-based, experimental research. Whilst the embedded fields of social cognition and group decision-making are clearly relevant to guideline development (see, for example, Murphy et al. 1999;Pagliari et al. 2001), it is the literature on social influence and group dynamics that is of greatest relevance to the Box 1. Seven alternatives to evidence-based medicine (Isaacs & Fitzgerald 1999) Eminence-based medicine: The more senior the colleague, the less importance he or she places on the need for anything as mundane as evidence. Experience, it seems, is worth any amount of evidence. These colleagues have a touching faith in clinical experience, which has been defined as 'making the same mistakes with increasing confidence over an impressive number of years'.
Vehemence-based medicine: The substitution of volume for evidence is an effective technique for browbeating your more timorous colleagues and for convincing relatives of your ability.

Eloquence-based medicine:
The year-round suntan, carnation in the button hole, silk tie, Armani suit and tongue should all be equally smooth. Sartorial elegance and verbal eloquence are powerful substitutes for evidence.
Providence-based medicine: If the caring practitioner has no idea of what to do next, the decision may be best left in the hands of the Almighty. Too many clinicians, unfortunately, are unable to resist giving God a hand with the decision-making.
Diffidence-based medicine: Some doctors see a problem and look for an answer. Others merely see a problem. The diffident doctor may do nothing from a sense of despair. This, of course, may be better than doing something merely because it hurts the doctor's pride to do nothing.
Nervousness-based medicine: Fear of litigation is a powerful stimulus to over-investigation and over-treatment. In an atmosphere of litigation phobia, the only bad test is the test you didn't think of ordering.
Confidence-based medicine: This is restricted to surgeons. data reported here. The key messages for this study can be summarized as follows: • groups are complex and dynamic 'organisms' whose composition, aims and social structure change over time (e.g. Lewin 1951;Tuckman 1984) and whose task performance is affected by the interaction style of their members (Bales 1950); • the behaviour of group members (and the group as a whole) is affected by social influence arising from majority pressure (conformity, e.g. Asch 1951) or from particular individuals (compliance and obedience, e.g. Milgram 1965); • degree of social influence is positively associated with the status of the individual or collective source (e.g. Driskell & Mullen 1990), and • control of verbal interaction (amount of verbal input and turn-taking behaviour) is both a source and an indicator of status within groups (e.g. Ng & Bradac 1993). As noted in the Abstract, this paper reports selected findings from a larger study, which aimed to investigate the influence of small group processes on guideline development using a range of qualitative and quantitative methods. The narrower aims here are to present a structured method for examining group dynamics in guideline development and to highlight the broad impact of status differentials on group decision-making. Our broad research questions can be summarized as follows: 1 What pattern of interpersonal interaction characterizes this group and is there evidence of aberrant behaviour that may be particularly detrimental to the group process or task achievement? 2 How does the composition of the group change over time and what implications may this have for successful guideline development? 3 To what extent does the professional role and status of group members influence their participation in the guideline development process? (How truly multidisciplinary is the decision-making process?)

Sample and context
The wider study relied on opportunistic sampling of multidisciplinary guideline development groups starting and ending their process within the project's funding boundaries. Data reported here represent one such group developing an evidence-based guideline for general practice, observed over four 3 hour meetings taking place during a 12 month period. Group composition changed to some extent between meetings, but the core membership numbered 19, although 27 persons were present on at least one occasion during the development process. Table 1 provides a breakdown of participants, specifying their group membership (core or ancillary/ observer), their broad professional category and, if different, their specific role within the group (e.g. chairman). For convenience, participants have been listed in clusters that reflect these roles. At the top of the list is the group chairman, followed by a highstatus member of the guideline development organization who is also a consultant professor in a relevant area. Following these are five secondary care consultants with expertise in various pertinent areas. Next are two other 'special experts', the first being a professor of medicine who was previously involved in developing a guideline-type product on the same topic, and the second a professor of public health medicine involved in developing a similar product from the point of view of financial costs. Listed next are the three general practitioners (GPs) in the group. After these are a nurse, a dietician and a pharmacist, all of whom have expertise in various aspects of the guideline topic [categorized as professions allied to medicine (PAMs), for convenience]. Next are four generic experts or advisors -a health economist, a patient advocate, a clinical auditor and a librarian assigned to the group by the development organization. All of the latter 19 individuals can be considered part of the core guideline development group. The remaining eight persons present at one or more meetings were editorial and secretarial staff, whose role was to provide specific instructions or information, and non-participant observers (the researcher and the chairman's assistants). The latter are included for completeness only.

Procedure and tools
Groups were observed from a short distance by the first author, who took anthropological field notes and manually recorded the order of speakers to aid later transcription. Dialogue was audio-recorded onto 4 hour VHS tape using a powerful but discrete boundary microphone placed at the centre of the meeting table. Audiotapes of meetings were transcribed with the aid of notes identifying speaker order. Transcripts were coded using the framework developed by Bales (1950) for categorizing interactions within groups, known as interaction process analysis (IPA, see Fig. 1). Coding was facilitated by reference to field notes. Codes were applied to each 'utterance', defined as a dialogue string emanating from one speaker before interruption by another speaker. Frequency analyses of the IPA codes were conducted to reveal predominant patterns of interaction within and across meetings and to highlight any aberrant interaction patterns (e.g. excessive antagonism). Representative sections of text were double-coded by the second author and an associate to assess inter-rater reliability. The guidelines for IPA coding are clear and consequently agreement between raters approached 100%.
To examine the impact of status differentials, a percentage figure indicating the relative number of words spoken by each participant was computed to give an indication of their proportionate contribution to the overall group discussion (see Results section for details).

Interaction process analysis (research question 1)
The frequencies with which the 12 IPA codes appeared across the four meetings are shown in Fig. 2. The distribution of codes shows the pseudonormal pattern typically found in task-oriented groups (McGrath 1984). Codes relating to task performance (4-9) occur more frequently than those relating to socio-emotional behaviours (1-3 and 10-12). Interactions concerned with giving information (codes 4-6) are more frequent than those aimed at eliciting information (7-9), as would be expected where task solution involves information exchange and interpretation and where roles during  meetings largely involve reporting of information gathered by individuals between meetings. Positive socioemotional interactions (1-3) are more common than negative ones (10-12), suggesting a healthy level of interpersonal interaction. All individual meetings followed the same general patterns. The most frequently used individual code relates to giving information, clarification or confirmation (6 instances), while opinion-giving (5) is the next most frequent. The relatively high frequency of the latter code is largely attributable to the fact that it covers evaluation and analysis, critical aspects of evidence-based decision-making. Nevertheless, these results also suggest a greater focus on scientific evidence than on personal opinion. Suggestions are given (4 instances) less frequently than they are asked for (9). Of the positive socioemotional codes, the most frequently used is 2, which covers joking and expressions of satisfaction, followed by 3, which deals with agreement. There is relatively little disagreement (code 10) or tension and antagonism (codes 11-12).

Changes in group composition over time (research question 2)
As noted previously, 27 individuals attended at least one of the four meetings examined here. Of these, 19 were official members of the core guideline development group and the rest ancillary support staff or observers. Table 2 shows the pattern of attendance amongst core group members. Evidently, group composition changed somewhat across meetings, as new members joined the group or existing members failed to attend. Only four members of the core group attended all four meetings: the chairman, a consultant and two GPs. Most core members attended at least two of the four meetings, although the patient representative, the economist and one specialist were only present on a single occasion.

Hierarchical contributions in multidisciplinary discussions (research question 3)
The impact of interprofessional differences and status hierarchies on group discussions can be examined by calculating the number of words contributed by each speaker across the guideline development process. (The number of utterances was also calculated to examine the nature of each speaker's interaction -e.g. a few long speeches versus frequent minor contributions. The pattern of results was very similar to that for word count and for this reason, these data are not reported here.) Since not all members were present at every meeting, total word counts provide a distorted picture of the overall relative contributions of individual group members. For this reason, a percentage score was calculated for each speaker by dividing their total word count by the total number of words spoken in the meetings attended by that individual. These estimates of general tendency to contribute to discussions are shown in graphical form in Fig. 3. Only the contributions of core group members are considered in Fig. 3, since the other participants did not contribute to the substance of the guideline or the process of evidencebased decision-making. Figure 3 reveals a clear relationship between professional role or status and level of contribution to group discussions. The chairman (A) has by far the largest contribution, as would be expected from his explicit role as facilitator. The secondary care consultants (B, C, D, E, F, G) form a clear group of very active contributors, as do the two experts with previous experience of developing guideline-type products on the same topic (H, I). Indeed, participant H has the second-highest word count of all, reflecting both his advisory role in the second meeting and his dominant personality. The three GPs (J, K, L) are much less active and the nurse and PAMs (M, N, O) less active still. The generic experts offer a higher level of contribution than either the GPs or the nurse/PAMs, reaching levels similar to those of 150 the consultants. Most reluctant to contribute were the nurse and the pharmacist.

Discussion
Interaction process analysis demonstrated that the group was clearly task-orientated and meetings were mainly concerned with the exchange of information or opinion. Closer inspection of the data revealed this to be primarily explained by discussions of evidence, showing that the group was appropriately focused. The group was also characterized by generally positive interpersonal relations, with little antagonism or conflict evident. These data were corroborated by the researcher's observations, which indicated that where conflict arose it was managed effectively and did not escalate or transfer into subsequent meetings. Not evident from the IPA analysis is the role of the chairman in facilitating the smooth group process. This chairman had considerable expertise in leading small groups and was both task-orientated and attentive to individual participants. Although the interactions of this group did appear to be appropriate, it should be borne in mind by guideline developers that differences in group management strategies (e.g. too laissez-faire or too dictatorial) and members' per-sonalities (e.g. combativeness) may affect the shape of the curve shown in Fig. 2.
Logs of attendance revealed differences in the group composition over time, with only four members of the core group being present on every occasion. This is to some extent inevitable, since the first meeting of such a group may involve decisions about appropriate representation (four participants joined after meeting one). Absences of existing members nevertheless accounted for much of the change. Such differences raise issues about the potential multidisciplinarity of decision-making within particular meetings and across the development process as a whole. As noted in the introduction, studies of structured consensus development groups indicate a marked influence of professional affiliation on scientific decision-making (e.g. Leape et al. 1992;Coulter et al. 1995;Kahan et al. 1996). The balance of disciplines represented within a decision-making event may therefore be reflected in the outcomes of the group, although the process of evidence-based guideline development (which involves explicitly grading each recommendation according to the strength of the evidence supporting it) should, to some extent, ameliorate this. Just as importantly, changes in panel composition may have hindered the development process. For example, observations indicated that the absence of key members in certain meetings meant that particular agenda items could not be fully explored as planned, limiting the ability of the group to move on to a different topic. Likewise, where topics were discussed in the absence of key stakeholders it often became necessary for the group to revisit the topic at a later meeting, adding to the already considerable time needed to develop the draft guideline. Maximizing the stability of the group composition on subsequent occasions will undoubtedly help to improve the flow of meetings and the progress of guideline development.
The clear relationship between status and contribution to discussions and decision-making, shown in Fig. 2, substantiates anecdotal accounts of the impact of medical hierarchies and is consistent with studies of other decision-making groups (e.g. Kirchler & Davis 1986;Vinokur et al. 1985). The marked difference in contribution between consultants and experts versus GPs and nurse/PAMs is inkeeping with what has previously been observed in psycho-Group process influences on guideline development logical studies on social influence. Not only are the former two categories of participant of higher status than the latter two, but there are more of them (eight versus six). Both of these factors are associated with social influence effects (e.g. Asch 1955). It is also interesting to note that the three least active contributors are female. (The group contained only two other female participants, one of these being a GP and the other a patient representative, whose professional role involves assertive advocacy of consumer issues.) Whilst it is not possible to draw clear conclusions from this observation, it is in line with findings in the literature suggesting an association between gender and status in social interaction (e.g. Eagly 1983;Eagly & Chrvala 1986) A factor that is likely to have exacerbated the influence of status differentials on participants' willingness to contribute to group discussions is the top-heavy composition of the group. It is well established within the psychology of social influence that as the number and status of the majority increase, the more reluctant individuals become to voice dissenting views (e.g. Asch 1955). Peer support can reduce this effect, even if the peer does not have exactly the same view (e.g. Allen & Levine 1971). Unfortunately, in a top-heavy multidisciplinary group, those members with the lowest status, and hence the greatest need for peer support, are the least likely to be accompanied by someone else from the same profession. In this group, for example, there was only one nurse on the panel. Given the need to restrict numbers in guideline development panels, and the need for high-level expertise, it may not be possible to achieve this; however, it may be sufficient to match individuals at the lower end with someone from the same level of the status hierarchy (e.g. as in this case, a senior nurse and a dietician).

Summary and implications
Interaction process analysis has been used successfully to examine the broad content of group discussions in this multidisciplinary guideline development panel. This group functioned well in terms of both task content and process. Such a method may be used in future evaluations of medical decision-making panels to provide a more objective record of group processes than is possible from informal observa-tions. However, it should be acknowledged that the data generated are limited in scope and a more detailed content analysis based on qualitatively derived codes may be necessary to fully capture the individual character of such a group.
Attendance problems resulted in changing group composition across meetings. This has implications both for the multidisciplinarity of decisions and for the efficiency of the guideline development process. Group composition should ideally be kept constant throughout the development process to ensure that all stakeholder groups have appropriately considered the recommendations and to minimize the need for repetition, and hence the time needed to complete the development process.
Analysis of individual members' contributions to group discussions revealed a strong association between status and participation. Status hierarchies are an inevitable part of medical culture. Although no attempt has been made here to examine the content of the guideline developed by this panel, it is apparent that such status hierarchies may affect the multidisciplinarity of decision-making in interprofessional groups. This may be exacerbated by unbalanced or top-heavy group composition. Those charged with composing such groups should make efforts to balance the group and ensure peer support for all participants. In guideline development, it may be useful to ask group members to evaluate each recommendation independently prior to the launch of the guideline (as in the Delphi or nominal group methods), since this may highlight areas of interprofessional disagreement that may not have been evident during group discussions. In composing a panel, efforts should be made to select a chairperson with knowledge of the psychology of small groups and expertise in their leadership.