The Drum Beat 302 - Evaluating Health Communication Programmes

Sun, 06/05/2005 - 17:00

3 comments

Issue #

302

Date

Mon, 06/06/2005 - 12:00

***

This Drum Beat is one of a series of commentary and analysis pieces. Dr. Jane Bertrand, [who was, at the time of this publication] Director of Johns Hopkins Bloomberg School of Public Health Center for Communication Programs (JHUCCP), outlines evaluation methods that are useful in assessing programme design, implementation and effects, as well as some of the challenges faced by evaluators of health communication programmes. This commentary draws on Dr. Bertrand's extensive experience in research and evaluation related to reproductive health, as well as the work of her colleagues at JHUCCP. What follows is Dr. Bertrand's perspective - NOT that of the Partners collectively or individually.

We are interested in featuring a range of critical analysis commentaries of the communication for change field. These will appear regularly on the first Monday of each month and are meant to inspire dialogue throughout the month. Though we cannot guarantee to feature your commentary, as we have a limited number of issues to be published each year, if you wish to contribute please contact Deborah Heimann dheimann@comminit.com Many thanks!

***

Evaluating Health Communication Programmes

Communication is an integral element to public health interventions designed to influence social norms or change the behaviour of individuals, families, and communities. Communication operates at multiple levels: in the political arena, in health service delivery systems, and within communities and families. Some communication programmes take the form of discrete activities within a short period of time (e.g., a public service advertising campaign or a series of community events). Others consist of integrated, coordinated activities over extended periods of time (e.g., a national campaign for HIV/AIDS stigma reduction, infection prevention, and palliative care using a combination of mass media, community mobilisation, and interpersonal communication and counseling in health service facilities).

Many take for granted the effectiveness of communication for achieving results. Presidential candidates in the United States spare no expense to get their message to the voting public via mass media and personal contact. Coca Cola does not stop advertising, even though the product has nearly universal recognition around the world. On the other end of the spectrum are the skeptics who question the value of communication programmes. They would argue, "If communication were effective, why does HIV/AIDS continue to spread?"

How can we determine whether a given communication programme "makes a difference?" The field of programme evaluation is steadily evolving to answer fundamental questions regarding the effectiveness of communication programmes, as well as related issues of programme design and implementation. The purpose of this commentary is to suggest one approach to understanding the different types of evaluation used in connection with behaviour change communication (BCC) programmes, especially in the international context.

There are three primary types of evaluation that span the life of a programme: formative, process, and summative evaluation (Rossi, Lipsey, and Freeman, 2004). Small non-governmental organisations (NGOs) with limited resources may opt to perform only one of these types of evaluation, whereas a major communication programme with national scope would be remiss to exclude any of them.

Formative evaluation refers to the activities undertaken to furnish information that will guide programme design. This information helps programme planners to determine who is most affected by the problem; identify the needs of specific subgroups; ascertain existing knowledge, beliefs and attitudes; determine levels of access to services, information, social support and other resources; understand barriers to action, and determine communication habits and preferences. The sources of these data are multiple: existing epidemiologic and demographic reports, secondary analyses of existing data, primary data collection among the intended audience (often in the form of a baseline that subsequently will serve for evaluation purposes, as well), media ratings data, service statistics, and other programme records. In addition, formative research generally includes qualitative research that taps into the opinions, aspirations, fears, beliefs, and other key psychological factors that influence a given health behaviour. Common qualitative research techniques include focus groups, in-depth interviews, direct observation, and a wide variety of guided or self-directed participatory methods.

Programme planners use this formative information in designing a communication strategy for the given project or intervention. Formative evaluation answers questions related to the objectives to be achieved, the intended audience, the potentially most effective channels, messages that draw on positive sources of motivation and address barriers to change, and other key elements.

Process evaluation involves tracking programme implementation once the programme is launched. It determines whether the programme is delivered as intended to the intended recipients (Rossi, Lipsey, and Freeman 2005). In the simplest form, process evaluation - also known as implementation assessment - monitors the activities conducted in relation to the proposed scope of work and timetable. It answers the question: to what extent is the project implemented according to plan? This question is important to managers, evaluators, and beneficiaries alike. If the project is not on schedule in implementing the planned activities, the programme manager will want to know this in a timely manner and take action to redress the delays and other shortcomings. Moreover, programme personnel want to understand the programme dynamics such that they can replicate successful components and eliminate ineffective elements in future efforts.

For the evaluator, process evaluation serves two important functions. First, it provides the necessary information to satisfy stakeholders (including donors and beneficiaries) that the project is on track. Second, it provides important documentation of what activities actually took place as well as other concurrent events that might affect the outcome. This becomes important with relation to summative evaluation. For example, if a given project fails to achieve its objective (that is, the expected change does not occur), it is important to determine if this results from implementation failure (the project was not implemented satisfactorily) or programme theory failure (the project was implemented as planned but failed to bring about the expected change because the strategy or materials were conceptually flawed or the programme failed to take into account important explanatory factors or adjust to historical events) (Rossi, Lipsey, and Freeman 2005).

In some applications, process evaluation may also measure quality of service, access, and reach. Many health interventions have some aspect of service delivery, and programmes often seek to improve the quality of services in an effort to attract and retain more clients. Thus, studies of quality of care are one type of process evaluation. Similarly, projects often have the objective of increasing access to services by expanding the number of sites, training additional personnel, recruiting a new type of personnel (e.g., bi-lingual), and so forth. Finally, process evaluation of communication programmes may also include a measures of the reach of the communication, that is, the percentage of the population exposed to or able to access specific elements of the programme.

In short, process evaluation tracks how the project is implemented, including changes in how services are delivered (e.g., quality of care). However, it stops short of answering the question: did the desired behaviour change occur among members of the intended audience?

Process evaluation is often underutilised. The primary reason is that programme directors, evaluators, and donor agency staff are generally more interested in measuring what the programme achieves (summative evaluation - see below) than what activities the programme carries out (process evaluation). Indeed, one often hears frustration over programmes that simply evaluate "outputs" (e.g., number of posters printed, number of persons trained, or number of community events held), but do not go the next step to assessing actual behaviour change. However, neglecting process evaluation is shortsighted, because process evaluation helps to explain why change did or did not occur.

Summative evaluation measures the extent to which change occurs, consistent with the objectives of the programme. For health communication programmes, the primary objective is usually a health-related behaviour. Illustrative outcomes in the developing country context include the following:

Type of programme	Outcome (health behaviour)
Family planning	Contraceptive use
HIV/AIDS prevention	Abstinence, monogamy, condom use
Malaria	Use of bed nets
Safe motherhood	Delivery with skilled birth attendant
Child survival	Exclusive breastfeeding to 6 months, use of ORS for diarrhoea

The explicit or implicit theories underlying most BCC programmes point to psychosocial factors that lead to behaviour change. These often figure as objectives as well: to increase knowledge, influence social norms, change attitudes, or improve self-efficacy to overcome barriers. Thus, summative evaluation may assess changes in these initial outcomes as well as the behaviour outcome of interest.

Summative evaluation addresses the question: did the programme make a difference? That is, did it have an impact? The strength of evidence to answer these questions varies depending on the study design and/or statistical techniques used. Not surprisingly, the methods needed to produce the strongest evidence are often the most costly and/or require knowledge of advanced analytic techniques, making it difficult for small projects with limited budgets to use the most rigourous evaluation methods. The levels of evidence, from strongest to weakest, are as follows:

Change occurred, which can be attributed to the intervention/programme. The evidence at this level demonstrates cause-and-effect; that is, the intervention produced the desired results. Randomised control trials (RCTs - i.e., experimental designs) are often touted as the only way to definitively establish causality, but RCTs are not an appropriate choice for evaluating large scale programmes with a mass media component. Even if evaluators wanted to, they could not randomly assign members of the general population to be exposed (experimental) or not exposed (control group) to a full coverage programme that potentially reaches everyone in the population. In fact, it is rarely possible to find a group of unexposed that are even comparable to those exposed when considering socio-economic status (SES) and access to media.

In isolated cases, it is possible to use experimental designs to evaluate small scale pilot projects including clinic-based interventions (i.e., where it is possible to control exposure to the communication intervention). Even then, local authorities often object to withholding potentially life saving information or treatment from selected segments of the population on ethical or political grounds.

Another limitation of RCTs - not unique to those involving communication programmes - is that the results may have low external validity (i.e., may not be generalisable to the larger population). For these reasons, evaluations of full scale communication programmes rarely use experimental designs.
Change occurred, which is associated with exposure to the intervention. Most evaluations of communication programmes in a developing country context take one of two forms: (a) quasi-experimental or (b) post-test only cross-sectional designs. Both use advanced statistical techniques (e.g., propensity scores, structural equation modelling) to control for common threats to validity. Such studies provide plausible evidence that the communication programme produced specific changes in the desired outcome(s) (Victora et al, 2004; Habicht et al., 1999).

The three most widely utilised quasi-experimental study designs are i) pretest posttest separate sample design, ii) pretest posttest non-equivalent control group design, and iii) time series (Fisher and Foreit, 2002). Those working in the field of programme evaluation rely on these study designs because they are the most viable means of evaluating full-coverage field-based programmes. Two advantages of such studies are (1) high external validity (for example, one can address the question of the effects of a national campaign on selected outcomes) and (2) feasibility. The disadvantages relate to the inability to answer with certainty the question: "what would have happened in the absence of this programme?" (Although the use of propensity score analysis largely overcomes this problem.) Moreover, these designs require significant financial and analytic resources.

Evaluators are making increasing use of analysis of posttest only cross sectional data using statistical techniques that link changes in the desired outcome to levels of exposure to the communication programme. One approach of this type is dose-response analysis, which establishes the level of exposure of respondents to the communication programme, then uses this variable in a regression analysis along with SES factors and access to media. It allows the evaluator to test whether those with higher levels of exposure are more likely to report desired behaviours, controlling for SES. A more statistically advanced method of establishing this type of association involves propensity scores. Using posttest only survey data, the evaluator uses regression analysis to establish the likelihood that a given respondent was exposed to the communication programme in question, based on available data on SES and access to media. From this, the evaluator is able to create a statistically equivalent control group that matches the characteristics of respondents actually exposed to the programme on all measured factors except exposure to the communication. By assessing the differential levels of change among those exposed and the statistically created control group, the evaluator can estimate the effects of the communication programme on the outcome behaviour and address the question of what would have happened in the absence of the communication programme (Kincaid and Do, forthcoming 2006).

The advantages of the posttest only methods are that they provide very strong evidence of communication effects that rely on feasible study designs. The disadvantages are two fold. First, certain factors may affect the health outcomes that are not readily measurable in surveys; one commonly cited example is motivation. Second, the evaluation cannot rule out reverse causation in cross sectional studies: that the respondent was attentive to the communication programme and recalled its contents in a subsequent survey, precisely because he/she was already practicing the behaviour (e.g., using modern contraception). Although tests for endogeniety provide a partial solution to this problem, they do not satisfy some critics.
Change occurs in the desired outcomes following the intervention. Those unfamiliar with study designs and threats to validity (including many policy makers) may be satisfied with evidence that the programme was carried out and that the desired change occurred subsequently. This type of trend data constitutes "adequate evidence" of change, to use the language of Habicht et al (1999). Programmes that have limited budgets or few personnel trained in evaluation may find it useful to track the trends in data from statistics or survey data instead of undertaking a full evaluation. The evidence from these trends is preferable to none at all, but it does not stand up to scientific scrutiny because it is not possible to rule out other factors that could also explain the change (e.g., other programmes with similar messages, secular trends).

One challenge of programme evaluation of any kind is the tradeoff between methodological rigour and ownership of the process by those responsible for the programme. For example, conducting impact analysis using an experimental design or multivariate analysis requires technical skills rarely available among the staff responsible for designing and implementing the programme. These evaluations often need to be carried out by external individuals or organisations. However, evaluations conducted by external evaluators may leave local staff distanced from the evaluation process. In response, many organisations have fostered participatory evaluation approaches, which create a far greater sense of ownership among key stakeholders. However, participatory evaluation may not meet the methodological rigour of the scientific community to measure effectiveness. All of the approaches above and others along this continuum are valuable in different contexts; practitioners and donors need to determine in advance what best fills their needs for evaluation.

Terminology in Programme Evaluation

The word "impact" carries different meanings for different people.

Evaluation purists object to the use of the word "impact" in situations where there is no compelling evidence of change or that the intervention caused that change. In the most egregious of cases, one will hear claims that "that programme really had impact," based on the perception that the programme was well-liked or that it reached a large audience, especially if a celebrity were associated with it. In the absence of evidence, this is mere speculation. Evaluation purists also object to claims of impact when the desired change may have occurred, but the study design can not rule out other factors that may have contributed to that change.

Three uses of the term "impact" that are widely used but in fact differ in meaning are as follows:

"impact" referring to the long-term outcome or the ultimate objective of the programme, such as a reduction in fertility, morbidity, or mortality. For example, some might consider the impact of a successful family planning programme to be a reduction in fertility;
"impact" referring to the "immediate, short-term, or intermediate effects of a health promotion programme; variables that are available within the time frame of most evaluations, which in the absence of data on expected final outcomes, provide some evidence of progress"; the term is used this way in the context of the PRECEDE framework (Green and Kreuter, 1999). Note: in the PRECEDE framework, impact evaluation occurs before outcome evaluation. This is just the reverse of its usage directly above, where impact is considered to be the ultimate objective of the programme;
"impact" referring to actual cause-and-effect, based on an assessment using a methodologically sound study design (e.g., experimental or quasi-experimental design) and analytic technique. Purists will require probabilistic evidence, whereas practitioners may be satisfied with plausible evidence of causal effects.

The overlap of terminology for different types of evaluation.

While the three main types of programme evaluation described above (formative, process, summative) are widely used in evaluation circles, this is not to the exclusion of other terms. Many readers may be wondering: what about "inputs-process-outputs-outcomes?" What exactly does "monitoring and evaluation" (M&E) cover? And how does M&E differ from programme evaluation?

It is safe to say that experts differ on the response to this question of how these terms relate one to another. In connection with this commentary, I have developed a chart that captures these various terms and attempts to indicate what each measures, available on The Communication Initiative website (please click here for the chart [PDF]). Definitions of all the terms contained in the figures, some but not all of which are described earlier in this commentary, accompany the chart in an Appendix (please click here for the Appendix).

In conclusion, programme evaluation has a certain logic, which this commentary attempts to explain. It follows a progression across the life of the project: from formative to process to summative evaluation. Many stakeholders want to know whether a given programme "has made a difference" (i.e., has had impact). Depending on the study designs that are feasible under the circumstances and the level of financial and human resources available, evaluators can produce evidence with differing levels of rigour. Most evaluators would argue in favour of the most rigourous evaluation possible given available funds. Whatever the final decision, it is useful for all stakeholders to understand the distinction between these levels of evidence, as well as the terminology used in programme evaluation.

Jane T. Bertrand, PhD, MBA
Chair, Dept. of Health Systems Management
Tulane University School of Public Health and Tropical Medicine
bertrand@tulane.edu

***

References

Fisher, A. and J. Foreit (2002). Designing HIV/AIDS Intervention Studies: An Operations Research Handbook. Population Council.
Green, L.W. and M.W. Kreuter (1999). Health Promotion Planning. An Educational and Ecological Approach. Boston: McGraw Hill, 3rd edition.
Habicht, J.P., C.G. Victora, and J.P. Vaughn. (1999). Evaluation designs for adequacy, plausibility and probability of public health programme performance and impact. International Journal of Epidemiology 28:10-18.
Kincaid, D.L. and M.P. Do. (Forthcoming in 2006). Impact of an Entertainment-Education Television Drama on Health Knowledge and Behavior in Bangladesh: An Application of Propensity Score Matching. Journal of Health Communication.
Rossi, P., M. Lipsey, and H. Freeman (RLF). A Systematic Approach. 7th edition 2004: Thousand Oaks, CA, Sage Publications.
Victora, C.G., J. Habicht, and J. Bryce. (2004) Evidence-Based Public Health: Moving Beyond Randomized Trials. American Journal of Public Health 94(3): 400-405.

***

Please participate in a Pulse Poll related to this commentary

Jane Bertrand [formerly] of JHUCCP argues that "participatory evaluation may not meet the methodological rigour of the scientific community to measure effectiveness."

Do you agree or disagree?

***

RESULTS of past Pulse Poll

Rather than getting back into the trenches of the 1990ies, the challenge today - in times of a strong treatment agenda - is to reassess our communication science, models and practice when putting HIV/AIDS prevention on the agenda.

Agree: 94.12%
Disagree: 00.00%
Unsure: 5.88%
Total number of participants = 17

***

This issue of The Drum Beat is meant to inspire dialogue and conversation among the Drum Beat network.

To read discussion contributions please click here.

***

This issue of The Drum Beat is an opinion piece and has been written and signed by the individual writer. The views expressed herein are the perspective of the writer and are not necessarily reflective of the views or opinions of The Communication Initiative or any of The Communication Initiative Partners.

***

The Drum Beat seeks to cover the full range of communication for development activities. Inclusion of an item does not imply endorsement or support by The Partners.

Please send material for The Drum Beat to the Editor - Deborah Heimann dheimann@comminit.com

To reproduce any portion of The Drum Beat, see our policy.

To subscribe, click here.

Language English

Comments

A full and clear non-technical discussion of the impact evaluation dilemma and options for addressing it: thank you

This is one of the most concise overviews describing the nature and status of evaluations of health communication programs that I've encountered. Excellent sources cited, brief and easy to read -- thanks!

Immensely useful and

Immensely useful and directly pertaining to the efforts to strengthen the ongoing endeavors.
Thanks and best wishes,
Dr.Rajesh Gopal,AIDS Control,Gujarat,India.

The Drum Beat 302 - Evaluating Health Communication Programmes

Comments

Immensely useful and

Red de La Iniciativa de Comunicación

Soul Beat Africa Network

The Drum Beat Network