Institute of Development Studies
In his later writings the philosopher Wittgenstein concludes that in most cases ‘the meaning of a word is its use in the language’ (Wittgenstein 1958). Loosely interpreted, the advice is not to worry excessively about dictionary definitions or, in the present instance, about seeking to establish precise dividing lines between different research activities. We will try to follow that advice in this text by accepting that a very wide range of contexts and activities have been and should continue to be seen as within the boundaries of both ‘Health Systems Research’ and ‘Implementation Research’. We will set out our use of these terms in this chapter but make no special claims for the value of our interpretation over the many others that can be found in the recent literature.
1. Health systems
Following the above guideline, we would consider research on the impact of reforms to the UK National Health Service (Allen 2013), on the diverse range of formal and informal health providers in Bangladesh (Ahmed et al. 2013), the activities of patent medical vendors in Nigeria (Oladapo and Lucas 2012), or on community-based insurance in Laos (Alkenbrack and Lindelow 2013) as mainstream examples of health systems research. We would also include research on household heathcare-seeking behaviour (e.g. Diaz et al. 2013) or coping strategies in response to illness (e.g. Rahman et al. 2013). To what extent can we characterise these very diverse studies as research on ‘Health Systems’?
The WHO defines a health system as:
all organizations, people and actions whose primary intent is to promote restore or maintain health. . . . It includes, for example, a mother caring for a sick child at home; private providers; behaviour change programmes; vector-control campaigns; health insurance organizations; occupational health and safety legislation. It includes inter-sectoral action by health staff, for example, encouraging the ministry of education to promote female education, a well known determinant of better health.
(WHO 2007: 2)
This is an interesting definition in a number of respects. First, it omits words such as ‘interacting’ or ‘interdependent’, which we would find in most dictionary definitions of the word ‘system’, though the same document later stresses the importance of interactions between health system components or ‘building blocks’. Second, it defines the ‘system boundary’ – which organisations, people and actions are considered part of the system and which not – in terms of their ‘primary intent’. This seems a potentially elusive criterion and one that must essentially involve a subjective judgement. Should we be examining, for example, the constitution of the Indian Ministry of Health and Family Welfare, the professional codes of conduct of Cambodian doctors or the mission statements of Ugandan Health NGOs to assess their health system status? Moreover, if individuals or organisations are contributing to the promotion, restoration or maintenance of health, to what extent does it matter why they are doing so? On the other hand, should we include those who have the best of intentions but who, perhaps because they have limited health knowledge, do more harm than good?
In line with the ‘primary intent’ requirement, health systems themselves are seen by the WHO as having goals, specifically, ‘improving health and health equity, in ways that are responsive, financially fair, and make the best, or most efficient, use of available resources’ (WHO 2000). As might be expected from a United Nations agency, the overall approach seems most appropriate for a ‘national health system’ with a defined organisational structure that operates under the direction of a well-intentioned government. Over recent years the WHO has expended considerable energy on devising methodologies to assess the comparative performance of such systems (Murray and Evans 2003) and producing guidance on ‘health systems strengthening’, which tends to focus on the central role that should be played by governments in terms of strategic planning, regulation and accountability, if not necessarily in service provision.
Though private providers are specifically identified in the list of organisations within the health system, it is not at all clear which, if any, private sector actors might be included within the WHO definition if strictly applied. Is the ‘primary intent’ of an international pharmaceutical company to ‘promote, restore or maintain health’ or to meet the expectations of its owners for a substantial return on their investment? In many countries there is a legal obligation on the directors of all companies with shareholders to act in the ‘best interest’ of those shareholders. It would seem unlikely that those interests would always align with the WHO criterion. Similarly, patent medical vendors in Nigeria are the main providers of anti-malarial drugs to the rural population, even though the law prohibits them from doing so (Goodman et al. 2007). They are typically marginalised workers on very low incomes whose ‘primary intent’ will in almost all cases be to support themselves and their families by responding to the demands of their clients. On the other hand, they would probably claim, and often genuinely believe, that they are proving a useful service to those clients, who have very limited access to any formal, regulated health services.
The current focus of WHO work on health systems is on the need for them to deliver ‘Universal Health Coverage’ (UHC):
ensuring that all people have access to needed promotive, preventive, curative and rehabilitative health services, of sufficient quality to be effective, while also ensuring that people do not suffer financial hardship when paying for these services
Again, note that the focus is on health system outcomes. This is a perfectly rational approach, given that the ‘primary intent’ of the WHO is not to advance our knowledge of health systems but to improve the health of the world’s population and that the primary mechanism available to them is to persuade national governments to play a lead role in improving the overall system that is currently delivering health outcomes in a given country. The ‘building blocks’ methodology of the WHO Health Systems Framework (WHO 2010) is an extension of this strategy. The aim is to identify a number of essential functions of a health system in order to consider the extent to which that system is meeting, or is capable of meeting, performance targets in terms of:
- Health services
- Health workforce
- Health information
- Medical products, vaccines and technologies
- Health financing
- Leadership and governance
For example, under the health services building block, the system should deliver ‘effective, safe, quality personal and non-personal health interventions to those that need them, when and where needed, with minimum waste of resources’ (WHO 2010: 3). Using this approach, health systems strengthening can be defined in terms of (a) determining the extent to which any given component is failing to deliver its expected outcomes, (b) analysing the reasons for that failure – which may lie in its interactions with other components – and (c) implementing actions that will remedy the situation. Again, it seems evident that government will play the key role in this process, possibly in collaboration with international agencies where resources are highly constrained or the national capacity for health systems strengthening is limited.
As indicated, we would see the above approach to health systems as one that may well be appropriate for the aims and procedures of the WHO. However, here we are concerned with health systems research. Gaining knowledge as to how specific health systems work is our primary intent. We assume that those working in this area will wish to use their research findings to influence policy in such a way as to ‘promote, restore or maintain health’, and a later chapter will provide guidance as to how this may best be achieved, but our first priority is to understand the health system that is the focus of our research – gathering and interpreting evidence about the complex interplay between the various actors who are engaged in what we identify as health-related activities (Peters 2014).
An interesting illustration of the possibilities for alternative approaches to the analysis of health systems is provided by Ahmed et al. (2013) in a study of the health sector in Bangladesh. The context within which that study is set will be very familiar to those who have worked on health systems in resource-poor environments. There are a multiplicity of health providers offering a variety of allopathic and alternative treatments, in this case including Ayurvedic, Unani and homeopathic remedies. Transactions are typically on a ‘cash-for service’ basis, even in the public sector. Poorer clients have very limited access to qualified providers (doctors, nurses, midwives, pharmacists) and rely on unlicensed village doctors, drug sellers, traditional healers, community health workers and traditional birth attendants. In this environment, the formal health sector regulatory framework has limited relevance for the great majority of the population and there is a yawning gap between the formal ‘Bangladesh Health System’, as defined in government policy statements, and the reality on the ground. To address this reality Ahmed et al. develop a conceptual framework that:
challenges static and antiquated notions of policy and governance identified, for example, in the building block approach of the WHO Health Systems Framework or in the efforts to align development partners around a single country health plan. The complex and chaotic nature of health systems is unlikely to be tamed by these relatively naive notions of command and control health systems governance.
(WHO 2010: 1753)
Two relatively recent approaches to health systems have had a considerable influence on research in this area. The first is usually described as work on ‘health markets’. At one level it involves an exploration of the role of private health providers, driven by a recognition of the extent to which in many countries health services are purchased in the same way as other services and commodities: “in at least 19 countries in Asia and 15 countries in Africa – including many of the world’s most populous nations (Bangladesh, China, India, Nigeria, and Pakistan) – more than half of total health expenditures are private out-of-pocket transactions” (Lagomarsino et al., 2009: 2). More generally, it questions the continuing relevance of many of the standard ways of characterising the health sector, recognising that health systems have become increasingly pluralistic. Old barriers between private/public, modern/traditional, and formal/informal health providers seem to be breaking down (Bloom et al. 2014; Peters and Bloom 2012). Bloom et al. (2008:2077) suggest that health systems can be more usefully considered as complex “knowledge economies which produce and mediate access to health knowledge embedded in people, services and commodities”. Their work focuses attention on the ‘stocks and flows’ of health knowledge: how its value is determined, who possesses it and how others gain access to it. This requires a shift from traditional health systems analysis and its concerns with public and private sectors, modern and traditional providers, etc. and focuses attention on power relations and the ways in which it might be possible to construct new forms of “social contracts for health care which build on existing areas of competence and good practice, whether mediated by states, markets or other institutional actors” (Bloom et al. 2008: 2085).
The second approach, which has captured the imagination of many leading health system researchers over recent years, is based on the observation that health systems have all the characteristics of complex adaptive systems (CAS) (Bloom 2014; Rickles et al., 2007; Tan et al., 2005; World Bank 2007). A wide range of actors with diverse objectives act at multiple levels and interact through dynamic and multifaceted networks. As de Savigny and Adam (2009) point out, an intervention in one area will typically have consequences, often unforeseen, for many others.
“every health intervention, from the simplest to the most complex, has an effect on the overall system. Presumably simple interventions targeting one health system entry point have multiple and sometimes counterintuitive effects elsewhere in the system” (de Savigny and Adam 2009:30).
Complex adaptive systems have the capability to self-organise, adapt, and learn from experience. They can change in a highly non-linear fashion over time, and are not easily controlled or predictable. It is not unusual for a CAS to show limited responses to apparently major interventions but then to change suddenly when a tipping point is reached (Gladwell 2000). In Chapter 4 we will discuss various aspects of CAS phenomena that are relevant to the analysis of health systems, including path dependency, feedback loops, scale-free networks, emergent behaviour, and phase transitions or tipping points (Paina and Peters 2011).
2. A general framework for research on health systems
To provide a framework for the development of research strategies that can be used in exploring the wide range of health systems referred to above, we will adopt a generic approach to their definition, adapted from that proposed by Bishai (undated). We first define as ‘agents’ all those individuals who are considered to play a role in a given health system (doctors, nurses, managers, drug sellers, patients, carers, etc.). These agents may come together with a common purpose to form various identifiable ‘units’ (organisations or groups – ministries of health, hospitals, health centres, health insurance agencies, unions, households, etc.), which can be regarded as both capable of making decisions and responsible for any actions undertaken as a result of those decisions. For example, we might hold an individual hospital doctor responsible for their poor treatment of a specific patient but hold the hospital management responsible for their collective failure to employ sufficient doctors. Finally, we can consider the rules or ‘institutions’ that govern or at least influence the behaviour of these agents and units. The term ‘institutions’ is used to cover not only the relevant legal frameworks that regulate the health system but also any established procedures, protocols, guidelines, codes of conduct, accepted behavioural norms, etc., that agents and units are expected to observe.
Given the above, we can define a health system very generally as ‘an interacting collection of agents, units, and institutions concerned with human health’. Note that this definition can encompass national systems such as the UK NHS, the patent medical vendors in a given state of Nigeria, community-based insurance schemes, and rural households in Bangladesh who are coping with the impact of healthcare costs. We make no a priori judgement as to the benevolence of the individuals or organisations involved, or as to the virtue of the institutions that influence their behaviour. Our initial aim will be to understand how a given system operates, though usually with the implicit intention of identifying potential ways to improve that operation in order to generate better health outcomes. The above implies that in any given context there may be multiple ways to define health systems. Health systems are essentially conceptual models of reality. “The concept of a ‘health system’ is a heuristic device for understanding a complex reality. Analysts draw different boundaries around the system depending on the questions they are trying to answer” (Bloom 2014:161). The important question is not the extent to which they precisely mirror that reality – all economic and social models involve drastic simplification – but the extent to which they are useful in predicting and explaining observable outcomes. Thus, the first task of a health systems researcher will be to decide how they will identify the types of agents, units and institutions with which they will be concerned and how to specify the system boundary.
In quantitative studies, for example those using questionnaire surveys or analysing routine data, it will often be necessary to make such decisions at the start of the research process. To give a simple example, if the ‘units’ to be surveyed include private clinics, you will need to specify how private clinics are to be defined and sampled – often a far from simple task. Similarly, if the questionnaire survey is to gather information on the institutional context, for example rules on incentive payments, you will have to decide on the importance of informal payments and the extent to which you will attempt to explore the behavioural norms that govern such payments. As we will discuss in Chapter 7, one advantage of qualitative studies is that such definitions can be allowed to emerge and evolve during the research activity, though this advantage has to be balanced against the need to defend against challenges of subjectivity and bias that you will almost inevitably face from those who wish to dispute your findings.
Whether the health system under discussion is defined before or during the research activity, we would argue that the primary obligation on any researcher is for transparency – they must go out of their way to ensure that those who read, and may even use, their findings fully understand the assumptions made in arriving at those findings. Good researchers should have the confidence to expose themselves to critical evaluation of both their conceptual models and their methodologies, especially if they have expectations that their work may have a significant influence on the formulation of health policy.
3. Implementation research
In his first address as incoming President of the World Bank Group, Jim Kim identified the ‘next frontier’ for the Group as:
“helping to advance a ‘science of delivery’. Delivery isn’t easy – it’s not as simple as just saying ‘this works, this doesn’t’. Effective delivery demands context-specific knowledge. It requires constant adjustments, a willingness to take smart risks, and a relentless focus on the details of implementation”.
Implementation research (IR) can be seen as the means by which we can develop a ‘science of delivery’. It has been defined as: ‘scientific inquiry into questions concerning implementation – the act of carrying an intention into effect’ (Peters et al. 2013a). In the health sector, the focus on implementation research has arisen partly from a long-standing sense of frustration that interventions for which there appears to be strong evidence indicating the potential for substantial reductions in levels of morbidity and/or mortality in high-risk populations are either not being used or not being used effectively (e.g. Bhutta et al. 2014; Darmstadt et al. 2005). The primary objective of IR in health is therefore seen as the effective and efficient integration of such innovations into existing health systems, “to improve the uptake . . . of research findings into routine and common practices” (Padian et al. 2011:199).
As indicated above, there are multiple definitions of IR, often arising from the specific concerns of those working in different areas of health research. For example, those concerned with innovations in medical science, such as the development of new pharmaceuticals, will often use the term ‘Translational Research’ (Drolet and Lorenzi 2011) in relation to the overall process by which those innovations move from the laboratory to various stages of clinical trials on human subjects and then on to clinical practice. In the present text, because we are focusing on IR in the context of research on health systems, we will be concerned with the final phase in this process, the “integration of research findings and evidence-based interventions into health care policy and practice”. We therefore exclude discussion of laboratory research to develop new drugs or medical technologies, and clinical trials to test the efficacy and safety of those drugs or technologies.
Our overall concern is thus to determine how best to apply health innovations that have proved successful in carefully controlled environments (laboratories, clinical trials, small pilot exercises, etc.) in a wider context. This requires the design of some form of intervention, which we will use as a general term to cover a range of activities including policy changes, programmes and projects. These have to be implemented, which typically involves actions by a collection of individuals that will here be described as the implementation team. In general we will assume that IR is best undertaken by ‘insiders’ – here defined as individuals who work alongside the implementation team, though with their own terms of reference and independently funded – and that the research questions they address are generated by identifying the constraints and challenges encountered during the implementation process. The scope of IR studies and the range of issues addressed can be very wide, including “the factors affecting implementation . . ., the processes of implementation themselves . . ., and the outcomes, or end-products of the implementation under study” (Peters et al. 2013b: 27).
Any implementation will take place within the context of an existing health system, which is in turn embedded in a broader physical, social, economic, institutional and, often overlooked, historical context (Grundy et al. 2014). A myriad factors may thus impact on the relative success or failure of that implementation, the great majority of which will be outside the control of the implementation team. The most appropriate strategy will often be ‘constrained adaptation’ – modification of the intervention design to allow for contextual factors but not to the extent that the primary aims of the intervention may be subverted.
The distinction between IR and the closely related activity designated ‘Operations Research’ (OR) is hard to pin down. For example, Zachariah et al. (2009) define OR as “the search for knowledge on interventions, strategies, or tools that can enhance the quality, effectiveness, or coverage of programmes in which the research is being done” (Zachariah et al. 2009: 711), which bears a close similarity to the definition of IR provided above. In practice, definitions vary from agency to agency. An important framework document from the Global Fund (2008) on IR and OR makes no attempt to differentiate between them, other than to provide an annex listing a selection of these definitions. In this text we make a pragmatic distinction, drawing on the following definitions:
the use of systematic research techniques for program decision-making to achieve a specific outcome. OR provides policymakers and managers with evidence that they can use to improve program operations.(WHO 2003:3)
a. identify common implementation problems . . . b. develop practical solutions to these problems . . . c. determine . . . the best way to introduce these new implementation strategies into the health system and facilitate their full scale implementation, evaluation and modification, as required.
Note that the first definition focuses on the use of research to improve the operation of a given programme – to improve the implementation of the programme within which the research is undertaken. It is not unusual for such a programme to have an OR component, with terms of reference that require those working on this component to focus on research that project managers may be able to use to enhance the implementation process The definition implies that OR is really only useful if it suggests actions that can be undertaken by the project management. In contrast, the second requires the IR team to use its independently funded resources to propose implementation strategies that can be integrated into the health system, by exploring not only issues that might hamper the current implementation but those that might be encountered in other contexts. In doing so, they may well be able to feed those strategies back into the implementation in which they are embedded, playing an OR role, but their primary task is to look outwards and explore the broader implications of their research in terms of seeking to maximise access to the benefits of the innovation.
To take a simple example, if an innovative incentive scheme for community health workers in a given region was being subverted by demands from higher-level health officials to be included in the scheme, an OR solution might be to ask the head of the regional government to negotiate with those officials, if that individual were strongly supportive of the intervention. From an IR perspective, this would raise a number of questions about the possibilities for both sustainability – how long will that individual remain as head of government – and scaling up. For example, how likely is it that similar issues will be encountered in other regions? Are there plausible alternative strategies that could be employed where the regional head was not willing to intervene or perhaps took the side of the officials? To what extent could the incentive scheme be adapted to gain acceptance among health officials while still delivering most of the anticipated gains in terms of health outcomes?
We would argue that the role of IR in terms of encouraging potentially wide-ranging reforms to the operation of health systems has implications for both the overall approach that researchers should assume and the research methodologies and methods that they should adopt. The failure of a time-limited project or programme in a given location will involve a waste of valuable resources and may delay or hinder the introduction of a potential valuable innovation. The failure of a major health system reform could have far more serious consequences, both in terms of the size of the investment involved and the number of individuals affected. Researchers whose primary objective may be to encourage the widespread uptake of a health innovation and to influence implementation practice at scale have to ensure that their recommendations are backed by the most rigorous and persuasive research findings.
In Chapter 2, we will consider the overall nature of the innovation process and relevant aspects of research design. We would suggest that the primary requirement is for an initial very clear decision as to what the IR activities are intended to achieve. On the one hand, we are assuming that all those involved will be primarily concerned with influencing policy, and that objective should be one starting point for research design – what types of research are most likely to generate findings that will be accepted into the policy process? On the other, we would argue that this objective must be carefully balanced against the demand for quality and rigor identified above.
We suggest that there are four broad areas that require special attention. The first, which we will address in Chapter 3, is the need to systematically review and evaluate the relevant existing literature, which will almost always be far more extensive than most researchers assume. The second requirement, discussed in Chapter 4, is to develop an in-depth understanding of the intervention that is to be implemented. This involves detailed knowledge of each step in the overall process that is intended to deliver the intended benefits, the assumptions that are required to hold for this process to function as planned and the indicators that will allow those managing the intervention to monitor if the implementation is on track. Because implementation is a dynamic process that will invariably diverge from the original plan, it will also be necessary to devise strategies that allow the researchers to be aware of any important modifications to that plan, focusing on the extent to which these are linked to the failure of the original assumptions or contextual factors that had not been fully appreciated when the plan was devised. As indicated above, one key issue will be the extent to which adaptation risks impacting on expected outcomes by threatening the fidelity of the implementation – has adaptation changed any essential features of the intervention, are we in practice implementing a reform that differs substantially from the one intended? As previously suggested, we would argue that only ‘insiders’ – researchers fully engaged with those undertaking the implementation – can hope to comprehend the process at this level of detail.
The third area is that of context, which we will address in Chapter 5. Here, as discussed above, we start from an analysis of the health systems context, with the health system and its boundaries carefully defined by the research team. We then need to explore the broader physical, social, economic, institutional and historical context within which the health system is located to identify those factors that may potentially influence implementation outcomes. We would see institutional analysis and stakeholder analysis as central to this process. Experience suggests that too often these are seen as peripheral activities, with rote procedures generating simplistic findings that play little part in the implementation process. Finally, in terms of overall approach, we argue the need for early and close engagement with key stakeholders, including those who may eventually play a central role in the integration of research findings into routine practice.
In Chapter 6 we will then address another often neglected aspect of health systems research, that addressing ethical issues. While ethical concerns have played a central role in the design of clinical research studies, there is often an implicit assumption that non-clinical health sector research is exempt from the strict observance of ethical standards. We will demonstrate why this assumption is unacceptable.
The ambitious nature of IR also has implications for data collection, analysis and interpretation, as we will argue in Chapter 7. We have argued that it should be independently funded but recognise that this will probably mean that it have very limited resources with which to pursue its very ambitious objectives. A key skill with therefore be to use those resources most effectively, which involves careful allocation between a potential wide range of research activities. One guiding principle should be that of transparency. The objective of initiating large-scale reform of some aspect of the health system implies a need to influence a variety of key stakeholders, many of whom will have limited knowledge of data collection and analysis procedures. The overriding obligation of researchers is to the population who may be affected if their findings are put into practice. Seeking to win over policymakers and other stakeholders by exercising analytical or presentational skills that mask underlying data limitations is not the way to meet that obligation. Two broad approaches to these issues, labelled by convention ‘qualitative’ and ‘quantitative’, though we recognise the inherent limitations of this division, are explored in Chapters 8 and 9.
As we will discuss in Chapter 10, the above obligation also imposes a requirement to present findings in a manner that can be generally understood and correctly interpreted. This is not to imply that sophisticated analytical techniques, for example econometric modelling, should not be used in IR, but we should be very wary of persuading policymakers of the likely benefits of a major reform based purely on the findings of such an analysis unless there were strongly supportive evidence from other sources.
At a deeper level, the potential importance of IR findings requires that researchers be sufficiently reflexive that they do not ‘fool themselves’ into believing that they have fully understood the nature of the implementation process by the routine application of qualitative and/or quantitative analytical procedures. Understanding typically requires that much more time is spent in careful consideration than in the manipulation of data. In terms of methodology and methods we would see the key areas as: sampling methodology – both for ‘qualitative’ and ‘quantitative’ research to avoid bias and accusations of bias; a preference for time series data where possible because the aim is to track processes; attention to the probability/frequency distributions of key variables to allow the thoughtful choice of appropriate summary statistics that encourage correct interpretation; and the need for greater attention than is often the case in academic research to underlying assumptions.
Overall, we would argue that as researchers we have too often claimed to understand how complex health system interventions function on the basis of very flimsy evidence, typically involving relatively short visits to the field, limited interaction with key stakeholders and one or two cross-sectional surveys of providers and/or intended beneficiaries that manage to be both complicated and simplistic. The argument in this text, as echoed in a recent book on the evaluation of complex interventions (Patton 2011), is that to reach a position from which we are willing to pass judgement on the advisability of scaling up or relocating a health system intervention that may have major implications for the health and well-being of the population, we need to take implementation research much more seriously than in the past and reconsider the amount of time, effort and resources that we are prepared to allocate to the task.
Ahmed, S.M.; Evans, T.G.; Standing, H. and Mahmud, S. (2013) ‘Bangladesh: Innovation for Universal Health Coverage 2: Harnessing Pluralism for Better Health in Bangladesh’, The Lancet 382: 1746–55
Alkenbrack, S. and Lindelow, M. (2013) ‘The Impact of Community-Based Health Insurance on Utilisation and Out-Of-Pocket Expenditures in Lao People’s Democratic Republic’, Health Economics, http://onlinelibrary.wiley.com/doi/10.1002/hec.3023/abstract (accessed 3 March 2015)
Allen, P. (2013) ‘An Economic Analysis of the Limits of Market Based Reforms in the English NHS’, BMC Health Services Research, 13 (Supplement 1): S1 www.biomedcentral.com/1472-6963/13/S1/S1 (accessed 28 January 2015)
Bhutta, Z.A.; Das, J.K.; Bahl, R.; Lawn, J.E.; Salam R.A.; Paul, V.K.; Sankar, M.J.; Blencowe, H.; Rizvi, A.; Chou, V.B. and Walker, N. (2014) ‘Can Available Interventions End Preventable Deaths in Mothers, Newborn Babies, and Stillbirths, and at What Cost?’, The Lancet 384: 347–70
Bishai, D. (undated) ‘The Building Blocks of Health Systems’, online course, Future Health Systems, www.futurehealthsystems.org/online-course/module-1-the-building-blocks-of-health-systems.html (accessed 20 July 2014)
Bloom, G. (2014) ‘History, Complexity and Health Systems Research’, Social Science and Medicine 117: 160–61
Bloom, G. and Standing, H. (2001) ‘Pluralism and Marketisation in the Health Sector: Meeting Needs in Contexts of Social Change in Low- and Middle-Income Countries’, IDS Working Paper 136, Brighton, UK: Institute of Development Studies
Bloom, G., Standing, H. and Lloyd, R., (2008) ‘Markets, Information Asymmetry and Health Care: Towards New Social Contracts’, Social Science and Medicine 66.10: 2076–87
Bloom, G.; Wilkinson, A.; Standing, H. and Lucas, H. (2014) ‘Engaging with Health Markets in Low- and Middle-Income Countries’, IDS Working Paper (443), Brighton, UK: Institute of Development Studies, www.ids.ac.uk/publication/engaging-with-health-markets-in-low-and-middle-income-countries (accessed 28 January 2015)
Darmstadt, G.L.; Bhutta, Z.A.; Cousens, S.; Adam, T.; Walker, N. and de Bernis, L. (2005) ‘Evidence-based, Cost-effective Interventions: How Many Newborn Babies Can We Save?’, The Lancet 365.9463: 977–88
de Savigny, D. and Adam, T. (eds) (2009) Systems Thinking for Health Systems Strengthening, Geneva: World Health Organization, www.who.int/alliance-hpsr/resources/9789241563895/en/ (accessed 28 January 2015)
Diaz, T.; George, A.S.; Rao, S.R.; Bangura, P.S.; Baimba, J.B.; McMahon, S.A. and Kabano, A. ‘Healthcare Seeking for Diarrhoea, Malaria and Pneumonia among Children in Four Poor Rural Districts in Sierra Leone in the Context of Free Health Care: Results of a Cross-Sectional Survey’, BMC Public Health 13: 157, www.biomedcentral.com/1471-2458/13/157 (accessed 28 January 2015)
Drolet, B.C. and Lorenzi, N.M. (2011) ‘Translational Research: Understanding the Continuum from Bench to Bedside’, Translational Research 157.1: 15
Fixsen, D.L.; Naoom, S.F.; Blase, K.A.; Friedman, R.M. and Wallace, F. (2005) ‘Implementation Research: A Synthesis of the Literature’, National Implementation Research Network, FPG Child Development Institute, University of North Carolina, http://nirn.fpg.unc.edu/resources/implementation-research-synthesis-literature (accessed 28 January 2015)
Gilson, L. (ed.) (2012) ‘Health Policy and Systems Research: A Methodology Reader’. Geneva: Alliance HPSR/WHO, www.who.int/alliance-hpsr/resources/reader/en/ (accessed 28 January 2015)
Gladwell, M. (2000) The Tipping Point: How Little Things Can Make a Big Difference, London: Abacus
Global Fund (2008) ‘Framework for Operations and Implementation Research in Health and Disease Control Programs’ The Global Fund, http://whqlibdoc.who.int/publications/2008/9292241109_eng.pdf (accessed 28 January 2015)
Goffman, E. (1956) The Presentation of Self in Everyday Life, New York: Doubleday
Goodman, C.; Brieger, W.; Unwin, A.; Mills, A.; Meek, S. and Greer, G. (2007) ‘Medicine Sellers and Malaria Treatment in sub-Saharan Africa: What do They do and How Can their Practice be Improved?’, American Journal of Tropical Medicine and Hygeine 77.6 (Supplement): 203–18
Grundy, J.; Hoban, E.; Allender, S. and Annear, P. (2014) ‘The Inter-Section of Political History and Health Policy in Asia – The Historical Foundations for Health Policy Analysis’, Social Science and Medicine 117: 150–59
Lagomarsino, G.; Nachuk, S. and Singh Kundra, S. (2009) ‘Public Stewardship of Private Providers in Mixed Health Systems: Synthesis Report from the Rockefeller Foundation—Sponsored Initiative on the Role of the Private Sector in Health Systems in Developing Countries’, www.rockefellerfoundation.org/uploads/files/f5563d85-c06b-4224-bbcd-b43d46854f83-public.pdf (accessed 28 January 2015)
Murray, C.J.L. and Evans, D.B. (eds) (2003) Health Systems Performance Assessment: Debates, Methods and Empiricism, Geneva: World Health Organization, www.who.int/health_financing/documents/cov-hspa/en/ (accessed 28 January 2015)
Oladepo, O. and Lucas, H. (2012) ‘Improving the Performance of Patent Medicine Vendors in Nigeria’, in G. Bloom, B. Kanjilal, H. Lucas and D. Peters (eds), Transforming Health Markets in Asia and Africa: Improving Quality and Access, London: Earthscan
Padian, N.S.; Holmes, C.B.; McCoy, S.I.; Lyerla, R.; Bouey, P.D. and Goosby, E.P. (2011) ‘Implementation Science for the US President's Emergency Plan for AIDS Relief (PEPFAR)’, Journal of Acquired Immune Deficiency Syndromes 56.3: 199–203
Paina, L. and Peters, D.H. (2011) ‘Understanding Pathways for Scaling Up Health Services Through the Lens of Complex Adaptive Systems’, Health Policy and Planning 27: 365–73
Patton, M.Q. (2011) Developmental Evaluation: Applying Complexity Concepts to Enhance Innovation and Use, New York: Guilford Press
Peters, D.H. (2014) ‘The Application of Systems Thinking in Health: Why Use Systems Thinking?’, Health Research Policy and Systems 12: 51, www.health-policy-systems.com/content/12/1/51 (accessed 28 January 2015)
Peters, D.H.; Adam T.; Alonge, O.; Agyepong, I.A. and Tran, N. (2013a) ‘Implementation Research: What It Is and How To Do It’, BMJ 347: f6753
Peters, D.H.; Tran, N.T. and Adam T. (2013b) ‘Implementation Research in Health: A Practical Guide’, Alliance for Health Policy and Systems Research, World Health Organization, www.who.int/alliance-hpsr/alliancehpsr_irpguide.pdf (accessed 28 January 2015)
Peters, D. and Bloom, G. (2012) ‘Bring Order to Unregulated Health Markets’, Nature 487.7406: 163–65
Rahman, M.M.; Gilmour, S.; Saito, E.; Sultana, P.; and Shibuya K. (2013) ‘Self-reported Illness and Household Strategies for Coping with Health-Care Payments in Bangladesh’, Bulletin of the World Health Organization 91: 449–58, www.scielosp.org/pdf/bwho/v91n6/13.pdf (accessed 28 January 2015)
Rickles, D.; Hawe, P. and Shiell, A. (2007) ‘A Simple Guide to Chaos and Complexity’, Journal of Epidemiology and Community Health 61.11: 933-37
Tan, J.; Wen, J.H. and Awad, N. (2005) ‘Health Care and Service Delivery Systems as Complex Adaptive Systems’, Communications of the ACM 48.5: 36–44
TDR (2005) UNICEF/UNDP/WHO/World Bank Special Program for Research and Training in Tropical Diseases (TDR) Implementation Research and Methods (IRM), ‘Guidelines for Preparing a Grant Application to the TDR Steering Committee for Implementation Research’, http://geneplot.plasmodb.org/publications/tdr/grants/workplans/pdf/ir_guidelines.pdf (accessed 28 January 2015)
WHO (2010) ‘Monitoring the Building Blocks of Health Systems: A Handbook of Indicators and Their Measurement Strategies’, www.who.int/healthinfo/systems/WHO_MBHSS_2010_full_web.pdf?ua=1 (accessed 28 January 2015)
WHO (2007) ‘Everybody’s Business: Strengthening Health Systems to Improve Health Outcomes. WHO’s Framework for Action’, Geneva: World Health Organization, www.who.int/healthsystems/strategy/everybodys_business.pdf (accessed 28 January 2015)
WHO (2003) ‘Expanding Capacity for Operations Research in Reproductive Health: Summary Report of a Consultative Meeting, WHO, Geneva, Switzerland, December 10–12, 2001’, Geneva: World Health Organization, www.who.int/reproductivehealth/publications/general/RHR_02_18/en/ (accessed 28 January 2015)
WHO (2000) World Health Report 2000, Geneva: World Health Organization
Wittgenstein, L. (1958) Philosophical Investigations (translated by G.E.M. Anscombe), second edition, Oxford: Blackwell
World Bank (2007) Healthy Development: The World Bank Strategy for Health, Nutrition, and Population Results, Washington, DC: World Bank
Zachariah, R.; Harries, A.D.; Ishikawa, N. et al. (2009) ‘Operational Research in Low-Income Countries: What, Why, and How?’, The Lancet Infectious Diseases 9: 711–17
Director, Centre for Studies in Family Medicine,
Schulich School of Medicine & Dentistry,
Western University, Ontario
1. Innovation in Health Systems
“..(R)ather than asking how research evidence can be made more influential, academics should aim to understand what influences and constitutes policy” (Oliver et al., 2014:1).
In this chapter we will consider the nature of change in health systems, and how small innovations in technology or delivery complement large changes in policy or program design to improve outcomes of care. We provide guidance to health systems evaluators and researchers to help you deploy your scientific skills to design, and implement innovation in healthcare, and assess the impact on health status and care delivery. We hope you will use some of these ideas to support the efforts of other stakeholders to influence decision-making in health systems, in order to improve the health care experience of patients, communities and providers, health outcomes and health system efficiency.
The majority of health innovation ideas do not progress into viable products, services or changes in healthcare delivery; few of those that are successfully developed and pilot tested are implemented effectively and even fewer scale to their full potential and are institutionalized into common practice. The process of developing a new health intervention (drug or technology) takes on average 14 years and 2 billion USD, and yet fewer than 5 percent of these innovations reach scale and are sustained (NIH, 2014). The proportion of successes is simply not known for health service delivery or policy changes but given the relatively low investment in early stage development, there is reason to think that the success rate is even lower.
Even if you have a good idea and a good innovation that is supported by empirical science that is simply not enough; the health system is complex and good innovations alone will not be effective in real world settings. Successful development, implementation and scale up of health innovations is a multi-stage process that requires appraisal at every stage and it is a team sport that requires true collaborations from all stakeholders at every stage. Uptake of innovations appears to depend on the interests of the critical stakeholders including the innovators, end users and the decision makers. It is also influenced by the broader context including the social and physical environment, the health system, and the regulatory, political and economic environment. Uptake is strongly influenced by the stage of maturity of the innovation at the point when it is offered to the health system decision makers for consideration. All too often, the maturity of an innovation is overoptimistically assessed, a problem which can usually be blamed on the innovators. This staging is based upon acceptability of the innovation to other stakeholders, the evaluation results up to that stage and characteristics of the innovation itself including its disruptiveness. We propose an approach to planning innovation which explicitly judges the maturity of an innovation and consciously undertakes specific work to ensure that it is matured to be ready for ‘prime time’.
Our suggested approach to innovation and its spread into the health system consists of several stages: development, pilot testing, implementation, scaling up and institutionalization. The approach is accompanied at each stage by very careful evaluation, in order to identify potential problems that the innovation will face and facilitate remedies before large investments are made in a potentially flawed solution.
Is there “a” health system?
We often speak of our countries’ as each having a health system, bringing to mind a large, coherent, rationally designed and managed organization. In such a health system we might well assume that most change is implemented through major policy initiatives led by national, provincial and local government health departments. But healthcare systems are not tightly coordinated or well integrated machines. It would be more accurate to think of health systems as consisting of multiple separate and uncoordinated elements in a spontaneously and rapidly evolving eco-system, each element with a unique history and well established ways of doing things. These parts intersect, overlap, collaborate and compete against a background of changing patterns of disease, demography and care delivery. In this constant and often contradictory flux, widely varying responses to change in need, demand, social forces, pattern of illness are implemented, some as policy, but many more simply as ad-hoc decisions on delivery of care in reaction one or other currently high profile problem.
Governments are only one of the many groups trying to shape the health system in their own interests. Others include professional organisations, producers and sellers of drugs and technologies, non-profit governance organizations running hospitals or long term care homes, advocacy groups for specific patient and disease issues, and last, but not least, citizens and their families, as individuals, interest groups and communities. These end-users of health care may favour different approaches to care, with dramatically different priorities and proposals for structuring health systems, depending on whether they are young or old, urban or rural, recent migrants or long established, wealthy or poor (and even how poor) and depending on whether or not they are ill, and if so, with what conditions.
It is important for Health services researchers to understand that such complex systems are not easy to improve, and that well intended changes to one aspect of care may produce unintended consequences for another part of the health system. With complex patterns of needs, and complex structures for responding to these needs, how do health systems decision-makers decide what care to provide, to provide, to whom, and how? Whether to prioritize health services for children, or the elderly, on chronic or acute infectious illnesses, on equity, access or coverage of the population, on quality of care or continuity? Let alone the many other questions arising, such as whether primary care should be delivered by nurses, physicians, some other category of health worker entirely or inter-professional teams?
Many issues influence health systems decision-making, and scientific evidence is only one element. This evidence might include a randomized trial or systematic review on what intervention works best to deal with a particular health or health care problem, or new survey data on the rapidly rising prevalence of a particular health problem or of a problem with equity, cost, quality or access. It might be focus group data describing the perceptions of a particular group of users of care, or case studies of successful quality improvement initiatives. These kinds of evidence can influence decision making in different ways. Sometimes the evidence is used as a post hoc justification for a decision that has already been taken. This rhetorical use of evidence may ignore contradictory evidence. Evidence may also be used substantively, as a coherent and comprehensive overview of options and evidence leads directly to a decision that is supported by the prior evidence. While this substantive use of evidence may sometimes be very influential, especially when supported by prominent, positive media coverage, more often the evidence is one part of the impetus towards action, or towards choosing among options for action.
How do healthcare systems evolve?
Innovation and change in health systems can be at large or at small scale, (strategic or tactical). Large scale policies have a profound influence on health care systems, determining their overall structure, funding, activities, eligible users and the health conditions they focus on. While these broad outlines determine the context in which care is provided, it may not necessarily determine the detailed daily operation, which are a result of the tactics chosen to implement each major policy. We propose that it is in influencing these details of how care is provided, irrespective of the broader system context, that our readers’ skills and efforts may have the most impact.
Countries can point to specific changes in laws, like the Canada Health Act of 1967, or the South African Ministry of Health’s commitment in 1995 to a National Health Insurance System, which have enormous impact on how the health systems of those countries are structured, and thus on what their health systems can and cannot do. For example, Canada’s Health Act focussed on physician centred acute care rather than chronic care, and did not require reimbursement of care provided by other professionals such as dentists and physiotherapists. The Act offered little funding support for long term or home care, and funded hospitals but not than community based services (aside from family physician care) or ambulatory pharmaceutical provision for the elderly. Even though these choices are federal, they have strongly shaped the structure and functioning of the provincial health systems. Thus, many major features of health care in all provinces are similar, even though there is complete constitutional autonomy of provincial health departments in relation to how care is delivered. This autonomy and strong central influence means that there has been little coordination or learning of lessons from each other as provinces have individually tried to adapt their structure to the changing needs of an aging population, and to control the costs of intensive, hospital based care.
In South Africa, with a similar national/provincial structure, and even greater socio-economic and epidemiological challenges, including both HIV/AIDS and chronic disease, the commitment to a National Health Insurance System has focussed debate and senior decision-maker attention on how to fund care, with the consequence that strategic innovation in national policies and programs and public pronouncements has tended to focus on infrastructure and financing, rather than on detailed development, implementation and evaluation of delivery mechanisms. As in Canada, this has left the implementation of care delivery in the hands of provincial health departments, rather than the national government, with autonomy allowing locally relevant innovation. Although there is more communication and learning between provinces in South Africa than in Canada, there is similar influence of the national health policy priorities and structures on provincial priorities in care delivery, and similarly slow progress in designing large scale policies to deal with priority problems and demographic and disease challenges, such as chronic disease and HIV/AIDS at provincial level. In most countries, irrespective of level of income, strategic innovations in the form of high stakes national policy decisions with huge impact on the structure of healthcare are relatively rare.
Smaller scale, tactical innovation opportunities arise much more frequently. These arise where incremental changes in specific health care delivery mechanisms need to be designed in response to a locally recognised problem, or detailed implementation plans need to be developed for large-scale strategic responses to public pressure from the rise of a particular health problem (e.g. chronic diseases, HIV/AIDS) or even the introduction at scale of a widely implementable new technology which can achieve large impacts (immunizations, family planning, tuberculosis drugs); or some combination of these factors. Because the tactical level of implementation is less specified, the design of responses to these needs creates opportunities for health systems researchers to use creativity and scientific evidence in ways that are potentially less constrained by political requirements or rhetorical commitments than would be the case with larger, more prominent strategic policy initiatives. We therefore consider these tactical opportunities to be the most fruitful area of work for Health Systems Researchers.
Small changes can have large cumulative impacts
Tactical improvements in healthcare are often built around newly available (or, in low and middle income countries, newly affordable) technical innovations in prevention, diagnosis and treatment. When these technical innovations and combined with carefully designed changes in organization and delivery of care, they can, if they are well evaluated, successfully implemented and scaled up, become the basis for improved health systems, whether in high, low or middle income settings.
Whereas National policymakers or policymakers covering large jurisdictions need to have bold policies visible to those who elect them, lower jurisdictions tend to focus on smaller, more operational choices; in other words, tactical approaches to care, rather than large scale policies. For health decision makers in such a jurisdiction, it is important to have a pipeline of multiple innovations that are carefully focussed on their priorities, so that they can spread and scale up the best of the interventions that improve the delivery of care, incrementally. As these small (and thus low risk) innovations accumulate, as the successful changes are evaluated, and distinguished from the failures, which are dropped, it is likely that the scale up and widespread implementation of the multiple small improvements in several aspect of the care for a particular group of patients, will accumulate in impact to make a large difference to the overall outcomes of that particular type of care for a particular conditions.
As primary care has developed and grown internationally over time, iterative improvements in systems for organising care, delegating functions, sharing care and referring patients for specialized treatment can lead to improved coverage, quality and impact of care. This success is an argument for health service research to focus on more tactical questions, often questions of implementation, aimed at modestly ambitious incremental improvements to existing programs of care, using strategies that will be easily implemented in the existing health system without too much disruption.
This tactical approach to incremental health system improvement might be specially appropriate for the more constrained economies in the depressed first half of our current century, where economic concentration and weakened social solidarity leads to shrinking states and public budgets. In spite of this apparent association with economic recession and spending constraints, this ‘low’ road to healthcare improvement can tap into the new knowledge generated by health service and systems researchers, in potentially advantageous ways. Technical innovations in prevention, diagnosis and treatment, in conjunction with finely tuned changes in organization and delivery of care, can, if well designed, be the basis for improved health systems, whether in high, low or middle income settings.
New health technologies, treatments, and innovative service delivery strategies are often developed in isolation from each other, from the context and from the health system, to address individual health problems. When implemented in this uncoordinated fashion, innovations may be less effective than expected, or have unintended negative consequences outside of their target- on care of other conditions, other groups or the overall system of care. These consequences undermine scale up and as a result, many isolated innovations fail to spread to other jurisdictions. This wastes resources, not only money and scarce efforts of skilled staff, but also the even more scarce resource of social and political capital to promote change, attention and support from both the public and senior managers and politicians, and front line providers time.
When deeply integrated into existing health systems, iterative, tactical innovations can result in unexpected positive consequences. When we designed a new training system for nurses in primary care clinics in South Africa to improve their ability to diagnose Tuberculosis (TB), we hoped only to improve the reliability with which tuberculosis would be diagnosed and referred for treatment. We had also a vague hope that this would demonstrate their capacities as clinicians, and open up a larger and more effective role for such nurses in publicly funded primary care in South Africa. Fifteen years later, over 20 thousand nurses are making use of a wide range of newly acquired skills to diagnose and treat not only Tuberculosis but a full range of minor acute and major chronic illnesses including hypertension, asthma and AIDS, using evidence based guidelines and effective and efficient in service, on-site training systems.
The combination of innovation and health services research, especially implementation research can help existing health systems evolve to deal with changing health and demographic trends while improving health outcomes, promoting equity and containing expenditure increases. Innovation can also promote simplification of care, and easier access to effective treatments or preventive interventions. In large part, the fall in child mortality throughout the developing countries since the 1960’s has been due to the delivery by alternative, non-physician providers of simple and highly effective treatment or preventive interventions such as immunizations. It is easy to forget that smallpox was a world scourge, eliminated by a simple new vaccination technology, the bifurcated needle, and delivered through an equally simple but well organized effort to isolate cases and immunize protective perimeters of populations around them. Similar effects have arisen form the development of effective treatments for chronic diseases including hypertension and diabetes, tuberculosis and HIV/AIDS, where innovations in treatment have been delivered alongside refinements and simplifications of care systems so that access, adherence and quality of care improve, and combine to reduce morbidity and mortality form these conditions.
What is the role of health systems researchers in tactical innovation?
Innovation is a long and complex process; health systems researchers can help ensure that as an innovation is developed, implemented, spread and scaled up to cover entire jurisdictions, that it is evaluated carefully at each stage, and that the lessons learned from that evaluation inform either early abandonment if it is clearly not an effective innovation, or improvement, to ensure that a successful innovation can be most easily implemented and can achieve its maximum impact.
Health systems research offers many insights to the complicated process of innovation, to help understand the problem that needs to be solved, to ensure that the proposed innovations are acceptable to those who will be affected by their implementation, that the chosen innovation achieves its expected benefits, and that it does not create any unexpected harms or costs, either within the part of the system in which it is implemented, nor in other parts through unexpected links. Researchers provide information and evidence which can assist decision makers in several ways:
- defining the priority problem to be addressed;
- designing and choosing among options for the innovation;
- developing and testing implementation and scale up strategies;
- determining the impact of real world implementation on health and healthcare delivery;
- recognizing areas for improvement in future iterations.
We will discuss methods and timing of evaluation in incremental innovation. Evaluation is crucial, because it ensures that each of these innovations is indeed an improvement, and not simply added work and cost with no benefit, or, even worse, no benefit and extra harms and/or costs. If an evaluation shows that the innovation is not effective, then it is possible to stop the innovation, not implement or scale up widely, and rethink, with modifications to the innovation based on flaws recognized in the evaluation process. This iterative cycle of innovation, implementation, evaluation, improvement, and so on to another round of innovation, is a key approach to making a difference with health services research.
With the complexity of the health system described above, it is best for health systems researchers to collaborate closely with other stakeholders to succeed in tactical healthcare improvement, either in an actual team (within an organisation like a ministry of health or a non- governmental organisation) or in a virtual team (if different stakeholders from different organizations are working together on an initiative). This collaboration goes through several steps (see table 1 for summary):
The stages of innovation: from problem to scale up
An innovation’s ability to progress through the stages in table 1 below is contingent on several factors. It is dependent on the characteristics of the innovation itself and the interests of the key stakeholders including:
- innovators (usually researchers) who are involved in developing the innovation;
- end users (the practice community and innovation users) from the health system unit;
- decision makers (government and non-government policy makers) who have policy jurisdiction within the health system unit.
It is also dependent on the broader context including
- the social and physical environment,
- the health system unit where the innovation will be integrated (i.e. organization, clinic, hospital, community, province etc.); and
- the regulatory, political and economic environment.
Not all innovations follow this linear trajectory; many need to loop back and some may appropriately skip certain stages.
It is important to identify barriers early in the innovation process and accept that some innovations simply may not be able to overcome important barriers and perhaps, there will be a need to go back to earlier stages and re-design the innovation, or in some cases abandon the project all together. Innovations are commonly rushed through stages and even skip essential stages all together. They may be implemented or scaled up prematurely without evaluations to verify that they are mature enough to advance forward.
Open and thoughtful (rather than rhetorical) discussion is needed between multiple stakeholders, including health innovators, decision makers and end users on potential barriers to scale up as they come into view, allowing for innovations to be sequentially adapted before meeting these problems in the “real world” setting. Collective problem solving among stakeholders is an essential element of deliberation, which “allows individuals with different backgrounds, interests and values to listen, understand, potentially persuade and ultimately come to more reasoned, informed and public-spirited decisions” [Abelson, 2003:241, Arendt, 1958].
It is helpful to be constantly aware of what stage the innovation is at and to identify what barriers have to be overcome in order to move forward in the process of implementation and scale up. Awareness from the beginning of the whole process leading to the end stages increases the ability to pre-empt barriers, and the likelihood of achieving successful scale up and spread of an innovation. This approach to staging of innovation may be most usefully applied to discrete innovations and to multicomponent interventions, rather than paradigmatic innovations (Edwards, 2014). Paradigmatic innovations are often attempted as solutions to difficult strategic problems, and as discussed above, these may be easier to solve in a piece by piece fashion.
Discrete innovations are well defined such as scale up of zinc in early childhood (Larson et al., 2012), scale up of ART (Harries et al., 2009) or the use of new technology for diagnosis and treatment of TB (Meyer-Rath et al, 2012). Multicomponent interventions involve several interacting program elements to produce a composite set of innovations that are targeted at multiple system levels. Examples include multilevel initiatives to decrease childhood obesity (deSilva-Sanigorski et al., 2010) or scale up of post abortion care services in two countries (Billings et al., 2014). Paradigmatic innovations require a shift in the way we understand health problems and the potential solutions to address them. An example of this is China’s quality of care reforms to modify their family planning programs to be in line with the international agenda which required a systems wide approach, and partnerships between international groups and all levels of governments in China, including those that extend outside of public health (Kaufman et al., 2006).
Table 1: Scaling up innovations
- Problem: identify priority problems that are susceptible to tactical solutions
- Solution: develop one or a few potential solutions to the point where they can be tested in a small, real world pilot or identify plausible solutions developed elsewhere and adapt them to local conditions
- Pilot test: if it tests well then prepare for larger implementation, if it needs improvement, adapt and pilot test again, if it seems not to be improvable, abandon
- Implement: in similar settings to pilot, but at larger scale and under real world conditions. Evaluate and decide on spread to different problems and settings, and/or on scale up to jurisdiction/s
- Spread and/or Scale up: based on evaluation of implementation stage, adapt the innovation and supporting systems to allow massive growth, and test whether it can be adapted to solve different problems, or the same problem in different settings. Evaluate jurisdiction wide scale up, especially whether effectiveness has been maintained, with rigorous, often randomized longer term evaluations of results and implications for other parts of health system.
Step 1) Identify the problem to be solved
If a problem is widely discussed, its characteristics understood and magnitude well measured, its priority agreed upon by the full range of stakeholders, including those with the health problem, their communities, the professions and organizations providing care and health care funders and decision makers it is likely that health systems research skills can help to address it. It is easier to tackle if the problem has a high public profile and a solution is required by new laws (or at least, not prevented by any), or is enabled by a newly available technology or healthcare delivery change.
Difficult problems have more multifactorial origins, are deeply rooted in cultural, social or economic stresses, have more polarization, stigmatization, or conflicting interest groups. Perhaps a chain of simple innovations can help, building up over time, with each small step addressing one small part, and achieving gradually widening support. Often some of these stages require new laws, new financial commitments and complicated political support. These difficult, strategic problems are often the most important problems in health systems, the result of inequitable social situations, but taken as a whole, such problems are hard to solve in one step. We suggest that you try to work on a mixture of simpler and more complex problems, preferably related to each other so that the learning you achieve from one helps you to understand and possibly help with others.
Step 2) Find or develop a solution
New innovation development should only progress if it is clear that there are no existing solutions to the problem. Do a literature search to check whether this problem has been addressed elsewhere, and how. Ask your networks if they know of existing solutions. If not, start thinking about the innovations needed to solve this problem. Use an approach called user centred design. Consider if the innovation should be aimed at clinicians, managers, a team, multiple units or facilities, jurisdictions or end users, and which ones? Does it focus on individual awareness, knowledge, motivation, attitudes, engagement, skills, behaviour or work processes? Is it a drug, a technology or a process change? Diagram how you think an innovation will work. Gather feedback from end users and managers to test your assumptions about what is needed and to direct the design. Develop one or a few local innovations- keep them simple, adaptable by end users and compatible with the existing culture, health system and workflow.
Rapidly test alternative ideas with enthusiastic users, starting with simple pictures of the solution, moving to physical mock-ups and/or role-plays of the innovation in use. Go through several cycles of prototyping, feedback adaptation until ready for pilot testing.
Step 3) Pilot test:
Test a real version of the innovation with a few local enthusiasts, who are ordinary end users (patients, communities, providers) in a real setting. Evaluate convincingly, using transparent qualitative and quantitative measures: is it acceptable to all stakeholders, does it work, is it simple, does it integrate into the system easily, is it better than the alternatives, at what cost? If not, abandon, or improve the innovation. After improvements, test again. When there are no major uncertainties, get ready for implementation.
Step 4) implement
In similar settings to pilot, but at larger scale, under real world conditions and with comparative effectiveness evaluation built in. Consider contracting out for implementation and consider recruiting independent evaluation team. In any case, build an implementation and evaluation team with buy in from end users, including patients, providers and communities, local respected champions, decision makers from several levels and strong administrative support; also advisers with KT knowledge. Ensure shared implementation decision making between stakeholders, communications of progress and an agreed performance measurement framework based on logic model from previous stages.
Evaluation should be pragmatic, realist and participatory. Effectiveness should be measured both in processes (how has healthcare delivery changed) and in outcomes (how has health or other end user relevant outcomes changed). Designs should include rigorous, preferably randomized trials, with mixed methods (including trial, qualitative and economic) evaluation including satisfaction, user experience, uptake, quality, effectiveness, and economic measures and observations. Look for unintended consequences and system impacts especially opportunity costs of implementation and eg internal diversion of resources and performance decline in other areas of function of involved delivery organizations. Report on social, cultural, geographic and health system effects on innovation; consider regulatory legal and financial barriers and potential solutions. Report on external validity/generalizability as well as effectiveness, benefits and harms in different subgroups, and recommend whether the innovation is ready for spread to different settings, problems user groups, (spread) and/or whether it is ready for scale up (expansion of innovation to other but similar settings dealing with the same or similar problem. If not, make explicit whether adaptation is possible, and if so, along what lines, or if a new direction is preferred.
Step 5) Spread and/or scale up
Assuming the evaluation from the implementation stage is positive and recommends scale up and/or spread, adapt the innovation as suggested and choose new problems or settings, if spread is to be addressed first; or identify scale up path (similar settings, same problem, minimal adaptation) if the decision is that the innovation is able to scale, but not spread to different settings or problems. Adaptation is based on rethinking the logic model, to see which elements can and need to be changed to match the different situation or problem. Consider the core and adaptable elements, and how to adapt the latter for the different settings or problems while maintaining sufficient fidelity to the original successfully implemented innovation to continue to be effective. Scale up may require changes in the physical, health system or legal/regulatory/financial context in which the intervention is to be implemented,- possible changes include to the delivery mechanisms, capacity development, funding any of which may need to be further developed to assist in scaling up a successful innovation to similar settings on a jurisdiction-wide or multi-jurisdiction scale,
The stage of scale up or spread needs to be evaluated as well, as thoroughly and rigorously as the implementation stage itself was evaluated. This is because, inevitably, the initial implementation, like the pilot much earlier, is led by the most committed to the innovation, implemented in the site most likely to succeed, and reviewed through the most optimistic lens, by decision makers whose reputation is built on announcements of successful pilot projects being widely implemented. It becomes all the more important that the long term commitments on a massive scale that accompany a decision to scale up, with or without spread beyond the area and problem initially targeted, are based on a rigorous objective and possibly independent evaluation of whether or not the expected gains are actually forthcoming. A reliable evaluation of the initial efforts at spread and scale-up provides the ability to correct course in order to maximise the positive and minimise unexpected negative consequences, and the reassurance to all stakeholders, prior to setting the innovation into the system, irreversibly for the foreseeable future that the innovation deserves to be scaled-up and spread. This evaluation at scale must also consider implications of the spread/scale up efforts on other innovations or other parts of the health system.
Health systems are complicated and improving them in ways that achieve wide and positive impact depends on understanding the particular problem which you want to solve very carefully. This may mean breaking down bigger problems into manageable pieces and developing innovations for each one, rather than trying to solve deep problems all at once. A new idea will not necessarily work, and even if it does do so at a small scale, innovation is not self implementing. Each innovation needs to be tested and only if it is successful should it pass on to the next stage. A large part of successful innovation is knowing when something has failed, and not trying to spread it. If an innovation appears to be successful as a prototype in pilot studies, it should be tested in a larger scale, using rigorous evaluation tools; with this information, if positive, it is worth trying to adapt the innovation to try and spread it as a solution to other problems, or the same problem in other settings; and also to scale it up widely across jurisdictions. Even at this stage it remains important to evaluate, to see if the earlier successes are maintained at scale.
Abelson, J., Forest, P.-G., Eyles, J., Smith, P., Martin, E., and Gauvin, F.-P. (2003). Deliberations About Deliberative Methods: Issues in the Design and Evaluation of Public Consultation Processes, Social Science and Medicine, 57: 239–251.
Arendt, H. (1958). The human condition. Chicago: University of Chicago Press.
Billings DL, Crane BB, Benson J, Solo J, Fetters T. Scaling-up a public health innovation: a comparative study of post-abortion care in Bolivia and Mexico. Social Science and Medicine 64(11):2210-2222.
Edwards, Nancy (2010). Scaling-up Health Innovations and Interventions in Public Health: A Brief Review of the Current State-of-the-Science. Paper commissioned by the conference chairs for delegates of the inaugural Conference to Advance the State of the Science and Practice on Scale-up and Spread of Effective Health Programs, Washington, DC, July 6-8. www.ihi.org/education/Documents/ProgramMaterials/ScaleUpBlog/7a_Commissioned_Paper%202_Public_Health.doc
Harries AD, Zachariah R, Jahn A, Schouten EJ, Kamoto K. (2009). Scaling up antiretroviral therapy in Malawi-implications for managing other chronic diseases in resource-limited countries. J Acquired Immune Deficiency Syndrome 52 Supplement 1:S14-16.
Kaufman J, Erli Z, Zhenming X. (2006). Quality of care in China: scaling up a pilot project into a national reform program. Studies in Family Planning 37(1):17-28.
Larson CP, Koehlmoos TP, Sack DA (2012). Scaling Up of Zinc for Young Children (SUZY) Project Team. Scaling up zinc treatment of childhood diarrhoea in Bangladesh: theoretical and practical considerations guiding the SUZY Project. Health Policy and Planning 27(2):102-114.
Meyer-Rath G, Schnippel K, Long L, MacLeod W, Sanne I, et al. (2012) The Impact and Cost of Scaling up GeneXpert MTB/RIF in South Africa. PLoS ONE 7(5):1-11. www.plosone.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pone.0036966&representation=PDF
National Institute of Health (NIH). Clinical and Translational Science. Available at: http://www.ncats.nih.gov/research/cts/cts.html. Accessed August 13, 2014.
Oliver, Kathryn; Lorenc, Theo; Innvaer, Simon. (2014). New directions in evidence-based policy research: a critical analysis of the literature, Health Research Policy and Systems 12(34):1-11. http://www.health-policy-systems.com/content/12/1/34
de Silva-Sanigorski AM, Bolton K, Haby M, Kremer P, Gibbs L, Waters E, et al. (2010). Scaling up community-based obesity prevention in Australia: background and evaluation design of the Health Promoting Communities: Being Active Eating Well initiative. BMC Public Health 12(10):1-7. http://www.biomedcentral.com/content/pdf/1471-2458-10-65.pdf
In chapter one it was argued that implementation research should be an ongoing activity, tracking the progress of what will typically be a complex health systems intervention and attempting to build an understanding of both what is happening as the implementation evolves and, an even more difficult task, why it is happening. Interpretation of data clearly involves a reasonable degree of intelligence and an ability to think rationally about the interplay between intervention activities and the context within which they are played out. However, experience can be an equally important guide, both your own and that of the multitude of researchers and others who have gone through similar processes before you. Being able to identify, assess, assimilate and use relevant existing evidence that may provide valuable insights is one of the key attributes of a capable implementation researcher. In this chapter the focus is on the first two activities – locating relevant evidence and assessing its quality.
We can distinguish between two phases of evidence review. Initially, we will need to draw on the existing literature in the design of our research. It will help us both to refine our research questions and to develop the appropriate methodologies for data collection. A selective review of the recent literature will also be essential if, as advocated in chapter one, we seek independent funding for our research. Those offering funding will be expecting us to provide findings which will complement the existing body of knowledge on a given topic. They will need to be convinced that we are very familiar with that knowledge and that our research is targeting areas where evidence is currently lacking. The first part of this chapter describes the basic review process from this perspective.
In the second part we consider what can be seen as a natural extension of this initial phase, the undertaking of a ‘systematic review’. This term is usually dated back to a book by Cochrane (1972), which argued that with limited resources available in the health sector, clinical judgements should be based on all the available evidence on treatments that had been obtained from rigorously designed evaluations. While that book, and the continuing work of the Cochrane Collaboration in this area, strongly emphasised the importance of one particular approach to evaluation – the Randomised Control Trial (RCT) – many authors have suggested that, particularly when considering innovations not directly concerned with clinical trials, the range of material considered should be substantially expanded, while retaining two key features of the methodology: the aim of systematically compiling all the relevant literature; and the rigorous quality assessment of each item before incorporating its findings to the extent warranted by that assessment into a final overall synthesis. Our suggestion here is that if researchers are going to have a long term involvement with an implementation of a given intervention, it would be advantageous to allocate some of their time to following the systematic review process, based on what they would regard as appropriate selection and assessment criteria, in order to refine their interpretation of the data they are compiling by building on the experience of researchers who have addressed similar issues.
Part 1: Rapid literature reviews
Institute of Development Studies
1. What is a literature review?
A literature review should include a select analysis of existing research that is relevant to what you have been asked for in the application, showing how it relates to your proposed research. It explains and justifies how your investigation may help answer some of the questions or gaps in this area of research and promote your application as a necessary area of study. A literature review is not a summary of everything available on a specific topic and it is not a chronological description of what has been discovered about a particular area. It is important to be concise, clear and selective, especially when writing a review for a funding application, bearing in mind that the people reading the application may not be experts in this issue, so avoiding any acronyms or very specific language.
If you are seeking funding, first check the donor criteria for their support and show how your project fits. Such is the competition for funds that there is no point in submitting a project, however worthy, if it does not clearly meet donor priorities. There are different types of funding applications and the amount of evidence you will need for your literature search will depend on what they are asking for so clearly read this before going any further with your search. One common way to approach the structure of a literature review is to start out by outlining the context and then become more specific, as suggested in figure 1. First, explain the broad issues related to your research proposal; this should not be too long, just enough to explain the context. Next, focus on studies in your particular area of research, followed by those directly relevant to that research and particularly those that identify gaps in the literature.
2. Search strategies
2.1 Identify a research question
Start with a carefully thought out research question which matches what the funder is asking for. A literature search should be focused and to ensure you are efficient with your search you must be clear from the start what types of evidence will be relevant to address that question. There are many guides that can help with this (see Aveyard 2007, Chapter 3). A systematic approach to searching for the literature is key. Ensuring that you follow a structure will allow you to identify the key broad texts and find the specific studies that are most relevant to your work. It may help to break the literature search into key themes with different sets of keywords, to help with organising your search as suggested in the diagram above. Make sure you record how you have approached your search and if you have been short on time and had to adapt some of these processes for speed that is fine too.
The keywords you choose are central to shaping your search. You will know some of the appropriate words but may need to use a snowball approach and add keywords as you access the literature and increase your knowledge of the terminology being used. If you are new to the topic do an extremely brief general search to help identify your keywords. You should be as creative as possible at this stage, as this will form the basis of your search and restrict what you find. You will need to consider that there are different meanings to different words, and also consider that different spellings and different terminologies may also be used in different countries. Note that keywords need not only relate to terms in your research questions. If your searches identify authors or agencies who have regularly published in the area, you can also search using their names.
Example: Attitudes towards medical abortion in India:
- Overall search (broad, context-setting
- Keywords: abortion, India, attitudes
- Theme 1: Medical abortion in the South Asian context (relevant studies)
- Keywords: medical abortion, Asia
- Theme 2: Personal characteristics affecting attitudes towards abortion (relevant studies)
- Keywords: education, socioeconomic, parity, abortion, personal characteristics (then add words in a snowball approach as you read through studies and find out what works)
- Theme 3: (specific): Attitudes towards abortion in India
- Keywords: Identify keywords based on the information you have found from the other searches about what terminology is used.
Take some time to get to know the search engines and how they work; for example exploring the use of AND/OR/NOT and * commands can be very useful when conducting your search and can save you time:
- AND ensures you search for two or more specified terms;
- OR looks for any one of them;
- NOT excludes articles with specific terms; and
- * Allows any ending to be searched for, e.g. anthropo* will bring up anthropological, anthropology, anthropologist, etc.
For example, table 1, shows various ways of refining a search on the links between hand washing by staff and hospital acquired infections.
Table 1: Search on links between hand washing and hospital acquired infections
|AND||hospital acquired infection AND hand washing||Retrieves citations with BOTH terms present|
|OR||hospital acquired infection OR cross infection OR nosocomial infection||Retrieves citations with ANY of these terms|
|AND, OR||(cross infection OR nosocomial infection OR hospital acquired infection) AND hand washing||Search sets may be combined. This search locates citations with the word hand washing AND (ANY one of the terms combined with OR)|
|NOT||hand washing NOT masks||Retrieves citations with the term hand washing, but omits records with the term masks (Caution: the NOT operator should be used sparingly and carefully as it may omit citations relevant to a search. For example, an article about hand washing that includes the word masks might be relevant to a search on hand washing.|
2.3 Identify types of literature to include:
Next, decide which types of literature you will include. This will help you to narrow down your search and also decide where you might best search for information. For example, you may want to include newspaper reports if you are looking at public opinion, or definitely exclude them if you are looking for an academic evidence base. Examples of types of literature to include may be:
- Peer-reviewed and academic journals using relevant search engines, e.g. PubMed, Medline, Ovid, Google Scholar.
- Working papers published by established research and consultancy agencies.
- International and national policy documents.
- Websites of international organisations, private companies and NGOs, and grey literature - newspapers, magazines, blogs, etc., often identified using Google or other general purpose search engine.
There are many databases and search engines and you should learn which are most relevant to your topic. Table 2 below provides some examples. You need to spend time thinking about the advantages and disadvantages of using different sources. Academic articles and books should have been peer-reviewed, which provides at least some guarantee of quality. However, there are often considerable delays between preparation and publication, so they may not provide the most recent data. Reports produced by an international agency may reflect the specific objectives of that agency or be influenced by political considerations – for example not wishing to provoke a country that is contributing to its budget. This may be an even more important consideration for material produced by private companies and NGOs. Grey literature typically will not have been through a process of peer review and may well be seen by some as biased, subjective and anecdotal – especially if it challenges their own views. However, it can often provide insights or at least suggest alternative interpretations of data or events that are not available elsewhere. Careful consideration of such issues will be a useful starting point to determine your inclusion/exclusion criteria.
2.4 Inclusion/exclusion criteria
When programming search engines you can usually set inclusion and exclusion criteria to ensure you are not looking through material that you will not use. Taking time to set appropriate criteria will save you time in the long run, though it may also be useful to do a more general quick search using Google to ensure you have not missed anything important by setting these restrictions.
Example: Selection criteria relevant to a health systems intervention in Ghana
Languages: English, French
Publications: Journals, books, dissertations, reports of specified agencies.
Regions: West Africa
2.5 Where to search
Spend a little time researching the most appropriate databases to use for your research topic. Depending on the time available, once you have used one database, try another and see if the same information is coming up. If it is, you can be more confident that your strategy is well-focused and that you are finding the relevant literature. If you only have time to use one search engine use Google or Google Scholar (depending on your inclusion criteria) as these search most widely. If you use these search engines you may need to limit the literature you search through, for example by only reading through the first ten pages of results.
Table 2: Examples of subject-relevant databases
- Allied and Alternative Medicine (AMED)
- Applied Social Sciences Index and Abstracts (ASSIA)
- British Nursing Index
- Cochrane Library
- Cumulative Index to Nursing and Allied Health Literature
- Google Scholar
- Health and Education Advice and Resource Team (HEART)
- Science Direct
- Social Care Online
- System for Information on Grey Literature in Europe
- World Bank Independent Evaluation Group
- UN Evaluation Group
- 3ie Systematic Review Database
- 3ie Database of Impact Evaluations
Firstly, search for the keywords you have selected, and synonyms of the keywords in the databases you have chosen. If you are using the approach suggested earlier, you will be undertaking a context-setting search, one or two more specific searches relevant to your research and one very specific search. As you learn more about the topic, open up the search to wider material by adding words used often in the research (for example look at the keywords in the journal articles you are bringing up). You may also want to search for more papers from key authors and journals you find, making use of ‘related articles’ features and using the bibliographies of relevant research. It is useful to record your search in a table such as that shown below. This will assist you in assessing the extent to which you can feel confident that you have compiled the most important material and provide others with evidence of your methods you have adopted. Note that some databases allow you to maintain a record of past searches.
Table 3: Search on links between availability of hospital performance data and utilization?
|Database||Date||Keywords||Total Number of Hits||Studies Chosen as Relevant|
|Science Direct||02/09/2014||DFID Tier 1 country (list with separator OR) AND Hospitals AND (HMIS OR HIS OR Terms related to performance) AND Utilization||103||19|
3. Quality-assessment of studies
The next step is to select the items identified in the search that you will use, given that there is not time for a systematic review of all of the evidence. This part of a literature search is key as it will ensure you spend your time effectively, and read in detail only the research that you will potentially be including. There are many ways of doing this, but one way of quickly assessing studies and ensuring you select the most appropriate is to use an appropriate assessment tool that takes into account a range of factors. The aim of this procedure is to provide an indication of which studies should be seen as contributing most significantly and robustly to understanding this topic and it will also mean the evidence you present is responsibly and judiciously selected. Note that funding agencies place considerable emphasis on the need for robust evidence to informing policy and programming; including suspect or out-dated materials will not be helpful if you are seeking their support.
Quality assessment can be problematic. Katrak et al. (2004) identified a list of 121 different critical appraisal tools. They concluded that there is no ‘gold standard’ for appraising studies as there is a lack of information on the development and validity of these tools and only a few have been seriously evaluated. One interesting example is an approach adapted from a report prepared by the UK DFID, ‘How to Note on Assessing the Strength of Evidence’ (DFID 2014). They suggest a two-part evidence assessment (single study and evidence body assessment), but here we focus on the first stage. Depending on the time available, you could simply use the general theory behind this approach without formally writing down the assessments. The procedure outlined below involves reading the abstract and methodology of each study as a basis for including or excluding them. More detail on the methods can be found in Chapter 2 of Aveyard (2007). Many search engines allow you to copy citations into a document as you proceed, such that by the end of this process you have your selected literature. If you have more time, and want to include more detail, a table such as that shown below can help you remember key aspects of each study and is a way to organise your results.
Table 4: Finding of a critical assessment process
|Author/Date||Related Theme||Aim of Paper||Type of Information||Main Findings||Strengths and Limitations|
3.1 Assessment of evidence strength
For each individual study, we can consider the research type, research design, and methodology to arrive at a quality assessment. Such a procedure can either be seen as a rough guide as you select material, or it can be undertaken more formally and the selection criteria described with multiple descriptive keys. For example, an assessment of (P&E; EXP;H) might mean that a study is highly relevant, primary and empirical, experimental and high quality. Table 5 provides one approach to classifying studies by type. Table 6 questions allowing assessment of various quality dimensions and table 7 an aggregation index based on these dimensions.
Table 5: Classification of research studies by type
|Research Type||Research Design|
|Primary and Empirical (P&E)||Non-Experimental (NEX)|
|Secondary (S)||Systematic Review (SR)|
|Non-Systematic Review (NSR)|
|Theoretical or Conceptual (TC)||N/A|
Source: DFID 2014:9
Table 6: Principles for assessing the quality of individual studies
|Principles of Quality||Associated Questions|
|Conceptual Framing||Does the study acknowledge existing research?|
|Does the study construct a conceptual framework?|
|Does the study pose a research question or outline a hypothesis?|
|Appropriateness and rigour||Does the study present or link to the raw data it analyses?|
|What is the geography/context in which the study was conducted?|
|Does the study declare sources of support/funding?|
|Appropriateness||Does the study identify a research design?|
|Does the study identify a research method?|
|Does the study demonstrate why the chosen design and method are well suited to the research question?|
|Cultural sensitivity||Does the study explicitly consider any context‐specific cultural factors that may bias the analysis/findings?|
|Validity||To what extent does the study demonstrate measurement validity?|
|To what extent is the study internally valid?|
|To what extent is the study externally valid?|
|To what extent is the study ecologically valid?|
|Reliability||To what extent are the measures used in the study stable?|
|To what extent are the measures used in the study internally reliable?|
|To what extent are the findings likely to be sensitive/changeable depending on the analytical technique used?|
|Cogency||Does the author ‘signpost’ the reader throughout?|
|To what extent does the author consider the study’s limitations and/or alternative interpretations of the analysis?|
|Are the conclusions clearly based on the study’s results?|
Source: DFID 2014:14
Table 7: Study quality category definitions
|High (H)||Demonstrates adherence to principles of appropriateness/rigour, validity and reliability; likely to demonstrate principles of conceptual framing, openness/transparency and cogency.|
|Moderate (M)||Some deficiencies in appropriateness/rigour, validity and/or reliability, or difficulty determining these; may or may not demonstrate principles of openness/transparency and cogency.|
|Low (L)||Major and/or numerous deficiencies in appropriateness/rigour, validity and reliability; may/may not demonstrate openness/transparency and cogency.|
Source: DFID 2014a:15
4. How to synthesise your findings
The next stage is to summarise the findings of the literature search. This will provide readers with details as to your review methodology and findings. If you have broken your search up into the three areas suggested in section 1, and used a table as suggested in section 2 to note down key findings as you have been searching, this process should be fairly simple as you will have three tables summarising the key findings for the different sections of your search. The inverted triangle diagram could be used to structure your review. There are different approaches to this, and it partly depends on what you have been asked to do. You could include several paragraphs on how you have conducted your search and use the inverted triangle diagram to summarise the results of the research. The aim is to interpret the results and consider the differences and similarities in different papers, rather than simply summarise them. This will give a new meaning to the results and identify gaps in the literature. These should be outlined to show how your research will add to the existing literature and why it is important to study this area.
If more detail is needed, a meta-ethnographic approach to synthesising information could be used. Developed by Noblit and Hare in 1988, this approach involves determining keywords, phrases, metaphors and ideas that occur in some or all of the studies and interpreting these in the light of those identified in other studies. The aim of this is to determine the relationship between the studies so that consistencies and differences are identified. If further time was given to research or if the funder asks how you could expand your review, a meta summary should be conducted, assigning codes to points discussed in each research paper and further sub-themes could be developed under each section (more detail can be found in Chapter 6 of Aveyard (2007)).
Finally, note that a narrative review such as this can lead to misleading conclusions and should be seen as a preliminary step towards undertaking the type of systematic literature review discussed in the second part of this chapter. It can be useful to clarify this at the end of your method statement and not interpret your findings too widely or make assertions that are not justified from the amount of time you have spent researching the issue. Do not be tempted to bend the data to show the gaps you would like them to show, to improve your argument or to align your review with stakeholder or donor perspective as this will cause problems later.
Aveyard, H. (2007) Doing a Literature Review in Health and Social Care: A Practical Guide, New York: Open University Press, www.chc.brookes.ac.uk/dr-helen-aveyard (accessed 12 March 2015)
DFID (2014). How to Note on Assessing the Strength of Evidence. www.gov.uk/government/uploads/system/uploads/attachment_data/file/291982/HTN-strength-evidence-march2014.pdf (accessed 12 March 2015)
DFID (2014a). Secure property rights and development: Economic growth and household welfare. Property rights evidence paper (accessed 12 March 2015).
Katrak, P.; Bialocerkowski, A.E.; Massy-Westropp, N.; Kumar, S. and Grimmer, K.A. (2004) A Systematic Review of the Content of Critical Appraisal Tools, BMC Medical Research Methodology 4.22, www.biomedcentral.com/1471-2288/4/22 (accessed 12 March 2015)
Noblit, G.W. and Hare, R.D. (1988) Meta-Ethnography, Synthesising Qualitative Studies, London: Sage Publications
Reid, M.; Taylor, A.; Turner, J. and Shahabudin, K. (no date) ‘Starting a Literature Review’, University of Reading, www.reading.ac.uk/web/FILES/sta/A5_Literature_Reviews_1_Starting.pdf (accessed 12 March 2015)
The British Library for Development Studies provides a document delivery service, which can be useful if you do not have access to a particular article, http://blds.ids.ac.uk/index.cfm?objectid=D3FBAB71-4D85-11E0-A71C00016C1BDD3E (accessed 12 March 2015)
Part 2: Systematic Review
Institute of Development Studies
There has been an explosion in medical, nursing and allied health care professional publishing over the last 50 years. There are perhaps 20,000 journals and as many as 2 million articles per year. These keep expanding in number, making keeping up with primary research impossible. Even in a single area, the number of published studies can run to hundreds if not thousands, and some of these studies can give unclear, confusing or contradictory results, and may involve a research method that is not compatible with those in other studies. By themselves single articles will not be very helpful. In addition there is a huge expansion of internet access to articles. There is also a challenge to build skills to use the electronic media that allow access to large amounts of information. In addition health care professionals have a wide range of information needs (Hemingway and Brereton 2009) needing good quality information the effectiveness, relevance, feasibility and appropriateness of a large number of health care interventions.
Traditional reviews of literature lacked rigour because of self-selection of research studies and a subjective interpretation of the evidence. Recommendations were biased. There was a turnaround in opinion in the 1980s-90s which showed that traditional approaches had largely failed to extract useful and unbiased information. What was needed was the same rigour in secondary research (research where the objects of study are other research studies as is expected from primary research i.e. original studies. Systematic Reviews (SRs) are an evidence translation mechanism but must be done in a highly rigorous, transparent and independent manner with full information made available to the reader. A peer- reviewed protocol is involved with reviewing starting the process with an open mind.
In the health profession, SRs are needed to fulfil a number of roles: i) to establish clinical and cost effectiveness of an intervention and to establish if an activity or intervention is feasible; ii) to propose a future research agenda when the way forward may be unclear or existing agendas have failed to address a clinical problem; iii) required by authors who wish to secure substantial grant funding for primary health care research; iv) to be part of student dissertations or post graduate theses, and v) to be central to the National Institute for Health and Clinical Excellence health technology assessment process for multiple technology appraisals and single technology appraisals. SRs are most needed whenever there is a substantive question, with several primary studies (perhaps with disparate findings) and substantial uncertainty.
2. Substance of a Systematic Review
An SR is a summary of existing research on a particular topic or research question. Although it is in essence a literature review it aims to use the same principles and rigour that is expected of primary research with generally accepted approaches and methods. This means that, readers can be confident that common methods have been used that are well accepted and that comparisons can be legitimately drawn between SRs. SRs sit within three types of research: i) primary research studies that observe a phenomenon at first hand, collecting, presenting and analysis raw data; ii) secondary review studies that interrogate primary research studies summarising and analysing their data and findings (this is the SR), and iii) theoretical or conceptual studies, either primary or secondary, that focus mainly on the construction of new theories rather than generating or synthesising empirical (DFID 2014).
The method involves interrogating multiple databases and search bibliographies for references, both published studies and also ‘grey’ material (unpublished but generally available material). SRs screen studies for relevance, appraise for quality on the basis of the research design, methods and rigour with which these were applied, and synthesises the findings using formal qualitative or qualitative methods. They are therefore a robust, high quality technique with pre-set techniques for evidence synthesis.
SRs are in a period of rapid development (Hemingway and Brereton 2009). In the health field, many SRs still look at clinical effectiveness, but methods now exist for reviewers to examine issues of appropriateness, feasibility and meaningfulness. Some SRs have been published without proper attention to clarity and careful oversight. These findings may mislead and these reports need to be interrogated by asking a series of questions that can uncover deficiencies.
3. Steps in a Systematic Review
There are eight main steps to a SR process. The first is to identify a health care question clearly and unambiguously. Generally SRs answer specific healthcare questions and assess the effectiveness of particular interventions rather than providing general summaries of the literature on a given topic. With the example of an intervention, the review question would clearly define: the specific population or problem being investigated, the intervention being evaluated, the comparison or control under investigation and the outcome of interest.
Second, a review protocol is developed. This is a detailed description of the scope, aims and methods of the study, stating clearly the review question, how and where studies will be located, selected, appraised and synthesised (Cardiff University, p. 2). This allows any problems of bias to be addressed.
The third step is the search of the literature with the aim of identifying all relevant studies on the research topic. A comprehensive search strategy is developed and clearly outlined in the review protocol. For researchers in the health sector, an initial search would normally use a combination of relevant keywords and medical subject headings (MeSH) terms in Ovid Medline. In terms of starting a scoping study it is best to start with very general search engines such as Google Scholar, Web of Science and ASSIA, talking to experts in the field, and looking at book reviews. However at an early point it is necessary to turn to other more specialist databases such as Embase, PsycINFO, HMIC and AMED. For researchers in Nursing the initial search strategy would be developed in Cinahl.
Publication bias has to be addressed by seeking out unpublished studies. This is also known as the ‘grey literature’. In all fields there is a tendency to publish research with positive consequences: research that shows ‘no effect’ or nothing positive may not be published but is just as important to get an overall picture of the effect of the intervention. But finding unpublished work can be very difficult because of the lack of a public record. With this in mind it is important to search conference proceedings databases and higher degree dissertations as well as websites. In addition ‘English language ‘bias’ should be addressed; if excluded (due to lack of resources for translation), this should be mentioned.
If possible, the search results should be imported into reference management databases such as Endnote, Reference Manager or Zotero.
The fourth step is to identify relevant studies. Studies are assessed for their actual relevance independently by two or more researchers. The criteria for inclusion should be documented in the review protocol (that is which population, intervention and outcome measures are of interest). In using an SR in evaluating an intervention, the effectiveness of an intervention, the randomised controlled trials (RCTs) are considered to be the most reliable evidence. Pre-specifying inclusion and exclusion criteria protects the review from allegations of investigator bias, where the reviewer for one reason or another becomes attached to one line of investigation.
Step five is to critically appraise relevant studies. Critical appraisal should be performed independently by two or more researchers to avoid bias, and revolves around the methodology of the research and therefore how reliable are its results. Appraisers should be on the lookout for insufficient rigour in the design and conduct of studies. It is possible that RCTs may have some built-in bias in the following stages of the research: i) selection of participants (randomisation and allocation concealment), ii) treatment provided to the study group; iii) follow up of participants and iv) measurement of outcomes. SRs may examine quantitative and qualitative evidence; then the two or more types of evidence are examined within one review it is called a mixed-method SR.
Step six is the extraction of data from the individual studies. This should also be performed independently by two or more researchers. Data collection tools carry out the following functions: i) ensure that all relevant data are collected, ii) minimise the risk of transcription errors while data re being collected, iii) allow the accuracy of the data to be checked, and iv) serve as a record of the data collected. This is a difficult phase of the SR and one at which judgement needs to be used and with as much clarity and reference to the protocol as is possible. This is complicated by issues such as incomplete reporting of study findings, the large range of outcomes commonly used to measure an intervention and the different ways in which data are reported and presented.
The seventh step is to summarise the conclusions of the studies. The aim is to synthesise the individual studies to provide a clear and unambiguous judgment on the effectiveness of the intervention or the summary of the research studies. It should be possible to combine the data statistically in a meta-analysis to get an overall estimate of the effectiveness of an intervention. This meta-analysis can only be undertaken when studies address the same question, use the same population, administer the intervention in a similar manner and measure the same outcomes. A meta-analysis is inappropriate where this is not the case. The results of the meta-analysis can be displayed graphically in order to facilitate interpretation, allowing a visual comparison of the findings of individual studies.
The results can be boiled down to a simple categorisation of studies that showed the specific intervention was beneficial, and those that indicated that the treatment was not beneficial. A synthesis may also be achieved by a narrative summary of studies using written précis of findings supported by brief descriptions of each study in ‘evidence tables’. Bodies of evidence should be summarised in terms of four characteristics (DFID, 2014). These are: i) the technical quality of the studies constituting the body of evidence and the degree to which risk of bias has been addressed; ii) the size of the body of evidence; iii) the context in which the evidence is set, and iv) the consistency of the findings produced by studies constituting the body of evidence.
The final step is to publicise the review findings. SRs need to be promoted to inform policymakers and practitioners and so are useless unless they help fuel this objective. Report production and dissemination are crucial parts of the process, written along a focussed structure of introduction, methodology; nature of evidence identified and detailed findings, conclusions and recommendations. There needs to be a clear description of the methods so the reader can judge the validity of the techniques employed.
SRs do have some drawbacks. When well conducted they should give the best possible estimate of any true effect but such confidence may be misplaced on some occasions. First SRs may be done badly and users should consult a checklist for appraisal to determine the level of quality. Second, there may be inappropriate aggregation of studies that differ in terms of intervention used, patients included or types of data included can lead to the drowning of important effects. This means that the effects seen in some subgroups may be concealed by a lack of effect (or even reverse effects) in other subgroups.
Finally the findings of SRs are not always in harmony with the findings from large scale single trails. Findings from SRs need to be weighed against conflicting evidence from other sources. Hierarchies of evidence for feasibility or appropriateness are available.
4. An Appraisal Framework for SRs
Hemingway and Brereton (2009) suggest that some of the key questions to be addressed in relation to any systematic review are:
- Is the topic well defined?
- Was the search for papers thorough?
- Were the criteria for inclusion of studies clearly described and fairly applied?
- Was study quality assessed by blinded or independent reviewers?
- Was missing information sought from the original study investigators?
- Do the included studies seem to indicate similar effects?
- Were the overall findings assessed for their robustness?
- Was the play of chance assessed?
- Are the recommendations based firmly on the quality of the evidence presented?
5. General Issues and the Future
The key element of SRs is impartiality, hence the requirement for independent assessment. However they are not easy, requiring enormous care and rigour with considerable attention to methodological detail and analysis. The label of ‘systematic review’ is hard earned. There are checklists to help in assessing quality.
There are some changing trends in SRs. Increasingly health professionals cannot wait for a year or so for a full SR to produce its findings. (Hemingway and Brereton). Therefore Rapid Evidence Assessments (REAs) can provide what is already known about a topic or intervention, and take about two to six months. REAs use systematic review methods to search and evaluate the literature, but the comprehensiveness of the stages may be limited. There are minimum standards for REAs. The use of REAS depends on the time frame for decisions, uncertainty about effectiveness when there has already been considerable prior research or to develop a map of evidence to determine the existing evidence and to direct future research needs.
Baker P, J Costello, M Dobbins, E Waters (2014). Cochrane Update: The benefits and challenges of conducting an overview of the benefits of public health: a focus on physical activity, Journal of Public Health, August 1, 2014.
Banberry A, A Roots and S Nancarrow (2014). Rapid Review of applications of e-health and remote monitoring for rural residents. Aust J Rural Health 22:211-222.
Boonstra A, A Verslius and J Vos (2014). Implementing Electronic Health records in Hospitals: A systematic literature review. BMC Health Services Research 14:370.
Burlan D, K Buse, J Shiffman and S Tanaka (2014). The bit in the middle: a synthesis of global health literature and policy formulation and adoption. Health Policy and Planning 29: iii23-iii34.
Cardiff University, Beginning a systematic review of the health care literature. http://www.cardiff.ac.uk/insrv/resources/guides/inf113.pdf Accessed 15 March 2015.
Cochrane, A L (1972). Effectiveness and Efficiency: Random Reflections on Health Services. The Nuffiled Provincial Hospitals Trust. www.nuffieldtrust.org.uk/sites/files/nuffield/publication/Effectiveness_and_Efficiency.pdf
DFID, Assessing the Strength of Evidence, March 2014.
Erasmus, E, M Orgill, H Schneider, and L Gilson (2014). Mapping the existing body of health implementation research in lower income settings, what is covered and what are the gaps? Health Policy and Planning 29: iii35-iii50.
Gilson L (2014). Qualitative Research Synthesis for research policy analysis: what does it entail and what does it offer? Health Policy and Planning, 29:iii1-iii5.
Gilson, L, H Schneider and M Orgill (2014). Practice and Power: a review and interpretive synthesis focused on the exercise of discretionary power in policy implementation by front-line providers and managers. Health Policy and Planning 29: iii51-iii69.
Greenhalgh T, G Robert, F MacFarlane, P Bate, O Kyriakidou and R Peacock (2005). Storylines of research in diffusion of innovation, a meta-narrative approach to systematic review, Social Science and Medicine 61:417-430.
Greenhalgh T (2014). Evidence based medicine: a movement in crisis? British Medical Journal 13 June 2014.
Hemingway P and N Brereton (2009). What is a systematic Review, Evidence based medicine April 2009. http://www.medicine.ox.ac.uk/bandolier/painres/download/whatis/Syst-review.pdf
Jagosh J, Macaulay AC, Pluye P, Salsberg J, Bush PL, Henderson J, Sirett E, Wong G, Cargo M, Herbert CP, Seifer SD, Green LW, Greenhalgh T (2012). Uncovering the benefits of participatory research: Implications for a realist review of health research and practice, Millbank Quarterly, 90(2): 311-346.
Li, X, Z Ya, C Yao-long, Y Ke-hu, and Z Zong-diu (2015). The reporting characteristics and methodological quality of Cochrane reviews about health policy research. Health Policy 119: 503-510.
Loevinsohn, M, L Mehta, K Cuming, A Nicol, O Cuming and J Ensink (2014). The cost of a knowledge silo, a systematic re-review of water, sanitation and hygiene interventions. Health Policy and Planning May 29, 2014.
Pawson, R, J Greenhalgh, C Brennan and E Glidewell (2014). Do reviews of health care interventions teach us how to improve health care systems? Social Science and Medicine 114:129-137.
Rockers P, J-A Rottingen, I Shemilt, P Tugwell, and T Barninghausen (2015). Inclusion of quasi- experimental studies in health systems research, Health Policy 119: 511-521.
Thomas, J, A O’Mara-Eves, and G Brunton, Using Qualitative Comparative Analysis (QCA) in systematic reviews of complex interventions: a worked example (2014). Systematic Reviews 3:67.
Walt G and L Gilson (2014). Can frameworks inform knowledge about health policy processes? Reviewing health policy papers on agenda setting and testing them against a specific health priorities framework. Health Policy and Planning 29: iii6-iii22.
Witter, S, A Frethiem, F Kessy and A Lindahl (2012). Paying for Performance to improve the delivery of health interventions in low- and middle-income countries, Cochrane collaboration.
Wong, G, Greenhalgh T, G Westrop,, G Buckingham, and R Pawson (2013). RAMESE publication standards: realist synthesis, BMC Medicine, 11: 21.
Yoong S, T Clinton McHarg, L Wolfendon (2015). Systematic Reviews examining implementation of research into practice and impact on population health are needed, Journal of Clinical Epidemiology Online publication. http://dx.doi.org/10.1016/j.jclinepi.2014.12.008 Accessed March 12 2015.
The Cochrane Library www.cochrane.org
The Joanna Briggs Institute www.joannabriggs.edu.au/pubs/systematic_reviews.php
The Campbell Collaboration www.campbellcollaboration.org
The Centre for Evidence-Based Medicine www.cebm.net
The NHS Centre for Reviews and Dissemination www.york.ac.uk/inst/crd
Institute of Development Studies
1. Why is context important?
Traditional scientific experimentation typically involves (a) articulation of a plausible theory and (b) testing that theory under carefully controlled conditions to determine if predicted outcomes are observed. We can define ‘carefully controlled conditions’ as efforts by the experimenter to exclude every factor that could plausibly influence those observed outcomes to an extent that would be of concern, given the objectives of the experiment. For example, in attempting to estimate the acceleration of a falling body due to gravity an experimenter would have to decide whether to conduct the experiment in a vacuum to eliminate the influence of air friction. Their decision would depend on the type of falling body and the precision with which they required to measure its acceleration. Similarly, a chemist intending to measure the heat dissipated in a chemical reaction would have to consider the degree of purity of the chemical compounds involved. Would the measurements be significantly affected if they were only guaranteed to be 99.8 per cent pure as compared to 99.9 per cent? Note that there will be a multitude of other factors – for example the colour of the falling object or the age of the laboratory assistant mixing the reagents – that the experimenter may regard as obviously not relevant to the outcome of the experiment and therefore not needing to be controlled, though as scientific knowledge progresses there is always a possibility that one of these assumptions will later be proved incorrect.
The clinical trials of a new pharmaceutical will also typically involve the use of a range of controls that attempt to isolate the association between application of the drug and observed outcomes, in terms of physiological or psychological changes in a patient, from other factors that might ‘confound’ that association. As above, the purity of the drug will be carefully assessed. There will be procedures that aim to ensure that patients take their medicine in the prescribed doses, at the appropriate time and in the manner – for example before or after meals – laid down by those organising the trial. Typically, a placebo treatment will often be used to ‘control’ for the potential effects on patients of simply being involved in a trial, often with neither patients nor providers aware of which patients are receiving the placebo and which the drug until the trial is ended (an approach known as ‘double blinding’). Somewhat more controversially, the patients will usually be carefully screened before recruitment. They will generally be within a predetermined age range, have no pre-existing relevant health conditions and not be using other medications that may influence the outcome of the trial. They may also be excluded on the basis of a variety of other factors such as their weight, alcohol consumption, smoking habits or other lifestyle behaviours.
Interestingly, many practising physicians have expressed a concern (Zwarenstein and Treweek 2009) that the vast majority of clinical trials can be described as ‘explanatory’ (designed to test a hypothesis in a highly controlled context), rather than ‘pragmatic’ (designed to identify treatments that are likely to produce beneficial outcomes across the broad spectrum of patients routinely encountered by healthcare providers). They argue that by adopting ‘laboratory’ conditions and excluding patients with attributes that might confound the relationship between treatment and outcome, explanatory trials often produce findings that may be of scientific interest but are of limited practical value to clinicians working ‘in the real world’ and having to make difficult decisions about the best course of treatment for the large number of their patients who do not or will not conform to the rigid guidelines laid down by the drug manufacturers.
2. Context in health systems research
In the health systems interventions with which we are concerned there is almost no possibility of controlling for potentially confounding factors. ‘In laboratories scientists create artificial conditions in which those causal mechanisms which they conjecture to exist will be activated. In the natural world, potential causal mechanisms will only be activated if the conditions are right for them’ (Tilley 2000: 5). Interventions take place within a specific context, and implementation successes and failures can often be linked to uncontrolled and often uncontrollable mediating factors that derive from that context. In a small minority of relatively simple interventions it may be possible to adopt a version of the placebo approach indicated above by randomly allocating individuals in the targeted population to intervention or ‘control’ groups. In other cases ‘cluster randomisation’ may be possible, where whole facilities, villages, health districts, etc. in a targeted region are randomly allocated to receive or not receive the intervention. More often, when random allocation is not seen as a feasible option, the intervention population may be compared with ‘similar’ populations (with similarity based on the values of a range of socioeconomic and other indicators) that have not received the intervention, in what are usually described as ‘quasi-experimental’ implementation designs. In each case the argument (which is often not very convincing) is that the contexts in the intervention and control groups are sufficiently alike that different outcomes can be attributed to the intervention.
Whatever the intervention design, the implementation team will obviously wish for a successful outcome. To increase the likelihood of achieving this, given their inability to control contextual factors, they should: (a) determine what the most important of those factors are and how and to what extent they might influence outcomes; and (b) find ways to embrace those which are supportive and mitigate those which pose a threat to the implementation process. For the implementation researcher, as discussed in Chapter 1, the tasks are similar but even more challenging. In addition to the above, they would have to: (c) review the extent to which similar factors might need to be addressed in scaling up or relocating the intervention; (d) consider the implications for contexts where some positive factors may be less influential, absent or even negative; and (e) assess the possibilities for using approaches similar to those adopted in the current implementation for the mitigation of negative factors in other contexts.
To take a simple example, those implementing a mother and newborn child health (MNCH) intervention might find that a large majority of their target population have mobile phones (a potential positive factor) but also that a substantial number live in areas where road access is much more limited than expected (potential negative factor). The implementation team might decide to modify their operational procedures to make maximum use of mobile phone communications and to substitute motorcycles for ambulances to overcome the lack of road access. The implementation researcher would also need to consider: the extent to which these factors might be important in other regions; whether implementation performance might be less impressive in locations with more limited communications; and whether motorcycles would be a plausible solution to similar road access limitations in other parts of the country.
It is important to recognise that health system interventions are essentially social interventions, and that the diverse range of individuals who make up those societies may respond in very different ways depending both on their specific circumstances and on their perceptions of the intervention. They will often play the most important role in defining the context within which a given implementation takes place. Those contexts will also be strongly influenced by the nature of the communities within which those individuals live. For example, a child health promotion programme may aim to provide information, encourage trust in local services and empower mothers to take healthcare decisions. Programme implementation may trigger different processes depending on the characteristics of targeted individuals and households (age, education, socioeconomic status, family circumstances, etc.), and various community and societal factors (community assets, local power structures, cultural norms, etc.). Across such varied contexts, the same programme components might in some cases result in increased knowledge and improved attitudes and behaviours, and in others promote conflicts between and/or within households that risk impacting adversely on children’s health.
Developing a detailed understanding of the context within which an implementation takes place, and of the actual and potential consequences for implementation progress and outcomes, can thus be seen as one of the defining tasks of the implementation researcher. It may seem a daunting undertaking, given the range of potentially relevant contextual factors that will need to be considered and the limited resources that are typically available. However, remember that the definition of implementation research proposed in Chapter 1 assumes that the researcher will be an active member of the implementation team. This implies: (a) that the work can be shared across a number of individuals, all of whom will (or should) be equally concerned to understand the context within which they are working; and (b) that contextual knowledge can (and should) be acquired over an extended period, not in a ‘one-off’ exercise.
In practice, the problem faced by both researchers and the implementation team is rarely a lack of available data. As discussed in Chapter 3, even a cursory review of the literature or an elementary internet search will typically uncover a wealth of documentary material relating to the remotest regions and apparently most isolated populations. The difficult task is to identify the often small proportion of that material that provides data that is both relevant and trustworthy. There will also be a large number of individuals – colleagues, professional and social contacts, officials, journalists, etc. – who may be willing to provide key insights into areas that are less well addressed in the literature. For example, an anthropologist colleague assisted one of the editors of this volume by explaining that the design of a project could be relatively easily modified to avoid antagonising a local secret society that might otherwise have persuaded its members to hinder the implementation process. Such informal communications can be invaluable and often obtained with minimal effort – if the researcher has the initiative to seek them out and the ability to assess their reliability.
Whatever the available sources of data, it should be remembered that ‘working hard’ is no substitute for ‘working smart’. It is very easy to lose sight of the primary objective, gathering and interpreting contextual information that is likely to be relevant to the implementation process, and to waste valuable time and effort on readily available and interesting, but at best marginally useful, sources. For example, it can be fascinating to investigate the various manuals available in most ministries of health setting out the precise regulations governing the activities of various types of health provider, but if those regulations are routinely disregarded and there is no prospect that they will be monitored or enforced within the lifetime of the implementation, the resources allocated to that investigation should be strictly limited. A useful concept from participatory methodology is that of ‘optimal ignorance’ – a state achieved when it is recognised that the value of the resources required to gathering additional information will probably exceed the likely benefits (Longhurst 2013).
2.1 Sensitive information
The ‘secret society’ example mentioned above raises an issue that is rarely addressed in textbooks but is often of critical importance – the extent to which potentially sensitive information on contextual factors should be disseminated. It will often be the case that the context within which an implementation is undertaken includes factors that may be acknowledged in private discussions but that would cause serious offence if made a matter of public record. For example, it might become evident that corrupt practices by providers were being tolerated by health authorities, or that some communities were willing to pay for healthcare for male children but not for girls. It would be a matter for the implementation team as a whole to decide how to address such issues. In most cases, a confrontational approach, proclaiming their concerns and endeavouring to overturn long-established practices within a relatively short time frame, will not be seen as the most effective strategy. Typically, various mechanisms may be introduced into the implementation design, for example modifications to financial control systems or campaigns intended to encourage greater utilisation of services by girl children, that will be described as general project enhancements, without reference to the specific, sensitive problems that they are intended to address.
This situation will often pose a dilemma for the implementation researcher. As a member of the implementation team, it would be entirely inappropriate for them to widely disseminate sensitive information against the wishes of that team. However, given the broader objectives of implementation research as defined in Chapter 1, they clearly cannot ignore evidence that might have serious implications in terms of the potential risks and benefits associated with scaling up the intervention. As will be discussed in Chapter 10, one way to address this dilemma is to move from a focus on ‘dissemination’, which we commonly associate with academic research findings, to one on ‘influencing’, which is more relevant to research that is specifically intended to feed into policy decisions. This involves a recognition that the knowledge we have gained from our research is, to adopt a concept from economics, an intermediate good, of value only to the extent that we use it to influence policy debates in ways that can be expected to improve health systems and ultimately raise the health status of the population.
From this perspective, the use of sensitive information, just as with any other information, should involve: (a) rigorously determining that you really do have valuable evidence that can and should contribute to policy debates – it is always very tempting to believe that you have unique insights; and (b) presenting that evidence to the relevant audiences in ways that are most likely to influence those debate as intended. Again, direct confrontation will generally not be the most effective strategy in terms of persuading key stakeholders as to the value of your evidence and may well have the opposite effect. Remember that senior officials and politicians will often be well aware of the issues you are addressing and are typically very adept at ‘reading between the lines’. In some cases, ‘speaking truth to power’ may be the best and most courageous option. But you have to be very sure that you are taking this line because it offers the best chance of achieving your ultimate goal and not because it offers the greatest personal satisfaction.
3. Frameworks for implementation context analysis
As indicated above, context analysis will often need to consider a wide range of factors, some in considerable detail, others simply to confirm that they are likely to be of at most marginal relevance in terms of their influence on the implementation process. In order to undertake a systematic analysis it is of considerable advantage to work within a predetermined framework. Such a framework is best compiled as a collaborative exercise. This should involve at least all members of the implementation team but can often be improved by working with a range of other stakeholders – health officials, providers, community members, etc. – who have specific knowledge of contextual factors that may otherwise be overlooked.
Frameworks will be intervention-specific. For example, a Situation Analysis Tool developed by the Centre for Public Mental Health at the University of Cape Town focuses on the provision and utilisation of mental health services. A guide to situation analysis produced by the Health Systems Trust in South Africa was intended for use by district officials and is therefore primarily concerned with assessing priority health issues and district-level facilities, human resources and management. More recently, the WHO has produced a Situation Analysis Toolkit for the implementation of interventions on male circumcision, which emphasises the need for detailed assessment of local customs and stakeholder attitudes. However, it is possible to consider a number of areas that should usually be at least considered in any such framework. These would include:
- Politics and history
- Physical environment
- Health needs and services
3.1 Politics and history
Over recent years it has been increasingly recognised in the literature that to understand how a health system functions it is essential to know how it, and the context within which it exists, has evolved over time (Bloom 2014, Grundy et al. 2014). In the language of complexity theory, it is necessary to acknowledge the importance of ‘path dependency’ (Paina and Peters 2012). Where there has been a history of projects that promised much and delivered little, perhaps because of weak local governance structures, it may be very difficult to persuade the population that a new intervention will be successful. Where corruption has become endemic, some stakeholders will view such an intervention as a potential new source of funds, while others will be very reluctant to participate, assuming that the benefits will be misappropriated. Where there are long-standing ideological differences between different sections of the population, there will be a risk that any new development will become a cause of dissent between different political factions. On the other hand, in populations that have a history of effective community organisations, such as proactive local women’s groups, it may be much easier to set up, for example, a community-based health insurance scheme (Asaki and Hayes 2011).
As indicated by the above, issues relating to local and national politics and to historical trends and events may well be seen as extremely sensitive and difficult to address within an implementation research setting. However, in many cases they will be among the most important contextual factors. Projects and programmes have come to an abrupt halt when a change of government has removed key political supporters. Others have failed to increase service utilisation because local health officials had lost the trust of a substantial section of the targeted population as a consequence of previous activities. The controversy surrounding the clinical trial in 1996 of a new antibiotic trovafloxacin (Trovan) by the drug company Pfizer in Kano, Northern Nigeria, during an epidemic of meningococcal meningitis, is still raised as an example of the risks of engaging with foreign companies in the health sector and has played a part in the resistance to polio vaccination programmes (Yahya 2006). As discussed above, under the heading ‘sensitive information’, the argument here is not that the implementation researcher should provoke controversy by reopening old wounds or taking sides in any political debate. However, if past events and current political positions are relevant to the potential outcome of the implementation process, they do have to be addressed, analysed and interpreted as an important component of the research findings. Again, the pragmatic use of those findings to influence policy will be discussed in Chapter 10.
3.2 Physical environment
The physical environment within which an intervention takes place should almost always play a major role in determining implementation design. To give an extreme example, implementation of an intervention designed to improve health outcomes for children living in the crowded squatter settlements of Nairobi will clearly pose very different problems from one intended for scattered populations in the highlands of Papua New Guinea or the densely populated islands of the Sundarbans mangrove forest in West Bengal. Even when the targeted regions are relatively limited in size, substantial geographical variations within regions may need to be carefully considered. For example, researchers will typically distinguish between urban and rural locations but not differentiate peri-urban areas, which often have their own very specific environmental characteristics. Similarly, it will often be essential to classify rural areas into those that are easily accessible and those that are more remote from major centres of population, given that it will typically be substantially more difficult to provide services in the latter.
Note that it should be standard practice to explain why different environmental factors are relevant. Researchers often provide descriptions that would be more appropriate for a geography textbook or a travel guide, specifying items such as precise estimates of land area, height above sea level, average annual rainfall or detailed grid references. Often the key issues concern potential physical access barriers – such as long, difficult and/or dangerous journeys to services by those seeking care (Houben et al. 2012) or restrictions on the ability of providers to transport medical supplies or appoint additional staff to facilities when required (Cohen et al. 2010). Another important question that is often overlooked relates to the willingness of providers (and their families) to live and work in ‘difficult’ areas, whether urban shanty towns or remote rural areas (Agyei-Baffour et al. 2011; Sundararaman and Gupta 2011).
Consideration of such issues naturally leads to questions relating to infrastructure and services. We often use some indicator of the overall ‘level of development’ of an area, such as GDP, but where possible such measures should be supplemented by data on specific factors that are considered relevant to the planned intervention. Is the area well served by road, rail or water transport links? Is there access to electricity, clean water and sanitation? Are there local primary/secondary schools? Is there is a reasonably effective and trusted law enforcement service? Are there functioning telecommunications networks that could enable access to services via telephone, radio or television? How are these various services affected by seasonal factors such as rainfall, drought, high winds, snow?
Just as it is important to distinguish between geographic areas with different characteristics, it is equally important to consider the extent to which distinctions between different population groups will be relevant to the outcome of the intervention. Sometimes there will be a considerable overlap between geographical areas and population characteristics. For example, in Nigeria certain states in the south-eastern region are closely associated with the Igbo ethnic group, the majority of whom identify with the Christian religion. Even in such cases the researcher should be very cautious in assuming that the number of individuals in that area who do not share those characteristics is insignificant. The 2013 Demographic and Health Survey for Nigeria suggests that almost 98 per cent of the population of the south-eastern region are Igbo. That would still imply, however, that around half a million individuals who identify themselves as being in other ethnic groups also live in that region. Given that resources are always constrained, implementers may sometimes reasonably decide that they will tailor an implementation process in ways that they see as most likely to meet with approval from the majority population in a given location, even though this may adversely affect the response from minority groups. Nevertheless, any such decision should be clearly stated, justified and evidence-based.
Often it will be evident that there are relevant differences between population groups living in the same geographical area. For example, as in many other countries, most major cities in China have large migrant populations. Those populations have very limited access to the services, including health services, provided for those who have urban resident status (Mou et al. 2013). Any intervention intending to improve the health of the overall population living in a city would have to consider the very different circumstances of these two groups when developing the implementation design. Similar considerations would apply in south-western Nigeria, where there are many long-established settlements entirely composed of members of the Hausa ethnic group in cities that are predominantly populated by the Yoruba people. As the purpose of these settlements was precisely to retain traditional customs and practices, including authority structures, the context within which they live differs substantially from that of the majority population (Omobuwa et al. 2013).
Having identified relevant population groups on the basis of factors such as geographical location, migrant/non-migrant, ethnicity, culture, religion, etc., it will also be essential to consider the extent of variation within these groups. Gender will almost always be a key factor. Where an intervention is intended to improve the health of all members of a population group, for example advocating behavioural change to reduce the risk of chronic illness, it seems obvious that careful thought should be given as to how such an intervention will be perceived by both women and men and how they will respond. However, this should be the case even if the intervention is clearly gender-specific, for example intended to encourage increased use of ante-natal care, given that such perceptions and responses are invariably strongly influenced, positively or negatively, by existing gender relations (e.g. Dworkin et al. 2012; Nikiema et al., 2012). Potential differences between the younger and older members of population groups should also be considered. These may include attitudes to cultural traditions, authority structures or technical innovations. For example, a number of studies have suggested that older people are much less likely to use mobile phone texting services (e.g. Deng et al. 2014), with implications for behaviour change interventions that wish to adopt this approach. Note that it will often be informative to consider age and gender simultaneously. An intervention to encourage facility-based births, for example, might face opposition from older women who trust local traditional birth attendants with whom they have shared life experiences.
Variations in levels of education may also be an important consideration. Some interventions, for example those that provide written instructions and/or warnings to patients in relation to drug treatments, may be premised on the assumption that the great majority of the targeted population are either literate in one of the languages selected for use in the intervention or will be able to rely on support from a literate person. If this is not the case, alternative approaches may have to be adopted (Dowse and Ehlers 2001). There is evidence that adherence to anti-retroviral therapy can be influenced by the level of education of patients, though not always in the expected direction (Emamzadeh-Fard et al. 2012; Radhakrishnan et al. 2012). Interventions that involve some form of written contractual relationship – for example where individuals are invited to join a health insurance scheme – may be more easily comprehended and thus more attractive to those with language and/or numeracy skills beyond those that would be acquired at the primary level of education (Jehu-Appiah et al. 2011). A general understanding of the distribution of the population across different education levels may therefore be useful in predicting the potentially different responses to various components of an intervention, with implications for implementation design.
Finally in this section, we need to consider potential financial barriers to care. Even when an intervention is providing a notionally ‘free’ service we will typically find that utilisation is significantly lower for the poorer members of the population. This may sometimes be because facilities find ways to add indirect fees, for example to register with the facility, or because they encourage patients to take additional services – laboratory tests, scans, supplementary drugs, etc. – and imply that this will greatly increase the efficacy of the basic, free treatment. In some cases, it may be that patients have to bear travel or accommodation costs, or, more recently, need access to a mobile phone to utilise the service. It is also possible for the ‘opportunity costs’, associated with a household member having to take time away from wage employment, household production or other tasks, to be relatively high for poor households. In poor areas of rural China, for example, poverty is often associated with a lack of household labour time because household sizes tend to be limited and many of those of working age leave to seek employment in urban areas.
The aim of contextual analysis in this area would be to gain an understanding as to which sections of the identified population groups might either fail to access treatment due to financial barriers, or experience serious hardship due to expenditures associated with accessing treatment – often described as ‘catastrophic healthcare expenditure’ (Mills et al. 2012). This will typically involve compiling order-of-magnitude per capita annual household income or expenditure estimates, focusing on those within the identified groups most at risk. These might include, for example, small farmers and the landless in rural areas, day labourers and the self-employed in urban areas. These estimates will sometimes be available from sample survey data but it may often be necessary to rely on the judgement of a number of key informants. For interventions where substantial out-of-pocket expenditures might be involved – for example a co-payment scheme for inpatient treatment – it will also be useful to explore the extent to which these groups tend to have disposable assets, outstanding debts and access to sources of credit (including from extended families). In many countries, illnesses that result in substantial costs or loss of income can severely disrupt the livelihoods of households that have to sacrifice productive assets, including agricultural land, or take out loans on highly unfavourable terms that may force asset sales at a later date (de Laiglesia 2011, Kenjiro 2005). Finally, there should be at least some consideration of intra-household financial arrangements. For example, in many countries expenditure on healthcare for children may be influenced by women’s status in decision-making and control over household resources (Richards et al. 2013).
3.4 Health needs
An intervention intended to improve the detection and treatment of TB cases should obviously compile as much information as possible about the incidence of TB in the target population, the extent to which those with TB have access to services and the extent to which they utilise those services. This would apply to any disease-specific intervention. Even for such focused interventions, however, it will often be important to compile data on a range of other health issues. Such data can, for example, assist us to understand population awareness and perceptions of the health issues with which we are primarily concerned, which may help explain their attitudes to our intervention. For example, a population in which both adults and children are subject to frequent bouts of fever, cough and diarrhoea may question why those implementing an intervention on chronic conditions such as hypertension or diabetes are failing to address what they, and possibly local health service providers, see as more immediate concerns.
More wide-ranging interventions, for example the introduction of performance-related payments in primary healthcare centres or new insurance schemes to meet the cost of inpatient care, would merit the compilation of detailed information on a range of relevant conditions with high incidence or prevalence rates in the target population. Such data should allow an improved understanding both of the healthcare needs of the population and of the current and potential demands placed on healthcare providers. Thus in the example above, reliable estimates of the likely rates of hypertension, diabetes and other chronic diseases in a population will not only indicate the need for an intervention to address such conditions but also the implications for healthcare services of improvements in their detection and diagnosis that may result from such an intervention. Note that cultural factors may complicate the translation of health needs into demands for health services. An obvious example in many countries relates to mental illnesses, where the stigma attaching to mental illness and the assumption that it should be addressed by religious or traditional healers will often prevent sufferers and their families from seeking care from allopathic providers (Brenman et al. 2014).
In many countries reliable data on incidence/prevalence rates and service utilisation over a recent period will be difficult to obtain, except possibly where there are well-funded major programmes for specific diseases such as HIV/AIDS or TB, or where it is possible to obtain access to detailed facility or health insurance data. In general, researchers will have to rely on evidence from previous national surveys. For example, data on the most common early childhood diseases can be found from the Demographic and Health Surveys undertaken in many countries. It should be noted that such surveys are typically based on reported symptoms rather than formal diagnosis and that they rely heavily on the ability of respondents to provide details as to the type of healthcare accessed. A useful international source of data on disease-specific morbidity and mortality for most countries is provided by the Global Burden of Disease studies. The country profiles compiled under this programme represent systematic attempts to use whatever data are available to derive best estimates of the impact of different diseases based on the number of years of life lost to premature deaths and the number of years lived with a disability. Note that national surveys will aim to provide data for the population as a whole. Disaggregation by location or population group may be possible to some extent, depending on the nature of the survey and whether access to the raw data is possible, but it will often be impossible to derive disease patterns for the specific population targeted by the intervention.
3.5 Health systems
In Chapter 1 it was argued that the most useful approach to the study of health systems, especially in situations where there are multiple providers and limited regulation of services (Bloom and Standing 2008), is to require the implementation researcher to define the health system – in terms of agents, units and institutions – that will be the focus of their research. However, in order to undertake that definition, to define the boundaries of the system with which they are concerned, they should first undertake a systematic assessment of the diverse range of providers that are offering to provide health services to the population targeted by the implementation. This may not be a simple task. For example, we often characterise health service providers under headings such as:
- Public healthcare;
- Formal private healthcare;
- Informal private providers (unlicenced practitioners, shops, drug sellers);
- Traditional, religious and faith healers;
- Household healthcare.
However, many of these categories consist of multiple components that have their own distinct characteristics. In China, for example, public hospitals may offer either allopathic (Western) medicine, Traditional Chinese Medicine, or both. Ayurvedic practitioners have recently gained a similar status in India. In the United Kingdom, there are long-standing debates as to whether homeopathic treatments should be available under the public National Health Service. In most countries the ‘formal’ private sector will include individuals from a variety of medical traditions ranging from senior specialists, with qualifications and experience equal to or greater than those in the public sector, to providers with minimal training and titles ranging from Registered Medical Practitioner in India (Das and Hammer. 2007) to Village Doctor in Bangladesh (Mahmood et al. 2014) to Community Health Worker in other countries. In terms of the quality of services provided, there may be little to choose between many of these providers and those who practise without any form of licence or simply sell drugs in shops or local markets. In both groups there will be dedicated, principled providers who sincerely believe that they are doing their utmost to help those who seek their services and there will be unprincipled charlatans or ‘quacks’, whose primary aim is to extract money from or exert influence over their patients.
Having identified those components that seem most relevant to the implementation – that is, those which play a substantial role in delivering the services on which the implementation is focused – it will be helpful to undertake a systematic descriptive analysis to identify for each component the various units, decision-making agents and (formal and informal) institutions that govern its operation. A simple outline example is provided below (Tables 5.1 and 5.2), where we adopt the WHO classifications discussed in Chapter 1 to consider different aspects of each type of service provider. As always in health systems research, we will wish to assess the implications of our analysis in terms of access, utilisation and quality of services for different population groups. In general, it will be relatively straightforward to identify the main agents and units but understanding the institutional arrangements will typically be more challenging.
A reasonable understanding of the formal institutions can usually be gained by a review of the available documentation: policy statements, plans, laws, regulations, protocols, procedures, guidelines, etc., though identifying and prioritising such documents will often require guidance from key informants. However, in many cases these documents will describe a system that differs substantially from that which actually exists. Undertaking research in Nigeria in the mid-1990s, it was common to find in some local government areas that the rural primary healthcare system described by the ministry of health simply did not exist. Health workers had not been paid for many months and had moved away to seek other employment, equipment had become unserviceable or disappeared and in some cases buildings had collapsed because of long-term failures in basic maintenance. Similarly, following the economic reforms in China, the rural ‘three-tier healthcare system’, under which County Hospitals supervised Township Healthcare Centres, which supervised Village Health Stations, evolved into what were essentially local competitive markets, with each facility competing for patient ‘out-of-pocket’ fees. It took many years before this situation was officially recognised and action taken to offset the worst characteristics of these markets. In one East African country, basic drugs were at one time primarily available via ‘essential medicine kits’ supplied by an international aid agency. It was general knowledge that many of these kits were diverted to local shops to which patients would be referred even by providers at public health centres. However, formal acknowledgement of this practice would have resulted, under the agreement with the agency, in the withdrawal of the kits until more secure delivery mechanisms could be devised. As this was seen as creating a potentially life-threatening situation for many children, everyone proceeded as if they were unaware of the true situation.
In such situations it will be necessary to understand both the intended and the actual institutional arrangements that are governing the activities of a health system. While it may be common knowledge that that system is functioning quite differently from what was intended in the various policy documents and formal operating guidelines, it is rarely possible for those charged with managing the system to radically shift their activities to allow for that fact. Officials, managers, administrators and providers have contracts of employment that assume intended operating procedures. Data collection systems and reporting forms will have been designed to align with those procedures. In some countries, health officials will take great care in preparing annual budgets that they know will be ignored. In many others those managing health information databases will use sophisticated techniques to analyse and present data that are well understood to be incomplete and highly unreliable. Institutions are important even when they result in unintended consequences. The implementation research has to understand both how they were intended to function and how they fail.
Table 5.1: Public health sector
|Service delivery|| Hospitals |
| Doctors |
| Legal system |
|Human resources|| Hospitals |
| Managers |
| Legal system |
|Supplies and equipment|| Facilities |
| Managers |
Drug company representatives
| Legal system |
|Leadership and governance|| Ministries |
| Ministry officials |
| Legal system |
|Financing|| Patients |
| Patients |
| Legal system |
|Information systems|| Registers |
| Administrators |
| Data quality audit |
Table 5.2: Informal private health sector
|Service delivery|| Clinics |
| Village doctors |
| Legal system |
Custom and practice
|Human resources||Households|| Village doctors |
|Custom and practice|
|Supplies and equipment|| Shops |
| Shopkeepers |
| Legal system |
Custom and practice
|Leadership and governance|| Providers |
|Village doctors|| Legal system |
Custom and practice
|Finance|| Patients |
|Patients|| Markets |
Custom and practice
|Information systems|| Account books |
| Shop keepers |
|Custom and practice|
4. Stakeholder analysis
Stakeholder analysis (Brugha and Varvasovszky 2000) can be one of the most important activities undertaken by researchers in terms of understanding the context within which the implementation takes place – but only if it is systematic and comprehensive. As discussed in Chapter 1, even apparently simple technical health system innovations typically involve complex social interventions. The extent to which different groups respond with enthusiasm, indifference or hostility to an implementation will often determine its relative success or failure. On occasion, depending on their degree of authority or influence, a single individual can make the difference. Predicting likely responses, determining their potential implications, adapting procedures in line with that analysis and repeating this process over the lifetime of the implementation is a key activity for the implementation team in terms of maximising the likelihood of achieving targeted outcomes. For the implementation researcher, engaging in a rigorous stakeholder analysis can provide valuable insights into the social contextual factors that might promote or obstruct scaling up or relocation of the studied intervention. Too often, however, such analyses are not allocated the resources that they merit, being seen as simply a routine task to be undertaken at the start of the implementation, sometimes simply to meet the requirements of those funding the intervention, and largely disregarded thereafter.
We can outline the aims and objectives of stakeholder analysis as follows (Varvasovszky and Brugha 2000):
Aims: Identify all relevant stakeholders and assess:
- how they are likely to be affected by the intervention;
- how they are likely to respond;
- the implications of their responses, given their capacity to influence implementation outcomes either directly or through their relationships with other stakeholders.
- Where possible, modify the implementation design to (a) encourage collaboration and (b) minimise obstruction by different stakeholders;
- Improve understanding of the underlying causes of implementation successes or failures that are linked to stakeholder behaviour.
A stakeholder can be any individual, group or organisation that may be positively or negatively affected by an intervention or in a position to influence the implementation of that intervention and have a positive or negative effect on intended outcomes. Broadly speaking, the analysis is intended to generate information about these stakeholders that can improve our understanding of their incentives, perceptions, attitudes, and relationships with other stakeholders in order to provide insights into their current and likely future patterns of behaviour in relation to the implementation of a given intervention (Hyder et al. 2010). This requires:
- Identification of potential stakeholders.
- For each potential stakeholder, determination of:
- The extent to which they are interested in the intervention;
- How and to what extent they can influence the implementation progress;
- Their attitudes and actions relating to the intervention objectives;
- The factors that are most important in determining those attitudes and actions.
4.1 Identifying stakeholders
One way to identify stakeholders is to work ‘outward’ from the implementation:
- Start with the project managers or lead implementers and donors/funders;
- Move out to consider those they work with – partners, service providers, regulators and owners of resources (facilities, land, infrastructure, etc.);
- Move out again to those who will either be involved in the activities or who are beneficiaries – NGOs, local authorities, communities, media organisations and other interest groups.
Such a diagram is intended to assist in identifying ALL stakeholders – those who impact ON, or are impacted BY the activities.
4.2 Stakeholder groups and subgroups
The limited resources typically allocated to stakeholder analysis often results in a failure to achieve an appropriate level of disaggregation. It is important to distinguish significant stakeholder subgroups that may have very different attitudes and levels of influence.
- Providers should at least be subdivided into categories such as: public/private, qualified/unqualified, allopathic/traditional/faith-based but some degree of cross-classification (e.g. qualified traditional private providers) may substantially increase the value of the analysis by reducing within-group variation.
- Community members might be classified in terms of: male/female, younger/older, richer/poorer, indigenous/migrant, ethnic group, etc., again keeping open the potential value of cross-classification (e.g. older poorer women)
The large number of potentially interesting groups and subgroups requires a careful process of prioritisation in terms of analysis. The criterion for prioritisation should always be in terms of the anticipated level of influence over implementation outcomes that different subgroups may possess.
The analysis will focus on identifying the following characteristics of each stakeholder or stakeholder group in line with the aims indicated above:
- Their interest in the intervention:
- Intended beneficiary;
- Direct involvement in the implementation;
- Likely to be directly affected by the intervention (positively or negatively);
- Likely to be influenced by those directly affected;
- No apparent interest (can be omitted from the current list of stakeholders but may have to be added later if this assessment changes).
- Their potential influence over the implementation process, for instance:
- Policymaker (capacity to affect implementation strategy or context);
- Decision-maker (capacity to affect routine implementation activities);
- Gatekeeper (capacity to control access to resources or take-up of services);
- Opinion leader (capacity to affect responses of other stakeholders).
- Their attitude to the intervention (evidence to be provided where available):
- Enthusiastic supporter;
- Generally supportive;
- Strongly opposed.
- Factors driving attitudes, possibility of changing attitudes and potential benefits:
- How likely are they to use their influence to change outcomes and why?
- What would be the consequences?
- Are they accessible to the implementation team?
- Might they be responsive to incentives intended to modify their attitudes?
Given the complexities of stakeholder analysis it is sometimes tempting to make the simplifying assumption that a given stakeholder or stakeholder group can be considered in isolation, and that they make their decisions in line with their own perceptions and preferences. However, in the real world we know that this is rarely the case and that relationships between stakeholders, especially power relationships (Erasmus and Gilson 2008), play a major role in influencing behaviour. A hospital manager may be convinced that a new payment mechanism would result in a better outcome for patients but act to undermine that mechanism because he wishes to avoid conflicts with senior hospital staff who believe that it will adversely affect their incomes. Local government health officials may see the training, licensing and monitoring of local drug sellers as the most effective way to deliver antimalarials but be pressured into opposing this reform by qualified providers with connections to local polticians, who see it as threatening their control over the supply of prescription drugs.
This indicates the need to identify stakeholder networks (Blanchet and James 2012). The aim will be to map the formal and informal links between stakeholders and assess the underlying nature of those links – in particular do they exist only in theory (e.g. according to regulations) or do they have practical consequences? Relationships may be of many types including: financial support; direct management; oversight/monitoring (in theory and in practice); advice/influence (in theory and in practice), etc. Having identified such relationships, the stakeholder analysis can be revisited to explore the extent to which they provide additional clues to the attitudes and behaviours of particular stakeholders.
4.4 Data collection
A range of data collection activities will need to be undertaken. The aim will be to seek: responses to a range of specific questions; the reasons underlying these responses; and the extent to which the responses are based on available evidence. The initial activity should involve a detailed document review to assess the stated position (if any) of each stakeholder on issues relevant to the intervention objectives. This will be followed by primary data collection using semi-structured interviews, structured questionnaires (possibly self-administered) and focus groups.
One important consideration is the extent of involvement in data collection, feedback and analysis by the various stakeholders themselves. Such involvement can offer many advantages in terms of the extensive knowledge that individual stakeholders may have, for example in terms of the range of attitudes inside an organisation or internal documents that may be difficult to identify by other means. On the other hand, there are obvious risks that some stakeholders may attempt to drive the analysis in directions that support their own agendas. Decisions should be made on a case-by-case basis depending partly on the sensitivity of the information compiled. Where there are no concerns that the analysis may cause offence, a feedback stage can be included in the process to allow each stakeholder to comment on findings relating to their own position and correct factual inaccuracies where appropriate.
Finally, we would suggest that in gathering the data to perform a stakeholder analysis it is useful to keep in mind the work of sociologist Erving Goffman (Goffman 1956). This codified the commonplace observation that individuals can be compared to actors who play a variety of roles. They will behave very differently depending on whether they are ‘front-of-stage’, ‘offstage’ or ‘backstage’. Front-of-stage is where the actor formally performs and here they will adhere to conventions that align with the expectations of their current audience. For example, a hospital director accompanied by senior staff members responding to questions from an unfamiliar researcher would probably play a very different role from that which they would adopt if called to a formal meeting with the Minister of Health. Offstage is where actors meet individual audience members in an informal environment. Thus, if our hospital director happened to meet the researcher at a social gathering and realised that they were both long-standing friends with the local politician who was the host of that gathering, he or she might well be much more open in discussion of current problems in the health sector. Even so, this would be another role that the director was playing. Only when they are alone ‘backstage’ do actors truly get to be themselves. The lesson to be learned is that it is always wise to be cautious about the extent to which any researcher can truly ‘understand’ the perceptions and attitudes of different stakeholders. Those who proclaim themselves to be strongly in favour of an intervention may act in ways that undermine its implementation and those who are initially most critical may become key players in ensuring its success. Only by updating the stakeholder analysis on a regular basis and comparing actions to stated intentions can the implementation researcher hope to gain at least a working understanding of the factors influencing stakeholder behaviour.
Agyei-Baffour, P.; Kotha, S.R.; Johnson, J.C.; Gyakobo, M.; Asabir, K.; Kwansah, J. et al. (2011) ‘Willingness to Work in Rural Areas and the Role of Intrinsic Versus Extrinsic Professional Motivations – A Survey of Medical Students in Ghana’, BMC Medical Education 11: 56
Asaki, B. and Shannon, H. (2011) ‘Leaders, Not Clients: Grassroots Women's Groups Transforming Social Protection’, Gender and Development 19.2: 241–53
Blanchet, K. and James, P. (2012) ‘How to Do (or Not to Do)... A Social Network Analysis in Health Systems Research’, Health Policy and Planning 27.5: 438–46, (accessed 30 January 2015)
Bloom, G. (2014) ‘History, Complexity and Health Systems Research’, Social Science and Medicine 117: 160–61
Bloom, G. and Standing, H. (2008) ‘Future Health Systems: Why future? Why now?’, Social Science and Medicine 66: 2067–75
Brenman, N.F.; Luitel, N.P.; Mall, S. and Jordans, M.J.D. (2014) ‘Demand and Access to Mental Health Services: A Qualitative Formative Study in Nepal’, BMC International Health and Human Rights 14.22, www.biomedcentral.com/content/pdf/1472-698X-14-22.pdf (accessed 30 January 2015)
Brugha, R. and Varvasovszky, Z. (2000) ‘Stakeholder Analysis: A Review’, Health Policy and Planning 15.3: 239–46
Cohen, J.; Sabot, O.; Sabot, K.; Gordon, M.; Gross, I.; Bishop, D. et al. (2010) ‘A Pharmacy Too Far? Equity and Spatial Distribution of Outcomes in the Delivery of Subsidized Artemisinin-Based Combination Therapies through Private Drug Shops’, BMC Health Services Research 10: S6
Das, J. and Hammer, J. (2007) ‘Location, Location, Location: Residence, Wealth, and the Quality of Medical Care in Delhi, India’, Health Affairs 26.3: w338–w351, http://content.healthaffairs.org/content/26/3/w338.full.pdf+html (accessed 30 January 2015)
de Laiglesia, J.R. (2011) ‘Why Do Farmers Sell their Land? Evidence from Nicaragua’, OECD Development Centre, pagesperso.dial.prd.fr/dial_pagesperso/dial.../204_de%20laiglesia.pdf (accessed 30 January 2015)
Deng, X.; Ye, L.; Wang, W. and Zhu, T. (2014) ‘A Cross-Sectional Study to Assess the Feasibility of a Short Message Service to Improve Adherence of Outpatients Undergoing Sedation Gastrointestinal Endoscopy in the People’s Republic of China’, Patient Preference and Adherence 8: 1293–97
Dowse, R. and Ehlers, M.S. (2001) ‘The Evaluation of Pharmaceutical Pictograms in a Low-literacy South African Population‘, Patient Education and Counselling 45: 87–99
Dworkin, S.L.; Colvin, C.; Hatcher, A. and Peacock, D. (2012) ‘Men’s Perceptions of Women's Rights and Changing Gender Relations in South Africa: Lessons for Working with Men and Boys in HIV and Antiviolence Programs’, Gender and Society 26: 97
Emamzadeh-Fard, S.; Sahar, E. Fard, S. A., Alinaghi, S. and Paydary, K. (2012) ‘Adherence to Anti-Retroviral Therapy and Its Determinants in HIV/AIDS Patients: A Review’, Infectious Disorders – Drug Targets 12: 346–56
Erasmus, E. and Gilson L. (2008) ‘How to Start Thinking about Investigating Power in the Organizational Settings of Policy Implementation’, Health Policy and Planning 23: 361–68
Goffman, E. (1959) The Presentation of Self in Everyday Life, New York: Doubleday
Grundy, J.; Hoban, E.; Allender, S. and Annear, P. (2014) ‘The Intersection of Political History and Health Policy in Asia: The Historical Foundations for Health Policy Analysis’, Social Science and Medicine 117: 150–59
Houben, R.; Van Boeckel, T.; Mwinuka, V.; Mzumara, P.; Branson, K.; Linard, C. et al. (2012) ‘Monitoring the Impact of Decentralised Chronic Care Services on Patient Travel Time in Rural Africa – Methods and Results in Northern Malawi’, International Journal of Health Geographics 11: 49
Hyder, A.; Syed, S.; Puvanachandra, P.; Bloom, G.; Sundaram, S.; Mahmood, S. et al. (2010) ‘Stakeholder Analysis for Health Research: Case Studies from Low- and Middle-income Countries’, Public Health 124: 159–66
Jehu-Appiah, C.; Aryeetey, G.; Spaan, E.; de Hoop, T.; Agyepong, I. and Baltussen, R. (2011) ‘Equity Aspects of the National Health Insurance Scheme in Ghana: Who is Enrolling, Who is Not and Why?’ Social Science and Medicine 72: 157–65
Kenjiro, Y. (2005) ‘Why Illness Causes More Serious Economic Damage than Crop Failure in Rural Cambodia’, Development and Change 36.4: 759–83Longhurst, R. (2013) ‘Implementing Development Evaluations under Severe Resource Constraints’, Centre for Development Impact Practice Paper 3, Brighton: Institute of Development Studies
Mahmood, S.S.; Iqbal, M.; Hanifi, S.M.A.; Wahed, T. and Bhuiya, A. (2014) ‘Are “Village Doctors” in Bangladesh a Curse or a Blessing?’, BMC International Health and Human Rights 10.18, www.biomedcentral.com/content/pdf/1472-698X-10-18.pdf (accessed 30 January 2015)
Mills, A.; Ataguba, J.E.; Akazili, J.; Borghi, J.; Garshong, B.; Makawia, S.; Mtei, G.; Harris, B.; Macha, J.; Meheus, F. and McIntyre, D. (2012) ‘Equity in Financing and Use of Health Care in Ghana, South Africa, and Tanzania: Implications for Paths to Universal Coverage’, The Lancet 380.9837:126–33, www.sciencedirect.com/science/article/pii/S0140673612603572 (accessed 30 January 2015)
Mou, J.; Sian, M.; Griffiths, H.F. and Dawes, M.G. (2013) ‘Health of China’s Rural–Urban Migrants and their Families: A Review of Literature from 2000 to 2012’, British Medical Bulletin 106: 19–43
Nikiema, B.; Haddad, S. and Potvin, L. (2012) ‘Measuring Women’s Perceived Ability to Overcome Barriers to Healthcare Seeking in Burkina Faso’, BMC Public 12: 147, www.biomedcentral.com/1471-2458/12/147 (accessed 30 January 2015)
Omobuwa, O.; Onayade, A.A.; Olajide, F.O. and Afolabi, O.T. (2013) ‘A Comparative Study of the Nutritional Status of Under-Five Children of Indigenous and Non-Indigenous Parentage in lle-lfe, Nigeria‘, Journal of Community Medicine and Primary Health Care 25.1: 1–11, www.ajol.info/index.php/jcmphc/article/view/95507/84849 (accessed 3 March 2015)
Paina, L. and Peters, D. (2012) ‘Understanding Pathways for Scaling Up Health Services through the Lens of Complex Adaptive Systems’, Health Policy and Planning 5: 365–73
Radhakrishnan, R.; Vidyasagar, S.; Varma, D.M. and Sharma S. (2012) ‘Design and Evaluation of Pictograms for Communicating Information about Adverse Drug Reactions to Antiretroviral Therapy in Indian Human Immunodeficiency Virus Positive Patients’, Journal of Pharmaceutical and Biomedical Sciences 16.16: 1–11, www.jpbms.info/index.php?option=com_docman&task=doc_download&gid=415&Itemid=48 (accessed 3 March 2015)
Richards, E.; Theobald, S.; George, A.; Kim, J.C.; Rudert, C.; Jehan, K. and Tolhurst, R. (2013) ‘Going Beyond the Surface: Gendered Intra-Household Bargaining as a Social Determinant of Child Health and Nutrition in Low and Middle Income Countries’, Social Science and Medicine 95: 24–33
Sundararaman, T. and Gupta, G. (2011) ‘Indian Approaches to Retaining Skilled Health Workers in Rural Areas’, Bulletin of the World Health Organization 89: 73–7
Tilley, N. (2000) ‘Realistic Evaluation: An Overview’, paper presented at the Founding Conference of the Danish Evaluation Society, September, http://evidence-basedmanagement.com/wp-content/uploads/2011/11/nick_tilley.pdf (accessed 3 March 2015)
Varvasovszky, Z. and Brugha, R. (2000) ‘How to Do (or Not to Do)... A Stakeholder Analysis’, Health Policy and Planning 15.3: 338–45
Yahya, M. (2006) ‘Polio Vaccines – Difficult to Swallow: The Story of a Controversy in Northern Nigeria’, IDS Working Paper 261, Brighton: Institute of Development Studies, www.ids.ac.uk/files/Wp261.pdf (accessed 30 January 2015)
Zwarenstein, M. and Treweek, S. (2009) ‘What Kind of Randomized Trials do we Need?’, Journal of Clinical Epidemiology 62: 461–63
Meharry Medical College, School of Medicine
Adnan A. Hyder
Professor & Director Health Systems Program, Johns Hopkins Bloomberg School of Public Health & Associate Director for Global Bioethics, Johns Hopkins Berman Institute of Bioethics
This chapter assumes a basic (introductory) familiarity with core terms in both health systems research and bioethics; for the latter these include the core principles of respect for persons (autonomy), beneficence, and justice.
A measles immunisation programme for all children in a low-income country is to be created. The population density and local ecology lend itself to the proliferation of epidemics, and most children have had the infection by age three. It has been noted statistically that immunisations have the greatest impact on high mortality rates if administered by age one. Studies in high-income countries have shown that the immunological response to a measles vaccine is most effective at 15 months of age. Local research is needed to decide the ideal age at which the programme can produce the most impact on measles incidence and mortality.
A rapidly industrialising middle-income country has been expanding their transport and communication networks, with a large growth in healthcare facilities. The newly created hospitals and medical centres were potentially capable of providing complete coverage for the entire population. The Ministry of Health is concerned that despite the investment in health facilities and services, they are largely inaccessible to many individuals. A district health officer is appointed to improve the function of the health facilities. She discovers that a serious shortage of drugs and supplies at a medical centre was the result of the hospital siphoning off most of the drugs and supplies. She is to design a study to examine the misallocation of drugs within this system.
As noted in these two slightly modified cases from the World Health Organization (WHO) (Taylor 1984), it is apparent that research that addresses these problems – or health systems research – is different from other types of empirical clinical or public health research, and covers a wide range of subject areas that are focused on common health systems functions, such as stewardship, financing, resource inputs, and delivery of services (WHO 2009). Health systems research (HSR) is defined by the WHO as ‘the purposeful generation of knowledge that enables societies to organise themselves to improve health outcomes and health services’ (WHO 2009).
HSR is not usually research that focuses on the discovery or development of new interventions to improve health; rather, it is research that usually aims to understand how new interventions that are efficacious can be made more widely accessible to potential beneficiaries through policies, organisations and programmes (Gilson et al. 2011). While some HSR adopts the traditional randomised controlled trial (RCT) model, many HSR studies are performed as non-randomised, controlled or non-controlled, prospective or cross-sectional assessments of new or modified health care programmes and strategies (Alliance for Health Policy and Systems Research 2012a). Given the macro focus of HSR, its participants and beneficiaries are often communities, hospitals, and healthcare institutions, as opposed to individuals. Since HSR has its own definitions, methods, and analytic approaches, there is an increasing realisation that HSR raises ethical concerns that differ from those in other types of research; therefore, its ethical review should arguably be tailored to address the features and unique ethical challenges that are particularly salient (though not exclusive) to HSR.
Unfortunately, many (if not most) institutions often use the same review criteria and review processes for HSR studies as for clinical trials, which can potentially create an imprecise application of criteria, confusion on the part of research teams, and unnecessary delays. Currently, it is not clear whether institutions are equipped to adequately address and ethically evaluate HSR in their research ethics committees (REC) or institutional review boards (IRB) (Hyder, Bachani and Rattani 2012a). This is especially true for HSR in low- and middle-income countries (LMICs), where this research often plays a critical role in efforts to strengthen health systems and improve healthcare delivery.
Based on the presumption that certain kinds of ethical issues may be particularly relevant to and salient in HSR, this chapter explores several of these issues. We outline eight areas of ethical relevance that are particularly salient in HSR (though not unique to HSR) that may require special attention during ethics review, especially in LMICs. This set of issues is used to demonstrate only some of the salient features that might arise during HSR – they are not exhaustive and readers are encouraged to add to the list below.
2. Type of research subjects in HSR
A team of engineers in collaboration with public health workers is designing a vehicle crash reduction study on a new technology created to improve driver and passenger safety in auto rickshaws.
The ‘research subjects’ in HSR studies can either be humans or non-humans, each with their own respective ethical challenges. Non-human ‘subjects’ in HSR may include units of interventions, such as motorcycles equipped with safety features in a vehicle crash reduction study, or units of allocation, such as hospitals or schools involved in a study of cost containment budgetary strategies. When reviewing study protocols with these kinds of non-human subjects, it may be challenging for RECs to assess what role various actors should play in the authorisation and implementation of the study. For instance, when the intervention involves safety features for products, how should RECs weigh the interests of the manufacturers as well as consumers of these products? When schools or hospitals are the unit of allocation, how should the teachers and students in these institutions factor into the ethical analysis? For HSR studies, the level of impact goes far beyond individuals, and even research with non-human ‘subjects’ requires consideration of a wide range of stakeholders who may be affected by the investigation. As of yet, there is no standard universal guidance on how to assess and balance the interests of these various parties in HSR studies.
When considering human subjects, HSR studies may target individuals as units of allocation or intervention, though more commonly they are directed at groups of people, as in population-level or cluster-based studies. The emphasis on groups of people as research subjects introduces the ethical challenge of defining the moral status of a group or community as opposed to individual persons. Identifying appropriate representatives or leaders of these groups may also be trying, especially when assessing their legitimacy and source of authority. The principle for respect for communities has been proposed as a means of defining moral worth and protecting the interests of a given community (Weijer and Emanuel 2000). This principle understands the community as a source of values, a social structure that sustains its members and makes decisions for its members. Therefore, RECs concerned with respect will have to think far beyond the typical construction of respect for persons, concerned with individuals (and often focused on consent), and instead adopt the broader interpretation of respect for communities to determine what is required. Their review of such an HSR study will have to be considerate of study community priorities and norms, and determine appropriate levels of engagement with local leadership, which presents further challenges, particularly in pluralistic communities embodying a range of diverse interests. This extension of ‘respect’ from an individual to a population requires further exploration for global health research (Wallwork 2008).
HSR studies involving large populations or groups of people also require a broader interpretation of burdens and benefits and how they may be differentially distributed across various study populations. Similarly, concerns around potential harms need to be reviewed, such as a group reputation potentially affecting individuals – for example, a hospital that is perceived to provide low-quality care may develop a reputation that affects the flow and type of individual patients who visit it. This concept has been described as group harm by which ‘members suffer it by virtue of their identification with or participation in the group’ (Wallwork 2008). This presents complications for RECs in assessing benefits and harms at the community level. The current norm for reviewing research focuses on the individual, but this narrow application of principles at the individual level is not well suited for assessing HSR, in which group-level interventions and impacts require a much broader lens.
3. Informed consent in HSR
With the incidence of malaria growing in a city in Burundi, the Ministry of Health approved an intervention designed by researchers in Sweden to alter standard procedures on malaria prevention and control. Seeking informed consent from individuals would be impractical, and so the research ethics committee and the Ministry of Health opted for group consent from each village.
Consent can be obtained similarly in both HSR and clinical studies in the event of individuals receiving a particular intervention. However, in many HSR studies where interventions are administered to an entire group, the consent process has to involve authorisation at multiple levels, engaging community or institutional leaders as well as affected individuals. In some studies where the intervention can be delivered at an individual level, such as with malaria bed nets, researchers may require consent from both the community leadership, as well as from individuals or households participating in the study. However, other interventions, such as adjustments to standard procedures or drugs offered at public facilities, broadly impact a large number of people for whom obtaining consent would be impracticable. Alternatively, individual informed consent in HSR studies that focus on area-wide interventions such as spraying for malaria control or building speed bumps for road safety, may not be possible.
In these instances, group consent (or permission) is usually obtained through representatives and often paired with community outreach and education. In some circumstances, participants still have the ability to opt out and can take voluntary actions to exclude themselves from study participation (for example avoiding public facilities or seeking private providers). Since in many types of HSR (and this includes cluster-randomised trials of certain group interventions), individual informed consent may not be obtainable, some have argued that ethics committees have an obligation to ensure that the justification for waiving consent is adequate (Sim and Dawson 2012; Taaljard et al. 2009; Weijer et al. 2011). A trade-off may need to occur in decisions regarding the choice between individualised consent and ability to conduct valid HSR studies; indeed, if the societal value of the HSR study is high enough, it may allow concerns of greater benefit to outweigh individual autonomy concerns and permit practical studies to move forward (Hutton, Eccles and Grimshaw 2008).
Consent involving groups of people may not be specific only to HSR, but is starting to become a ‘norm’ in many HSR studies in LMICs. Questions persist about how groups should be defined and how formal permissions and consent processes are being administered in LMICs. Key to this is addressing the issues of group representation, legitimacy of representatives, authority structures, and coverage of the consent process. For example, concerns have been well documented in the literature around the validity of leaders who give consent for a group; potential exclusion of vulnerable groups including women; and ability of individuals within the groups to opt out (Cassell and Young 2002; Davis 2000; Diallo et al. 2005; Emanuel et al. 2004; Ijsselmuiden and Faden 1992; Weijer and Emanuel 2000). Thus, consent from groups and/or representatives is often necessary and yet case-by-case discussions are needed to determine whether it is sufficient.
An important issue is that of defining subjects for consent irrespective of whether they are individuals or groups. For example, common requirements for informed consent may not apply to many HSR studies. In the United States (US), informed consent applies to ‘research subjects’ defined as those actively involved in research (Protection of Human Subjects Research 2009). However, in circumstances where no ‘direct’ subjects are identifiable, as is the case in many HSR studies, is such a requirement for consent appropriate? For some HSR projects, the lack of identifiable human subjects and aggregation of data for analysis may lead IRBs and RECs to designate these studies as ‘non-human subjects research’, which would exempt them from the consent requirements specified in the US regulations.
Furthermore, the US federal regulations also waive consent when studies fulfil four conditions: the research is no more than ‘minimal’ risk; the rights/welfare of subjects are not adversely affected; the research cannot be carried out in other ways; and the subjects will be debriefed (when appropriate) (Protection of Human Subjects Research 2009). Applying these conditions to HSR studies would mean that many of them could obtain a waiver of consent. The nature of group interventions, which often lack identifiable direct subjects and are built into health systems responses, makes HSR studies amenable to such waivers. Appropriate ways to handle consent, authorisation, and authentic community engagement for group-level interventions characteristic to HSR remain a challenging area for investigators and ethical review boards.
4. Units of intervention and observation in HSR
Health systems researchers have designed a study to provide local taxi drivers with incentive payments to transport pregnant women to the clinic for antenatal care and delivery. Although the intervention is administered to the taxi drivers, the outcomes data are being collected on mothers and infants within the intervention community.
A hospital has recently decided to introduce quality assessment activities for infection control by teams of health providers. However, the hospital plans on collecting outcome data on hospital-acquired infections among patients admitted to those hospitals.
Unlike typical clinical research, in which interventions are often administered to individuals who are then observed for potential effects, HSR often targets a unit of intervention at a more macro level and then assesses its impact at a more micro unit of observation. In other words, the units of intervention and observation are often not the same. In the first example (above), the local taxi drivers were the unit of intervention, while the outcomes data were collected on mothers and infants, which would be the unit of data collection/observation.
The use of different units for intervention and observation creates a new set of challenges for ethical review. One issue is in terms of defining and assessing risks and benefits for multiple levels of research participants: the research subjects who might be the unit of intervention (sometimes called primary), and other research subjects from whom data is collected (sometimes called secondary). In the second example (above), the teams of health providers are the unit of intervention, and the outcome data are collected from hospital-acquired infections among patients, which is the unit of observation. Furthermore, the hospital staff (doctors, nurses) and patients would all be research participants. How should RECs assess the study with appropriate regard for all groups of research subjects whose well-being can be impacted by the intervention?
This also raises important questions for the consequential targets and nature of informed consent; that is, who should be involved in the informed consent process and decide when individual consent of some (secondary) participants might be impracticable, and what should be the standards for informing them of the study? If data collection involves a measurable burden for some participants, such as additional interviews, does this incremental burden necessitate greater participation in the consent process? It is clear that having different units of intervention and data collection presents unique challenges for how practical matters of risk benefit analysis and informed consent are carried out for HSR studies.
5. Risk assessment in HSR
To reduce the incidence and prevalence of smoking in Mumbai, India, a health systems research group wants to use social media as their intervention for smoking prevention. A member of the research group is concerned that a social media campaign against smoking could overtly stigmatise current smokers or the message could get inaccurately modified somewhere in the communication chain, proliferating harmful misinformation.
A team of researchers at a prominent university in Uganda plan on designing a programme to provide conditional cash transfers as incentives to pregnant women to deliver their children in a hospital, arguing that institutional newborn delivery results in better outcomes. A health economist at the university advises that the incentives for women to deliver in a facility could expose participants to a variety of harms in places where home birthing is the norm, not to mention the potential of the cash transfers to be a more macro threat by distorting local economic markets.
Risk assessment in HSR is considered an area with serious practical and ethical challenges for HSR in many contexts (Peters et al. 2009). Traditional risk assessment for clinical research studies focuses on physical risks to participants, with some additional attention to psychological and social risks associated with participation. However, the types of risks associated with HSR studies can be quite different from clinical research, often with the largest risks manifesting in social, financial, or communal harms. While the use of sound and appropriate designs to minimise risk still applies in HSR studies, different approaches might be needed for both assessment and mitigation. As noted in the above examples, identifying and quantifying risks in an HSR study on using social media for smoking prevention in a population or the use of financial incentives for promoting institutional newborn deliveries (conditional cash transfer) requires a much more in-depth understanding of the underlying social conditions and system-level factors.
The issue of risks also relates back to appropriate modes of consent. In typical clinical research, participants are directly informed of the potential risks, and by consenting they express their willingness to accept these risks as part of their participation. There are many HSR designs in which individuals may not have this opportunity to directly consent to the exposure to risks associated with the study. Further concerns arise when potential risk levels vary across subsets of the population group, especially when the local leadership granting authorisation for the research may not represent these subgroups. One could imagine communities in which a practice under investigation might go against the norms of a religious or cultural minority or some study objectives may disproportionately burden the extremely poor. When risks are evaluated at an aggregate level across the population and marginalised groups are not represented in decision-making, the potential for undue burden and disregard for these subgroup values have clear ethical implications related to distributive justice and respect for persons.
Although many HSR studies are typically classified as low risk, a risk benefit analysis remains important and requires broader interpretation of how harms may result. Some of these present new challenges in defining ‘minimal risk,’ since knowledge of negative group characteristics might pose social concerns in how a health system treats members of that group. Moreover, defining who is at risk, inclusive of all types of research subjects, varies in HSR and may include several stakeholders involved in a study, such as providers, recipients, beneficiaries, observers, institutions, and tribes. Considerations for risk assessment therefore have to go well beyond a simple focus on individual participant concerns in HSR studies. Additionally, monitoring systems would need to be set up to report adverse consequences resulting from the research so that these harms are appropriately captured during implementation.
6. Defining benefits, beneficiaries, and fair benefits in HSR
A researcher in Bangladesh has designed a study that examines health systems issues within his city. His study protocol calls for referring patients/participants to their local facility for receipt of appropriate care, but due to health systems inefficiencies, the quality of these facilities or the standard of care available may not be equivalent. He knows that in his application to the research ethics committee, though this may seem like equivalent treatment of patients using different facilities, variation between those facilities will mean that in reality there may be substantial disparities.
Research subjects in LMICs do not always have access to the same standard of care enjoyed by subjects in high-income countries. Establishing a standard of care becomes difficult with varying types of health systems that are often the context (and the object) of HSR studies. Hence, notions of ‘best care available’, which have been promulgated in research ethics guidelines, may not be relevant if they are applied to LMIC health systems. Arguably, the very concept of standard of care continues to remain ambiguous (Hyder and Dawson 2005; London 2000). This ambiguity results in challenges in assessing the implications of opposing standard of care arguments, in recognising important differences in their supporting rationales, and even in identifying the major source of disagreement (London 2000). For example, others have attempted to address the standard of care debate from a health systems perspective, arguing that the structure and efficiency of national health systems have been neglected in arguments about the standard of care in research (Hyder and Dawson 2005).
Addressing the current global variability is a challenge, especially in elucidating benefits, beneficiaries, and the range of responsibilities in offering benefits to participants in global health systems research, particularly in satisfying the requirement of research to provide social value. One ongoing debate is whether the individual participants in a study or the communities from which they are drawn should be counted as the beneficiary, with implications for what is due to each during the course of the study (for example benefits like capacity building) and after the trial concludes (for example post-trial access and benefit sharing) (Lairumbi et al. 2011). This conversation reflects the current bias in research ethics literature to consider the individuals enrolled in studies as the primary participants and beneficiaries. However, because the goals of HSR are to make improvements at the systems level, and units of intervention in HSR are groups, with individuals as indirect beneficiaries, this dialogue about what is due to individuals versus broader communities is more important for HSR. Few guidelines discussing beneficiaries of research include the ‘larger community/host country’, further highlighting how one of the main beneficiaries in HSR may be under-recognised when applying these guidelines for review of HSR studies.
Several international and national ethics guidelines support provision of diverse types of benefits during and after studies, yet many of the benefits in HSR may be left out, such as improvements in healthcare delivery systems, actual provision of treatment, human and material capacity building, and health systems strengthening. It is also important to regard more equitable distribution of existing resources as benefits in HSR; this means that addressing inequities in health provision is another form of benefit often considered in HSR studies, especially those that work on larger communities or countries. As a result, it appears that commonly used international and some national research ethics guidelines might not be addressing the forms and types of benefits in HSR or the beneficiaries of HSR and, thus, their usage by ethics committees poses challenges for review of HSR studies.
7. Nature of interventions in HSR
The most common cause of newborn mortality is preterm birth. A local community in a low-income country has been using the ‘kangaroo mother care’ intervention for preterm infants weighing less than 2kg, which includes skin-to-skin contact, support of the relationship between mother and child, as well as exclusive and frequent breastfeeding. This form of care has been shown to reduce infant mortality in some hospital-based settings in low- and middle-income countries. A community health worker and researcher wishes to use the population-level experience as a proof-of-concept to undergo a large-scale trial of testing this new method of delivering care in Zambia.
Ethical challenges that are intervention-specific in HSR vary from concerns around scientific rationale to distribution of benefits to sustainability issues. For instance, in the case of HSR testing new delivery methods (for example for child health), one could question whether there is appropriate evidence to support the testing of a new approach or challenge the need for innovation over continuing with existing delivery systems (such as community health workers versus facility-based delivery). Where there is not much prior evidence on an approach, are theory and hypothesis enough to justify testing the intervention, or is there some population-level experience that should be required to demonstrate proof-of-concept prior to a larger-scale trial of the new method as noted in the example above?
This is of particular concern in LMICs, since their need for novel interventions to deliver services efficiently makes them arguably ideal candidates for testing health systems innovations, and if there is meagre evidence supporting the effectiveness of interventions, these resource-constrained settings may bear a disproportionate burden in the generation of global health systems innovations. Implicit in this concern are (1) the obligation of researchers to not impose undue harm upon populations, which may occur in the absence of sufficient evidence (for example distortion of a local health market), and (2) issues of distributive justice, in which disadvantaged communities assume the risks of research on interventions that will ultimately benefit more advantaged populations – an increasing concern as more high-income nations adopt innovative models from developing country settings (Fry et al. 2011).
Similarly, another ethical concern is the potential for harm with new health delivery methods and associated safety issues. The kinds of harms resulting from HSR tend to be more obscure, downstream, and harder to quantify than those typically associated with a clinical study. In order for RECs to adequately assess the potential harms associated with certain types of HSR, they will have to rely on the existing evidence base of the approach, with a good understanding of the history of a particular delivery method and its success or failure with similar types of health interventions. However, for many novel approaches, there might be insufficient prior evidence available to inform the ethical review process.
A critical concern with HSR is that it can also blur the distinction between research and non-research processes. For example, it is important to make the distinction between quality improvement (QI) projects, which are meant to improve service deliverance and process performance, and research, which is meant to produce generalisable or transferable knowledge. Though the former is typically exempt from ethical review, pertinent issues may overlap for both QI and HSR, regardless of what may be legally required vis-à-vis regulations. From a practical standpoint, this range and variability can pose difficulties for RECs in gaining experience reviewing certain kinds of HSR and applying recommendations consistently. As compared to clinical trials, which often share common features and have more clearly identified areas for ethical consideration established in the literature, HSR may present unique challenges for review committees with each new protocol. This is especially the case when HSR refers to areas wherein the REC does not have much experience, which can impact the quality of the review and further strain the limited capacity of RECs to assess study proposals in a thorough and efficient manner.
Finally, in LMICs where a lack of access to health interventions exists, ethical concerns around future availability become salient. Will the community involved in the trial continue to have access to beneficial services provided as part of the study? While the issue of post-trial access is not unique to HSR and has been widely discussed in research ethics literature, this issue is of particular import for HSR given the well-documented lag in, or absence of, research-to-policy translation (Grady 2005; Lavery 2004). What impact might the temporary change in health delivery mechanisms or available services have (during a study) on the community, and could this disruption in the status quo have net negative consequences for the population? At the systems level, decisions to adopt new approaches for providing health to the population often weigh costs and benefits at the aggregate level, so even interventions that show improvements for those involved in the study may not be taken up in the end if they do not prove cost-effective. These concerns must play a role in how local and national health sectors analyse and respond to the results of an HSR study and raise questions about what obligations exist for research institutions and funders conducting such work.
In sum, HSR is fundamentally about translating efficacious interventions into effective practice at the population level. As a result, the interventions under investigation in HSR can vary greatly, as can their methods of delivery, resulting in ethical issues quite specific to a given study. These interventions might be health messages, incentives, measurement tools, performance guides, intervention packages, financial subsidies, or delivery systems. Therefore, typical interventions in HSR can involve new methods of delivery or dissemination of existing or proven interventions, novel approaches for creating demand for efficacious interventions, new packaging of two or more interventions for enhanced programme effectiveness, or knowledge generation on costs or cost-effectiveness for policy impact. This diversity in the intensity, invasiveness, and duration of implementation requires a very good understanding of the intervention in each HSR study in order to define relevant ethical issues.
8. Appropriate controls and comparisons in HSR
A study to evaluate the efficacy of a new health safety curriculum in local medical centres is underway in Dodoma, Tanzania. Participants from the intervention group share their knowledge with members of the control group via social networks and staff transfers. The control group’s integrity is effectively compromised and the extent of the ‘contamination’ is difficult to assess and threatens to undermine the interpretation of the magnitude of the findings.
The nature of control groups can vary in HSR studies, and the ways groups are compared are often not consistent with common clinical research study designs, such as placebo-controlled studies, where the ‘gold standard’ involves comparing an ‘intervention’ group with a ‘non-intervention’ group. For instance, if an HSR study is testing a new delivery method for a proven intervention, then the comparison group may have an older delivery method, or if an HSR study is testing a new package of existing interventions (say A and B together), then the comparison group may receive them separately (either A or B alone). The selection of these comparison locations is also often not done randomly, but rather by systematic matching or even geographical or logistical convenience. As a result, comparison groups in HSR studies pose challenges to the ethical review process when these control groups receive different types of interventions; and there is a wide variation of possibilities in what might constitute comparison groups.
HSR presents challenges for establishing appropriate comparison groups. As compared to clinical trials, it is more difficult in HSR studies to control for a variety of extraneous variables that could impact results. This is due to the fact that HSR often involves interventions that take place within existing, real-world settings, while clinical trials occur in highly controlled experimental settings. Therefore, many (especially low-cost) HSR studies use comparators of convenience, such as data from similar districts or cities, or quasi-experimental pre–post designs, often applying complex statistical techniques in an attempt to account for non-parities or temporal confounders. In order to ensure the internal validity of these studies – a necessary ethical requirement of all research – RECs should be equipped to evaluate the techniques used in HSR to determine if studies have adequately controlled for the challenges of imperfect comparison groups (Emanuel et al. 2004). This will have implications for the future applicability of the study findings and their social value, in addition to ensuring respect for the communities participating in the HSR study.
In addition to determining who should serve as the control, there is also the question of what should be provided to the control groups. Although ethical debates concerning the appropriate use of placebos versus active controls are not exclusive to HSR and have been ongoing in the literature for many years surrounding both clinical and implementation trials, these concerns are particularly acute in the context of HSR in LMICs (Emanuel, Wendler and Grady 2000; Emanuel and Miller 2001; Freedman 1990; Miller and Brody 2002). If there is little evidence available concerning the effectiveness of current systems of practice, it becomes difficult to choose what to test a new health system approach or combination of approaches against (if anything), whereas in clinical investigations testing equivalency or superiority, there is often a much more robust evidence base about the current standard of practice. Furthermore, where an HSR study seeks to assess packages of multiple beneficial interventions that have potentially synergistic effects, what subset(s) of these interventions should be provided to the control group(s)? If the researchers are seeking to find the most cost-effective package of services to produce the desired health impact, they must balance their obligation to provide existing beneficial interventions to their participants against their aim to produce information for evidence-based policy that will ultimately provide the greatest societal benefit.
Another relevant factor for many cluster-based studies arises when they use a staged introduction (or stepped wedge design), in which the intervention is rolled out sequentially to participating groups or clusters so that even the control groups receive the intervention by the end of the study. While staged roll-out is often considered to be more ethically acceptable than providing no intervention to control groups, there is still the risk that the control communities will feel unfairly disadvantaged. This could pose validity threats due to varying external conditions over time or contamination from neighbouring clusters via information diffusion, and may also raise issues of justice and fairness for the clusters receiving the intervention so much later than their counterparts (Brown and Lilford 2006). These types of specific issues must be understood within the overall aim of HSR – to inform real-world practice and produce social value. In the interest of good science, RECs must be better equipped to evaluate these options and determine whether HSR studies have adequately considered appropriate comparison groups (Emanuel et al. 2004).
9. Inclusion of vulnerable groups in HSR
A Malawi HIV clinic has created a programme to incentivise HIV testing and collection of test results. Recently, self-identifying gay and lesbian individuals, a highly stigmatised and vulnerable group in Malawi, were seen entering the clinic and were later beaten by an unidentified mob.
As the volume of research in LMICs increases, the role (and protection) of the highly vulnerable (for example women or stigmatised groups) in research among the poor or generally vulnerable groups becomes a serious challenge. These especially disadvantaged groups are often left out of general improvements in healthcare due to lack of access or lack of power, and become further marginalised. For instance, in locales where freedom of movement is restricted for women, their access to basic health services may be limited. Therefore, improvements in the delivery of services at health centres may not translate to benefits for this subgroup. Furthermore, as seen in the above example, interventions aimed at stimulating demand for services may overlook the social or cultural risks to individuals if they pursue these services. Thus, including these concerns for particular vulnerable subgroups who face acute risks and whose position may not be represented in many models for group authorisation is an important consideration that needs special attention when evaluating risks and benefits associated with HSR. It is uncertain how well RECs in general are equipped to address the specific concerns that these highly vulnerable subgroups pose in a study. This is an increasing challenge in addressing the ethical issues of conducting much-needed HSR in LMICs and remains largely unexplored.
HSR often involves vulnerable populations, especially in LMICs, where the general population’s impoverished condition may already place them at historical disadvantage. This type of vulnerability raises ethical concerns around risks of exploitation, coercion, and abuse. The International Ethical Guidelines for Biomedical Research Involving Human Subjects by the Council for International Organization of Medical Sciences (CIOMS) specifically state: ‘Special justification is required for inviting vulnerable individuals to serve as research subjects and, if they are selected, the means of protecting their rights and welfare must be strictly applied (Guideline 13)’ (Council for International Organizations of Medical Sciences 2002).
However, many HSR studies, especially in LMICs, are in fact conducted with the primary aim of reaching vulnerable groups and providing access to existing or proven interventions for those communities. When such groups are the focus, the assessment of risks associated with these vulnerable populations continues to remain a challenge, since it is also ethically important to try out new ways of delivering and accessing care in the same population to have relevance. Paternalistic protection of vulnerable groups from HSR might compromise the opportunity to find solutions to some of the most important health system challenges. One characteristic challenge in the ethical review process of HSR is identifying when it is acceptable to pilot health systems innovations intended for broad scale-up among particularly vulnerable groups, who may realise the most benefit but who may, conversely, be subject to further harm as systems researchers explore new techniques.
Vulnerable populations suffer and face the worst burden of health due to system weaknesses, reinforcing the need to emphasise the larger notion of fairness (Daniels 2006). Fairness is an important consideration in the ethical review of HSR, especially as it relates to communities and populations that may become vulnerable, not because of inherent weakness, but because of the context in which they are operating (Bamford 2014; Hurst 2014; Hyder et al. 2014). Thus, on the one hand, HSR responds to such lack of fairness by trying to identify strategies to reduce inequalities; however, at the same time, the conduct of HSR can affect fairness. Therefore, RECs need to evaluate this potential impact for each proposed study.
There are several limitations to the conceptual exploration above that are worth considering. First, the definition of HSR varies depending on the type of research, location, or source considered and makes consideration of this field challenging. However, a unified definition is necessary, and global meetings are now focusing on further defining and enhancing the field (Alliance for Health Policy and Systems Research 2012a; 2012b; Global Forum for Health Research 2004). Second, there are activities that are often in the grey zone between research and non-research and that can be considered part of the HSR agenda in LMICs. Two examples include (1) quality assurance methods, which are used for performance management and research; data may be collected to inform practice only or become part of formal research activities in HSR (Heiby 1993, 1998; Reinke 1995; Zeitz et al. 1993), and (2) public health surveillance activities, which are not traditionally considered research but may be part of HSR for example, where a new disease surveillance system is being pilot tested for the first time (Lee and Thacker 2011; Lee, Heilig and White 2012). These types of approaches, when used in HSR, can add further complexity to ethics considerations. As a result, the mutually exclusive categorisation of HSR as either research or non-research by ethics committees and current guidelines is a source of challenge for the field.
Third, a discussion focused on teasing out differences between HSR and other research tends to downplay the many similarities across all types of health research; in many instances the differences are less stark and similarities more common. However, a conceptual exploration has to use some real-world generalisations that can stand merit even though specific exceptions can be defined.
Fourth, the diversity of HSR extends beyond the typical examples that have been provided, and can include the conduct of long-term HSR in the same site, such as the use of HSR in demographic surveillance sites across LMICs. Such longitudinal and often long-term HSR (years or decades in the same sites) can lead to different types of ethics issues associated with more dynamic concerns (Hyder et al. 2012b). Important conversations and areas for further exploration for HSR ethics include vulnerable populations, big data, ancillary care obligations, distribution of responsibility, and the potential (and possible moral obligation) of health systems research to help reduce health disparities between and within countries (Bamford 2014; Dereli et al. 2014; Gupta 2014; Hurst 2014; Hyder et al. 2014; Olson 2014; Pratt 2014; Rennie 2014).
Since health systems research, especially in LMICs, is substantively different from other types of research with its own set of objectives, approaches, methods, and analytic goals it warrants special or nuanced considerations in its ethical review. Some of these ethical concerns may be more salient than the usual ethics review of other types of research such as clinical research (Table 1). An ethics review of HSR that uses exactly the same criteria and ethical analysis as for clinical research may place an overemphasis on features that are not particularly relevant in HSR, and may not adequately capture the unique kinds of benefits and risks present in HSR. Thus, untailored review can result not only in practical inefficiencies, but also in unjustified research activities and inadequate protection of participating communities and individuals.
Ethical review of HSR does not always fit with the existing review paradigm born from the typical clinical research setting (Hyder et al. 2014; London et al. 2012). Additionally, more exploration is needed to understand the possible breadth of ethics issues that may apply to HSR in various contexts, and there is much to be learned from overlapping disciplines that have particular relevance to larger HSR concerns, such as health systems transformations (Daniels 2006). HSR studies ought to reflect fair terms of social cooperation between communities and researchers, be relevant to the health needs of the host communities, and have a favourable risk benefit ratio (Emanuel et al. 2004). Such responsiveness to host communities helps form collaborative partnerships in which all stakeholders (participants, researchers, brokers) are considered moral equals of each other. These concerns are important for HSR, as research resources themselves can have a direct impact on the distribution of opportunities in a community related to jobs, training, placement of facilities or site selection, with implications for distributive justice and fair equality of opportunity. This discussion can even be extended to include certain public health ethics obligations discussed in the literature, such as social duty, reciprocity, solidarity, stewardship, trust, and accountability (Baum et al. 2007; Swain, Burns and Etkind 2008; Thompson et al. 2006; Upshur 2002).
HSR is necessary to ensure health systems strengthening, quality of care, and evidence-informed public policy creation. HSR researchers must carefully define their intent and goals and openly clarify the values that may influence the premises and design of their protocols. In order to have appropriate ethical review of HSR, there is a need to have a deeper understanding of how to apply traditional ethics review criteria in ways that are relevant to the features of HSR, and further guidance to researchers and reviewers addressing the broader issues arising in the context of systems-level interventions.
Some questions to promote thinking about ethics issues
Take one of the eight ethical issues identified above and list three reasons why you: (1) agree that it is different for health systems research; and (2) think it is similar to other types of research.
Probe: Then list three ways in which you feel this specific ethics issue can be addressed by health systems researchers.
You are about to start a health systems research study in a district of Uganda with 30 villages. The study will train community health workers in 15 villages on child health in year 1 and then provide the same training to workers in the other 15 districts in year 2. The study will monitor childhood diseases in all villages for two years. Describe three ethical concerns you might have in this study. Who do you think should give consent for the study in the district?
Probe: What risks is the population being exposed to and how would you manage them if you were the study director?
What other ethical concerns (apart from the eight above) can you think of that may be particular to health systems research that differentiate it from clinical research?
Probe: And do you know of other ethical frameworks within public health that might address these other ethical concerns?
What counts as a benefit in health systems research and which benefits are due to communities versus their members?
Probe: How can this guide research designs and research ethics committees’ decisions about the obligations of health systems researchers to participating groups and individuals during and after a trial?
Can you identify actual health systems research studies wherein ethical considerations may have been overlooked by the current review process?
Probe: Do a PubMed search and review some studies.
Table 1: Proposed ethics considerations of special relevance in health systems research in low- and middle-income countries
|Ethics issue||Application to health systems research in low- and middle-income countries|
|Nature of interventions||System-based such as delivery systems, financing, human resources, or policies|
|Type of subjects||Groups of people or communities|
|Units of intervention and observation||Often different such that intervention is distributed to one group and measurement is based on another|
|Informed consent||Group consent and permissions needed (in addition to individual consent)|
|Comparison groups||Comparators often receive different interventions or are observed in the real world|
|Risk assessment||Broad range and different types of minimal risk – social, communal|
|Inclusion of vulnerable groups||The focus of HSR and proposed beneficiaries|
|Benefits assessment||Expanded definition including training, infrastructure, health systems strengthening|
Alliance for Health Policy and Systems Research (2012a) Health Policy And System Research: A Methodology Reader, (ed. L. Gilson), Geneva, Switzerland: World Health Organization, www.who.int/alliance-hpsr/alliancehpsr.reader.pdf (accessed 12 March 2015)
Alliance for Health Policy and Systems Research (2012b) ‘Global Symposium on Health Systems Research’, World Health Organization, www.who.int/alliance-hpsr/hsr-symposium/en/ (accessed 12 March 2015)
Bamford, R. (2014) ‘Ethical Review of Health Systems Research: Vulnerability and the Need for Philosophy in Research Ethics’, American Journal of Bioethics 14.2: 38–40
Baum, N.M.; Gollust, S.E.; Goold, S.D. and Jacobson, P.D. (2007) ‘Looking Ahead: Addressing Ethical Challenges in Public Health Practice’, Journal of Law, Medicine, and Ethics 35.4: 657–667
Brown, C.A. and R.J. Lilford (2006) ‘The Stepped Wedge Trial Design: A Systematic Review’, BioMed Central Medical Research Methodology 6: 54
Cassell, J., and A. Young. (2002) ‘Why We Should Not Seek Individual Informed Consent for Participation in Health Services Research’, Journal of Medical Ethics 28: 313–17
Council for International Organizations of Medical Sciences (2002) International Ethical Guidelines for Biomedical Research Involving Human Subjects, Geneva, Switzerland: Council for International Organizations of Medical Sciences, www.cioms.ch/publications/layout guide2002.pdf (accessed 12 March 2015)
Daniels, N. (2006) ‘Toward Ethical Review of Health System Transformations’, American Journal of Public Health 96.3: 447–51
Davis, D.S. (2000) ‘Groups, Communities, and Contested Identities in Genetic Research’, Hastings Center Report 30.6: 38–45
Dereli, T.; Coşkun, Y.; Kolker, E.; Güner, O.; Ağırbaşlı, M. and Ozdemir, V. (2014) ‘Big Data and Ethics Review for Health Systems Research in LMICs: Understanding Risk, Uncertainty and Ignorance—and Catching the Black Swans?’, American Journal of Bioethics 14.2: 48–50
Diallo, D.A.; Doumbo, O.K.; Plowe, C.V.; Wellems, T.E.; Emanuel, E.J. and Hurst, S.A. (2005) ‘Community Permission for Medical Research in Developing Countries’, Clinical Infectious Diseases 41.2: 255–59
Emanuel, E.J. and Miller F.G. (2001) ‘The Ethics of Placebo Controlled Trials—A Middle Ground’, New England Journal of Medicine 345.12: 915–19
Emanuel, E.J.; Wendler, D. and Grady, C. (2000) ‘What Makes Clinical Research Ethical?’, Journal of the American Medical Association 283.20: 2701–11
Emanuel, E.J.; Wendler, D.; Killen, J. and Grady C. (2004) ‘What Makes Clinical Research in Developing Countries Ethical? The Benchmarks of Ethical Research’, Journal of Infectious Diseases 189.5: 930–37
Freedman, B. (1990) ‘Placebo-controlled Trials and the Logic of Clinical Purpose’, IRB: Ethics and Human Research 12.6: 1–6
Fry, C.V.; Marjanovic, S.; Yaqub, O. and Chataway, J. (2011) Health Innovation Transfer from South to North, Santa Monica: RAND Corporation, www.rand.org/pubs/documented briefings/DB616.html (accessed 12 March 2015)
Gilson, L.; Hanson, K.; Sheikh, K.; Agyepong, I.A.; Ssengooba, F. and Bennett, S. (2011) ‘Building the Field of Health Policy and Systems Research: Social Science Matters’, PLoS Medicine 8.8: e100 1079
Global Forum for Health Research (2004) Strengthening Health Systems: The Role and Promise of Policy and Systems Research, Geneva, Switzerland: Alliance for Health Policy and Systems Research
Grady, C. (2005) ‘The Challenge of Assuring Continued Post-Trial Access to Beneficial Treatment’, Yale Journal of Health Policy, Law, and Ethics 5.1: 425–35
Gupta, S. (2014) ‘Ethical Review of Health Systems Research in Low- and Middle-income Countries: Research Treatment Distinction and Intercultural Issues’, American Journal of Bioethics 14.2: 44–46
Heiby, J. (1998) ‘Quality Assurance and Supervision Systems’, Quality Assurance Brief 7.1: 1–3
Heiby, J. (1993) ‘Project Officer’s Perspective: Quality Assurance as a Management Tool’, Quality Assurance Brief 2.1: 1–4
Hurst, S.A. (2014) ‘Simplicity as Progress: Implications for Fairness in Research with Human Participants’, American Journal of Bioethics 14.2: 40–41
Hutton, J.L.; Eccles, M.P. and Grimshaw, J.M. (2008) ‘Ethical Issues in Implementation Research: A Discussion of the Problems in Achieving Informed Consent’, Implementation Science 3: 52
Hyder, A.A. and L. Dawson (2005) ‘Defining standard of care in the developing world: The intersection of international research ethics and health systems analysis’, Developing World Bioethics Journal 5.2: 142–52
Hyder, A.A., Bachani, A. and Rattani, A. (2012a) ‘Ethics of Health Systems Research—A Scoping Study’, Working Paper (available from authors), Geneva: Alliance for Health Policy & Systems Research
Hyder, A.A.; Krubiner, C.B.; Bloom, G. and Bhuiya, A. (2012b) ‘Exploring the Ethics of Long-Term Research Engagement with Communities in Low- and Middle-income Countries’, Public Health Ethics 5.3: 252–62
Hyder, A.A.; Rattani, A. Krubiner, C. Bachani, A.M. and Tran, N.T. (2014) ‘Ethical Review of Health Systems Research in Low- and Middle-income Countries: A Conceptual Exploration’, American Journal of Bioethics 14.2: 28–37
Ijsselmuiden, C.B. and Faden, R.R. (1992) ‘Research and Informed Consent in Africa: Another Look’, New England Journal of Medicine 326.12: 830–33
Lairumbi, G.M.; Michael, P.; Fitzpatrick, R. and English, M.C. (2011) ‘Ethics in Practice: The State of the Debate on Promoting the Social Value of Global Health Research in Resource Poor Settings Particularly Africa’, BioMed Central Medical Ethics 12: 22
Lavery, J.V. (2004) ‘Putting International Research Ethics Guidelines to Work for the Benefit of Developing Countries’, Yale Journal of Health Policy, Law, and Ethics 4.2: 319–36
Lee, L.M. and Thacker, S.B. (2011) ‘Public Health Surveillance and Knowing about Health in the Context of Growing Sources of Health Data’, American Journal of Preventative Medicine 41.6: 636–40
Lee, L.M.; Heilig, C.M. and White, A. (2012) ‘Ethical Justification for Conducting Public Health Surveillance without Patient Consent’, American Journal of Public Health 102.1: 38–44
London, A.J. (2000) ‘The Ambiguity and the Exigency: Clarifying “Standard Of Care” Arguments in International Research’, Journal of Medicine and Philosophy 25.4: 379–97
London, A.J.; Borasky, D.A. Jr.; Bhan, A. and Ethics Working Group of the HIV Prevention Trials Network (2012) ‘Improving Ethical Review of Research Involving Incentives for Health Promotion’, PLoS Medicine 9.3: e1001193
Miller, F. G., and H. Brody (2002) ‘What Makes Placebo-controlled Trials Unethical?’, American Journal of Bioethics 2.2: 3–9
Olson, N.W. (2014 ‘Conceptualizing Ancillary Care Obligations in Health Systems Research’, American Journal of Bioethics 14.2: 46–7
Peters, D.H.; El-Saharty, S.; Siadat, B.; Janovsky, K. and Vujicic, M. (2009) Improving Health Service Delivery in Developing Countries, Washington, DC: International Bank for Reconstruction and Development/The World Bank
Pratt, B. (2014) ‘Connecting Health Systems Research Ethics to a Broader Health Equity Agenda’, American Journal of Bioethics 14.2: 1–3
Protection of Human Subjects Research (2009) Title 45 Code of Federal Regulations Part 46, Washington, DC: US Department of Health and Human Services, www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html#46.102 (accessed 12 March 2015)
Reinke, W.A. (1995) ‘Quality Management in Managed Care’, Health Care Manager 2.1: 79–88
Rennie, S. (2014) ‘Tinkering with the Poor’, American Journal of Bioethics 14.2: 43–44
Sim, J. and Dawson, A. (2012) ‘Informed Consent and Cluster-randomized Trials’, American Journal of Public Health 102.3: 480–85
Swain, G.R.; Burns, K.A. and Etkind P. (2008) ‘Preparedness: Medical Ethics Versus Public Health Ethics’, Journal of Public Health Management and Practice 14.4: 354–57
Taljaard, M.; Weijer, C.; Grimshaw, J.M.; Belle Brown, J.; Binik, A.; Boruch, R.; Brehaut, J.C.; Chaudhry, S.H.; Eccles, M.P.; McRae, A.; Saginur, R.; Zwarenstein, M. and Donner A. (2009) ‘Ethical and Policy Issues in Cluster Randomized Trials: Rationale and Design of a Mixed Methods Research Study’, Trials 10: 61
Taylor, C.E. (1984) The Uses of Health Systems Research, Geneva: World Health Organization, http://whqlibdoc.who.int/php/WHO_PHP_78.pdf (accessed 12 March 2015)
Thompson, A.K.; Faith, K. Gibson, J.L. and Upshur R.E.G. (2006) ‘Pandemic Influenza Preparedness: An Ethical Framework to Guide Decision-Making’, BioMed Central Medical Ethics 7: E12
Upshur, R.E.G. (2002) ‘Principles for the Justification of Public Health Intervention’, Canadian Journal of Public Health 93.2:101–03
Wallwork, E. (2008) ‘Ethical Analysis of Research Partnerships with Communities’, Kennedy Institute of Ethics Journal 18.1: 57–85
Weijer, C. and Emanuel, E.J. (2000) ‘Ethics: Protecting Communities in Biomedical Research’ Science 289.5482: 1142–44
Weijer, C.; Grimshaw, J.M.; Taljaard, M.; Binik, A.; Boruch, R.; Brehaut, J.C.; Donner, A.; Eccles, M.P.; Gallo, A.; McRae, A.D.; Saginur, R. and Zwarenstein M. (2011) ‘Ethical Issues Posed by Cluster Randomized Trials in Health Research’, Trials 12: 100.
World Health Organization (2009) ‘Scaling up Research and Learning for Health Systems: Now is the Time’, report of a High Level Task Force (presented and endorsed at the Global Ministerial Forum on Research for Health 2008), Bamako, Mali, 17–19 November, www.who.int/rpc/publications/scaling up research/en/index.html (accessed 12 March 2015)
Zeitz, P.S., Salami, C.G.; Burnham, G.; Goings, S.A.; Tijani, K. and Morrow R.H. (1993) ‘Quality Assurance Management Methods Applied to a Local-Level Primary Health Care System in Rural Nigeria’, International Journal of Health Planning and Management 8.3: 235–44
Institute of Development Studies
1. Data collection and resource allocation
In Chapter 1 it was argued that, where possible, implementation researchers should be embedded as members of the implementation team, equally committed to ensuring a successful outcome and fully engaged in decision-making processes. One important area of decision-making concerns the allocation of scarce resources – which could alternatively be used to improve the scope or quality of the intervention – to data collection activities. Following the proposed definitions of Chapter 1, these activities can be seen as directed towards three identifiable objectives:
- Intervention management and accountability;
- Operations research intended to improve the current implementation;
- Implementation research to learn lessons from the current implementation that can be used in scale-up or relocation to a new context.
However, these objectives can be seen as highly interrelated, each involving the need to track implementation progress against the original intervention design, identify potential weaknesses in that design and test the initial assumptions on which it was based.
Every intervention should have budget lines intended to address objective 1, covering the costs of collecting the data required for intervention management and to demonstrate to those providing funds that resources are being allocated appropriately and outputs produced as intended. For convenience we can regard all such planned expenditures as falling under the general heading of intervention ‘monitoring and evaluation’. Often this will include a separate item to meet the cost of operations research studies to be undertaken in pursuit of objective two, for example testing alternative approaches to service delivery or behaviour modification in order to determine which would best serve the needs of the intervention. As discussed earlier, funding of the implementation research activities, including researcher time and data collection costs, will typically not be included in the intervention budget and would often be provided by another agency. However, much of the data required by the implementation researcher can be derived from that collected under the intervention budget, though the analysis of those data may well differ. This provides an opportunity to establish a mutually beneficial arrangement, with implementation researchers providing support, and possibly additional resources, to the intervention monitoring and evaluation system in return for full engagement in the design of that system.
From an implementation research perspective, the design of monitoring and evaluation systems requires not only an understanding of how the data generated will allow rigorous analysis of the implementation process and the interaction of that process with key contextual factors, but also an awareness of the types of evidence that will be acceptable to different stakeholders and audiences (Murray 2007). Apart from the other members of the implementation team, these might include: national/local policymakers and officials; health workers; NGOs; donor and other international agencies; beneficiary communities and the general population. For example, donor agencies may demand ‘objective’ quantitative outcome indicators, while communities may be more impressed by qualitative evidence that reflects their own perceptions and concerns. A further consideration is that traditional monitoring and evaluation systems can tend to follow a routine reporting and analysis plan that is insufficiently responsive to rapidly developing potential opportunities and threats to the implementation process, especially in a CAS context. Both operations research and implementation research activities may often benefit from ad hoc, ‘real-time’ exercises, possibly undertaken in collaboration with service providers or intended beneficiaries, which can be effective in providing rapid feedback on access barriers and process bottlenecks as they arise. Such activities can combine information-gathering with exercises that explore ways in which these barriers or bottlenecks may be overcome, both in the existing or in future implementations.
Choice of research methods
There is a tendency for researchers to think in terms of undertaking studies that can be easily categorised, using labels such as quantitative, qualitative, participatory, action research, desk research, etc. A better approach is to start from a careful review of the various research questions that need to be addressed and then to assess which research methods might be able to deliver the required information on each of these questions. Resources should then be allocated in such a way as to best meet the overall research objectives, which will usually involve prioritising some questions over others, within whatever ‘budget constraints’ apply – which may relate not only to financial limitations but also to the limited availability of time, skilled/experienced personnel, access to data sources, etc. Such an approach may often involve unpalatable compromises relating to the scope, depth or precision of the intended research findings, but setting unattainable goals or attempting to ‘make do’ with inadequate resources will almost certainly degrade the quality of those findings.
One complicating factor in adopting such an approach is that most researchers have a strong preference for primary data collection. They identify what they see as their requirements for specific data items and then assume that those requirements can only be met by the careful design and application of data collection instruments that are intended to deliver those items. However, before deciding to invest in any substantial data collection exercise, which will almost inevitably be costly and will typically prove substantially more costly than anticipated, it is almost always worthwhile to undertake a systematic inventory of relevant, accessible, secondary sources. These will almost certainly not provide precisely the data you want but may well provide data that can meet at least some of your underlying needs.
For example, reports and/or data from previous income or expenditure surveys may provide a reasonably adequate guide to current distributional questions if there is no reason to suspect that these may have changed radically since those surveys were undertaken. Even poorly maintained hospital financial records may provide better data for the estimation of inpatient treatment costs for a given condition than can be obtained from a survey that relies on the memories of former patients. Careful study of official reports, even if you doubt about their reliability, will often enable you to be much more efficient in undertaking key informant interviews with senior policymakers, allowing you to focus on questions that test the veracity of the information and opinions in those documents.
The general proposition here is that all potential sources of relevant data should be explored and their availability, accessibility, cost and potential value assessed before deciding on your research strategy. These would include:
- Documents: official reports, academic journals, media articles, internet blogs, etc.;
- Routine data systems (RDS): financial data, personnel data, clinic records, etc.;
- Existing survey data: national surveys, Demographic and Health Surveys, World Health Surveys, Multiple Indicator Cluster Surveys, etc.;
- Implementation RDS: from the implementation monitoring system;
- New sample surveys: of patients, facilities, providers, community members, etc.;
- Qualitative studies;
- Rapid appraisal and/or participatory exercises.
2. Secondary data
Document review should involve the systematic compilation and analysis of relevant printed and electronic material. In terms of health systems interventions, probably the most important sources will be legislative documents and policy statements that set out the basic frameworks within which health systems function. There will also be a wide range of regulations, guidelines, manuals, protocols, etc., issued by ministries, other official agencies or by facilities themselves, which define the detailed operational procedures that should be followed in the management, administration and delivery of health services. These documents can be important even if the researcher is fully aware that they are widely disregarded, in that they can indicate what individuals perceive to be appropriate behaviour in terms of health service provision or at least what they perceive as being acceptable to the general population.
Organisations and individuals will often try to behave as set out in such documents even in the most difficult and chaotic circumstances, following procedures they know to be irrelevant simply because they have no well-defined alternative mode of operation. Working in Nigeria in the mid-1990s, when public health services were almost non-existent in many rural areas, I had to work around a legal prohibition on the use of alternative forms for the collection of data on public facilities. This was often cited by providers even though it was clear to all those concerned that the official health information system had ceased to function. Similarly, state government officials would expend considerable efforts on the careful preparation of annual budgets, even though they knew that these would have limited effect in terms of controlling actual expenditures. Analysis of the gulf between what is contained in such official documents and the reality on the ground is often key to understanding the context within which interventions are undertaken.
A systematic document review should aim to at least consider, if not analyse, all those materials which may be relevant, from whatever sources. This will be time-consuming and should not be seen as an activity that takes place only at the start of the implementation process but one that can be conducted at a steady pace over the research period. Increasing use of media outlets and in particular of the internet has dramatically increased the volume of information that is relatively easily accessible to the researcher. For example, reports from the international Demographic and Health Surveys, World Health Surveys and Multiple Indicator Cluster Surveys indicated above can be inspected or downloaded from their websites. In many countries census and survey reports are often made available in reasonably timely fashion through the internet sites of national statistical agencies or ministries. Reports from earlier periods, possibly useful in considering trends over time, may also have been archived on the International Household Survey Network website. Expectations should be limited. Survey reports tend to provide relatively simple statistics, often at a high degree of aggregation, using variables that will almost certainly not have been defined as you might wish. Nonetheless, they can often provide a limited number of apparently relatively reliable indicators that may be very useful in terms of confirming or challenging information received from other sources.
Interesting insights into the concerns and intentions of relevant organisations can often be gained by examining their press releases, which again are now often made available via the internet. Given that they are almost always intended to present the organisation in a favourable light, these need to be subjected to careful analysis and interpretation, but can be extremely useful in determining the most effective strategy for exploring their underlying aims and objectives, for example in the design of key informant interviews. Media articles – in newspapers/magazines or on television or radio – provide another relevant source, which in this case will need to be assessed in terms of an informed judgement as to whether the author can be seen as independent or biased in one direction or another. Such biases do not render the information useless – as long as their implications can be fully incorporated into your analysis. Articles based on the opinions of those critical of your intervention can be of particular interest in terms of understanding the arguments that an implementation may have to address and in revising your stakeholder analysis.
A related and under-utilised source are the internet blogs that may be written by individuals within a healthcare agency or community-based organisation, either on their own websites or on social media sites such as Facebook. The author gained valuable insights into the problems faced by an agency concerned with providing health advice from the activity on one such site, where it is easy for contributors to forget that their discussions are open to public view. Other sources of interest include the many advertisements for health providers and products, which may play a major role in influencing the attitudes of the local population about the availability of treatments for a range of conditions. These may be found in the local media or on the internet but are also widely displayed on posters, either positioned by roads or in shop windows. These advertisements can now be easily captured and analysed in the same way as other materials using digital cameras, which can also be used to incorporate a range of relevant maps, charts and photographs that the researcher may encounter.
It should be emphasised that document review is a research activity and as such should be fully described in the research report. Details must be provided about how different sources were explored and relevant materials identified, accessed and analysed. Stage one in such an analysis, as indicated above, is to understand the origins of the document and the reasons for its production. This should allow you to make an intelligent assessment of how it may be interpreted. Was it an uncontentious attempt to codify existing practices to make sure that all providers followed a common approach or a highly contested regulation that imposed unwelcome constraints on income-generating activities? Was it intended to demonstrate how careful a government agency had been in managing a social insurance fund or to attack the profligacy of political opponents? Stage two would involve seeking ways to verify any of the claims or estimates contained in the document. Can they be compared with those from any other source? Is the methodology adopted described in the document and if so does it seem appropriate? Is it possible to discuss the findings with those who had published them? Stage three involves the more difficult tasks of extracting relevant excerpts from each document, summarising these without losing essential content and then combining these summaries under various themes and sub-themes. This is most frequently undertaken on a relatively pragmatic basis, relying on the experience and skills of the researchers. However, there are more rigorous methodological approaches that are usually described under the heading of ‘content analysis’ (Hsieh and Shannon 2005) and recently these have often been undertaken using a range of specialised computer software packages.
Routine data systems
Routine (administrative) data systems (RDS) have the great potential advantage that they can deliver disaggregated, time series data by geographical area (region, state, district, sub-district, etc.) (Lagarde 2012). Facility records, for example attendance registers, patient records, disease registers, prescriptions, insurance payments, financial accounts, etc., can be an important source of quantitative data, if they can be accessed by the researcher. They may be of immediate value or of use after further processing. For example, they may require reorganisation, aggregation, disaggregation or other manipulation. In this case it is necessary to ensure that the nature of the data in terms of such aspects as definitions and collection procedures is thoroughly understood, as the possibilities for misinterpretation are considerable. It may in some cases be cost-effective to invest resources in measures that support improvement of the RDS. For example, in some countries many primary facilities still lack simple electronic calculators and may have to spend considerable time adding up many columns of figures, often making mistakes. In one study the author found that the simple provision of higher-quality attendance registers and prescription pads (the only documents available at this level) and a supply of pens, pencils and erasers, dramatically improved the recording of patient and treatment details over the course of the research. Recently, the provision of mobile phones does appear to offer considerable potential (Neupane et al. 2014).
Mainly because they can provide data that can be disaggregated to the particular location in which an intervention is undertaken, it can be more useful to analyse RDS than existing survey data. However, in most countries routine data are subject to a number of well-known limitations, often despite many attempts at improvement. There are three major issues:
- coverage – focusing only on those who use services can be extremely misleading – we know that it is generally the poor and vulnerable who are most often excluded;
- general poor quality (accuracy, timeliness) – often reflecting indifference on the part of staff who have come to believe that senior health service managers, officials and politicians rarely make use of, or even consult, the data they provide;
- incentives to misreport (e.g. performance-related pay) – and an absence of effective audit systems that might detect misreporting.
The poor quality of the data may be improved to a limited extent by measures such as those indicated above, but it typically relates to a widespread culture of indifference to reliable reporting that is not easily amenable to change, given the resources available to any specific intervention. Some financial data (e.g. payroll data, payments by health insurance agencies) may be more reliable because they are subject to audit procedures. One potentially useful activity is to explore the possibilities for combining routine data with other sources, such as surveys, to generate ‘best estimates’. This implies the need for expectations to be limited and second-best options to be explored. For example, while such basic indicators as service utilisation, access and cost are not ideal, they may provide a reasonable basis for context analysis and verification of data from other sources.
RDS data quality is a particular concern when disaggregation within the intervention area is required. As a general rule, administrative data quality depends on the quality of administrators, and both tend to be correlated with the overall level of development. The poorest areas and facilities typically have the least reliable data. This is of particular concern in terms of indicators derived from information systems that are subject to the pressures associated with the provision of marketable goods and services. For example, rural health workers in poorer areas (given that their government salaries are sometimes barely sufficient to purchase basic food and clothing) have become very adept at providing information that satisfies higher levels of administration while not limiting their alternative income-generating activities. It should be noted that variations in the quality of data, particularly administrative data, between areas and facilities may also influence aggregate estimates, as these are often based on partial coverage. Facilities in less developed areas not only tend to provide less reliable data, they often fail to provide data on time. As overall estimates are often derived by ‘grossing up’ the information available when estimates are required – that is, information from better resourced facilities – biases that tend to overestimate service utilisation, staffing levels, drug availability, etc. may be introduced.
Finally, note that many of the most important health indicators require the combination of service data from the RDS with overall or age-specific population estimates. These will reflect the ‘denominator problem’ of indicator construction – the fact that these estimates are typically crude estimates and/or outdated. The influence of changing population sizes and distributions, often due to internal migration, on access and utilisation measures can be substantial and will often need to be considered in the interpretation of trends over time. Again, poor regions may be particularly affected by both push and pull migration factors. The use of population estimates also raises issues of data availability. Population estimates in years removed from that in which the census is taken will be derived from demographic models, often based on parameters estimated from DHS data. These models may be reasonably reliable at the national level but are not intended for sub-national estimation and typically do not allow for the effects of possibly large-scale internal migration.
Existing survey data
Anyone who has undertaken a reasonably large-scale sample survey will appreciate that it can be a daunting task. It primarily requires a range of managerial and administrative skills that are often lacking even in some of the most talented and experienced social science researchers. In particular, surveys usually involve the hiring, training and management of a substantial number of enumerators, supervisors and data entry staff who may have little interest in the survey objectives and who need constant encouragement and oversight to ensure the quality of the data produced. Surveys also typically involve a considerable investment in terms of both time and money. If it seems at all possible that relevant research questions can be addressed by secondary analysis of an existing survey data set that is known to be of reasonable quality, it would be a mistake not to at least seriously consider this option (Boslaugh 2007).
One key question to be addressed is how the quality of the existing data set is likely to compare to that from any new survey. Where surveys have been conducted on a regular basis for a number of years by permanently employed staff members, for example from a national statistics office, their accumulated experience may well imply that the quality of the final product is likely to be considerably in advance of that from a newly designed survey conducted by a team of recent recruits employed on short-term contracts. The sampling expertise and resources (for example, computerised sampling frames) available within the agency that designed the existing survey may also have been far in advance of that available within the implementation research team. This may imply that there will be greater uncertainty as to the validity of estimating population parameters using sample statistics derived from the new survey.
In addition, the existing survey may have included questions on topics, for example incomes or expenditures, that would be of considerable value in any analysis but which could not realistically be included in a new survey given budget constraints. If it had been undertaken on a national or sub-regional basis, it could also provide an opportunity for direct comparison of data from the implementation sites with that from other areas – an important consideration when exploring the opportunities and challenges involved in scaling up or relocating the intervention. Similarly, if the same questions have been asked in successive rounds of the survey over previous years, it may allow analysis of trends over time, which provides insights that would not be available from a cross-sectional survey.
The above qualities are of course irrelevant if the survey data cannot be used to explore the questions that the research needs to address. An initial problem may be that it is difficult simply to gain access. For example, national statistical agencies will usually argue that survey data are collected on the basis that they will only be used for a specific purpose and that the respondents have been assured that the data will not be shared with other organisations. Versions of the data from which any variables that can be used to identify respondents have been removed may be made available but often only with a considerable delay that reduces the data value. Agencies may also require researchers to make a formal request for the data that involves a detailed explanation of the types of analysis to be performed and the intended uses of any findings, possibly requiring any resulting reports to be submitted to them before dissemination. In some cases the agencies may also demand a substantial payment for use of the data. Note that the international agencies indicated above usually do make data freely available to researchers with minimal formality and it is also worthwhile to explore the International Household Survey Network website, which does hold selected survey data sets, though many of these will be some years out of date.
The researcher will not only need to gain access to the data themselves but also to the ‘meta-data’, which provides a detailed description as to how the data may be analysed and interpreted. As a minimum this must include the questionnaires and coding manuals, but it will often be very useful also to have copies of the enumerator and supervisor manuals. For example, if respondents were asked if they visited a public or private clinic the answers may well differ substantially depending on the guidance (if any) provided by enumerators about how to distinguish between these two types of facility. Having considered the precise nature of any variables of potential interest in the data, the researcher will then have to make a considered decision as to whether they can be used to at least provide insights into the original research questions. It is in the nature of secondary data analysis that the variables available are rarely those that the researcher would have chosen to analyse. The original question may not have been worded as you would have wished. The instructions to the enumerators may have resulted in an excessive number of missing responses. The coding system adopted may have lost information that would have been extremely useful. Even if these problems can be overcome, you may find that the sample size is too small to allow disaggregation to the extent necessary to provide relevant estimates for the intervention population. Nonetheless, all these potential limitations should not prevent you from exploring this option. The costs are often minimal and the potential benefits considerable.
3.Primary data collection
Qualitative or quantitative?
A somewhat simplistic view of the appropriate uses of alternative approaches to primary data collection is shown in the following table.
Table 1: Alternative methodologies for alternative objectives
| Quantitative estimates representative of population parameters |
Knowledge of sampling errors
| Quantitative data with some understanding of processes |
Repeatable for trend assessment
|Quantitative rapid appraisal|
| In-depth knowledge of behaviours, perceptions, attitudes, etc. |
Interpretation of existing quantitative data
| Very limited contextual knowledge |
|Qualitative rapid appraisal|
At one extreme, we might be concerned to produce estimates of specific population parameters, for example utilisation rates for health facilities or the frequency of given symptoms in young children over a previous period, together with the associated ‘errors of estimation’, which allow us to specify how confident we can be that those estimates fall within a given range. If we wish these estimates and confidence limits to be widely accepted as valid, we would be well advised to use formal surveys that follow the accepted principles of statistical inference. If we are less concerned about the precision of such estimates we may decide that we can derive them to an acceptable degree of accuracy using alternative and less resource-intensive methods, for example by extrapolation from facility records, questioning key informants or focus groups, or techniques such as participatory ranking or mapping (Chambers 2007, Rifkin 1996). In the above, such approaches are described as ‘quantitative rapid appraisal’. Using standard procedures can allow comparison between areas and over time, but the extent to which such estimates are accepted will in this case depend on our ability to persuade others of their reliability.
If we need to understand not only how individuals and organisations behave but why, we may decide that some form of detailed qualitative study is required. This may involve long-term engagement with the study population, using a range of observational and interview techniques to formulate and then test alternative explanatory theories. Finally, if we know very little about the context within which we are working, a common situation at the start of any research activity, we might adopt an approach that can for convenience be labelled ‘qualitative rapid appraisal’, mainly using key informant interviews to enable us to at least frame relevant research questions.
However, it is often worthwhile to think ‘outside the box’ when considering which methodologies and methods might be the most appropriate (or cost-effective) to meet data requirements in a specific context. Kanbur (2003) suggests that we usually categorise qualitative and quantitative methods as having the following characteristics, locating them at the opposite ends of five ‘dimensions’ (table 2).
Table 2: Kanbur’s ‘five dimensions’
|Non-numerical information||Numerical information|
|Specific and narrow target groups||Large general target population|
|Active engagement with respondents||Passive involvement of target population|
|Inductive methods of inference||Deductive/statistical inference|
|Description/generalisation/theory construction||Hypothesis testing/econometric modelling|
But there are no ‘rules’ that force you to accept this dichotomy. Given that researchers are always constrained by limited budgets, they should try to assess the costs and benefits of locating at different points along each of these dimensions in a specific research context, considering only how they will justify their decisions if challenged. For example:
- Traditional household surveys can be used to gather non-numerical information using ‘open-ended’ questions (Rog et al. 2011);
- Participatory methods can be used to generate numerical data – e.g. ranking of providers, estimated travel times to different facilities, etc. (Chambers, 2007);
- Qualitative studies can use probability sampling and large sample sizes to gain credibility (Barahona and Levy, 2002);
- Qualitative studies may rely primarily on observational data, involving limited interaction with members of the targeted population (Walshe, Ewing and Griffiths 2012);
- Qualitative studies of social networks can use statistical methods and mathematical modelling techniques to generate network maps (Bishai et al. 2014).
Potential advantages and disadvantages of qualitative studies
One great attraction of qualitative approaches to many researchers is the extent to which they feel in control of the process. Sample sizes are typically relatively limited, allowing a small number of skilled, experienced researchers to take the time required to fully engage with those who are providing information. There can be considerable flexibility, with those researchers being trusted to make decisions as the research proceeds, for example selecting additional or alternative respondents, adapting questions or participatory methods as their knowledge of a situation increases and possibly opening up unplanned lines of enquiry if unexpected responses or observations suggest that these may be of importance.
Given sufficient expertise, researchers can undertake detailed investigations not only about the knowledge of respondents but also their perceptions, attitudes and motivations. If they can gain their trust, they may be able to explore sensitive issues and assess emotional responses. Interviews that take place in homes or facilities will often allow valuable insights into relationships, processes and contexts simply by careful and prolonged observations. Of particular importance when there is limited knowledge at the start of a research activity about which are the most relevant issues, qualitative studies can allow the gradual elaboration of concepts and theories as the research proceeds, delaying the often very difficult task of formulating precise definitions of variables and the expected relationships between them until the researcher has had an opportunity to experience the ground realities (Kuznetsov et al. 2013).
To some extent, the disadvantages associated with the archetypal qualitative study can be seen as the mirror image of the advantages. The flexibility that is so attractive to many researchers tends to place great weight on the regard in which the members of the research team are held by those whom they might wish to persuade of the value of their findings. The central issue is that of subjectivity, that given the extent of their control over the process of data collection it is likely that the research findings will be at least partly determined by the preconceptions of the researchers – that is, they will tend, quite possibly unconsciously, to gather information that reinforces their personal perceptions about how the world works. While it can reasonably be argued that quantitative research also has to contend with this issue, the use of predetermined instruments and procedures – questionnaires, manuals, sampling designs, etc. – provides those who wish to determine the extent to which findings have been influenced by the decisions of the researchers with the documentary evidence they require. This indicates the way in which qualitative researchers can guard against their findings being dismissed as ‘too subjective’, by ensuring that every step in the data collection process is carefully documented, providing detailed descriptions not only of what was done but why. This should be a central component of an activity usually described as ‘reflexivity’ (Finlay 2002, Mruck and Breuer 2003) – ongoing assessments by each researcher of the extent to which their activities might be driven by personal factors and attempts to counteract that tendency.
One related common criticism of qualitative studies is that of sample selection biases, for example tending to gather information more from those who are in favour or those who are against the intervention, neither group being representative of the overall population. Researchers will usually try to avoid obvious potential biases, such as relying on local officials or ‘community leaders’ to determine their subjects, but it is easy to overlook other potential pitfalls; for example, limited resources may result in a failure to seek out stakeholders who are harder to reach, such as those who live in remote or less accessible areas. Sample sizes are often limited in qualitative studies. The essential need to use only capable, experienced researchers, because the quality of the findings is so dependent on their abilities, generally implies that the cost per respondent will be substantially higher than that for quantitative studies using enumerators to complete standardised questionnaires.
Small samples can raise difficult problems in terms of analysis and interpretation, given that we are often interested in the relationship between the diverse circumstances and characteristics of our respondents and their perceptions, attitudes, etc. We would often see it as essential to distinguish between respondents in terms of a range of attributes including gender, age group, income/wealth, rural/urban, etc. Even if we adopt a policy of stratification, such that we have respondents in each cell of the implied multi-way table, the numbers in each cell will be so small that we may be reluctant to infer that they can be extrapolated to other ‘similar’ individuals in the study population. One common challenge to qualitative findings is that they are anecdotal, interesting as descriptions of individual cases but unrepresentative and therefore of limited use in terms of reaching general conclusions and hence in terms of policymaking. A similar complaint may arise with respect to comparisons between the various groups, for example differences in attitudes between men and women. If there were relevant differences in the nature of the information-gathering process between groups – for example different researchers choosing to vary the type or sequence of questions, or the use of male researchers to interview men and females to interview women – it might be argued that at least part of the observed differences may simply reflect inter-interviewer variation. A final, practical disadvantage, of qualitative studies is the sheer volume of information, mainly textual, that they almost always generate, posing substantial problems in terms of analysis and interpretation, even with the use of computer software packages.
Potential advantages and disadvantages of quantitative studies
A well-designed and implemented probability sample survey has the unique advantage of being able to provide reliable, bounded estimates of key population parameters, for example immunisation rates, illness prevalence rates, utilisation of services, average length of stay in hospital, median cost of an outpatient visit, etc. Unlike any other methodology, it allows the researcher not only to generate such estimates but to specify how ‘confident’ they are that each estimate falls within a stated range (the ‘precision’ of the estimation). These estimates are derived using the area of mathematics known as statistical inference, which allows a researcher who can show that they have ‘followed the rules’ of probability surveys to present such estimates without the need for further justification. While other approaches, for example market research surveys or political opinion polls, may make similar claims, they are almost always not following the rules and therefore cannot legitimately use the language of statistical inference to support those claims.
This ability to generate reliable estimates to a given level of precision can be very attractive to policymakers because it allows them to assess the potential quantitative impact of a given intervention. For example, China has recently started to introduce policies that provide improved health insurance coverage for the poorest members of rural populations. Such policies had been recommended by health researchers for many years but became much more acceptable to government when the costs of such changes could be reliably estimated from probability sample survey data. Again very useful from a policy perspective, the adoption of predetermined and standardised instruments for data collection in most quantitative studies enhances the credibility of making comparisons between different subgroups of the target population. Given that precisely the same questions are asked in what should be precisely the same manner to such subgroups, for example the heads of richer and poorer households, it will often seem plausible to directly compare their responses, for example in terms of the proportion of children under two vaccinated against polio. Quantitative studies generally try to minimise any variations in behaviour between those collecting data from different subgroups, which may be misinterpreted as between subgroup variation. Similar considerations apply to comparisons over time, for example estimation of trends in childhood malnutrition rates using DHS data for different years.
The desire to make comparisons between subgroups or over time is related to one of the main disadvantages of the typical quantitative approach – the difficulty of developing simple, uniformly applicable definitions of key concepts that are well understood and have a common interpretation across all subgroups of the population. For example, in one pilot exercise I conducted, a standard question about whether anyone in a household had suffered an acute sickness in the previous two weeks produced incidence rates for those in the poorest rural area surveyed that were far too low to be believable. A follow-up qualitative study found that fevers were so common that may people did not consider them worth reporting. Similar issues arise with respect to many of the covariates on which we often try to collect data in such surveys. The distinction between rural and urban areas, for example, is often problematic as is that between public, not-for-profit and for-profit facilities (if we are aware that they are all charging for services to a greater or lesser extent). Particular difficulties arise with studies that are concerned with equity. Measures of income, expenditure, wealth, indebtedness, vulnerability, etc. are notoriously difficult to define in ways that can be confidently expected to produce comparable findings across subgroups.
The above indicates the need for a profound understanding of both the topics addressed and the population targeted at the design stage of any quantitative study. Such studies should certainly not be used to explore issues about which the researchers have very limited understanding. That will almost always result in a substantial expenditure of resources to little purpose. In-depth knowledge is essential if the study is to be well designed and the design phase is often the key to a successful outcome. The implementation of a large-scale quantitative study is primarily an exercise in human resource management and logistics. Once launched it is very difficult to change course or rectify any major design defects that may become apparent. It is essential to ensure: (a) that the research team has the necessary management skills required and that those with these skills are willing to take a leadership role – along with the responsibility for ensuring that the exercise proceeds with as little divergence from the original intention as possible; and (b) that the resources are sufficient to allow for unexpected problems – bureaucratic delays, equipment failures, illness, bad weather, etc. – which will almost inevitably be encountered. Attempting to stretch an inadequate budget and ‘hoping for the best’ is a recipe for failure. Finally, it should be taken into account that those who most strongly favour quantitative studies often have a tendency to pay insufficient attention to likely data quality issues, preferring to make heroic assumptions about the reliability of the findings derived from this data, often substituting technical expertise for considered analysis and claiming general validity for what are typically very simplistic models of causality.
In practice, it would be very unusual, and almost certainly a mistake, not to use both quantitative and qualitative approaches in any implementation research exercise. While there is a very long history of researchers combining quantitative and qualitative methodologies, the mid-1990s saw a more formal discussion of the opportunities and potential pitfalls of using ‘combined methods’ (sometimes described as qual/quant) (Palinkas et al. 2011). Most attention has focused on the potential advantages of using qualitative studies to complement and support large-scale surveys (Kanbur and Shaffer 2005) they include:
- The use of qualitative studies to improve survey design;
- The interpretation of counterintuitive or surprising findings from surveys;
- Explaining the reasons behind observed survey outcomes;
- Exploring the motivations underlying observed behaviour;
- Suggesting the direction of causality;
- Assessing the validity of quantitative results;
- Understanding conceptual categories such as ill health, household, etc.;
- Interpreting local categories of social differentiation, e.g. poor/non-poor;
- Provide a dynamic dimension to cross-sectional household survey data.
However, there are multiple pathways by which qualitative and quantitative studies might be linked. Marsland et al. (1998) categorise these pathways under three broad headings:
A: Swapping tools and attitudes: ‘Merging’
- Adopting standard sampling techniques in qualitative studies (Barahona and Levy 2002).
- Coding responses to open-ended questions using qualitative enquiries.
- Using statistical techniques to analyse quantitative data obtained from qualitative studies, for example:
- Creating frequency tables from coded responses to open-ended questions;
- Constructing models based on binary and categorical data from ranking and scoring exercises.
- Using participatory mapping to create sampling frames for questionnaire surveys.
- Using findings from qualitative studies to reduce the non-sampling error (e.g. misunderstandings, offensive questions) in questionnaire surveys.
- Using exploratory techniques to establish hypotheses that can be tested through questionnaire surveys.
- Using a questionnaire survey to gather responses to a few key questions from a probability sample of respondents and then undertaking a qualitative follow-up study of respondents who appear to be of particular interest.
C: ‘Concurrent use’ of tools and methods from the different traditions
- Using a questionnaire survey to determine quantitative indicators (for example, Likert scales) on perceptions and attitudes relating to public and private health services.
- Qualitative exercises (key informant interviews, focus group discussions, participatory exercises) to address the same issues with the aim of gaining greater understanding.
These possibilities are reflected in the following diagram:
For example, Lucas, Ding and Bloom (2009) used large-scale sample surveys in Cambodia, China and Lao PDR to identify households where at least one member had suffered from a serious illness over the course of the previous year. A limited number of geographical case studies, based on purposively selected counties in China and health districts in Cambodia and Lao PDR were undertaken. In each of these areas households affected by major illness were identified and studied using a two-stage approach:
- A rapid and reasonably large-scale household questionnaire survey was undertaken using cluster sampling of households within the selected study areas. This aimed to identify households substantially affected by different categories of serious health problems and to estimate the proportions of such households in the population.
- The sampled households were analysed and classified into a number of strata based on the information provided by the questionnaire survey (the choice of stratification variables is indicated below). In-depth studies, typically requiring one to two person days, of a probability sample of the households in purposively selected strata were then undertaken by a team of social scientists.
Barahona, C. and Levy, S. (2002) ‘How to Generate Statistics and Influence Policy Using Participatory Methods in Research’, Statistical Services Centre Working Paper, University of Reading
Bishai, D.; Paina, L.; Li, Q.; Peters, D.H. and Hyder, A.A. (2014) ‘Advancing the Application of Systems Thinking in Health: Why Cure Crowds Out Prevention’, Health Research Policy and Systems 12: 28, www.biomedcentral.com/content/pdf/1478-4505-12-28.pdf (accessed 15 January 2015)
Boslaugh, S. (2007) Secondary Data Sources for Public Health: A Practical Guide. Cambridge: Cambridge University Press, www.academia.edu/1630213/An_introduction_to_secondary_data_analysis (accessed 15 January 2015)
Chambers, R. (2007) ‘Who Counts? The Quiet Revolution of Participation and Numbers’, Working Paper 296, Brighton: IDS
Finlay, L. (2002) ‘Negotiating the Swamp: The Opportunity and Challenge of Reflexivity in Research Practice’, Qualitative Research 2: 209, www.uk.sagepub.com/braunandclarke/study/SAGE Journal Articles/Ch 2. Finlay.pdf (accessed 15 January 2015)
Hsieh, H-F. and Shannon, S.E. (2005) ‘Three Approaches to Qualitative Content Analysis’, Qualitative Health Research 15: 1277–88
Kanbur, R. (2003) ‘Q-squared? A Commentary on Qualitative and Quantitative Poverty Appraisal’, in R. Kanbur (ed.), Q-squared: Qualitative and Quantitative Poverty Appraisal, Delhi: Permanent Black
Kanbur, R. and Shaffer, P. (2005) ‘Epistemology, Normative Theory and Poverty Analysis: Implications for Q-Squared in Practice’, Q-Squared Working Paper No. 2, University of Toronto
Kuznetsov, V.N.; Grjibovski, A.M.; Mariandyshev, A.O.; Johansson, E.; Enarson, D.A. and Bjune, G.A. (2013) ‘Hopelessness as a Basis for Tuberculosis Diagnostic Delay in the Arkhangelsk Region: A Grounded Theory Study’, BMC Public Health 13: 712, www.biomedcentral.com/content/pdf/1471-2458-13-712.pdf (accessed 15 January 2015)
Lagarde, M. (2012) ‘How to Do (Or Not to Do) ... Assessing the Impact of a Policy Change with Routine Longitudinal Data’, Health Policy and Planning 27: 76–83
Lucas, H.; Ding, S. and Bloom, G. (2009) ‘What do we Mean by “Major Illness”? The Need for New Approaches to Research on the Impact of Ill-health on Poverty’, in B. Meessen, X. Pei, B. Criel and G. Bloom (eds), Health and Social Protection: Experiences from Cambodia, China and Lao PDR, Antwerp: ITGPress
Marsland, N.; Wilson, I.; Abeyasekera, S. and Kleih, U. (1998) ‘A Methodological Framework for Combining Quantitative and Qualitative Survey Methods – Background Paper: Types of Combinations’, report written for DFID Research Project R7033
Mruck, K. and Breuer, F. (2003) ‘Subjectivity and Reflexivity in Qualitative Research—The FQS Issues’, Forum Qualitative Sozialforschung/Forum: Qualitative Social Research 4.2, www.qualitative-research.net/index.php/fqs/article/view/696/1504 (accessed 15 January 2015)
Murray, C.J.L. (2007) ‘Towards Good Practice for Health Statistics: Lessons from the Millennium Development Goal Health Indicators’, The Lancet, 369: 862–73
Neupane, S.; Odendaal, W.; Friedman, I.; Jassat, W.; Schneider, H. and Doherty, T. (2014) ‘Comparing a Paper-based Monitoring and Evaluation System to a Health System to Support the National Community Health Worker Programme, South Africa: An Evaluation’, BMC Medical Informatics and Decision Making 14: 69
Palinkas, L.A.; Aarons, G.A.; Horwitz, S.; Chamberlain, P.; Hurlburt, M. and Landsverk, J. (2011) ‘Mixed Method Designs in Implementation Research’, Administration and Policy in Mental Health and Mental Health Services Research 38: 44–53
Rifkin, S.B. (1996) ‘Rapid Rural Appraisal: Its Use and Value for Health Planners and Managers’, Public Administration 74: 509–26
Rog, M.; Swenor, B.; Cajas-Monson, L.C.; Mchiwe, W.; Kiboko, S.; Mkocha, H. and West, S. (2011) ‘A Cross-sectional Survey of Water and Clean Faces in Trachoma Endemic Communities in Tanzania’, BMC Public Health 11: 495, www.biomedcentral.com/content/pdf/1471-2458-11-495.pdf (accessed 19 February 2015)
Walshe, C.; Ewing, G. and Griffiths, J. (2012) ‘Using Observation as a Data Collection Method to Help Understand Patient and Professional Roles and Actions in Palliative Care Settings’, Palliative Medicine 26.8: 1048–54
Qualitative methodology has always been part of the set of tools used by health researchers. It has become more important now with the rise in lifestyle illnesses and the epidemics where human behaviour choices influence risk. There are many pressing research questions about health behaviour and risk-taking that cannot be answered with numbers – for example, why have AIDS interventions not worked and why do some people not take medication. So while public health and health research are generally predominantly quantitative, qualitative methods can allow increased understanding as they generate in-depth information by talking to and observing people. Qualitative research is an important addition to health workers’ set of skills.
Qualitative methodology is a part of interpretive research, which is of key importance, as the analysis goes beyond description. Interpretative analysis has been of vital importance in world history as it has allowed people to move beyond the obvious and immediate constraints of their environment. Examples include Steve Biko and Martin Luther King who commented on racism, and the suffragettes who criticised sexism and patriarchy. These people looked at their contexts, drew themselves in the context, made interpretations and came to conclusions about it. Qualitative methods draw on this capacity for interpretation and apply critical and scientific approaches. We make our systems of enquiry explicit and systematic.
Uses of qualitative research
Qualitative research is useful in many applications.
- May be used in the early phase of a study to explore an area on first entry into the field to get clarity or to clarify hypotheses. For example, researchers and activists could not agree as to what constitutes an OVC (OVC = orphans and vulnerable children); we felt it important to get a community perspective before we even started other research approaches (Skinner et al. 2006).
- Can be used along with other types of research in order to get an additional perspective on the problem. For example, looking at the risks for women in shebeens. Here the risk is well known, but to develop interventions we need a much better understanding so multiple methods were required (Sikkema et al. 2011; Watt et al. 2012).
- To clarify unexpected or very significant connections made in quantitative study. The idea of concurrent partners and the discordant numbers of partners between men and women needed an in-depth understanding.
- When the aim is to get an in-depth sense of what people think of a particular object, event or construct. For example, evaluation work done with the South Africa Truth and Reconciliation Commission on its role and application (Skinner 2000); or getting a real understanding of stigma or the cultural impact of male medical circumcision.
- Standalone research on a difficult topic or with a hard-to-reach population. For example, research on sex workers or with drug users.
- Process evaluation of implementations can be incorporated to broaden the nature of the evaluations done. For example, it can be used to observe workshop interventions to assess how they are received.
- Can often form part of a social action model. Well used in South America and other contexts during periods of social change (for example, Freire 1970).
- The methods can and are used every day by us all to examine our own context. What does that interesting person over there really think about me? What do I need to do to get this job?
Theory of qualitative research
Different models exist for science and the development of knowledge over time. The classical model is that of the process of falsification, by which incorrect theories are rejected on the basis of empirical evidence. Medical science tends to follow more the hypothesis-induction approach – a circular process of reasoning in which based on a problem, such as a new disease, hypotheses are drawn up, and then tested either directly in observation or by experimentation. A range of methods used for observation and experimentation that have taken a progressive approach and have become more sophisticated, so RCTs etc.
The social sciences offer a different approach. This led to the development of, amongst other approaches to knowledge, qualitative research methodologies. First we need to look at what is different about the social world and why we need a different approach. If you do an experiment by mixing two chemicals in controlled conditions they should always react in the same way. Likewise if someone has an infection and a correct treatment is applied, the person should get better. If they do not then the reasons can usually be clearly determined, including incorrect dosage or the presence of resistance.
In the social world there is generally much less predictability. For example, let’s take responses to an anti-smoking campaign. The information is clear but responses vary widely. Explanations for different responses will vary even more widely, between educators, different community members, smokers vs non-smokers. Explanations from a single person may also vary depending on the context in which they are asked. Research within the social world has to cope with these variations in response. Multiple theories have been developed around behaviour and what influences it:
- Theory of reasoned action;
- Health beliefs model;
- Intention motivation behaviour model;
- Lay beliefs theory;
- Social representations.
One particular approach is to look at the behaviours and decisions through the context in which the decisions are made. Remember context strongly influences how we think and act. Not a simple matter, context contains within it multiple considerations, including culture, language, access to resources, gender, education, age, time, date, knowledge of health issues, social norms, and multiple other considerations.
The research task is therefore to understand this social world and the context to make sense of the different responses. The variation can now be seen, there is the traditional world of scientific experimentation that assumes a single empirical reality, and there is the social world that can only be examined through the contexts in which that reality is known, and through all the filters that are made use of to understand and respond to that world.
Assumptions behind the qualitative approach
The philosophy of behaviour and thought directs qualitative approach. Each behaviour and belief carries meaning. People are shaped by their contexts and society which require a complex understanding, and which in turn influence their context. These systems of social and personal meaning cannot be explored as a statistical variable. Qualitative research takes an interpretive and subjective approach. Researchers are part of the research. So we as participating researchers attempt to understand people and events in terms of the full complexity of their context and make interpretations from there.
Paradigms and theories
A paradigm is the collective understanding that we have of our world, including our culture, ideology, assumptions of power and perceptions of ourselves in the world.
Theories are statements about rules and systems that direct how events happen in the world. These include biological and chemical processes. But we focus on human behaviour. Grand theories such as self-efficacy, demand and supply in economics, and impact of addiction. We draw out theories that are based on our data, interviews and observations. These are normally only applicable to the research subjects but, due to the complexity of the data gathered, inferences can be made about the broader world.
Many theorists have spoken about the importance of context in shaping our behaviour. Data themselves take on new meaning. Traditionally ‘data’ refers to pieces of information collected in a spreadsheet, each divorced from its context. Data in qualitative research remain integrated as far as possible in the context from which they are drawn. Even the research process is essentially part of context and data.
What does context mean? Environmental, physical, political, social, religious, ideological, cultural, and economic. A natural rather than an experimental setting. Not possible to take everything into account, but try to maintain as much of what is important as possible. Ideally you collect data while interrupting the context as little as possible. Using a system of comparing notes amongst members of a team will enhance identifying those aspects of context that will influence results.
Subjectivity is the focus point of qualitative research. It is perceptions that are important. At an extreme form, ‘There is no such thing as objective reality because everything is understood and interpreted through the eyes, ears and brain of the individual’. Getting the respondents’ perspective, which means removing yours as far as possible from the interaction. Bear in mind two respondents in the same situation may have very different perceptions of what is happening – for instance, men and women in bars. You have to make sense of this, bearing in mind you will not get access to the total reality of the respondent.
Everything that we as researchers perceive is affected by who we are and our context. Likewise the information we are given will relate to that context. Even who we are able to talk to and what we may observe. Compare the interviews in a South African township by a white male academic vs a black female local community member. This has to be factored in at all times during the research. Writing the protocol, preparation for entering the field, while interviewing and observing, doing analysis, writing reports.
Meaning and interpretation
Pursuit of meaning or trying to obtain a full understanding is the core focus. What lies behind the observed behaviour or the words? What are OVC, concurrent partnerships or stigmatising attitudes? Rather than counting and measuring, the core task is to translate this meaning into a report so that the understanding can be shared. Remember that meaning may differ between people even if they have apparent common contexts.
Interpretation is the process of finding meaning. We do this every day and you have probably done this several times today. Analysis is the methodology of looking at what is shown in our data, understanding it and then offering interpretation. We set up ideas, ‘hypotheses’, as we go through the data and then we check up on them – OVC, concurrent partners and stigma. As meaning may vary by context, so interpretation may vary by context.
In most research we try to deny the human component. It is essential to qualitative methods. You are the research instrument. Much of what you do is an extension of what you do every day in social interactions. Perception of the use of self as ‘making it easy’ is an error. It requires considerable development that continues throughout life. You will feel real development over time, for your research and for yourself.
Learning to listen and observe. Taking self into account and the role of your own context and subjectivity. Learning to cope with what you perceive and how it impacts on you. Contradiction of having to be present as yourself, but to withdraw yourself from the situation. At all times be aware of how you are perceived and who you are seen as, e.g. race, gender, class.
The interpretive approach incorporates within it the analytic approaches of grounded theory, thematic content analysis, phenomenology, and hermeneutics. The emphasis is on staying close to the data and interpreting the material from a position of deep empathic understanding. The researcher aims to provide a thick description of the characteristics, processed, transactions and contexts that constitute the phenomenon being studied.
2. Methodological issues
There are specific issues that need to be considered in developing a qualitative research proposal. A number of the key elements of the classical research proposal require a different approach in qualitative research. These differences arise out of the specific philosophical background and the realities of the methodology. Issues of sampling, development of research instruments and ethics require particular attention.
Given the nature of the methodology both in terms of the methods of data collection and analysis, the sample size is usually around ten to 30 respondents, and rarely more than 100. Random sampling is thus inappropriate in qualitative studies. Statistical inference is not sought. Depth of the data gathered and the quality of the explanations provided by respondents are the primary goal, rather than statistical inference. Small sample size requires particular assumptions and approaches. Decisions in sampling are based on the requirements of your study. There are few direct rules, but the researcher must present within the methodology what decisions were made, and be able to argue why. Ultimately the researcher still wishes to say something about the community from which the sample was drawn. As with other research convenience sampling is discouraged.
Purposive or strategic sampling
Greater emphasis is placed on purposive or strategic sampling. Deliberate choices of respondents or settings are made to ensure coverage of the full range of possible characteristics of interest. This should ideally include both the typical and the diversity of the population being researched. Can sample by:
- Personal characteristics, e.g. gender, age, race;
- Group membership, e.g. profession, club, sports participation;
- Social characteristic, e.g. geographical area, class, educational level;
- Experience level, e.g. how long they have participated, how regularly.
For example if you are investigating the experience of nurses providing treatment for multi-drug resistant TB (MDR TB), you will look for variation by types of treatment site, levels of training of nursing staff, years of experience in working with MDR TB patients, size in terms of both area and population of the community being served by their facility, as well as the race, age and gender of the nursing staff. Each nurse sampled would include a number of these characteristics; for instance, a young black female staff nurse with five years of experience with MDR TB patients working in a tertiary hospital who has attended an internal two-week course on the treatment regime for MDR TB patients. This covers a number of the categories. You need to try to cover as many combinations as possible. Usually start with a range that would be typical of the group. Then interview those on the fringe to see how perspectives differ as you move away from the core group. Other approaches include:
- Extreme sampling;
- Intensity sampling;
- Homogeneous samples;
- Heterogeneous samples;
- Typical cases;
- Snowball sampling;
- Opportunistic sampling.
Sampling as you go
You can adapt your sample as you progress and learn about your sample and subject. You need to reflect on your aims and on your stated approach to sampling. It is useful if you are able to analyse your data as you collect information. You can develop hypotheses and/or theories that you can test. If a common set of views is found amongst a core group you can assess if these same views are held further from the core; do all men in the shebeens have negative attitudes to women patrons? Or, if you can draw in additional people to ask specific questions to test an idea; do women in the shebeens change their drinking behaviour when pregnant? Focus on collecting a sample that will suit your needs. Avoid bias. There will be a temptation to take those that are easier to talk to or who are more friendly and accessible. Where access is difficult, you take who you can get, but still have to be careful. Use the approaches defined above. Must just honour the reason for you doing the research. There is a reason why you selected the methodology that you did including the sampling definition. Do not change this without a scientific reason.
Sample size is often influenced by resources, but there are two theoretical systems used, both of which are set in the ideal of letting the research process play out. The first, data saturation, means continuing until no further new information is being uncovered. So observations or interviews are producing the same results, even when you look at new categories of respondents.
The second involves forming hypotheses during analysis that become part of later interviews, continuing until all or enough hypotheses become stable. Given the relatively high cost of qualitative research, resource constraints often limit the sample size even before other considerations. These include cost, time, access and fieldworkers available. The size will depend on the length of the interviews. If interviews are an hour or longer then do fewer; if 20–30 minutes then do more. Assuming that interviews are 45-60 minutes, it is seldom necessary to go beyond a sample of around 30 participants and many studies can be done with 10–15 respondents. A sample of larger than 60-70 becomes very difficult to manage. A master’s thesis should not need more than 15 respondents and a PhD about 40. As stated above this will vary by the length of the interviews and the quality of the discussions. Ultimately these are just guidelines. As the researcher you have to decide if your research questions have been answered.
Other methods of data-gathering such as the use of focus groups, observation and use of existing documents have their own approaches to sample selection. The key philosophies remain similar and guide decision-making.
3. Preparing a discussion schedule
There are a range of research documents that can be used in the research process. These include discussion schedules, observation schedules, workshop agendas, search lists for secondary sources. This section will focus on the development of a discussion schedule as it can be used in individual interview and focus groups.
The purpose of the discussion schedule is to assist the interviewer to maintain a focus and to ensure coverage of all issues felt to be important. The content of the schedule should be clearly derived from the described research question being addressed and from the stated aims and objectives. You do need to remember that you as the researcher and interviewer are the key research instrument and not the discussion schedule. So while considerable effort needs to be put into the development of the schedule, it needs to guide the interview not control it. So it is important that if you are using other interviewers to collect the data, they need to be trained in the whole study and not just the discussion schedule. The interviewer should be thoroughly familiar with all the contents in advance.
As a guideline, the schedule should not be longer than a single or at most two pages, but it should still be easy to scan as a reference. The interview does not need to proceed as per the schedule. Not everything on the schedule needs to be covered. The interview can go beyond the points on the schedule if the respondent chooses to go into new but relevant areas. Ideally an interview should be able to move from a single question for the entire interview. But in reality it is better to start with an initial very open question. Then to have a limited number of follow-up questions that cover all the components of interest. Each sub-question should have a checklist of topics that are important. All the sub-questions should be follow-ups to the initial question.
The particular interviewee may due to some characteristic have additional issues that need to be addressed. If so these additional points should be noted. Particularly if this person was added to cover for gaps identified in the sampling. There are a range of considerations for deciding on the level of detail in the schedule:
- Information being sought;
- Current levels of knowledge on the subject;
- Purpose of the information;
- Experience of the interviewer;
- Likely openness and ease of the respondent in the interview.
Emphasis should remain on keeping it as brief and clear as possible given these constraints. Some interviews have particular functions so schedules may differ within a study, for instance interviewing sex workers as against police. The discussion schedule can be adapted during the research if new questions arise that are seen to be important or new groups emerge that need to be interviewed. These do need to tie back into the original aims and objectives, or these objectives need to be adapted. It is important to maintain this consistency. Sometimes specialised schedules can be developed for particular purposes such as: directed accounts of behaviour or thought processes; commentary on interventions; or accounts of specific events.
Other qualitative methods such as observation or expert review have different instruments and these will be outlined in those sections.
4. Ethical considerations
Three central principles of research ethics are to ensure respect for participants, justice and beneficence. These participant rights need to be supported. They deserve protection from researchers due to power imbalances. Particular concerns in qualitative research include:
- Publishing their stories in a way that exposes them;
- Undermining their credibility or that of a grouping that they represent;
- Breaking confidentiality;
- Generating false information about them;
- Hurting or undermining them in any way during the research process.
Before research fieldwork begins access should be negotiated with all relevant gatekeepers. This is a part of ethics that is often overlooked and seen in purely functional terms. Working with the community being researched and their representatives is important as they are gatekeepers, for the protection of the research team and because it is morally correct. Doing full-access negotiation is particularly important in qualitative research due to the more intensive level of contact and depth of information obtained. Access negotiation is also an ongoing process. While general access is obtained initially, this does need to be repeated at different levels with each new context entered within the site and for each new interview. Clear and honest accounts of the research must be given and an explanation provided of how the data will be used.
Confidentiality is a right of all respondents. Especially here where a lot of detailed information is obtained from them. Interview sound files and transcripts must be password-protected. Remove all personal identifiers from the interview – name, address. Keep the list of subjects separate from the interview data. Interview data should only be referred to by code. All recordings should be destroyed at the end of the project, unless storage in an archive has been negotiated. Be careful when writing up the analysis to protect identity.
There are situations where confidentiality may be difficult to maintain. This is primarily where people are interviewed because they hold a particular position, even more so with high-profile people. Examples include nurses providing specialised services or interviews with university professors in particular disciplines. On these occasions it needs to be made clear that while you will try to protect respondents’ identity and avoid them being recognised there are risks that some readers of the final reports or papers will be able to recognise them. Care needs to be taken that respondents are not inadvertently compromised through the research.
We also have to be aware that we collect sensitive information and as such are responsible for dealing with some of the knowledge that we obtain. There may be times when we have to break confidentiality. Issues such as respondents who are in compromised situations:
- Depression to point of suicide;
- Child being abused;
- Where the respondent states that they intend to hurt others;
- Planning to commit violence.
You must warn respondents in advance that you will break confidentiality under these conditions.
Unintentional possible negative implications
On sensitive topics you need to give a warning that the interview may evoke an emotional response. Due care and sensitivity need to be used when such a situation arises and if necessary the interview closed. Need to have a referral list for respondents who need it. Especially where the respondent asks for help directly in the interview. In certain situations it may be possible or necessary to provide these backup services within the study. You have to be careful to make sure the respondent is not vulnerable at the end of the interview – for instance, may expose themselves to risk behaviour or be suicidal.
Avoid getting into a counselling relationship with your interviewee. Also be careful of giving advice. This is tempting and may be requested. It increases your power and potentially gets you access to information you should not access. It implies that you will have a longer more ongoing relationship. This is not what was consented to. It is also bad research. Refer people on.
You need to be able to refer respondents who are in need to an appropriate service. Resources need to be local and affordable. Use state services, NGOs and community-based organisations (CBOs). Include private structures, but you may need to warn that these will lead to charges. Bear in mind that the easiest service may not be appropriate if for example it may lead to stigmatisation – for instance, being referred to a workplace service that may lead to others finding out.
A full informed consent is required as with any research interaction, including a full explanation of the research. Need separate consent to audio- or video-record the research interaction, or very specific clear mention of the recording. Need to provide backup counselling on sensitive issues, e.g. HIV, violence, sexual abuse. With a child respondent, consent must be obtained from the parents, but must also be obtained from the child. Particular concerns have been attached to populations who are vulnerable due to particular circumstances, such as prisoners and long-stay hospital patients.
The respondent is entitled to request that the recorder is turned off at any point and may request that the recording be deleted and any notes destroyed. This must be respected. There is a power relationship that exists. You need to be careful not to exploit this to gain additional information. It will generally count against you as people tend to alter what they say when facing a more powerful person.
The process of leaving also has to be negotiated, especially if closer connections have developed. Ensure that promises of follow-up are kept. Deal with any outstanding issues from the interview, such as any painful emotions that may have been released or requests for information that could not be dealt with during the interview. If you wish to correct any misconceptions that arose during the interview this is the time to do it.
Offering incentives for participation
Offering incentives for doing an interview has created some ethical conflict. Some argue that offering an incentive can be considered coercive depending on the size of the incentive. Alternatively others have felt, especially some respondents, that just taking information without offering anything in return is exploitive. Especially when researchers are seen as obviously more wealthy than their respondents and as profiting from the information gathered. The benefits to researchers include getting research funding. The concern often depends on the size of the incentive. At minimum the incentive should cover the respondent for any costs associated with participation, such as the cost of transport to the venue.
Incentives for children are often smaller and we have found it better to offer gifts such as toys, educational equipment or air time for their cell phones rather than money. On some occasions we have observed parents appropriating money given to the child, so again gifts are better. If a parent has to accompany the child to the interview they may also require compensation.
Issues for observation research
This is a different situation as a respondent may give away in behaviour key personal information unintentionally. You still have to negotiate access, especially to private venues; this is done with the key gatekeepers. Observation must be open with no covert research. Public spaces are easier to get ethical approval for; you can try to blend into the background, but be aware that this remains a position of power. Those being observed may try to confide in the researcher, which is not necessarily a problem, but you need to be careful how you use this information.
5. Writing a qualitative protocol
While being more flexible in their approach, qualitative methods still require considerable rigour. Going back to the opening section, it is this systematic approach that makes it research. This added rigour is reflected in the protocol. A clear plan is required before you go into the field; this should guide operations in the field. Of course, a first encounter may change the entire project and that can be done. Fundamental to a qualitative proposal is a clear and developing narrative that flows from the opening statement or research question right through.
Outline and structure
As with any other research approach, the following form the basis and should constitute the headings.
- Literature review;
- Statement of question;
- Aims and objectives;
- Research design;
- Discussion or observation schedule;
The proposal must constitute a continuous flow. The literature review must be directed at the question, not just be a collection of as many references as possible. The aims and objectives must flow from the statement of question. You can have sub-aims and objectives in a study if there are related areas that you want to investigate. The sample and instruments must reflect the aims and objectives. If there are questions that you find yourself adding to the instrument that are not in the aims and objectives, check if these are important and if so go back and adjust. Or add a new sub-objective.
A focused account of the literature that covers the subject matter. As with any other literature review, this should cover relevant content material in both international and local settings and reflect on theory relevant to the question. While all literature is relevant including quantitative data, the focus will be on qualitative data, and the context behind the studies needs to be considered. Remember a literature review is part of analysis and so a qualitative analytical approach should be used. This forms the start of the narrative of your study.
Statement of question, aims and objectives
Must include the broad area of work that you want to investigate. Within that what is the specific area is that you want to look at? Information on the context in which you wish to do the work. The relevance of the question to that context. Also state how you negotiated access to the community.
Be very clear about the information that you want to seek. This can be very specific or very broad. Just know what you want to find out about. The aims should sketch out the area within the research question that you wish to address. The objectives should specify more directly what information you are seeking within that aim. Objectives are formulated in open or descriptive terms, not in terms of measurable variables or hypotheses. This should not put limits on what you could find out about the subject. Clear aims and objectives do not block lateral thinking. You can adapt objectives and aims as you go, but it must be a reflective process. For any adaptation you must go back to your proposal. Generally, adaptations mean adding a new aspect to the objectives as information on new connections arise. These should generally fit into the broad original question.
For example, disclosure of HIV status to children is difficult because parents often assume that they are ready to hear this information. How and when to tell children that they are HIV+ has caused considerable debate as parents fear the negative impact of this on the child, including depression, negative self-worth or even self-destructive behaviour. But there are concerns about risky behaviour, the child being able to be adherent to medication and the child working it out for themselves without assistance and support. Developmental age can make a difference. Parents are the key figures here so how they make this decision needs to be understood. The aims and objectives might be understood as follows:
- To assess how parents make the decision to disclose HIV+ status to their child.
- To understand parents knowledge and attitudes about HIV;
- To understand what they feel about their child being HIV+;
- To understand what they see as the risks to be in disclosing to their child;
- To understand what they understand by disclosure;
- To ascertain how they talk to their child about the medication;
- To understand how the parent understands childhood;
- To identify signals that they and the child are ready to go through the process of disclosure.
A brief description of what you want to do, including the basic methodology and some specifics about the context of the work. You may also include more details on some of the specific issues that you wish to look at within the broader aims. This serves to set the framework for the technical components that follow. In the above example the design might be described as:
Individual in-depth interviews will be done with parents who have had different experiences of disclosure or who are preparing to go through a disclosure process with their child. These interviews will be conducted by the researcher who is trained in the methodology and is familiar with core issues around disclosure from reading the literature and from discussions with health staff. All the interviews will be recorded for later transcription. The focus for this study is on parents whose children attend services at a tertiary hospital. The study can later be expanded to include other sites.
The information required here typically includes the sampling frame, the intended sample size, the sampling procedures to be adopted and details of the types of person you are seeking. Note that in observational studies, for example of health facilities, the sample would be specified in terms of the sites that you will observe and the times when you will be observing them. Thus, in the above example we might decide to select 12 sets of parents and to sample them purposively with the assistance of the health staff who provide services to their children. The sample could then be split into those who have already disclosed to their children and those who have not. Subjects might be selected in order to obtain substantial variation as to: age of child; race/language of parents; education level of parents; level of illness in the child; and, for those who have disclosed, the age of the child at the point of disclosure.
At a crude level the instruments include a broad description of the specific content that you are looking for. Describe the nature of the instrument that you will use and a broad description of this. Also motivate for the different components of the schedule. The provisional discussion schedule or alternative instrument if using another methodology should be attached. Of course there may be some variation from this original instrument.
A description of your analysis process. Include the preparation of the documents for analysis, for instance transcription and translation where necessary. Development of themes. Coding process. Analytic notes. Validity and reliability. Your analytic approach if necessary, such as interpretive, grounded theory, phenomenology, feminist.
Be clear on all ethical aspects of the project. Key issues include, consent letters with special mention of recording the interview, including additional consents for children; confidentiality protections; protection of data; referral plans; statement on the destruction of sound files; and if there is a risk of this, the potential to break confidentiality; and the right of the respondent to cancel the interview at any time.
Ongoing revisions of protocols
Remember that in qualitative research the protocol is a living document. It can be adapted as you identify new questions or identify clear answers to some of your questions and want to focus on specific issues in new interviews. New target groups can also be identified. If there are substantial changes to your protocol these changes have to be taken to your IRB. If you are uncertain about this rather err on the side of caution.
6. Individual in-depth interviews
The most common form of qualitative data collection. Takes the form of a discussion between the interviewer and interviewee. Very similar to a conversation but with rigour and a clear initial focus. The interviewer also has to take their own views out of the situation. The interviewer sets the question, and directs the discussion to make sure the respondent stays on track, but there is freedom to cover the information in their own terms.
It is similar but not identical to a clinical interview. The clinical interview can be qualitative data, but that is a specific type of investigation. You can learn from the clinical interview, in terms of the use of questions and flow of ideas as well as interactive skills. But you have to remove the categorical focuses, such as diagnosis. You are interested in the interviewee’s world as they construct it. Not as how you construct it using diagnostic and medical systems.
The point is to explore the interviewee’s reality and meaning system from their own perspective. Useful for respondents as it allows them space for personal exploration and detailed investigation of their own understandings of the world. If this occurs it is great for the interviewer as the space can be used for greater exploration. There is a power relationship in the interview. Be aware of it and try to balance it.
Set a fixed appointment at a suitable place and time. Place should be considered safe and acceptable. Clear explanation of research and use of the material. Preferably not much more than an hour, varies by age and other demographics. Make sure they have transport. If you can find a stable place to do all of your interviews it helps both you and the respondents. Be prepared yourself. If using incentives make sure these are ready. Make sure that you have all equipment with you and that it works. May be an idea to have water, juice or a snack available depending on the person, especially important in contexts of extreme poverty.
Privacy and quiet very important. NO interruptions, so no phones and out of thoroughfare. Warn those around not to interrupt. No TV or computer screens, these are focal points for eyes. Comfortable upright chairs. Arrange chairs at about a 70 degree angle so you do not look directly at each other. Avoid having a table or desk in between you. Recorder is a place near to both. Ensure a friendly environment, so avoid places that have meaning already loaded into them, e.g. doctors’ or nurses’ offices, academic offices. Be careful about posters, décor, etc. on the walls. Even innocuous images like a family photo, religious message or HIV education poster can place restrictions on the interviewee. But you never know where you end up, especially if you give the respondent choice.
Get good equipment, as your research depends on this. There is nothing worse than doing an interview and then not being able to transcribe it. Speak to someone experienced in the field for advice. Make sure the microphone is good. Can use external or lapel mikes. Always check equipment before the interview, including batteries and recording space. Do occasional checks during the interview. Do it quietly and as surreptitiously as possible. If the interviewee notices state what you are doing. Reinforce that what they are saying is valuable and you do not want to lose it.
Introduction and preparation
Respect any concerns expressed. Make sure consent form is signed and use that to establish clarity. Make it clear that you are recording the interview. Explain right of interviewee to turn off and delete the recording. Set an open and cordial atmosphere. Initial phase important to establish trust. Make sure that the interview process is understood. Explain what will happen to the data and how it will be used, including confidentiality. Introduce yourself as is appropriate to the interview setting.
Researchers need to prepare both in terms of the content area and personal preparation. Will usually enter with a discussion schedule of key points. This should not impose discussion, but act as points to initiate discussion and to ensure coverage of important areas. Less structure generally means more openness and free thinking. Examine your understanding of the subject matter so that this does not interfere. If the subject matter is potentially sensitive or shocking then you need to prepare yourself in terms of that. You need to be able to control your emotions and to be able to contain your respondent’s emotions. Interviewees may cry or get angry. You may need to cry or get angry. Be aware that even a seemingly innocuous question like ‘what did you have for lunch?’ can produce an emotional response depending on the situation.
Your own presentation will impact on the research process; including gender, race and class. Language, words used, dress, non-verbals, and emotions. Be yourself, but indicate acceptance. Remember that the interviewee is observing you and looking for your response to guide them. This happens in any conversation. You have to take them back into their own reality. You can elicit a very powerful interview when this happens.
When leading a team, match your interviewers to respondents as far as possible. Gender, race, language, community background; but do not do this simplistically. But getting too close can also be a problem; a respondent will not want to share personal secrets with a neighbour. Access negotiation is very important here and that will also provide a level of trust. Useful to meet them before the interview, when setting up, so you can assess each other. Skill and experience can get around these blocks, but it does take time to get there.
Language and behaviour in the interview
Try to match language and dress style to your respondent. Dress down if the interviewee is not formally dressed. Use some correct language, e.g. tik not methamphetamines. But where some term is unclear ask; if trust exists people are very willing to help. And asking for help balances power and often assists in opening up. Neutrality and acceptability are watchwords here. Be yourself; acting a role very seldom works. Especially in extreme and fringe groups. Acknowledge commonality and difference. Be very careful around issues of physical contact. Entry into personal space generates its own dynamics. You cannot always be sure of the meaning attached. Check before you offer reassurance such as a hug or even a tap on the knee. This also applies to physical distance you have during the interview. Similarly be aware of the impact of sympathy. Empathy is fine, but sympathy can cause problems.
Some writers talk about presenting yourself as naïve to the subject matter. This is a tactic, but remember you are always naïve as to the interviewee’s reality. Even if the material is something that you know well, such as a parent talking about children’s first day at school. You may guess at some aspects of commonality, but you need to check this all the time. A tactic is to treat this as a new perspective on an old reality, like the latest movie version of Romeo and Juliet. Or to acknowledge in your own mind that they are the experts on their own life.
There will be times when it is hard to maintain a positive attitude, especially when interviewees are offensive, have made bad choices, are abusive or violent, or will test you with offensive language or actions. You need to be understanding, avoid judgement and know your own limits.
Power in the relationship
As in any relationship, power differentials exist. You get your best results when power is more evenly distributed. Generally you as researcher will have more power. This can generate distrust, so you need to be open. The more power you hand over, the more open the respondent will be, but as in the language and dress, be genuine; especially in first interviews, be aware of your own nervousness. Power can go the other way. Interviewees can try to dominate, either from common practice or from fear. Business leaders, gang members, politicians, lecturers are used to dominance. Don’t push, but be clear on your control and need to direct the interview where necessary. Power can be used to avoid sensitive topics. Power can also arise from unsafe situations. Try to always make sure that you are safe and again know what you are prepared to live with. Examples of where you need to take particular care are with people on the run, drug users, sex workers, unfamiliar areas.
7. Qualitative interview process and techniques
This section explores some of the key tools and processes that you can use or explore during the interview. All play an important role in doing a successful interview and together constitute a strong technique. These are presented as simply as possible, but can in reality be complex and difficult to implement, and take much practice. This practice becomes the focus of the practical sessions. The material will be presented in conjunction with direct stories and examples from research that I have done or read.
The interview process can be facilitated by a number of functions. Usually start with a general open question. This can be followed up with other questions to cover the broader areas of interest. The role of the discussion schedules will be looked at later. Essentially you want to ensure a smooth progression of information to recreate the respondent’s world, or this aspect of it, in words. Avoid whenever possible jumping between questions and from one question to another. Three key techniques for drawing out responses include summarising, clarification questions and the why question
Summarising involves reflecting back what the respondent says to provide an immediate check on the interviewer’s understanding, and to allow the interviewee to further develop his or her thoughts on the question. Can think of it as reflective listening. This keeps you present in the interview. Maintains the flow of the interview. Shows understanding. Allows for you to constantly be doing validity checks that you are clear about what they are saying. In an ideal interview this would constitute almost all of what you would say.
Clarification questions serve to get explanations on issues raised by the interviewee about which the researcher is unclear, especially in new areas of work, unfamiliar terms or acronyms may be used. Better to try to keep the conversation going. But often you need to know what something means to understand the interviewee; for example, ‘I apologise for interrupting your flow of thinking; you keeping mentioning the term “stop” in reference to dagga, can you explain what it means?’ Once you have the response, thank the person and then put them back on track with a reflective summary.
Why questions can be used to push the respondent to think about an issue at a deeper level. A powerful and incisive tool in the interview process. It pushes the person to look behind their beliefs to what these are based on; for example, ‘you said that you felt uncomfortable going into the tik house for the first time, why was that?’ It is dangerous as it can meet with a rebuttal if used inappropriately; and even if you get a positive response it can cut you off from other information that is more broadly descriptive but important. It should only be used very selectively and not in all interviews.
Onion features of the mind
The onion model is used to signify the different levels of information and meaning of a person/respondent. On the outer levels are the more superficial levels that are easily shared and are generally shown to the world. On the next set of levels is the more personal information including some of the background and reasons for those superficial presentations. Here some information may even appear to contradict what is expressed more superficially. As you get deeper, more inherent understandings and deeper values are expressed, until at the core is found the basic values that the person uses to define themselves.
The mind map is useful as a model of how different items of information, belief, emotion, etc. link together in the mind. All of our ideas are in one way or another linked. So in an interview we may follow one path to get an understanding of how thinking in that area is structured. Then return to another strand of information to explore that. The additional complexity that is more difficult to show is how these links develop interconnections between them. But it still provides a useful model when exploring how a person constructs their information and their narrative on the question(s) that you are posing.
The person will generally present their information in the form of a story or narrative. While this is generally not a story that takes place over time as in the accounts we read on peoples’ lives, it takes on process with some of these experiences intertwined. The narrative is the connecting theme or themes that they use to mesh it all together. When these themes are identified and followed it strongly enhances the interview. These can be broad constructs of justice or love, or direct sets of experiences such as abuse or conflict, or a search for pleasure or power, or any other of a wide range of constructs.
Writing notes during the interview
Writing notes during the interview should usually be kept to a minimum. It moves your attention away from the interviewee, causes breaks in the dialogue, can interrupt the interviewee and is often interpreted by the interviewee. But minimal short notes can be useful if there is something you want to remember to come back to. Keep these short and avoid interruptions, multitask; and keep them organised in your notebook. Have a notebook and pen ready in advance, and explain in the beginning to the respondent that you may take notes and why. Especially explain that these notes do not indicate that something is more important.
The interviewee’s experience
Often a unique experience for the interviewee; an opportunity to reflect without judgement or outside purpose on an aspect of their lives. To be listened to with rapt attention for an hour. To be accepted no matter what you say. To be treated as important for what you are saying. You do not have to reciprocate and there are no clever return comments and if there are incentives you get paid to do it. It can be better than therapy or a lover. It can also be an undiluted re-experiencing of deep pain and trauma. If the respondent becomes very upset offer support and maybe take a break. Check when the person is ready to continue, if they are able to carry on. Often important for them to go through their story in an accepting environment. Avoid counselling in the interview; it is unethical and bad research. Have a referral list of options where the person can go for assistance. Remember you are a researcher here, not a counsellor or social worker.
Some respondents want to ask questions during the interview. If these are about yourself, be honest but brief, but not if this will impact on the interview. Also avoid answering questions about the interview content. Rather reflect back that you want to know what the interviewee thinks. Do not correct an interviewee during the interview, even if you feel their perceptions are dangerous to them, such as HIV does not exist. You can defer answers or the giving of correct information until after the interview. Never lie, unless you feel threatened. If you do feel threatened then get out as quickly as possible.
Ending the interview
Conclude interviews with care and with respect for the emotions and ideas raised in the interview. Any unresolved emotions need to be dealt with. Referral to outside agencies where necessary. Clarification must be given again for how the data is to be used and offering access to reports or papers. You can chat informally afterwards, still be careful about self-exposure and try to minimise talking about the effect of the interview on you.
8. Focus group interviewing
Focus groups like in-depth interviews and observation are tools in the qualitative toolbox; that is, like a hammer, screwdriver and drill, each has its specific purpose, application and place. Each has its place in the research plan. The key issue for focus groups is the use of group dynamics. Group discussions stimulate dynamic conversations, which leads to discovery, exploration, direction and depth about topics.
Definition and history
A group of individuals selected and assembled by researchers to discuss and comment on, from personal experience, the topic that is the subject of the research (Powell and Single 1996). The focus group concept is about 50 years old and, like many modern innovations, its roots date back to the Second World War. A group of sociologists were asked to investigate how the military’s propaganda films were being received by their audiences. They learned that, with proper prodding, people can identify the exact reason certain scenes, lines, or phrases make them think or act in a certain way. The typical focus group consists of a group of five to 13 participants who gather for a period of 40 to 90 minutes to talk about a prearranged topic under the guidance of a facilitator. In this respect it mimics many respects of the dinner party or pub conversation, but with the discussion being more focused and with a facilitator maintaining the process.
The power of the focus group is its capacity to draw out shared knowledge and social norms around the key issues being discussed. The discussion format produces information on group or community norms and practices, and will often draw out the extreme behaviours and sensationalism (usually associated with people outside the group). The nature of the sharing process may encourage people to share where they may not have shared in another environment.
Specific roles and use of groups
The focus group produces shared knowledge, with the group building the narrative and information system together, as against the individual interview which is done alone. It allows members to confront one another and exchange ideas, building a more coherent story. Often not good for collecting sensitive information (depending on context), unless the group is well structured for this. It cannot be used to produce a set of stories or draw out individual information and stories from all the individuals. One group is not equal to six to ten individual interviews as is commonly stated in methodology discussions.
Focus group structure
There are standard features, but as indicated above they may vary by the demands of the situation. Size should be between five and 15 members, ideally about seven to 12. One main facilitator with a co-facilitator who assists by picking up on issues missed by the facilitator and looking after the functional issues. Should last about 40–90 minutes depending on the group members. You can break the session in the middle with a toilet break or for tea. This can allow for the group to be extended in overall time. The group should be seated in a circle with no furniture between except maybe a small table for the recorder.
Set a fixed appointment at a suitable place and time. As described for individual interviews above, the place should be considered safe and acceptable. Again, privacy and quiet are fundamental, with no interruptions, no phones and out of thoroughfare. Warn those around not to interrupt. There should be no TV or computer screens, these are focal points for eyes. Use comfortable upright chairs, no sofas. The recorder should be in a place near to the (co-)facilitator in the middle of the group.
Get good equipment, particularly for focus groups as there are more people talking and most will be further from the mikes. Speak to someone experienced in the field for advice. Make sure the microphone is good. You can use external or lapel mikes. Always check equipment before the interview, including batteries and recording space. Keep checking throughout the session.
Selection of members
Sampling and selection of members important. You can select naturally existing groups such as a group of women traders, a group of school friends. Alternatively select from a predefined population. Selection of members is strategic based on the aims and objectives of the study, and on the study context. Homogenous vs heterogeneous groups. Sensitivity of content can influence decisions, for example single sex groups when discussing sex. Again, it is important to not just use convenience sampling, and all decisions need to be clearly motivated. Be clear about what questions you feel that the group members will be able to answer.
Before the session
Prepare a clear interview schedule as described in the earlier section, but take note of the particular function and structure of the focus group. If you are using others as facilitators make sure they are thoroughly trained in the content and approach. Access negotiation is very important here and that will also indicate a level of trust. Sort out any logistics such as securing the venue and transport for the participants.
Be prepared yourself. If using incentives make sure these are ready. Make sure that you have all equipment with you and that it works. May be an idea to have water, juice or a snack available depending on the group, either for the break or before the group starts. Make sure that consent forms and other necessary documents are ready.
At the start of the session
Make sure the consent process is dealt with thoroughly and the consent forms are signed. Make it clear that you are recording the interview. Explain the right of the interviewee to turn off and delete the recording. Make sure that the interview process is understood. Explain what will happen to the data and how it will be used, including confidentiality. Introduce yourself as is appropriate to the interview setting. Establish a group contract which includes protection of everyone’s confidentiality.
During the group discussion
Building rapport is your main job as a facilitator. Prepare well. A well-prepared facilitator calms everyone. Relax. Participants will pick up any anxiety. Follow the discussion schedule in your head. Start off easy to settle nerves. Use a warm-up question if necessary to get the group started. Be clear and neutral. Ask, listen and ask. Encourage interaction, including engaging with each participant informally before starting the session. Check for any problems in the group that could hinder interaction.
The facilitator role includes:
- depending on the nature of the group the facilitator may intervene more than in an in-depth interview.
The characteristics of a good facilitator would include : curious – a desire to learn, enjoys asking questions and listens to the answers; outgoing; flexible but persistent; has an open mind; can direct conversations; analytical and sceptical – doesn’t accept answers at face value. While most of us would like to believe that they possess many of the qualities, in fact good facilitators are hard to find. Reflect seriously on the extent to which you fit this description and consider whether the research might benefit from allocating this role to a colleague.
For those acting as facilitators it is important to remember that their characteristics will impact on the research process. These will include their gender, race, social class, language, the words they use, their dress, non-verbal communications and emotions. It will be useful to match the characteristics of interviewers to respondents as far as possible but it can be a serious mistake for them to try to pretend to be someone that they are not. Much better if they are honest but indicate openness and respect for others.
Facilitators must be very aware of the power relationship that exists in the focus group. Their task is to hand over leadership to others temporarily not to relinquish it entirely. The are some standard procedures that should be observed. It is important to allow ample time for responses after posing a question. You will probably need to curb dominant participants, draw in quiet ones and act to politely prevent multiple conversations. All contributions should be carefully acknowledged. If pivotal points are raised, take time to make sure that all participants have understood what has been said. Try to avoid any expression of your own opinions, including using body language which seems to welcome or reject a contribution. You will need to keep careful track of the process and be aware of both alliances and tensions within the group. Be prepared to break strategically if necessary to avoid disruption. Remember that:
- People sometimes cannot explain why they behave the way they do. Many behaviours are automatic. People don’t think about them.
- Attitudes are complex. They consist of knowledge, perceptions, beliefs, feelings, desires, and opinions buried deep in the subconscious mind. People cannot explain their subconscious.
- Emotions influence behaviour. But again, most people can’t explain their emotions and many prefer to keep sensitive emotions secret.
- Issues of culture are often not directly explainable and the respondents may not even be aware on what they base their belief or behaviour.
- Participants will tend to serve up socially acceptable opinions. They don’t want to reveal their inner secrets about sensitive matters particularly in a group format.
- Members are affected by one another and so will respond in terms of the group norms, which may not be true to them.
- Most often participants will talk about the behaviour of others, rather than their own.
- You may get sensationalist responses. These do need to be treated with care in the analysis.
- The more sensitive the topic, the more on guard the facilitator should be.
The role of the facilitator can be made much easier if they have an efficient and sensitive co-facilitator. Their role includes: monitoring for people or issues that the facilitator may not be picking up on or giving enough focus to; keeping a check on the group atmosphere, particularly of any changes in mood; writing key ideas down, using the words of participants or paraphrasing for brevity. The co-facilitator should introduce themselves and explain their role at the start of the focus group. While the group is taking place they should take responsibility for preventing interruptions from outside, keeping a check on the recording equipment. They should debrief with the facilitator immediately after the session.
Hidden helpers may also exist. In all groups there will be a person who will act as an unscripted supporter in maintaining the discussion. If you can identify this person and work with them they can make the process considerably easier. Do not expose them. They will not be aware that they are playing this role, but are generally more sensitive to the issues and group process so try to assist to keep the group online.
Use the same set of approaches as in the individual interview, namely general questions, summaries, clarification questions, why questions, and prompts. Be aware that there are often group members who create problems or who need special attention. These include difficult people who complain and disrupt: Dominators; Perpetual commentators; Silent members; mockers; and those who break down the energy, for example by making it clear that they cannot wait for the end, or focusing outside the group. Problematic interruptions include cell phones going off, toilet breaks, interpersonal interactions, going over time – at least in some group members minds – and interference from outside that is picked up by group members.
When ending the session, conclude the process with care and with respect for the emotions and ideas raised in the sessions. Any unresolved emotions need to be dealt with, refer to outside agencies where necessary, and clarification must be given again for how the data is to be used, and offer access to reports or papers. It can often be useful to chat informally with respondents afterwards.
Limitations of the method:
Focus groups can be of little value, with respondents aiming to please the facilitator or other members of the group rather than offering their own opinions or evaluations. Data is often ‘cherry picked’ by researchers to support a foregone conclusion. The disastrous introduction of ‘New Coke’ in the 1980s is a vivid example of focus group analysis gone bad. Often groups will not draw out individual information or stories. These may appear but generally as part of the group narrative, so individual stories needed to used carefully in the analysis.
9. Participant observation
Observation has always been fundamental to health research. Part of cohort design, patient observation, observation of epidemics. All have systematic processes of information collection that vary in form. Qualitative research uses systematic detailed descriptive observation of behaviour and conversation. The researcher can either observe from a distance, or involve themselves directly in the respondents’ lives and experience their reality. Historically this method was also one of the earliest forms of qualitative research, linked to ethnography and the old anthropological investigations of different cultures. Involved living in and interacting on an ongoing basis with a community or immersion. Then began to be applied to different groups in the urban environment – Chicago School. Ethnographic research still key. Amongst first applications to health was observing TB patients process through treatment.
The interview-based methods are more efficient in that the respondents comment directly on their lives and influences, but cannot be certain of the truth and there are limits to what people will talk about; so observation is the next step. Based on the assumption that understanding can only be achieved by participating in and observing the subject’s world. The ultimate immersion of actually living full-time in your subjects’ world can take place, but this is often a strategic and practical decision.
Power of the methodology
By observing over an extended period and becoming part of the life of the observed you get access to considerably more information. The key issue is that observation is done in the natural setting. In their own setting people are much more open to showing the unexpected and offer broader insights. Those observed by virtue of being in their own reality are more likely to act and speak naturally and in a way that reflects their context rather than being constricted by a research context. So potentially a very powerful tool.
The pure observational methodology is not commonly used in health research, but aspects are widely used. In most examples where observation is used, it is part of a mixture of methods used. These include operations of health systems, examining how decision-making is done, disease history, spread of epidemics, health risks and even risky sexual behaviours, patient observation and cohort studies, understanding sites conducive of high risk, development of research questions, intervention development and intervention evaluation particularly their application.
It can also be useful as part of note-taking and descriptions of the context of other data collection. Everything that you see could be important. Diaries of research can be important in all research (the use of research diaries is covered in a later section). Need to decide what is important for any project. Observation is often ignored in general projects as it is perceived as too subjective. Or the observed information is too difficult to draw into analysis, but this often leads to vital information being missed.
Observation data come from various sources: observations of behaviour, conversations with participants, chance interactions, even secondary sources. Data usually take the form of written descriptions with some additional sources that may include numbers, e.g. number of drinks sold in a bar. Done from the perception of the observer/researcher/fieldworker. Even if observation is not a core focus of the research it is important to do initially to get familiar with the site, especially if the research site is unfamiliar. In reality, observation should be part of every research project as it allows for additional detail to be noted, even in quantitative research.
Data in observational research
What is data for observational research? It is everything that you observe, that you hear. Even the other senses can also be useful, smell, touch, even taste. Also you need to note what you do not see. Note what you see in front of you and what people try to hide.
Workshops are another potential source of information. These are facilitated directed forums for discussion. Workshops extend over a longer period than a focus group – up to several days – so there is more space to explore issues. People can move in and out. Have a more structured discussion process including an agenda and a chairperson. Can be much larger in size, can go up to 50 or 60 people and still be manageable. They are not always audio-recorded, so can rely on notes or even documents developed in process. They usually have an action purpose as well as research. Workshops are often set up for purposes other than research, such as planning, forums for discussion of principles or interventions, meetings between larger groups of people, as places for negotiation, for education and training amongst other purposes, but these can still be used for research purposes. If the subject is of interest to its participants workshops tend to produce enormous amounts of information. However, if it can be constructed for a research purpose it allows for that data collection to become more systematic.
Workshops can be appropriate in multiple contexts. They are useful for policy critique, evaluations, intervention development, intervention planning and assessments of context prior to beginning work in a new area. Workshops often form a part of a broader research process. Outputs may have multiple forms – output documents, transcribed notes, behavioural observation. Each of these can contribute to a triangulated analysis or can produce its own set of outputs. It is a methodology familiar to NGOs and CBOs and has been used extensively.
Specific practical arrangements are required. Workshops are very different from the focus group in that respondents need to come prepared to discuss the material towards a specific goal. The venue needs to be large and suitable for large group discussion, preferably with breakaway rooms for smaller group discussions. A strong chair(s) and scribes need to be included to facilitate and record the discussions. A clear agenda is needed, preferably sent out in advance. It is often useful to get participants to prepare inputs in advance. Catering is often required, especially if the meeting continues over a full day or more.
Selection of participants
The nature of the workshop will often define the participants. Usually it is people and or organisational representatives that are involved in or knowledgeable about the issue or place under discussion. Selection is directly purposive to find the best possible people to participate. Usually only one or at most two workshops will be held so there is a focus on getting the right people there. Care must be taken about the mix so that conflict does not occur or one group inhibits another.
Given the length of time that a workshop takes and the comparative size, it is not always practical to record and transcribe a full workshop. This may depend on the relative importance of the event. The most common data are the workshop minutes, sheets of newsprint, jointly created documents, copies of the direct prepared inputs from speakers and notes from small groups. The scribes should also take more detailed notes of discussions. Recordings can be kept, but these may be of less importance in this approach. A separate scribe may be introduced to note group dynamics and group process separate from the content. Discussions with participants during breaks may also give information on the process and on content that cannot be raised publicly. Validity of the outcomes can be confirmed by constantly checking back with the members about resolutions taken. Minutes can also be distributed for members to confirm.
Again, specific considerations apply. Setting of clear aims and objectives that should be agreed to by all, a more power balanced approach. Selection of who should participate. Setting of agenda and process. Facilitating these meetings requires strong skills and benefits from experience. Be clear on the potential need for confidentiality and the use of final documents.
The workshop series uses this approach in a developmental manner. The first workshop takes discussion to a point where some level of clarity has emerged. This discussion is summarised by facilitators and distributed prior to the next workshop and used as a basis for discussion there. Additional investigations may be undertaken in the interim to further inform discussion. This process can continue until a full understanding or solution is obtained, or there could be a specified number meetings based on resources. For example, a situation review of a clinic in which there is high patient dissatisfaction may go through several cycles to identify all the problems. A policy review may also need to focus on different issues such as content, implementation, resources. Additional information or the need for additional participants to adequately address an issue may be identified in an earlier workshop and then included later. Sometimes the complexity of the issues at hand requires periods of contemplation.
An adaptation of the Delphi technique would be one type of workshop series. Arises in a highly specialised situation where the researcher has to find out ‘the truth’ or at least some shared account between two or more groups of informants. Sessions are repeated with the divergent groups both separately or together where conflicting ideas are confronted and some shared position found, or some evidence sought. This process continues until there is agreement or sufficient shared clarity.
Workshops can be conceived in terms of a social action model. The social action model implies a process of community development or social change. At its simplest level, a problem is identified and the community is brought together to find a solution. Ideas are tested and then at a later fixed time, the community meets again to decide the next steps until a solution is found. The workshop methodology is crucial as a way of drawing different relevant parties together. The advantage the workshop holds here is the openness of the method to allow multiple participants to contribute, providing it is well facilitated.
11. Additional methods of data collection
Qualitative research methodology can be plastic and open to adaptations so you can use your imagination to develop ways to collect relevant information. You need to do whatever you can to understand the reality of your target populations. You have to operate within the overall philosophy and ethical practice. These are some examples that are found in the literature or that I have developed for use. But these are not the full universe of options. Use the material from the other sections to set your base and then work with combinations of these.
This is more an approach to organising data around the ‘object’ being examined than a specific method. It has a long history in health research. There are some famous case studies of individual patients such as those reported at length by Sigmund Freud. But they can also done on a community and even regional or national basis. Mainly observations of patients with key biological, symptom and other data. It is very often the first step in research from which later research programmes develop. Can also be used to give life to the description of larger experimental design. Can be used for describing interventions, treatment centres, communities, issues of social concern amongst other phenomena. Usually involves the integration of data from multiple sources. Interviews, groups, observation, use of existing documentation, analysis of ongoing reports that may have been designed for other purposes. Will often require interaction with quantitative data, cost measures and health information. Very useful for initial implementation processes for interventions, particularly when applied to a real-world setting. Set aims based on context and methods, and collect information specific to these. Often requires triangulation of results in the analytic phase.
Example: Implementation of a prevention of mother to child transmission (PMTCT) programme in a new clinic. Ascertain use and quality of services and to identify problems or blocks to service. Observation of prospective mothers at information meetings and in the clinics during return visits. Interviews with mothers, nurses and counsellors. Checking documentation on adherence to regime for antenatal care. Following a cohort of women over time from testing to the delivery of her child to completion of treatment.
This approach can quickly expose problems such as: areas of risk, e.g. not entering programme due to lack of understanding; areas of potential breakdown, e.g. queues for service and attitude of staff; lack of clarity about interventions in contexts, e.g. informed consent to test; circumstantial problems, e.g. stigma; and additional concerns, e.g. information and lifestyle problems. It is especially useful for complex interventions or contexts where it is difficult to take all potential influences into account. It is an often criticised approach, primarily as there is no comparator and so findings cannot be easily generalised. But that is not the only role of research. Description is important and can be highly enlightening. This is where the case study is vital. Very little is generally written about this approach.
Advantages of the case study
It is possible to use any information relating to the subject in a case study. To look at the material in the depth required. There are fewer restrictions than are implied in the comparison-based research. It offers a lot of space to explore new ideas as they arise in the research process and is ideal for describing processes and setting up protocols for work or treatment.
Case studies can be made more rigorous. Take a range of measures over time, rather than using memory. Use standardised measures. Check validity and reliability of the data used. Get some independent person(s) to collect or check data. Use accepted technical approaches from other methodologies, e.g. recording and transcribing. Check for data sources that may have been missed, especially those that may show a different outcome. Again be clear on the limitations and present these. Closeness to the data can create a false sense of certainty. Get another person to read your report critically.
Analysis of existing documents and objects
Existing documents and visual media have a long history as a basis for interpretation. This has dominated schools of history, literary studies and media studies amongst others. The idea is to draw out information on those who produced them, or those that are represented in them, or about the time or context that they lived in. This can be the distant or recent past, even the current period where it is hard to contact people directly involved.
Sources of information include regularly produced documents like minutes, disciplinary hearings, school reports; media articles including news reports, reviews, cultural or fashion assessments; event records like formal or evaluation reports, media records or photos/videos; fictional stories including novels, short stories, poetry, movies; photos or videos of events; tweets and blogs; SMS-based counselling sessions on HIV; plus anything else that you choose.
You can get access to groups or events that you could not get to otherwise, including in other countries, or events long past. There is often a lot of material available. Some of the Disadvantages are that you have to deal with the subjectivity of the person writing the reports or taking the pictures. It may be hard to get around this. For example, history is written by the winners of conflicts; often the right material is not available; and you have no control over the production or the nature of the information. Key methodological issues still apply. Aims and objectives have to be clear. Often an initial objective is needed to identify how much information is available. You sample by selecting which material to use, such as. choosing only one day of newspapers a week. If the items you are looking at are more current you may wish to interview people involved in the event. Analytic method may be similar to what is described in the later section or a more innovative interpretive approach may be required. Photos certainly need a more interpretive approach.
Projective assessment, role play and visualisation exercises
Projective techniques, role play and visualisation exercises are all techniques that allow the researcher to get beyond the mask or publically acceptable face that we all present to the world. These are tools that can be added to specific studies where needed. They are sophisticated approaches and should only be used with proper training and with great care. Projective testing is usually the domain of specifically trained mental health professionals. The most recognised projective tests are the Rorschach and Thematic Apperception Test (TAT) and drawings, but you can use a broad range of images and get people to reflect on them. Interpretations are then made about what they see and how this makes them feel.
Role play involves creating a situation in the research space where the subject(s) are able to play out a real life situation. Visualisation exercises create a similar effect, but without the acting out. With young children observation of play and interpretation of drawings can elicit information. Again a high level of skill and knowledge for the researcher is required here. The person needs to be asked about their experience once the exercise is complete. Data recording is similar to observation for the role plays, but digital recordings should also be done. These approaches are useful in situations where the information you seek may be suppressed, or in situations where the data are sensitive and the person will find talking uncomfortable. For example, looking at the impact of trauma such as rape or child abuse or experiences of torture, looking at the motivations of perpetrators of crimes or abuse, or looking at unconscious factors that may influence behaviours. These are intensive exercises that require additional time and particular consents as you are seeking information that the person may be hiding intentionally or unintentionally. They require a mixed analytic approach to take different elements of the data into account.
Developed for policy analysis and review of technical documentation, such as diagnostic scales; especially useful if you are not highly skilled in the area of the review. Send a copy of the document to a set of experts with terms of reference stating exactly how they are meant to comment on the document. Each reviews it and writes a brief report for a small fee. Data are the reports from these experts. Define your experts according to the task that has to be done. Sample across your experts, get variation and quality. The sample of experts is usually small, four to ten people, as it is expensive. Set tight terms of reference. Pay in accordance with the task, 50-500 euros or more. Need to pay professional fees. Analysis is a modified content analysis, doing a cut and paste across the different reports, with you adding continuity discussion.
The telling of life stories is becoming increasingly important in health research. For institutional development and background, histories of illness and exposure, histories are communities and health concerns. Trauma and its impacts have made strong use of this method. In this the person is asked to recount their life story as a whole, or to talk about particular periods usually in the context of their full life story. Oral history is a particularly powerful tradition in this area. It is usually done through interviews, but often more than one interview, even up to five or more. Stories are often backed up by pictures, newspaper stories, other published accounts, friends’ stories, objects of importance from a particular time. The District Six Museum (), and other similar museums or places or groups of people, are great examples of such collections of information. Confidentiality will vary according to the purpose of the interview. Data are tape-recorded. The discussion follows the person’s life, but may focus on particular times, events or issues. Multiple people who share a common experience will often be interviewed to find a shared understanding of the event. The analysis aims to clarify a period of time or event in the context of this person or multiple people’s lives.
Sometimes where it is difficult to access stories or memories, special techniques can be used; body scans; ropes and stones; memory boxes; even scrapbooking and photo essays or posters.
Body scans: can be used with trauma survivors or people with eating disorders. The outline of the body is traced on a large sheet of paper. The person then writes on the paper in different parts of the body about pains that they experience there, associations with that part, things that happened to that body part, why they love or dislike that part of themselves. Putting it on paper provides a distance allowing the person to talk about their life and experiences or body with less anxiety and trauma.
Ropes and stones: one of a range of approaches used to reconstruct life stories. This was very successfully used with refugee children in West Africa. Children had often travelled for more than a year. Even before migration their lives had been unsettled. Trauma was confused between what they had experienced in their home country, on their travels and now in their current destination. A rope is laid down on the ground which represents the child’s life. Stones are bad events, sticks are other important changes and buttons are positive events. This ordering is often a shared task, with the child leading and others, the researcher, other children or family assisting. It allows events to ordered and for the child and the researcher to obtain a sense of order. The child can then tell the story.
Memory boxes: both a therapeutic approach for grief and a way of gathering information. Developed around the HIV epidemic for families where parents with young children are dying. A memory box would store the parent’s life story and messages from the parent to the child. It would usually include a book with a written story, photographs, print media where available, favourite objects, books, etc. Most often used as a family tool for passing on memories of parents, but has been used as a research tool to really understand the lives of those families affected by AIDS. At a simpler level, scrapbooks could form a similar function in a different context.
Photo essays or posters: involve respondents being given a camera and being asked to go out and take photographs that hold meaning for them around a particular theme. The photos are organised into a narrative or used to create a poster around the theme, for example HIV prevention, domestic violence, nutrition, etc. These outputs form the basis of analysis. The researcher can also interview the respondents on these outputs.
12. Final thoughts on data collection
Be creative in your approach. Qualitative methods are plastic and directed at finding key information. Remember the key rules of inquiry. Respect your participants and stay within ethical rules and guidelines. Be aware of the impact of different subjectivities on the data being gathered. Be aware that some new methodologies require additional levels of skill and/or new technologies and equipment.
Keeping journals of observation and of your own experience
The power of observation has already been noted. This can become part of all research, but it does still need to be systematised. Research journals kept by the researcher can facilitate this. These need to be used continuously. Everything that you see could be important, but you also need to make decisions about what is important for the project. Note not just what is seen or heard, but your own reflections on the study as well. Observation is often ignored as it is perceived as too subjective or too difficult to draw into analysis, but this often leads to vital information being missed. It involves similar skills and focus as in the participant observation methodology.
Remember that the researcher is the research instrument; documenting the world as it is seen. So subjectivity and awareness of what is shown and not shown are key. The capacity to read, understand and where necessary interpret is important; even more complicated when working with fieldworkers.
Three diaries need to be kept, especially by a researcher in training. These are less formal than the notes kept in the formal observation research and are mostly just recorded on the basis of date. Their impact on the analysis varies from being directly incorporated into the data set, to developing an understanding of context, to being a part of the analytic process. Each diary has a different focus and role.
The first diary is of observations in the field. This literally means recording anything you see of interest, including the physical context, people that you see, events that happen, notes from conversations or even the impact of the other senses. It is not only what you see, but if you are working with fieldworkers or know others familiar with the context, recording their insights. May be used as data. Mostly used to enhance your memory and insight into the context. Can lead to the development of additional questions or research areas.
The second diary is the record of your ongoing analysis and the development of ideas. To record ideas, interpretations and hypotheses about the subject matter. Less rigorous and not done strictly by date, but useful to have a chronological process to record how your ideas developed. Can include crossing out, adding side notes and personal reflections. Keep even while not in the field. This diary should run from the start of the project until reports or papers are complete.
The third diary covers the researcher’s self-insights, emotions raised in the field and lessons learned. Especially for researchers in the early period of training it is useful to keep a personal diary. This is used to record lessons learned; the impact of interviews or experiences, emotionally, intellectually, spiritually even physically; ideas for self-improvement; technical notes for future research or interviews; and questions or concerns that need to be reflected on in the future. It is useful for the personal development of self as an instrument and encourages an active approach to reflection.
13.Preparing for analysis
Transcribing and preparing for formal analysis
Taped material should be transcribed and where necessary translated. The task of transcribing should not be underestimated. Estimate six hours of transcribing for one hour of recording, nine hours if translation is required. This is where you start to value the extra money you spent getting a good-quality recorder. Use this transcription process to reflect on your own interviewing technique and skill. Pages of transcription need to be prepared for analysis. They should be done in a standard word format, not in a table. Use 1½ line spacing and have a wide margin, at least 7cm on the right.
Decide on the level of detail for the transcription. You need to make sure at a minimum that all words are included. Decide whether to add: notes on body language; interruptions; time taken in pauses; speed of speech; and other expressed emotion, such as crying, laughing, anger. Make sure that it reads easily and correctly, especially if translated. If transcribed in the original language do not correct grammar.
Add a brief introduction to each transcript to give background to the interview. This can assist in contextualising the interview during analysis. Include: demographics of respondent; some background and reason for their selection if purposive; context of the interview, place and time; notes on other events in the interview, e.g. interruptions, noises around, interference; any specific notes on the interview, e.g. emotion, body language, anxiety; and notes on self, e.g. if very tired, was emotionally affected, felt angry, felt excited about the information.
Validity checks can be done on the transcription. It is better if the person who did the interview does the transcription. Someone else needs to read through it. This is where the summarising as described in the interview process becomes important again. The summarising offers clear validity in terms of understanding. Even where summarising is done, both the voices of the interviewer and the respondent need to be transcribed.
Where you have to do translation as well as transcription, it is generally better to do both translation and transcription at the same time. Need to give this more time. Doing both together allows for a meaningful translation rather than a direct translation. Need to take into account that words and even more so sentences can have more than one meaning. If you are working with a translator, everything from all three participants has to be translated and transcribed. This is important as translators working in the field do not always give accurate translations.
Transcribing programmes can be downloaded from the web at no cost and are usually adequate to do the work. You can get specialist machines that operate with a foot pedal but this is not essential. Express Scribe is one programme that I have used. You can select either the free version which works extremely well or can purchase a more sophisticated version with the foot pedal. The media player on MicroSoft Office can also be used.
Preparing for analysis is important
The emphasis here is to work slowly and carefully. If you are feeling frustrated and irritated, it is better to take a break. Remember that these are your data so it needs to be done carefully. Checking means listening to everything again. Do not change sentences even if the grammar is bad. However, if translating do not just use bad grammar in English (or whichever language you are using). This is part of analysis and immersing yourself in the data. It is particularly useful for early-stage qualitative researchers to transcribe their own interviews as it provides a direct point for reflection on your own interviewing skills.
14. Analysis and interpretation of qualitative data
I am going to outline a general basic method that I have termed contextualised interpretive content analysis. The process of analysis is that of generating meaning from the texts that you have. For qualitative analysis this needs to take the form of a narrative. A range of approaches to analysis have been developed. These provide the basis for analysis, but with experience this can be adapted to the demands of the situation. No absolute rules for analysis, but there are some key processes and checks. Taped material should be transcribed. The task of transcribing should not be underestimated. Pages of transcription need to be prepared for analysis. Make sure that the descriptive context is included. Data analysis is not a set of distinct steps, points of analysis overlap and you can constantly backtrack on earlier decisions, even to reset aims and objectives. But the constant systematic process must be maintained from aims to the final conclusions.
Process of data analysis
Analysis starts from the point of the conceptualisation of the project, but the focus increases over time, especially as the data come in. Gradual movement towards greater attention on analysing and producing conclusions. Focus once formal fieldwork ends, although in an open-ended project you can always go back into the field. Many important insights come in these early phases that should not be lost. Keep your research diaries going.
Immersion in the data
Immersion means becoming completely familiar with your data. Especially if you do your own interviews and transcription you should already know the content well. Read and reread the transcripts so you can get beyond the linear account. Also read your notes that you made in the field and in development. Read with interview background that you did in your mind. This allows for them to be seen as a cohesive unit. Keep making notes.
From your knowledge of the material you can develop your themes. It is better if these are induced or arise out of the material, but there will also be themes that you will want to impose based on your research question. So a mixture, but tend towards induced themes. Remember these are the building blocks for extracting meaning. Use the language and context of the interviewees to name the themes. Imposed themes will draw from theory, but you can still represent them using interviewees’ words. Go beyond the obvious descriptive terms to look at processes in the data. Look for relationships between issues in the data; power dynamics that lie behind behaviour or choices; concepts that have been inferred or induced from the material and that appear consistently, such as gender power relationships, controlling forces such as land owners or religion, drives such as desire or power, examples of incorrect knowledge, desire for pleasure.
Spend time doing this exercise as this will guide the rest of your analysis. You do not want to have to code all your interviews again. Test your system of themes against a few interviews. Then if necessary redo your themes. Keep your aims and objectives in focus. The themes need to reflect back to these. Remember you can still adjust aims and objectives to take into account new findings. Keep word length of themes short, best as a single word or a few words. This makes the coding and analysis easier. It also usually widens the range of meaning that can be assigned to a code, but they must remain descriptive. Add definitions to each of the themes to ensure accurate coding. Remember that this is just an initial phase of the analytic process. You still have to interpret the data. But this sets the constructs on which you will do the analysis. Useful to have a separate Word document where you are developing on your themes. The development of these themes is a part of the analytic writing.
You now need to organise your themes. Ideally you need to end up with about 40 to 60 themes. These should be divided into about five to eight groupings. Each grouping is headed by a major theme with a number of sub-themes included below this.
Example from nutrition: rather than list every option, a theme list needs to be able to link common ideas, so for example in a cluster on healthy eating, you may list as themes. And then set definitions for each, for instance healthy foods in fruit, vegetables, certain meat and dairy in limited quantities, etc.
- Healthy foods;
- Unhealthy foods;
- Balanced diet;
- Correct food preparation;
- Uncertainty about foods.
Eventually you need to establish a reasonably stable set of themes. Use these to code the data. Literally reading the full text and attaching a theme to pieces of text. A piece of text can be a line, sentence, paragraph or a string of paragraphs. A piece can and often does have multiple codes, as any piece of text can carry multiple meanings. Pieces can overlap, so you can code a paragraph with one theme and then code one sentence inside it with a different code.
As you are coding, new themes or areas of interest may arise that you need to add to your list. Or new meanings may be attached to existing themes. This may necessitate you going back to recode earlier interviews. There are different systems of coding:
- Computer-based coding (provided in the analysis software discussed below);
- Writing in the margins (similar but less control and functionality, also can get messy);
- Colour coding (problem of having too few colours and overlapping themes);
- Cut and paste functions (again problem of overlapping themes and taking pieces out of context).
- Working with a computer program is strongly recommended. Software makes coding simple, easy to change, overlaps are clear and it allows for a cut and paste function (in this case the interview name is listed with each quote so context is maintained).
Research memos are notes that you make to yourself. These are a component of the notes that you have kept throughout. In the analysis these become more intense. Memos can be notes on points of interest, possible alternative meanings of a section of text in the transcript, ideas for further analyses, or hypotheses that you may want to test at a later stage in the analysis. In the computer programs these can be added easily as general memos or attached to specific pieces of text.
Elaborating is the process of making connections across time in an interview and across interviews. As the interview progresses and the conversations get deeper key themes will be returned to several times. Connections need to be made between these points as each return offers a new perspective. Connections across interviews also allow for different perspectives, but context must be taken into account. This occurs once you have immersed yourself in your data and are looking at quotes across interviews out of the sequence that they originally appeared. This is a point of time for reflection and where necessary expanding your analysis. Here the researchers need to establish a distance between themselves and the text and comment on it, while being aware of their own connections to the material.
Interpretation and establishing conclusions
Now all the blocks are in place to do the deeper interpretation. Hypotheses are inductions that you have developed as you went through your analyses. These need to be checked against the data and then either accepted, modified or rejected. If your theme development was done well and reflected your original aims then these themes will become the focus for interpretation, but not exclusively. So the section titles of your reports will reflect these main themes, or whatever themes you have selected for this output.
Interpretation has to happen at multiple levels. To illustrate, each of your section headings will present a discussion of how this area of the content was addressed and interpretations have to be made on that content. Try to take the reader through a narrative account of the material. Usually start with how the main theme is understood, and then look at variations and developments around this. Quotes are used to illustrate key points. In the final discussions and conclusions these points have to be drawn together into a more coherent single analysis. This process really gets down to skill, experience and knowledge of the content area. Again core to the analysis is establishing a distance between the researcher and the text, to be able to reflect on the material. The researcher does need to acknowledge limitations and the possible impact of these on the research and analytic process. Then depending on the paper, recommendations can be added. These can be for policy, activism, further research or adaptations to functions. These need to draw down from the aims through the research and analytic process and be clearly connected to the final conclusions drawn.
Although it is the domain of social constructivism, it is generally agreed that context is important in understanding what people say. Context works on multiple levels. In the interview, statements need to be read in terms of what else the person is saying at that time. In relation to the general subject matter of the interview. The specific context of the interview, for instance a patient will respond differently if interviewed in a clinic or their home. Their own background, including gender, age, race, educational background, etc. All of these could lead to a different interpretation of what a person says in an interview. The emotional state that they are in. Further questions may include: ‘Were they drunk or on drugs?’; Or ‘Had a recent traumatic experience?’. Then the interviewer would also impact, for instance characteristics such as race and gender, level of skill, state of mind, etc. How did the subject matter influence how they constructed their ideas? For example, people may talk about sex differently if it is constructed as part of love, or pleasure, or pornography, or HIV.
Verifying interpretation and testing validity
Qualitative methods has produced new challenges for showing validity. New different tools to those of quantitative approaches. So some tools such as inter-rater reliability are overvalued and outdated. As are the use of numbers to verify findings. Qualitative methods do not fit with these validity checks due to the open and adaptable methods, and the freedom of thinking and interpretation encouraged. New approaches are needed. Looking at the data for verification. Researchers can set up hypotheses/theories about the subject and then search the data for contradictions. When these occur, use that information to further develop the theory/hypothesis and then continue looking until these stabilise.
Having an audit trail is important so that the process of getting to a conclusion is described. Triangulation of the data against other methods represents a more sophisticated approach of looking at the results from different methodologies. Validity can be enhanced by drawing in additional people, either by getting other researchers to look at the same material and see if they come up with similar results, or by getting the community from which the respondents were drawn to comment on the material.
These two approaches do however have to take into account the role of subjectivity. Looking at the theory developed or the deeper interpretations made to see if they are useful in explaining the data in the study and if these have a value in explaining the broader world. This checks more external validity. Can see how others in the field respond to the results, both at a scientific and lay level. Again a measure of external validity.
How do these results translate into social action if that is part of the focus of the research, or into implementation if done in advance of introducing an intervention. The quality of description of the context and research process is key to looking at the potential for applying the knowledge gained here into other settings.
Involvement of self
It is important to recognise your own impact on the research. Researchers influence the results throughout the process from the setting of the research question to the final conclusions, no matter what the methodology. This impact is increased in qualitative research. The inclusion of self into the text can be done in multiple ways from statements done in great detail, to a brief mention either in the introduction or analysis. Generally important to state upfront the position from which you approach this research and then in the conclusion to again state what influences you may have had on the findings.
Using computer programs for analysis
A great tool and advance for qualitative analysis. Major tools are Atlas.ti and Nvivo. They provide much greater control over your text, themes and memos for analysis. Memos can be attached to individual pieces of text or be general notes to self. Allows you to change and develop themes and coding as you go along. Facilitates the cut and paste process, while retaining links to the original documents. Very useful for validity checks as it produces all quotes on selected themes. It cannot do your analysis for you. You still have to read and define your themes and do the coding. You also have to draw out the analysis. So you cannot get it to do your work or hide poor work behind it. Are expensive and like all software programs are constantly being developed and improved. Also have to aware of bugs and problems with them.
To be avoided
Numbers are useful when looking at the context of the interviewees only, such as in more fully describing your sample – for example, six undergraduate students doing research projects, three were male, two are in first year, etc. Remember you are not doing bad quantitative research. Inter-rater reliability is open to interpretation if you have more than one person coding data. Even statements like ‘most felt this’ and ‘smaller number felt that’ are inapplicable.
The other crutch is to try to find a methodology of analysis and apply it slavishly. This will give you a narrow perspective. The analytic process is creative and flexible, and must be treated as such. The analysis cannot only be summarising, this is lazy. Even when put into themes, the role of the researcher is to provide insight. Making statements on the obvious, such as ‘many people like to eat unhealthy food’ is also not sufficient for analysis.
At the same time make sure that an interpretation you make can be grounded in the data. You can make further postulations, but then state that these are assumptions and will need further research. Also avoid individualising a text, by getting too involved with a particular individual so anthologising or psychologising the person too deeply, remember all are products of our context. Or by getting personally involved with the text, such as by attacking or idolising the person. You can critique a position taken, for example from an interview with a person who advocates for children to eat lots of sweets, but avoid the personal conflicts.
Production of papers and reports
Need to decide what the focus is of the document that you are writing. One project may thus produce several outputs. Even a small project of ten to 15 interviews could produce two or three papers. A large project of 100 or so interviews could produce ten or 12 papers. These will all draw on the same coded set of data. So each paper may need to have a modified set of aims and objectives. Most effective writing is now done for publication either in academic journals or for community access. For here, we will focus on the core paper arising from a small project. The stated aims of the research become the focus of the paper. This is where the ordered flow of your analysis really facilitates the writing and final paper production. Each of your major themes can act as a section heading in the analysis. The sub-themes under each become the major discussion points. The conclusions draw these together to make a final statement.
Balance of discussion and quotes
Finding the right balance between discussion and quotes is sometimes hard. Especially when the material is interesting there is a tendency to add more quotes and for the paper to be dominated by these. There is also the false belief that by just using quotes bias is removed. The researcher has the responsibility to lead the analysis and present the findings. The quotes are there to illustrate these conclusions and to contribute to the narrative. As a rule of thumb there should be at least twice as must discussion text as there are quotes.
Quotes are used in two ways: These are either as separate paragraphs using a narrower column width and possibly different line spacing, generally used with longer quotes; or alternatively to use short quotes and put them in inverted commas as part of the text. Can use both but generally better to use one dominantly otherwise it can look disorganised.
Freire, P. (1970) Pedagogy of the Oppressed, New York: Seabury
Powell, R.A. and Single, H.M. (1996) ‘Focus Groups’, International Journal of Quality in Health Care 8.5: 499-504
Sikkema, K.J.; Watt, M.H.; Meade, C.S.; Ranby, K.W.; Kalichman, S.C.; Skinner, D. and Pieterse, D. (2011) ‘Mental Health and HIV Sexual Risk Behavior Among Patrons of Alcohol Serving Venues in Cape Town, South Africa’, Journal of Acquired Immune Deficiency Syndromes 57.3: 230-37
Skinner, D. (2000) ‘An Evaluation of a Set of TRC Public Hearings in Worcester: A small Rural Community in South Africa’. Psychology, Health and Medicine 5.1: 97-106
Skinner, D.; Tsheko, N.; Mtero-Munyati, S.; Segwabe, M.; Chibatamoto, P.; Mfecane, S.; Chandiwana, B.; Nkomo, N. and Tlou, S. (2006) ‘Towards a Definition of Orphaned and Vulnerable Children’, AIDS and Behaviour 10.6: 619-27
Watt, M.H.; Aunon, F.M.; Skinner, D.; Sikkema, K.J.; Kalichman, S.C. and Pieterse, D. (2012) ‘“Because He has Bought for Her, He Wants to Sleep with Her”: Alcohol as a Currency for Sexual Exchange in South African Drinking Venues’, Social Science and Medicine 74.7: 1005-12
Institute of Development Studies
As discussed in Chapter 7, in order to ensure a degree of independence, implementation research will typically not be funded from the intervention budget and the level of funding will almost always be relatively limited. Implementation researchers are therefore not in a position to insist that the data they need to meet their specific objectives should be made available within the general monitoring and evaluation (M&E) system. However, where possible, they should aim to be involved in the design of that system and may be able to negotiate modifications that serve their purposes. This may be possible either because those modifications are also seen as valuable to those with overall responsibility for the implementation – for example supporting operations research activities – or because additional resources from the implementation research budget are made available to complement those allocated to M&E within the intervention.
The above implies that implementation researchers will rarely be able to embark on independently managed and funded large-scale primary data collection activities but will have to rely mainly on the intervention M&E system, special studies using the ‘qualitative methods’ described in Chapter 8 and secondary sources. The key responsibility in this case is to adopt a systematic approach to determine the quality of the data to be analysed. This will be an ongoing challenge, given the tendency for data quality to vary over time, possibly improving initially as innovative systems for data collection are introduced and the enthusiasm of those involved is stimulated by access to new equipment and training workshops, but often deteriorating as that enthusiasm declines and systems fail to work as intended.
An obvious starting point when assessing data quality is the existence and completeness of those data. Missing information on facilities, providers, patients, etc. not only limits the scope of the analysis that can be undertaken, it typically biases the findings of that analysis. As a rule it will be the less well-resourced, less well-managed, most remote locations that are most at risk of providing incomplete data. Failure to recognise this trend can lead to a seriously over-optimistic view of intervention progress. Given that data are available, the statistical agency of the European Commission (Eurostat 2007) defines data quality as having five additional desirable attributes:
- Accessibility and clarity.
Accuracy – essentially whether data reflect the true value of a given quantity – is obviously very difficult to test. However, we can check for obvious outliers – values that are almost certainly too large or too small. This can be a very important check, as many statistical procedures are highly sensitive to outliers, which can seriously distort the findings of any analysis. If we have time series data, excessive changes between one period and the next may also indicate measurement or recording problems. In some cases, issues may become apparent if we calculate rate or ratio indicators. For example, is the number of patients seen per day per doctor plausible? Are the recorded financial data compatible with patient numbers? In addition, by examining the frequency distributions of selected data items, as discussed below, it may be possible to determine if initial assumptions about those items have proved valid. For example, attempts to assess patient satisfaction using a scale often result in a distribution in which almost no patients chose the lowest points on that scale. This should raise questions as to whether the scale we are using is an accurate reflection of patient attitudes.
Timeliness reflects the delay between the occurrence of an event or phenomenon and the availability of the associated data items. It is relevant in terms of both routine data systems and the intervention M&E system, which should be providing time series data that allow us to track implementation progress and link intervention inputs to outputs and outcomes. For example, a training workshop at a given point in time may be intended to result in improved staff performance, which is expected to produce better health outcomes by some future date. Assessment of the extent to which this sequence of events has taken place may be complicated by excessive delays in the availability of data from some sources, given that, as with missing data, such delays will often be associated with facilities or agencies that are performing less well.
Comparability relates to differences in concepts and measurement tools/procedures between sources – e.g. facilities, geographical locations, etc. – or over time. This can be a particularly serious problem in research on health systems, where different providers often choose to specify their own diagnostic and treatment protocols. Some unqualified providers may record all patients with a fever as suffering from malaria, while others rely on a range of rapid diagnostic tests. Some hospitals may record a diagnosis of tuberculosis based only on a chest X-ray while others require identification of Mycobacterium tuberculosis from clinical specimens. Additional issues arise if definitions are modified over time, perhaps as a result of the intervention itself. For example, faced with restrictions on outpatient costs by health insurance schemes, providers may simply vary their accounting procedures such that costs are transferred to inpatient departments. Such possibilities need to be carefully considered to avoid misinterpretation of apparent trends over time.
Where analysis involves the combination of data items from different sources it will be necessary to assess the extent to which there is coherence between those items. For example, to determine the extent to which some conditions remain untreated in the population it may be necessary to combine aggregates calculated from facility routine data systems with estimates of the prevalence of those conditions based on data from existing surveys. Those responsible for compiling these two sources will typically have used very different concepts, methodologies and instruments because they had distinct objectives. To the extent possible, any analysis must evaluate and try to address the implications of these differences.
Accessibility and clarity are perhaps most usefully understood as denoting the extent to which a researcher has the required level of understanding about the true nature of the data they intend to analyse. At a minimum this implies a careful review of the necessary ‘metadata’ – documentation that describes how the data are intended to be collected, compiled and stored – assuming these are available. However, it will often be clear that this documentation has limited relevance in terms of how those responsible for these activities proceed in practice. For example, heath staff tasked with introducing new procedures for collecting and recording patient data will typically soon find ways to reduce the time and effort required for an activity that they will often regard as a pointless addition to their workload. They may make assumptions as to the personal characteristics of the patient rather than ask the appropriate questions, or decide to enter data into the computer at the end of the day rather than ‘waste’ time after each patient visit as intended. The potential for misinterpretation if such issues are not addressed is evident. It may be frustrating to discover that an intended analysis cannot be undertaken as planned because the required data are not as originally assumed, but, as emphasised in Chapter 7, implementation research demands the highest standards of integrity and that includes ensuring that data limitations are thoroughly examined and addressed.
2. Rapid surveys
One additional methodological tool that can prove very useful in implementation research is the rapid survey. This description is usually applied to relatively small-scale surveys, typically of around 200 subjects or less, which aim to collect a very limited number of items of quantitative data over a short time period, say 5–10 days, that can be analysed and interpreted within at most a few weeks. Rapid surveys can target a variety of populations including facility records, health staff, patients, households and individuals. They adopt the probability sampling approach described in Chapter 7, and can therefore be analysed using statistical inference procedures that provide unbiased estimates of population ‘parameters’ (quantitative information) and reliable estimates of error bounds on those parameters. They should not be used to address more complex questions, for example the detailed operation of new incentive schemes or implications of new mechanisms for reimbursement of user fees. It is often assumed that surveys can be used to 'find out about' a policy question. In fact, the successful planning and implementation of a rapid survey typically requires that a great deal is already known, both about the population to be surveyed, and about the subject matter under investigation.
Rapid surveys are usually cross-sectional but may also be used to track changes over time. The small sample size and limited number of data items greatly reduce the administrative and logistic burdens associated with large-scale, multi-topic sample surveys, particularly those associated with the recruitment and training of field staff. This does not imply that such surveys should be undertaken without due consideration of the implications in terms of resource allocation. Though they are sometimes described as ‘lightweight’, it is essential that they are designed and implemented with all the rigour that should be applied to any study that intends to claim the respect that is reserved by many policymakers for findings derived from traditional statistical surveys. The range of tasks to be undertaken is identical to that required for a large-scale survey, even if the content of each is much more limited:
- Questionnaire design;
- Sample design;
- Mapping/listing to create the sampling frame;
- Preparation of fieldwork manuals;
- Recruitment of field staff;
- Training of field staff;
- Field enumeration and supervision;
- Transport and communications;
- Data preparation and processing;
- Computer analysis.
A number of the above require human resource management skills that some researchers either lack or are reluctant to practise. Again as with large-scale surveys, a key point to bear in mind is that not all of those involved in a survey can be assumed to have a direct personal stake in achieving a successful outcome. Without effective management and supervision, supported by a system of incentives and penalties, many will not perform to the standard required. Apart from such practical issues, it is also necessary to give some thought to the legal and administrative context within which surveys are undertaken. Are there laws that prohibit data being taken from a patient record or doctors providing information on the health status of an individual? If these activities are legal, is it necessary to obtain permission from a relevant administrative agency before undertaking them? Even if we have such permission, perhaps only because that agency wishes to encourage the intervention with which we are associated, we should still consider if we are abiding by the ethical criteria described in Chapter 6. As this is intended as a practical guide, we would also advise consideration of any potential political implications of our survey. Are we addressing a sensitive issue? Might some stakeholders be concerned that we are gathering information that might be used to their disadvantage? What are the possible implications in terms of our overall research activities?
Another set of potential constraints that may impact on the quality of the survey are those relating to the targeted respondents. Can we assume, given that we have the appropriate permissions, that they will be cooperative? If you have undertaken the detailed stakeholder analysis discussed in Chapter 5, that should provide information that will help you make such a judgement. Does what you know of your intended respondents indicate that they may have reasons – guilt, embarrassment, suspicions about your motives – to be concerned about providing you with data? Might they be irritated by what they see as an interruption to their normal activities? For example, in many countries busy frontline health workers often tend to regard all record-keeping as a largely pointless chore that takes time away from patient care. On the other hand, might their desire to be helpful – perhaps because they regard you as a high-status individual, or simply out of a natural tendency to be polite – lead them to provide data that they know to be unreliable and/or incomplete, rather than risk disappointing you?
This raises another issue. Even if respondents are cooperative, do they have access to the data you require? Do those data relate to current knowledge that they almost certainly possess or memories that may have become less reliable over time? Will they need to consult records? If so, do such records exist and are they complete and reliable? Finally, it is important to remember that one major disadvantage of questionnaire surveys is that it is very difficult in practice to ensure that every question will be interpreted in precisely the same way by all respondents. As a first priority, we should, if possible, try to ensure that the enumerator and respondent share fluency in a common language and that the questionnaire has been translated into that language (standard practice is to translate the questionnaire and then translate back for comparison with the original). Obviously, every effort should be made to avoid ambiguity and complexity in language. One useful approach is to deliberately try to identify any remotely possible way in which questions might be open to misinterpretation. Health-related surveys raise particular issues, in that researchers sometimes casually use technical terms that seem commonplace to them but that may be interpreted quite differently by some sections of the surveyed population. For example, the term ‘inpatient care’ is usually taken to imply at least one night spent in hospital, but could be seen as applying to any individual who has received treatment in a hospital inpatient department, for example reclining on a bed to receive a saline drip. Similarly, to a researcher the word ‘doctor’ may signify a qualified, licensed professional. In a remote village the same word may be used for an unqualified traditional healer.
The sampling designs used in rapid surveys, as in the great majority of large-scale surveys, are based on the combination of a relatively limited number of elements:
- Simple random sampling;
- Systematic (list) sampling;
- Stratified sampling;
- Sampling with probability proportional to size;
- Cluster sampling.
The differences between these procedures can best be understood by considering a simple example. Suppose we wish to estimate the proportion of hospital clinical staff who have understood the basic principles of an innovative procedure following a one-week training course. If we have a list of all the staff in the hospitals, we could take a random sample and calculate the proportion of our sample who can answer a few simple questions about the procedure. That could be used as an unbiased estimate of the proportion of all staff that would have been able to answer those questions, which we could interpret as the proportion with a good knowledge of the procedure. There are two slightly different types of random sample. In simple random sampling (SRS), we would select members of staff sequentially from the full list, which allows for the possibility that we may select the same member more than once. In simple random sampling without replacement (SRSWOR), we again sample sequentially, but this time excluding any member previously selected.
SRS is used as a standard sample design to which others are compared. It allows calculation of simple estimates of the required sample size for a given level of precision (the size of the lower and upper error bounds for the estimate). Thus:
A sample of size 100, selected using SRS allows estimation of a proportion to a precision of +/-10 per cent with 95 per cent confidence;
A sample of size 400, selected using SRS allows estimation of a proportion to a precision of +/-5 per cent with 95 per cent confidence (note that improving precision by a factor of two requires increasing the sample size by a factor of four).
We can interpret ‘95 per cent confidence’ as implying that only 5 per cent (1 in 20) of such samples would be so misleading as to result in an estimated proportion that was further away from the true population parameter. We simply assume that we have not been so unlucky as to have chosen one of those samples.
In theory, SRS can be used as the reference to calculate the efficiency of any given sample design:
Efficiency = precision of SRS/precision using alternative design and same sample size
However, typically we do not have sufficient information to estimate efficiency but may simply be aware that one design is almost certainly more efficient than another. This is important because even though we may not be able to calculate the cost of achieving a given level of precision, a more efficient design can deliver increased precision for the same cost – i.e. it will probably result in a better estimate.
As indicated in Chapter 7, taking a random sample requires repeated use of a set of random number tables, a computer program or more recently a mobile phone app. Systematic sampling is a simpler approach that involves selecting a starting point on the list at random and then sampling every kth entry, where k is equal to the total number of entries divided by the required sample size (k=N/n). When the end of the list is reached, the process continues from the first entry. Except in rare cases where the list happens to follow a pattern that increases the risk of a biased sample, it can be shown that this procedure produces unbiased estimates with sampling errors that are at least as small as those from a random sample. It can in theory be more efficient than SRS if the list is ordered by a variable that is related to staff performance, for example if the names are listed by hospital, because there is less risk of selecting an unrepresentative sample – that is, one that does not include staff from all hospitals. However, as indicated above, it will typically be impossible to estimate the extent of the gain in efficiency.
If we suspect that knowledge of the procedure among staff may differ substantially between hospitals, we can increase the precision of estimation – i.e. narrow the gap between the lower and upper bounds on our estimate – by ensuring that our sample must include staff from every hospital. For example, in each hospital we might take a constant proportion (k=n/N) of staff members. This would be a stratified sample, with each hospital being a separate stratum. This sampling design results in a reduced sampling error compared to random sampling because we have excluded the risk of selecting samples that were mainly from under- or over-performing hospitals, samples which would have under- or over-estimated the proportion of knowledgeable staff in the total population. The unbiased estimator of this parameter is calculated as a weighted sum of the proportions in each hospital (pi), where the weights are equal to the number of staff in each hospital (Ni) divided by the total number of staff (N), i.e. P = Σ Ni/N x pi.
Stratification by a range of other variables, for example gender, age or grade of staff, etc., might similarly be used to reduce the sampling error, if it were suspected that they were also associated with differences in staff knowledge. A basic principle is that the more information we have about our target population, the easier it will be to develop an efficient sampling design. It is important to understand that a stratified sampling design is intended to provide better estimates of overall population parameters. In the above example, we would be calculating statistics for individual hospitals and it will be tempting to compare, for example, average performance levels between those hospitals. However, that was not the purpose of the sample design and we will usually find that we simply do not have sufficient observations in each hospital to make such comparisons reliably.
For the estimation of a range of key population parameters, including totals, averages and rates, large entities – villages or urban districts with large populations, hospitals with a large number of inpatient beds, diseases with a high prevalence rate – are obviously very important in terms of their contribution to those parameters. In the above example, failing to obtain data from a small district hospital would have little impact on our overall estimate of the proportion of staff with adequate knowledge. However, failing to include staff from the largest national hospital could easily make a substantial difference to our estimate. One way to address this issue would be to stratify staff by size of hospital. As an alternative, the probability P of including a staff member of a given hospital in our sample could be made proportional to the total number of staff in the hospital, for example for all staff members in hospital i, the probability of being selected is:
P(i) = number of staff in hospital(i) / number of staff in all hospitals
This approach, called sampling with ‘probability proportional to size’ (PPS), increases efficiency by increasing the probability of inclusion in the sample for staff from large hospitals, thus decreasing the risk of taking a sample that excludes staff from these hospitals.
The PPS design is most commonly used in cluster sampling (Bennett et al. 1991), also called two-stage sampling. In our example the hospitals can be regarded as ‘clusters’ of staff. If we decided that it would be too expensive, for example in terms of travel and accommodation costs, to send enumerators to every hospital within our study area, we might decide to (1) select a sample of hospitals and then (2) select a sample of staff within each of those hospitals. A common sampling design would be to use PPS to sample hospitals and systematic sampling to sample staff within each hospital. Cluster sampling almost always involves a loss of efficiency for a given overall sample size. As not all hospitals will be surveyed, if there are differences between them this design introduces a risk of selecting a sample of hospitals that is not representative. This risk increases as the number of hospitals in the sample decreases. Cluster samples typically need to be two to ten times as large as an SRS to achieve the same precision.
Estimation of sampling errors
In each of the above designs, the sample selected is simply one of the many that might have been selected using the same design and with the same sample size. The sampling error of an estimate is essentially a measure of the variability between all possible values of that estimate that might have been obtained from different samples. One way to reduce that variability is to increase the sample size but that will imply a higher cost. The other is to improve the sample design, adopting sampling procedures that attempt to maximise, for a given sample size, the proportion of possible samples that will provide estimates that are close to the population parameter. We can never ensure that the sample that we do obtain meets this requirement, but using probability sampling we can make a reasonable estimate of the sampling error, which determines the risk of a ‘bad’ sample, and use this to modify the sample design or increase the sample size to ensure that it is less than some designated level – typically 1 in 20 or 1 in 100.
For a simple random sample (SRS) the sampling error can be estimated using:
sesrs = √ [2/n]
where (i=1..n) are the sample values, is the arithmetic average or mean of those values and n is the sample size. If we are willing to take a risk of 1 in 20, a remarkable mathematical result called the Central Limit Theorem, which can be derived from the basic definitions of probability theory, allows us to construct a 95 per cent confidence interval for the mean of the sampling population:
Population mean = ± 1.96 sesrs
Or a 99 per cent confidence interval:
Population mean = ± 2.58 sesrs
Note that the more confident we wish to be, the wider must our interval be. A similar formula can also be applied to confidence limits for proportions, as in the example discussed above.
Example: A study was designed to evaluate the effect of integrating ITN (insecticide treated bednet) distribution on measles vaccination campaign coverage in Madagascar (Goodson et al. 2012). A national cross-sectional survey was undertaken to estimate measles vaccination coverage, nationally, and in districts with and without ITN integration. To evaluate the effect of ITN integration, propensity score matching was used to create comparable samples in ITN and non-ITN districts. Relative risks (RR) and 95 per cent confidence intervals (CI) were estimated via log-binomial models. Equity ratios, defined as the coverage ratio between the lowest and highest household wealth quintile (Q), were used to assess equity in measles vaccination coverage.
National measles vaccination coverage during the campaign was 66.9 per cent (95 per cent CI 63.0–70.7). Among the propensity score subset, vaccination campaign coverage was higher in ITN districts (70.8 per cent) than non-ITN districts (59.1 per cent) (RR = 1.3, 95 per cent CI 1.1–1.6). Among children in the poorest wealth quintile, vaccination coverage was higher in ITN than in non-ITN districts (Q1; RR = 2.4, 95 per cent CI 1.2–4.8) and equity for measles vaccination was greater in ITN districts (equity ratio = 1.0, 95 per cent CI 0.8–1.3) than in non-ITN districts (equity ratio = 0.4, 95 per cent CI 0.2–0.8).
It should be emphasised that the above formula is only appropriate where the sample is selected using simple random sampling. There is a tendency for researchers to ignore this requirement and use the formula whatever the sample design adopted. As indicated in earlier chapters, the argument in this volume is that implementation research findings are too important for such disregard of established analytical procedures to be considered acceptable. To illustrate the problem, consider that rapid surveys will almost always adopt some form of cluster sampling. This implies that the above has to be modified to include a ‘design effect’, which measures the ratio of the sampling error of the cluster sampling design to that which would have resulted if an SRS design had been used.
Population mean = ± 1.96 secluster = ± 1.96 x design effect x sesrs
The design effect will vary depending on the extent to which the clusters differ from each other. If this is large compared to the variability between the individuals within each cluster, the risk of sampling clusters that are unrepresentative is large and the design effect is large. In the above example, if the staff in some hospitals had all been well trained in the new procedure while in others training had been minimal, taking a cluster sample of a small number of hospitals would run a substantial risk of over- or under-estimating the overall level of staff proficiency.
The implications of a large design effect on the appropriate confidence limit bounds can be substantial. Table 1 below compares standard errors using the SRS formula with the appropriate standard errors for a clustered design where the clusters were villages. Note that the design effect varies considerably, from 1.13 for primary completion rates (limited between village variation because pupils travel to school) to 4.08 for improved drinking water (high proportion of variation between villages because this relates to a village level facility).
Table 1: Comparison of SRS and cluster sampling errors
|item||m|| SRS |
|Design Effect|| Cluster |
|Availability of ITN*||0.05||0.01||1.54||0.01||0.03||0.07|
|Iodised salt consumption||0.82||0.01||1.77||0.02||0.78||0.87|
|Improved drinking water||0.75||0.01||4.07||0.06||0.64||0.87|
|Primary completion rate||0.86||0.03||1.13||0.03||0.80||0.93|
|Attends secondary school||0.81||0.01||1.48||0.02||0.77||0.85|
*Insecticide treated bednets
One reason for the inappropriate use of the SRS formula by research was the difficulty of calculating the correct sampling error, which often requires a considerable familiarity with the methods of theoretical statistics. However, such calculations can now be undertaken using well-established software packages such as STATA and SPSS, which require only that the researcher provide a detailed description of the sample design adopted.
The WHO Expanded Programme on Immunisation (EPI) surveys
The origin of the ‘rapid survey’ concept is often dated to the WHO ‘30 by 7’ cluster surveys that were introduced in 1978 to obtain rapid, inexpensive but reasonably reliable estimates of child immunisation coverage (Lemeshow and Robinson 1985). The target population is subdivided into a complete set of non-overlapping ‘clusters’, usually defined by geographic boundaries (typically villages or urban districts). A sample of 30 of these clusters is taken with probability proportion to size (PPS) and then a ‘quasi-random’ sample of seven households with children in the relevant age range is selected within each of these clusters. Following this procedure, coverage estimates can be obtained that can be confidently assumed to be within ±10 per cent of the true value. The basic immunisation coverage survey instrument (the seven children sampled per cluster are typically recorded on a one-page document) usually records simply the cluster location, the age and sex of the selected child and their immunisation status. A similar methodology has been applied in rapid nutrition surveys, which have often been applied in emergency situations (Prudhon and Spiegel 2007). In this case it is usually recommended that the second-stage sample size should be increased to 30 children (SMART 2005).
The approach has attracted some criticism. Turner, Magnani and Shuaib (1996) focus on the lack of formal probability sampling of households within clusters. For example, one popular technique involves selecting a random direction from a central location within a village or urban district (traditionally by spinning a bottle). The households from the central point to the edge of the community in the chosen direction are listed, one is selected at random and then that household and its nearest neighbours are visited until the required seven children have been enumerated. None of the commonly used methods meets the basic requirement of probability sampling, that every eligible member of the target population has a known, non-zero chance of being selected. Simulation exercises suggest that the risk of sampling bias is substantially higher than in a conventional cluster sample. The paper suggests that a relatively simple modification can retain the advantages of the ‘30 by 7’ design while ensuring a true probability sample. This involves: the production of a simple sketch map of each selected cluster; dividing this into segments of roughly equal size; selecting one segment at random; and interviewing all eligible members of the target population(s). This approach also addresses the common situation where surveyors attempt to gather information on multiple indicators (e.g. vaccination and childhood illness incidence rates) from the same sample.
Myatt et al. (2005) argue that while the PPS approach used in the ‘30 by 7’ surveys may result in improved estimates overall, the associated tendency to sample areas of high population density may lead to a judgement that reasonable coverage has been achieved even where more remote, low-density areas have been severely neglected. They argue that this is of special concern in the case of feeding programmes, where a priority objective may be to identify such areas before children become severely malnourished. They describe an alternative approach which was first trialled in 2002 in the Mchinji district of Malawi where a district-wide feeding programme had been implemented. A 10km by 10km grid was overlaid on a map of the district. All those squares (quadrats) with more than 50 per cent of their area within the district were sampled. Communities nearest the centre of each quadrat were then sampled, with the sample size determined as the number that could reasonably be surveyed in a single day, based on the size of each community and the distance between them. All children in a community were screened to identify those suffering from malnutrition using a standard anthropometric criterion. Coverage in each quadrat was calculated as the proportion of malnourished children included in the feeding programme and overall coverage estimated by treating the quadrats as a stratum in a stratified sample. The survey was reported as proving simple, inexpensive and rapid, providing results within just ten days.
3. Quantitative analysis
As discussed in Chapter 1, implementation research has two broad aims:
- Understanding implementation processes, focusing on mechanisms that support or constrain those processes;
- Communicating that understanding to the multiple stakeholders who may contribute to the integration of findings into current and/or future implementations.
Those stakeholders may include:
- The implementation team;
- Providers and other actors in the health sector;
- National and local policymakers/officials;
- NGOs and CBOs;
- Donor and other international agencies;
- Beneficiary communities;
- The general population.
A key issue is that very few of these stakeholders will have specialist knowledge of quantitative or qualitative methods. It is therefore of central importance that analysis and, most importantly, presentation of findings must be carefully considered to avoid potential misinterpretations that could lead to inappropriate responses. Emphasis needs to be placed on simplicity and interpretability – stakeholders need to both understand the information provided and interpret it correctly (Walker, Bryce and Black 2007). In terms of quantitative analysis, this implies an emphasis on simple summary statistics such as:
- Counts, means, medians, ranges, percentiles;
- Rates, ratios and (for some stakeholders) risks;
- Frequency distributions, proportions and percentages.
This does not imply that complex analytical techniques are never appropriate; only that final communication of the analytical findings should meet the above criteria.
Designing analysis by purpose
A second important preliminary consideration is to clearly assess the primary objectives of any analysis – what specific issues are you trying to address? Implementation research is by nature intended not to simply describe specific implementations but to improve the process of implementation. For example, we might focus on:
Effectiveness: Research that aims to modify implementation procedures in order to improve the flow of benefits that result from a given level of resources. This is typically the primary aim of implementation research. It should also assess ‘how effective’ and ‘for whom’?
Efficiency: Analysis that attempts to assess the implications of possible modifications to the implementation process in terms of the value of benefit flows relative to resource costs. The aim will be to improve the benefit/cost ratio.
Equity: Analysis of distributional issues, i.e. how are benefits and resource costs distributed, typically relating to population subgroups?
Sustainability: Focus on identification of essential inputs, potential constraints on their availability and other possible barriers to medium- and long-term sustainability.
The aim in this section is not to teach statistical methods but to consider, given the objectives described above, the most appropriate choice of methods in the context of implementation research. Five main areas are addressed:
- Frequency distribution and summary statistics
- Relationships and confounding variables
- Subgroup analysis
- Statistical models
- Generalising from samples to populations.
A note on levels of measurement in quantitative studies
Variables are usually classified by their ‘level of measurement’:
- Rational, e.g. weight of child, number of vaccinations;
- Interval, e.g. temperature, some disability measures;
- Ordinal, e.g. facility levels, quality of life indices;
- Nominal, e.g. district names.
The level of measurement should determine the appropriate type of analysis – for example, using an ordinal dependent variable in a regression contravenes one of the assumptions of such models. Researchers often ignore such restrictions. However, as previously indicated, because the findings are explicitly intended to influence important implementation processes and to be interpreted and used by a wide variety of stakeholders, it is probably reasonable to set a higher standard in implementation research.
Distributions and summary measures
Implementation research data can be seen as distributions of the values of study variables over selected study populations. For example, we may consider the distribution of white blood cell counts across patients, the numbers of children under five across households or outpatient attendances on a given day across primary facilities. Analysis can be seen as the use of techniques intended to summarise those distributions and estimate the extent to which they are related. For example, in a sample of newborn children we might summarise the distribution of birth weights by calculating the frequency of low, normal and high weight births, classifying as ‘normal’ those in some standard range. If we also calculated the frequency of different education levels for the mothers of those children, we could estimate the strength of a possible relationship between these two variables.
This use of frequency distributions, which show the number of values of a given variable that fall in each of several non-overlapping (mutually exclusive) groups, for this purpose (table 2) has a number of advantages. They are useful for all types of variable, easy to explain and interpret for audiences without specialist knowledge and can be presented graphically (figure 1) and/or in different formats to aid interpretation.
Table 2: Provider education frequency distribution
|Level of education of private providers||Frequency|
|Primary school certificate||57|
|Secondary school certificate||11|
Frequency distributions provide an extremely useful approach to the presentation of large volumes of data. In the above example, information relating to 250 people has been used to construct one small table and, very importantly, no information has been lost in the process – that is, it would be possible to regenerate the original list of data values given the table. There are a number of interesting alternative ways of presenting the above data. We often, for example, calculate the ‘relative frequency’ (proportion, percentage) of data items that fall into a specific class. Again, to provide a slightly different perspective, we can ‘cumulate’ these percentages to show, for example, that 94.8 per cent of our population have at most a primary school certificate as in table 3.
Table 3: Alternative presentation of a frequency distribution
|Level of education||Proportion||Percentage||Cumulative percentage|
|Primary school certificate||0.228||22.8||94.8|
|Secondary school certificate||0.044||4.4||99.2|
Similarly we can experiment with different graphical displays. Figure 2 below shows the percentages as the segments of a pie chart. Note that the percentages are rounded to whole numbers. As a general rule, it makes sense to present data only to the degree of accuracy that (a) it warrants (estimates are almost always based on data that contain errors); and (b) makes the point we wish to make. Excessive precision (for example expressing numbers to more than one or two decimal places) confuses the eye of the reader and reduces impact.
Defining groups for frequency distributions
A key decision in constructing a frequency distribution relates to the choice of groups. In the above examples, the educational attainment groups were predefined. However, we often have to decide how to specify such groups in order to best summarise a given data set. For example, incomes will need to be grouped into ‘income bands’ and age data into ‘age bands’. The way in which this is done will depend on the aims of the analysis. Demographic analysis, for example, will often aggregate ages into fixed five- or ten-year age bands, such as 0-4, 5-9, 10-14, etc., with a final open-ended class such as ‘75 and over’ (note that these classes are defined such that there is no overlap –the second, for example, relates to ‘all children of five years or older but under ten years’). An educationalist, on the other hand might use classes such as 0-5, 6-12, 13-15, 16-18, 19+, where the classes are defined in line with the official age ranges for specific levels of education, for example pre-school, primary, lower secondary, etc.
Just as above, we can construct a frequency distribution based on these classes, showing the number of people falling into each age band. However, here the definition of groups does involve a loss of information. Given the number of children in the 5-9 age band, we cannot deduce the ages of the individual children in this class. More frustratingly, we cannot, in the above example, derive the frequency distribution preferred by the educationalist if we are presented with that derived by the demographer. This can be a major problem because we often wish to combine distributions from more than one source. For example, we might know the number of children in primary school and wish to express this as a proportion of all children in the 6-12 age band. If we only know the numbers in the 5-9 and 10-14 age bands, we cannot directly calculate the number we require and have to resort to a complicated weighting procedure based on more or less plausible assumptions.
How should classes be defined? In some cases, such as types of facility or staff salary ranges, official definitions may be most appropriate. If such obvious classifications do not exist or do not serve our purposes, we usually try to balance two conflicting objectives – limiting the loss of information (by using a relatively large number of classes) and providing a simple summary (by using a relatively small number of classes). In general, we would also prefer to make all the group intervals of equal width, because this simplifies comparisons between one group and another. In table 4, for example, a much higher percentage of the studied school-age population are in the second age band than are in the third. However, this is obviously at least partly because this class covers a greater range of ages: seven years as compared to three.
Table 4: Percentage distribution by level of schooling age=bands
Note that the column chart below, which is derived from these data, does not reflect the differences in class width. The age bands are used simply as labels for the columns, which are all of equal width. It is the height of the column that shows the percentage falling into each age-band.
Joint frequency distributions
One of the simplest and yet most powerful techniques for analysing and presenting data involves comparing the frequency distributions of two groups within the study population. Table 5 takes the data used above and disaggregates by the gender of the respondent.
Table 5: Joint frequency distributions for two or more variables
|Primary school certificate||32||25||57|
|Secondary school certificate||8||3||11|
Doing this reveals interesting new information. Although almost the same number of men and women were asked (128 and 122), it would appear from our sample that educational achievement is much higher for the former. We can make the comparison clearer by using the relative frequencies or percentages based on the total number of individuals in each group. Table 6 shows, for example, that 52.5 per cent of women are reported to be illiterate as compared to 32.8 per cent of men. Obviously the conversion to percentages would be even more useful if the numbers in the two groups differed more substantially.
Table 6: Joint distribution using column percentages
|Primary school certificate||25.0||20.5||22.8|
|Secondary school certificate||6.3||2.5||4.4|
The above table can again be presented graphically in a column chart, as in figure 4.
An alternative presentation, that can be useful if we wish to focus on the composition of each class, is obtained by calculating row percentages based on the number of individuals in each education group (table 7).
Table 7: Joint distribution using row percentages
|Primary school certificate||56.1||43.9||100.0|
|Secondary school certificate||72.7||27.3||100.0|
When interpreting percentage distributions it is always important to check on the absolute size of the denominator on which they are based. For example, the above table shows that 50 per cent of those with a higher-level qualification are men and 50 per cent women. Before getting too excited about this apparent example of gender equality, we should note that only one man and one woman are in this class!
Summary statistics and frequency distributions
Careful examination of the frequency distribution of a variable can be an extremely powerful and robust form of analysis. Unfortunately it is often bypassed. There is a tendency to move too quickly to the calculation of simpler ‘summary statistics’ that are intended – but often fail – to capture the essential features of the distribution. These usually focus on the derivation of measures:
- to indicate the overall ‘location’ of a distribution – how sick, poor, educated is a study population ‘on average’?
- to indicate the extent of ‘variation’ within that population.
However, the reasons for selecting a particular summary statistic should obviously relate to the purpose for which it is intended. For example, if we ask the question ‘Has the recently implemented intervention reduced the problem of malnutrition among five-year-olds in this village?’, there is no doubt as to which of the following possible summary statistics would be more useful:
- Change in mean daily calorie intake of all five-year-olds in the village, or
- Change in proportion of five-year-olds in the village falling below a predetermined minimum calorie requirement.
Bearing in mind the above discussion about the need to present research findings in ways that are appropriate to the various stakeholder groups, appropriate criteria for the selection of summary statistics might be: (1) is the statistic clearly relevant to the specific concern we wish to address; (2) will stakeholders understand how it was derived; and (3) will stakeholders interpret it as intended – that is, are they taking what we would regard as the right message from the information we are providing? We can consider how to apply these criteria by considering some simple examples.
Mean or median?
There is a tendency for quantitative analysis of continuous variables to start by comparing mean values over time – for example by how much has the mean cost of treatment increased – or for different sections of the population, such as how does the mean length of stay in hospital vary between urban and rural populations? The mean is the most commonly used statistic, often seen as the ‘natural’ measure of central location and used without much thought. However, this is mainly because it is simple to calculate and manipulate. In the days before analysis was done by computer, it was relatively easy to calculate means either by hand or using a calculator. Moreover, given the means for two population groups (for example, two health districts) it was very easy to calculate the mean of the combination as:
combined population mean = (n1 x population mean1 + n2 x population mean2) / (n1+n2)
where n1 and n2 are the number of observations in the two populations.
On the other hand, we know that most people tend to misinterpret the mean. They assume that it can always be seen as representing the ‘typical’ value in a population, for example interpreting GDP/capita as the income of a typical person in a given country. In practice this is only a valid interpretation in the case where the underlying frequency distribution is symmetric, for example the so called ‘normal’ distributions that tend to occur for physical measures such as age-specific heights and weights. For example, in Figure 5 the mean birth weight is 7.5 lbs, which can be seen as a providing a reasonable idea of the typical birth weight of a baby in this population.
When the distribution is ‘skewed’, as in Figure 6, the mean can be seriously misleading as an indicator of the situation of most members of the population. It is pulled to the right by the limited number of individuals with high values. Such distributions are very common for variables such as expenditures, income, wealth, lengths of stay in hospital, etc.
Where the distribution is skewed in this way, the median value may be a better guide. It has the additional advantages of being easy to define and interpret – ‘line up the population in order and identify the one in the middle’ is relatively easy to explain to all stakeholders. The use of medians may be particularly important in analysis of data sets liable to errors that may include extreme outlier values (it is not unusual, for example, for an individual to accidentally add a zero to a number). Including these outliers in the calculation of the mean, which as indicated above is sensitive to large values, can give rise to biased results. The median is not affected. An alternative approach sometimes used to deal with outliers is the ‘trimmed mean’. For example, a 5 per cent trimmed mean removes the smallest and the largest 5 per cent of data values from the studied population and re-computes the mean using the reduced sample. This can be a useful approach but has the major disadvantage that it often appears somewhat arbitrary and increases the difficulty of explaining results to stakeholders.
Even the median is not much help in more complex distributions, such as the ‘bi-modal’ in figure 7. This type of distribution is often found where two subgroups are combined, for example patients in urban and rural hospitals. The most useful analysis in such cases involves identifying and separating the subgroups. This again emphasises the need to understand how variables are distributed in order to summarise them in ways that are helpful rather than misleading.
Measures of variation
In a population that has relatively limited variability in terms of the variable in which we are interested, a measures of location can be seen as reasonably ‘representative’ of the overall population and there is limited loss of information if we use this as a summary measure. If all those receiving treatment for malaria pay roughly the same amount, we lose little by describing the median or mean payment as ‘the cost of malaria treatment’. On the other hand, if the amount paid for treatment of tuberculosis varies substantially across cases, use of the location measure alone would not be an appropriate summary of the data. We would be losing valuable information. Essentially, high variability implies that we have something to explain. Is the variability between urban and rural areas, between facilities within those areas, between patients who are insured and those who are not?
The variance is a measure of variability that considers all the data values relating to a study population. It asked the question ‘how far away on average are the data values from the mean’? If we were considering length of stay, for example, and for most patients the stay in hospital was close to the mean, we would say that the distribution was relatively equal – with limited variation ‘about the mean’. To calculate the variance we first determine the differences between each value and the mean, the ‘deviations from the mean’, square each of these differences, find their sum and divide by the number of values:
variance = ∑ (xi - m)2/ n
Note that the size of the variance can often be determined by a limited number of deviations that are much larger than the rest. For example, if we have 100 inpatients and 48 stay in hospital for two days, 50 for three days and two for 20 days, the mean length of stay would be 2.86 days and the variance 6.24. Without the long-stay patients the variance would be 0.25. Simply using the mean and variance to summarise this data would lead to the incorrect interpretation that length of stay varied considerably, while in fact it would be much more useful to report that it appeared to be almost constant but for a few exceptional cases.
This effect results from the squaring of the deviations – squaring a large number produces a very large number. We saw above that the mean was influenced by outliers, but this effect is much more pronounced for the variance. The earliest use of the variance as an indicator of dispersion was in the field of scientific measurement and here it was considered an advantage that it was so influenced by outliers. These were either errors of measurement or extremely interesting data points – both of which required explanation. However, in social research it may often be an undesirable characteristic, first because the errors are typically of less interest (for instance caused simply by poor reporting), and second because it tends to focus analysis on attempts to explain the behaviour or experiences of a small number of individuals in what may be a fairly homogeneous population. Analysis of the differences in length of stay between small rural hospitals and the main teaching hospitals may be interesting, but from a policy perspective it would be probably be differences between the rural hospitals, if they were similar in most other respects, that would be more relevant.
The standard deviation is the square root of the variance. It has similar characteristics but also the advantage that it is expressed in the same unit of measurement as the original data. In the example above the standard deviation for all patients would be √6.24 = 2.5 days and without the outliers it would be √0.25 = 0.5 days. The standard deviation is a very important measure when we consider sampling from a population.
Another commonly used measure derived from the variance is the coefficient of variation. This is defined as:
Coefficient of variation = 100 x standard deviation / mean
and provides a measure of variation relative to the mean. This is a useful statistic when comparing the variations of data sets that have substantially different means. For example, if we were to compare the variation in incomes for a population of hospital doctors as compared to a population of nurses, we would probably find, if we used any of the measures described above, that the former was considerably larger. However, this would be at least partly due to the generally higher incomes of doctors as compared to nurses. Essentially, the higher the incomes the more scope there is for variation. The coefficient of variation is not affected by this issue. Another advantage is that it is a pure ratio, which has no unit of measurement (because both the numerator and denominator have the same measurement unit). Thus, for example, we could directly compare the variation of incomes that are expressed in different currencies using this measure. It is also unaffected by inflation (as both numerator and denominator are equally affected), so we can consider if income variability has increased over time without worrying about the need to adjust using price indices.
Variances, standard deviations and coefficients of variation are widely used in statistical analysis. As with the mean, this is not because they are always the ‘best’ measures of variability (they can be easily interpreted for normally distributed variables but not for other distributions) but mainly because they can be readily calculated and manipulated. For example, given the variances of two population subgroups it is easy to combine them to calculate the overall population variance. However, while they may have technical advantages, all these measures have serious limitations in terms of policy application, given that there is no way to provide a simple explanation of their derivation that would be understood by the great majority of stakeholders.
Alternative, more easily interpreted, measures of variation
Just as the median divides a data set into two halves, with 50 per cent above and 50 per cent below, quartiles can be used to divide it into four quarters with 25 per cent of the study population in each. There are three quartile values, usually denoted Q1, Q2 and Q3. If the data are listed in ascending order, Q2 is simply the median. Q1 is the median of the data points below the median and Q3 is the median of the points above the median. A useful and relatively easy to interpret and explain measure of variation is Q3-Q1, the inter-quartile range, which includes the ‘middle 50 per cent’ of a population.
When we have data on a reasonably large population (at least 100 members) we can extend the above to calculate percentiles. The pth percentile divides the data into two parts with approximately p per cent having values less than the pth percentile and (100 – p) per cent having values greater. Thus the 50th percentile is the median, the 25th percentile is the first quartile, etc. Other common percentiles, often used with incomes and expenditures, are the 20th (which defines the first ‘quintile group’) and the 10th (which defines the first ‘decile group’). In describing inequality in income of doctors, for example, we might estimate the proportion of total incomes paid to the bottom and top decile groups.
Precise formulae for calculating percentiles are available and used in computer statistical packages. However, because the number of data points is large, an approximation is usually perfectly adequate. For example, if there were 513 data points, it would be reasonable to calculate the quintiles as follows:
Q1 ≈ 513/5 ≈ 103
Q2 ≈ 2 x 513/5 ≈ 205
Q3 ≈ 3 x 513/5 ≈ 308
Q4 ≈ 4 x 513/5 ≈ 410
(rounding to the nearest integer) and use these to identify the four quintiles that divide the population into five quintile groups.
Lorenz curves and Gini coefficients
A Lorenz curve provides an alternative approach to measuring dispersion based on the cumulative distribution of a variable. The approach is often used for incomes or wealth distribution: for example, ‘what share of the total income received by a given population goes to the 20 percent who receive the lowest incomes?’, ‘what share goes to the lowest 40 percent?’, etc. By definition, the shares of each income group will increase as we move up through the income quintiles. However, the approach can also be used to analyse access to services. For example, we can ask ‘what percentage of vaccinated children 12-23 months come from the subdistrict with the lowest vaccination rate?’, ‘what percentage from the two subdistricts with the lowest rates?’ etc. If we plot those percentages against the total percentage of children 12-23 months, cumulating over subdistricts, we obtain a Lorentz curve illustrating variation in vaccination rates (figure 8).
The Gini coefficient, the ratio of area A to area (A+B), provides an alternative summary measure of variability that is often used when equity is a priority concern. If there is complete equality, the area A and the Gini coefficient equal 0. As inequality increases, area B becomes smaller and the Gini coefficient approaches 1. For any population the Gini coefficient will lie between 0 (complete equality) and 1 (complete inequality). However, there is no simple interpretation of other coefficient values. It is typically more useful, and certainly easier to communicate with stakeholders, if findings focus on the overall distribution illustrated by the Lorenz curve rather than exclusively on the Gini coefficient.
Concentration curves can be seen as an extension of the Lorenz curve approach to include relationships between two variables. Typically, they show the cumulative percentage of a health status variable plotted against the cumulative percentage of a population ranked by socioeconomic status. For example, figure 9 shows the cumulative percentage of under five deaths plotted against the cumulative percentage of births, ranked by the wealth status of the households in which those births occurred (O’Donnell et al. 2008). As might be expected the curves both lie above the line of equality because under-five mortality rates decrease with increases in wealth. The fact that the line for India is above that for Mali indicates that inequality in death rates was uniformly higher in India. The interpretation would have been more complicated if the lines for the two countries had crossed at some point. As with the Gini coefficient, it is possible to calculate a concentration index if a simple measure of inequality is desired.
Source: Derived from O’Donnell et al. (2008), Supplementary material
Risk measures: Handle with care
Finally in this section, we can consider measures of ‘risk’. These are widely used in health research but again are not well understood by the general population. For example, if the risk of contracting typhoid in an urban area over a one-year period is one in 10,000 and an intervention claimed to have reduced this to one in 20,000, this would probably be reported in local media as ‘halving the risk of contracting typhoid’. There might then be a popular call for the intervention to be introduced at scale. However, this would disregard (a) the low risk prior to the intervention and (b) the likely estimation (sampling and non-sampling) errors when attempting to measure such rare events.
As another example, ‘risk’ and ‘odds’ are often confused. If we denote the risk of an event as P, then
Risk (P) of an event = number experiencing an event / population at risk.
Relative risk (P(A)/P(B)) = risk in group A / risk in group B.
Odds of an event = number experiencing / number not experiencing = P / (1-P)
Odds ratio = [PA/(1- PA)] / [PB/(1-PB)]
This distinction is particularly important when we consider reductions in risk, which are not equal to reduction in odds, for example:
Risk of malaria before intervention = P(B) = 0.5
Risk of malaria after intervention = P(A) = 0.1
Reduction in risk = 0.1/0.5 = 0.2
Reduction in odds = (0.1 / 0.9) / (0.5 / 0.5) = 0.11
The denominator problem
For the above calculation it is necessary to know the overall size of the population ‘at risk’. Similarly, in clinical research one common summary statistic is the proportion or percentage of patients in the intervention and control groups whose condition improves. Calculating such proportions also requires data on the total membership and number improving in each group. In implementation studies, it is often very difficult to calculate or even reliably estimate these summary statistics because the denominator is not reliably known.
For example, we often have only a rough estimate of the number of children who should be immunised or could be sleeping under a net in a given district. Similarly, the catchment population of a facility or actual number of births over a period of time are often unknown. Because of this uncertainty, it is good practice to provide the estimates of both the numerator and denominator alongside any proportion, percentage or risk estimate and to indicate the sources used in the calculation.
As indicated above, we can regard analysis as essentially concerned with the explanation of variability. For example, why do the costs of care for a given condition vary between patients and/or between facilities? Can this be explained by variations in the severity of the condition or do other factors – patient gender or age, type of hospital, diverse treatment protocols, urban/rural location, etc. – play some role? In general terms, is variation in one variable associated with variation in another and does that association imply some causal relationship? As indicated above, this is an enormous topic to which we can only provide an introduction in this chapter. One excellent online course for those who wish to gain an in-depth knowledge of modelling techniques is that provided by the University of Bristol.
During analysis we will often find that the outcomes of an intervention vary substantially between different subgroups of the target population. It would then seem natural to explore the possibility that the variables that define that subgroup may in some sense have caused the variation or alternatively been caused by it. However, subgroup analysis can be a contentious issue if the subgroups are not predefined. Large data sets containing multiple variables, whether from routine data systems or sample surveys, will often tend to exhibit patterns that may arise purely by chance. The term ‘data mining’ is often used to describe the process of exploring data sets to discover apparent relationships that may be of interest. It is generally regarded as useful when used to formulate new hypotheses but requiring great caution to avoid being misled – if you search through all possible combinations of variables in a data set containing perhaps 50 or more, there is a high probability that you will stumble across a number of apparent relationships purely by chance.
This is a particular issue in implementation research, where the emphasis is on detailed understanding of processes and an acceptance that relationships between inputs and outcomes may often be mediated by other variables. For example, suppose we find that the prevalence of chronic illness varies by age group and sex as in Table 4.1. If we obtained these findings using a rapid survey as defined above, we must first consider if the sample size was sufficient to provide reasonably reliable estimates of prevalence in each cell of the table.
Table 8: Prevalence of chronic Illness by sex and age group
|Percentage reporting at least one chronic illness|
One of these relationships, between chronic illness and age group, is long-established and well understood – as bodies age they tend to accumulate defects that are linked to various types of chronic illness. The other, the higher prevalence of chronic illness in women in all age groups, is less easily explained. It would not be correct to leap to the conclusion that women are naturally more prone to chronic illness than men. We might consider a range of possible hypotheses exploring, for example, the influence of childbirth, activities mainly undertaken by women and men, whether women were more likely to report illness than men or less likely to receive treatment for acute illnesses that then became chronic conditions. We might be able to examine some of these hypotheses by further analysis of this or other data sources, or by undertaking qualitative studies such as described in Chapter 8. The key requirement for researchers is to ensure that they have convincing evidence before advancing one or other theories to explain such observations.
Controlled and confounding variables
We sometimes describe such an analysis as one that assesses the relationship between inputs and outcomes controlling for other factors. Typically we know that in practice a very large number of other factors may influence this relationship, for example occupation, level of education, socioeconomic status, household size, type of dwelling, rural/urban location, etc. As indicated in Chapter 5, random allocation of subjects to subgroups would allow us to argue that the potentially ‘confounding’ effects of such variables average out. That will almost never be possible in the type of interventions we are considering and we therefore need to find some way to allow for these effects.
Cross-tabulating by all such factors, even with an apparently large data set, would almost always result in the numbers in most cells being too small to permit analysis. One alternative is to construct a model of the relationship between outcome and inputs that takes into account the effects of other confounding variables. This typically involves very strong assumptions both as to the nature of the multiple relationships between these variables and their individual distributions – assumptions that are often not adequately recognised or tested. As above, it can be argued that the explicit intention to change implementation practice and influence a wide range of stakeholders requires implementation researchers to set higher standards than those conducting more exploratory research.
Models and presentation of findings
Models are typically very simplified versions of reality and we should be very cautious in their interpretation. In particular we should recognise that most stakeholders will typically have little understanding of the assumptions underlying those models. Modelling may be useful to explore our data but should be seen as an intermediary stage in the generation of findings that can be readily comprehended and interpreted. As with the step from distributions to summary measures, we should proceed cautiously and try to ensure that we understand the underlying form of the relations that we are trying to model.
Just as we can understand a great deal about individual variables by examining frequency distribution, much can be learned about two-way relationships from simple scatter diagrams. Figure 4.1 illustrates that such relationships can take a great variety of forms. Our common, often unspoken, assumption of linear relationships between variables is often not only incorrect but can frequently be irrational. The number of cases of tuberculosis cannot increase linearly with expenditure on case finding, because finding cases will become increasingly more difficult once the ‘low-hanging fruit’ have been identified. Hospital net revenues cannot increase linearly with the number of inpatients, because the marginal cost of an inpatient will decline as the number increases.
The linear regression model
By far the most common approach to model building is the use of some form of linear model and we can use this to illustrate modelling possibilities and limitations. The simple linear regression model is illustrated in figure 11. It is usually expressed by an equation of the form:
yi = α + β xi + εi
yi is the value of a response (outcome) variable for the ith observation.
xi is the value of an explanatory (input) variable for the ith observation.
εi is the value of a random error term for the ith observation.
This model is the equation of a straight line where:
α is the intercept and
β is the slope.
The following strong assumptions, which many researchers choose to ignore, are required in order to argue that a regression model is appropriate:
- The relationship between X and Y is linear;
- The values of the independent variable X are assumed fixed (not random) – the only randomness in the values of Y comes from the error term ε;
- The errors εi are uncorrelated (independent) in successive observations;
- The errors εi are normally distributed with mean 0, variance σ2 [ε ~ N(0, σ2)].
We choose α and β such that the sum of squares of deviations from the regression line [Σ(observed value of yi at xi – predicted value of yi at xi)2] is minimised. This is known as the error sum of squares (ESS) about the regression. ESS/(n-2), where n is the number of observations, provides an unbiased estimate of σ2.
Variance components and the coefficient of determination
The error sum of squares can be compared to the sum of squared deviations about the mean (TSS) to see how much of this can be ‘explained’ by fitting the regression line. The division of the sum of squares about the mean (TSS) into two ‘components’, a regression sum of squares (RSS) and an error or residual sum of squares (ESS), is the simplest example of the ‘variance components’ approach to model building, which plays a central role in multilevel modelling.
The use of the phrase ‘explained by the regression line’ should not be taken literally. It refers simply to the above ratio, which is interesting only if all the assumptions made in defining our model are correct – this is rarely the case. One way of exploring the value of our model is to look at the deviations of observations from the value ‘predicted’ by our regression line – the ‘residuals’. We do this using a scatter plot with the explanatory variable (X) on the horizontal axis and the residuals on the vertical axis as in figure 12.
As indicated above, we use models to allow for the effects of a variety of potentially confounding variables. To do this we construct a multiple regression model:
y = α + β1x1 + β2x2 + β3x3 + . . . + δ1z1 + δ2z2+ . . . + ε
y is the response variable
xi are known explanatory variables
zi are known confounding variables
However, one often intractable issue is that there are typically a range of factors which we have either ignored or cannot measure. The true model should be written:
y = α + β1x1 + β2x2 + β3x3 + . . . + δ1z1 + δ2z2+ . . . + γ1c1 + γ2c2 . . . + ε
y is the response variable
xi are known explanatory variables
zi are known confounding variables
ci are unknown confounding variables
ε is an error term.
This is described as a ‘specification’ error. In general, omitting such variables from the model has serious implications in terms of undermining the basic assumptions identified above.
Statistical inference in regression models
As discussed above, with the widespread availability of statistical software, it is expected that, where data have been collected using probability sampling, all estimates will be accompanied by estimated error margins. For example, in the simplest case of a random sample of size n we know that we can estimate 95 per cent confidence limits for a population mean as:
sample mean ± 2 s/√n
where the term s/√n is the ‘standard error’ of estimation (s.e.). It was also indicated above that for other probability sampling designs the formula for the standard error will vary, but the formula remains the same and can be extended to other statistics:
estimated value + 2 s.e.
Given that regression estimates will also require to be accompanied by error margins, we again have to address the issue that most surveys will use a sample design that involves cluster sampling at one or more levels. For example the DHS surveys typically involve:
- stratification by states/provinces and then by urban and rural areas;
- a PPS cluster sample of enumeration areas within each stratum;
- a systematic sample of 30 households per cluster.
As discussed above, failure to allow for the larger sampling errors associated with cluster sampling can result in the confidence limits for estimates that are too narrow, and incorrect assessment of tests of statistical significance. With a cluster sample the error sum of squares has two components:
ESS (about mean) = ESS(between clusters) + ESS(within clusters)
If the random sample formula for the error sum of squares is used, estimates of model parameters may be unbiased but estimated confidence limits will typically be far too narrow – that is, we will be substantially exaggerating the precision of our estimates. Multilevel modelling (Goldstein 1999) explicitly builds the variation between clusters into the model and estimates the between-cluster variation.
Random intercept model
We can allow for between-cluster variation by formulating the model:
yij = β0 + β1 xij + [uj + εij]
yij is the value of y for individual i in cluster j
xij is the value of x for individual I in cluster j
uj is the deviation of the mean of cluster i from the global mean
We then have two random variables in the equation:
uj ~ N(0, σu)
εij ~ N(0, σε)
And can obtain correct estimates for: β0, β1, σu2, σε2
Note that sometimes the variation between clusters may itself be of interest. For example, if we have clustered by health facility, we can estimate the proportion of total variability ‘explained’ by between-facility variation (Lopez-Cevallos and Chi 2009).
Sample survey software
Multilevel modelling can similarly be used in a wide range of other contexts where relationships exist between different ‘levels’ of a health system (district, facility, doctor, patient, etc.). Using modern survey analysis software it is relatively straightforward to describe even complex sampling design and obtain appropriate parameter estimates. These packages include the ‘usual suspects’: SAS, STATA, SPSS and some more specialist software such as MLwiN, etc. They can all readily address the most common health survey designs involving cluster sampling and unequal sampling probabilities.
Example: A multilevel analysis of self-reported tuberculosis disease in a nationally representative sample of South Africans was undertaken based on the 1998 DHS (Harling, Ehrlich and Myer 2008). Individual and household-level demographic, behavioural and socioeconomic risk factors were taken from the DHS; data on community-level socioeconomic status (including measures of absolute wealth and income inequality) were derived from the 1996 national census.
Of the 13,043 DHS respondents, 0.5 per cent reported having been diagnosed with tuberculosis disease in the past 12 months and 2.8 per cent reported having been diagnosed with tuberculosis disease in their lifetime. In a multivariate model adjusting for demographic and behavioural risk factors, tuberculosis diagnosis was associated with cigarette smoking, alcohol consumption and low body mass index, as well as a lower level of personal education, unemployment and lower household wealth. In a model including individual- and household-level risk factors, high levels of community income inequality were independently associated with increased prevalence of tuberculosis.
The multilevel analytic approach was seen as allowing for the differentiation between community- and individual-level mechanisms in the relationship between socioeconomic status and tuberculosis. Furthermore, these data allow strong inferences to be drawn regarding risk factors for tuberculosis disease across the country: a nationally representative cross-sectional survey provided evidence on individual and household characteristics, while South African census data provided robust estimates of the true community-level socioeconomic characteristics across the nation.
Bennett, S.; Woods, T.; Liyanage, W.M. and Smith D.B. (1991) ‘A Simplified General Method For Cluster-Sample Surveys of Health in Developing Countries’, World Health Statistics Quarterly 44: 98–106, http://apps.who.int/iris/bitstream/10665/47585/1/WHSQ_1991_44%283%29_98-106_eng.pdf?ua=1 (accessed 30 March 2015)
Eurostat (2007) Handbook on Data Quality Assessment Methods and Tools. Wiesbaden: European Commission, http://unstats.un.org/unsd/dnss/docs-nqaf/Eurostat-HANDBOOK ON DATA QUALITY ASSESSMENT METHODS AND TOOLS I.pdf (accessed 30 March 2015).
Goldstein, H. (1999) Multi-level Statistical Models, London: Arnold
Goodson, J.L.; Kulkarni, M.A.; Eng, J.L.V.; Wannemuehler, K.A.; Cotte, A.H.; Desrochers, R.E., Randriamanalina, B. and Luman, E.T. (2012) ‘Improved Equity in Measles Vaccination from Integrating Insecticide-treated Bednets in a Vaccination Campaign, Madagascar’, Tropical Medicine and International Health 17.4: 430–37
Harling, G.; Ehrlich, R. and Myer, L. (2008) ‘The Social Epidemiology of Tuberculosis in South Africa: A Multilevel Analysis’, Social Science and Medicine 66: 492–505
Lemeshow, S. and Robinson, D. (1985) ‘Surveys to Measure Programme Coverage and Impact: A Review of the Methodology Used by the Expanded Programme on Immunization’, World Health Statistics Quarterly 38.1: 65–75
Lopez-Cevallos, D.F. and Chi, C. (2009) ‘Health Care Utilization in Ecuador: A Multilevel Analysis of Socio-Economic Determinants and Inequality Issues’, Health Policy and Planning 25.3: 209–18
Myatt, M.; Feleke, T.; Sadler, K. and Collins, S.(2005) ‘Field Trial of a Survey Method for Estimating the Coverage of Selective Feeding Programmes’, Bulletin of the World Health Organisation 83.1: 20–6, www.who.int/bulletin/volumes/83/1/20arabic.pdf (accessed 30 March 2015)
O’Donnell, O.; van Doorslaer, E.; Wagstaff, A. and Lindelow, M. (2008) ‘Analyzing Health Equity Using Household Survey Data’, Washington, DC: The World Bank, www.worldbank.org/analyzinghealthequity (accessed 30 March 2015)
Prudhon, C. and Spiegel, P.B. (2007) ‘A Review of Methodology and Analysis of Nutrition and Mortality Surveys Conducted in Humanitarian Emergencies from October 1993 to April 2004’, Emerging Themes in Epidemiology, www.ete-online.com/content/pdf/1742-7622-4-10.pdf (accessed 30 March 2015)
SMART (2005) ‘Measuring Mortality, Nutritional Status, and Food Security in Crisis Situations: SMART Methodology’, The SMART Initiative, www.smartindicators.org/SMART_Protocol_01-27-05.pdf (accessed 30 March 2015)
Turner, A.G.; Magnani, R.J. and Shuaib, M. (1996) ‘A Not Quite as Quick but Much Cleaner Alternative to the Expanded Programme on Immunization (EPI) Cluster Survey Design’, International Journal of Epidemiology 25.1: 198–203
Walker, N.; Bryce, J. and Black, R.E. (2007) ‘Interpreting Health Statistics for Policymaking: the Story Behind the Headlines’, Lancet 369: 956–63
Fatema Rajabali and Hannah Corbett
Institute of Development Studies
“It is clear that better use of research in development policy and practice can help save lives, reduce poverty and improve quality of life ”. (Court and Young 2003: p1)
1. Research knowledge driving change
There has been growing awareness and acknowledgement of the value of research uptake and communications as an integral part of the research process. Bringing together good credible research is important, but there is increasing demand to see it come alive rather than just have it sit on a database somewhere. The World Health Organization (WHO) has also recognised the need to use rigorous processes to ensure that health recommendations are informed by the best available research evidence (Al-Riyami 2010).
In order to see relevant research make a difference requires putting the evidence into use through a variety of knowledge management and communication mechanisms – an area this chapter will explore. A more detailed focus will be presented on the use of policy briefs, an evidence-based product that places strong emphasis on clear recommendations for policy-related professionals. But apart from just communicating credible evidence, researchers also need to be more engaged and aware of the complexity of decision-making processes, as this directly influences the likelihood of whether their research will be taken up. This is another area that we will briefly examine in this chapter.
There are a number of terms that we will be using that are explained below.
Knowledge is a well-researched concept and refers to the process of ensuring that knowledge is available and accessible. Perkin and Court (2005) define knowledge as 'information that has been evaluated and organised so that it can be used purposefully'. Porter (2010) reflects that the term 'knowledge' is now often used in place of evidence as it encourages discussion on how evidence is utilised and processed, and includes tacit and informal sources.
Evidence: in the context that we are discussing, evidence is information generated through research, whether scientific or social and is generally communicated through research-related formats – data, statistics, indicators, scientific studies, technical briefings and reviews, among others. In this chapter, we will focus on research evidence and the process by which it can be communicated and utilised by decision-makers including the policy community.
Evidence-informed policy can be considered as a broad range of research evidence; evidence from citizens and other stakeholders; and evidence from practice and policy implementation as part of a process that considers other factors such as political realities and current public debates (Newman, Fisher and Shaxson 2012). Evidence-informed policy is generally not a linear transition of research findings into policy decisions; research can inform policy discourses in multiple and sometimes subtle ways. But it is valuable to understand the complex nature of decision-making processes and where your research may fit in, whether it is about influencing language or creating awareness on an issue (Weyrauch and Diaz Langou 2011).
Example: “ . . . initial outcomes of the study which wound up at the end of 2011 showed increased community awareness about benefits of delivering in health facilities, and phenomenal increases in facility births, with an average of 1,336 deliveries per month in the intervention area compared to an average of 461 deliveries per month in the control area”.
2. Literature around the research–policy nexus
The evidence-based policy concept has been well explored in the international development sector with multiple sources of literature and the development of useful models and frameworks that have helped to improve research-policy integration and research uptake by policy actors. Table 10.1 below outlines some selected bodies of work that address the use of research evidence in policy.
Table 10.1: Selected bodies of work on research evidence
|Body of work||Overview of research questions||Examples of research actors|
|1. Approaches and frameworks for connecting research and policy|| • What processes mediate and facilitate the use of evidence and knowledge in policymaking? |
• How does evidence contribute to effective policymaking?
| • Overseas Development Institute RAPID framework of research policy linkages (Court and Young 2003). Key influences: (1) Political context and institutions (2) Credibility and communication of the evidence (3) Links, influence and legitimacy (4) External influences. |
• The British Government Cabinet Office views the use of evidence as one of eight core competencies of professional policymaking (Cabinet Office 1999b).
|2. Assessing the impact of research and research communication|| • What factors seem to matter (or not to matter) for increasing the impact of development research on policy? |
• What role does the communication of research play?
| • International Development Research Centre (IDRC) analyses different methodologies for assessing the impact of research on policy and examining the challenges of assessing impact. |
• UK Department for International Development (DFID) commissioned working papers and developed strategies /guidance notes in the area of research communications and research uptake (DFID 2008a; 2008b; 2008c; and Yaron and Shaxson 2008).
|3. Theories of policy influence and models of policy change and policy processes|| • What are the processes by which policy decisions are made? |
• How do political processes determine decisions?
• Models of the policy process contain assumptions in relation to how evidence is used in policymaking.
| • Kingdon (2003) Decision-making is an incremental process; The politics of agenda-setting is very important. |
• Lindquist (2001) The nature of decision-making can vary considerably.
• Weiss (1979) How policymakers engage with and ‘use’ evidence: Enlightenment model; Problem-solving model.
• Stone (1996) Bargaining and coalition formation lead to policy formulation.
• Roe (1991) Dominant narratives can shape problem-definition and open or close off political space.
|4. Models and guidance for research utilisation|| • How is research consumed by policymakers? |
• What are the different factors that influence how policymakers demand and utilise research?
| • Carol Weiss (1979) six models that explain different types of research utilisation: Knowledge-driven; problem-solving; interactive; enlightenment; political; tactical. |
• Diane Stone (2002) outlines 12 perspectives for improving research utilisation. These can be summarised into three categories of explanation; supply-side, demand-led and policy currents.
• Nathan Caplan (1979) ‘Two Communities’ theory of under-utilisation of research focuses on the cultural gap between researchers and policymakers. Proposes two types of research use: instrumental and conceptual use.
Source: Adapted from Porter (2010: tables 1 and 3).
Although important strides have been taken in raising expectations and pushing boundaries, progress is not uniform. A recent UK/Dutch workshop in London identified that while some research organisations have led the way in exploring research uptake and knowledge management by strengthening capacity, creating alliances and pioneering new ways of working, many others still struggle. There is still much to be done in mastering the 'art' of research uptake and making sure it happens more consistently.
3. Do we understand enough about real world policy processes?
“There is no such thing as context-free evidence” (Davies 1999).
Politics shapes how evidence is used at many decision-making levels – and this means that researchers should understand politics and the process of decision-making. Power relations can crowd out certain types of evidence and perspectives (Fisher and Vogel 2008) so engaging with these different actors and understanding how decisions are made is useful. Porter's paper, 'What Shapes the Influence Evidence Has on Policy? The Role of Politics in Research Utilisation' provides some useful guidelines for integrating political economy analysis into different stages of the research and communication process in order to negotiate the political context.
Achieving and attributing influence and change is a complex and difficult process with no easy quick-wins. As Dagenais (2015) argues 'despite efforts expended over recent decades, there is a persistent gap between the production of scientific evidence and its use'. It is also important to take account of how definitions of 'successful' influence and impact are shaped and by whom. Scott-Villiers highlights this in her case study of research undertaken in the Karamoja, Uganda which was deemed as 'not influential' by a donor, but had a significant effect on the local community (Scott-Villiers 2012). Debates around influence and impact often focus narrowly on 'policy influence' and 'policymakers' (in itself a very broad term that encompasses a whole range of different actors and that requires further definition in relation to specific contexts). However, as Benequista argues in a recent blog on research communications and politics, policymakers are not the only drivers of change, in fact they can obstruct positive change. An analysis of the power dynamics and relations in the context in which researchers are seeking to engage and achieve influence is essential (Gaventa 2006) in order to realise change and avoid simply reinforcing or legitimising existing power imbalances and the status quo.
This is further analysed by Jonathan Lomas (2006) who asks how much researchers should compromise in their conception of 'evidence'. And how much should decision-makers compromise in theirs? Both researchers and decision-makers can be equal partners in a co-production of the research throughout the process (Lomas 2005) or it can also be a more research-dominated process (Pope, Mays and Popay 2005). Porter (2010) suggests that if researchers seek to have their evidence be taken up by governments, practice to date shows that involving governments in the process of establishing research priorities increases the likelihood of research uptake by those governments. It also makes it more likely that recommendations are not out of line with government thinking and are mindful of the realities of policymaking.
In a Health Care Policy article, Rick Roger talks about confronting the gap between the idealised use of research in policy development and current realities. He highlights that healthcare managers and decision-makers do not function solely within the simple world of 'What works?' The policymaking environment is more a function of 'What combination of interventions works where, for which sub-populations, in which environmental circumstances, in which combinations?' As Porter (2010) argues, policymakers often want research that shows how impacts can take place. They are seeking evidence that demonstrates how things should be done differently or that offer practical guidance. How policymakers define 'useful' research will often depend on whether the evidence helps them solve a policy problem. This shows how decision-makers and researchers need to negotiate ways of meeting halfway in this process, which Greenhalgh and Russell (2005) describe as 'a new rationality of policy-making'.
But there is also a need to go beyond just engaging with policy actors themselves. Ajay Dutta (2012) argues that researchers are knowledge producers and communicators, and 'if they view their role to policy, they should be prepared to engage with stakeholders affected by policy issues and expose their findings to human interaction, review and scrutiny by others'.
4. Your research needs to have impact! What can this mean?
Realising this kind of positive transformative change is for many researchers and research institutions at the heart of what they do. In recent years, the demand on the research community to ensure the impact of research findings on the decisions, actions and behaviours of policymakers and practitioners has become an increasing priority. This has been driven in part by growing funder requirements and expectations (Sumner, Ishmael-Perkins and Lindstrom 2009), and contested value for money agendas (Chambers 2014).
Louise Shaxson's blog (Shaxson 2012) discusses how researchers are being put under increasing pressure to demonstrate impact examines what that impact could look like, highlighting four very useful points:
- Clear research evidence doesn't necessarily lead to clear policy messages. It can be better to focus on the quality of the evidence produced by getting it into debates, than on trying to demonstrate impact in terms of any concrete outcomes on policy.
- Be careful how you define 'policy relevance'. Although the term is used quite a lot, it has quite a few different dimensions based on the research being conducted. Quite often now, as Shaxson notes, policy relevance isn't an either/or situation; it's a multidimensional and constantly shifting challenge.
- Be realistic about what can be achieved – think breadth of impact rather than depth. Yes, research can influence policy development making it citable, but it is important to recognise it also has real impacts on how policies are designed and implemented, and the relationships research can build along the way. Justin Pankhust (Pankhurst 2012) goes further by arguing that it is probably less important to ensure research leads to policy change than it is to support local systems for reviewing and evaluating evidence, which may be better able to set that evidence in the context of what needs to be done.
- Be clear whether you are practising research communication or advocacy. As shown in the diagram below, there is a line between research communications and advocacy. Researchers and organisations need to decide where they sit or how far to the right they want to operate, but being clear about it beforehand is important.
Source: Louise Shaxton blog, adapted from Morton et al. (2012: 98-99)
5. I have completed my research: So, now what?
As presented earlier, the generation of credible research is in itself insufficient if you are looking to influence change. Your research methods and results need to be intelligible to non-researchers, sufficiently digestible, and have results clearly interpreted and translatable for the target audience. Many decision-makers, especially policy-related professionals, often have little time for engaging with research and if the results are ambiguous and do not offer clear interpretation of the findings, they will be discarded, ignored or misinterpreted (Porter 2010). The produced research usually needs to be packaged in a variety of products to make it more accessible. One can use working papers, briefing papers, policy briefs, talking head videos, etc. as a way of making a much larger, more detailed research report more useful for the audience at hand.
5.1 Using policy briefs to communicate your research
Built of the assumption that policy informed by evidence is more likely to lead to better development outcomes (Newman et al. 2012), we would like to explore the use of the policy brief as one research communication approach. A policy brief is a concise summary of a particular issue, the policy options to deal with it, and some recommendations on the best option. It is aimed at government policymakers and others who are interested in formulating or influencing policy as they take decisions in complex policy processes (Beynon et al. 2012).
There are a couple of things to think about when you are developing policy briefs:
- Who is your target audience? Are they policy-related advisors? If not, do you need to develop a policy brief or is another knowledge product more suitable?
- What are your intended policy impacts? What kind of change are you hoping to see as a result of your policy brief? Are you simply looking to create awareness on a policy issue, influence decisions around programming and funding, or looking for a change in behaviour?
- Audience may not be health and development specialists. Language needs to be well-thought through without overly simplifying your key messages. Use of jargon and technical terminology needs to be well explained and where possible avoided.
Table 10.2 Policy brief structure
Designed to give an overview of the content of the brief with an emphasis on capturing the attention of the reader.
Explains the importance and urgency of the issue and creates curiosity about the rest of the brief. It gives an outline with the structure for the brief and an overview of the conclusions or the direction of the rest of the brief.
Aims to strengthen the credibility of the brief by explaining how the findings and recommendations were arrived at, including:
- Description of the issue and context of the investigation;
- Description of the research and analysis activities;
- What methods were used to conduct the study?
- Who undertook the data collection and analysis?
Results and conclusions
Provides an overview of the findings/facts constructed around the line(s) of argument behind the policy recommendations moving from general and specific information. Base the conclusions on evidence, data and findings with clear, balanced and defensible assertions.
Attempts to explore what policy changes or actions the results point to, based on the evidence provided. It is less direct that recommendations and useful to include if direct advice is not requested or welcome.
Highlights what you as a researcher think should happen, based on the evidence presented. It is useful to make the recommendations actionable, with some precise steps of what should happen next, so the key target audience have some guidance on steps forward.
References and useful resources
It is important to cite and clearly signpost the sources in your briefing to provide credibility in your analysis and recommendations. This gives authority and weight to your product. Readers generally find additional resources useful to have if they would like to undertake more detailed reading around the subject area.
Additional points to consider:
- Policy briefs are two, four or a maximum of eight pages (1200, 2200 or 4000 words);
- Policymakers often spend around 30-60 minutes reading information on an issue;
- Design can help highlight key facts or concepts;
- The policy brief should be focused and succinct to make appropriate recommendations;
- In addition to having solid content, policy briefs should also be visually engaging;
- Accessibility is important – the style should be professional - not academic. Remember your target audience. Accessibility includes making sure that technical language is well explained.
6. What tools can one use in achieving influence and driving change?
There are a range of research uptake and communication tools and approaches that can be used to better support you in getting your evidence and your knowledge products (like the policy brief) out in the health and development practice and policy world. As noted later in the chapter, any approach used should be developed as part of a wider action plan. There are different functions that a researcher can play with the research evidence that has been produced:
- Share information: tends to be one-directional - less need to capture feedback;
- Engagement: this is where you might be looking for feedback on something or you are looking to stimulate conversation on an issue;
- Collaboration: seeking to construct or co-produce something new together;
- Storing/capturing information: how to store information for easy future access.
Multiple tools and a mixture of facilitation techniques can help to foster effective knowledge-sharing of research evidence. Table 10.3 outlines a range of tools with examples from the health sector that can be used by researchers in developing a research uptake strategy.
Table 10.3: Different structural components of a policy brief
|Face to Face||Face to Face||Face to Face||Repositories: e.g. HighWire Press – largest repository of free full-text life science articles in the world|
|Wikis||Skype||Databases: e.g. InfoSci-Medical Database|
|email lists: e.g. WHO email listserve, GIZ HESP news||Google docs||Teleconferencing||Social bookmarking|
|Discussion lists: e.g. Healthcare for all by 2015: www.hifa2015.org/||MS Word||Blogs||Websites (see other useful health resources section)|
|Online communities/communities of practice (CoP): e.g.www.globalhealthlearning.org/course/online-communities-practice-cops-global-health||Writeshop (face to face)||Online communities||Minutes|
|Twitter (#health, #GlobalDev)||Confluence|| Twitter |
e.g. Eldis learning paper on using social media
|Web gathering||Webinar|| Tagging: |
e.g. Delicious, Old reader
|Webinars||Google hangout|| Learning modules: |
e.g. Canadian Network on Health in Development modules on maternal health, newborn health, and family planning: http://reprolineplus.org/learning-opportunities
|Websites (see other useful health resources section)|
7. Setting engagement and influencing goals and objectives
Despite the complexity, effective action is possible (Jones et al.) and needs to seek to involve potential beneficiaries and other actors such as government officials, health practitioners and community leaders alongside project or programme researchers (Participatory Impact Pathways Analysis; Douthwaite 2009).This section of the chapter will outline practical approaches and tools and tactics that can help researchers design effective influencing and engagement strategies in relation to their work, with a particular focus on how to identify, understand, prioritise and target key audiences.
Defining a clear and measurable engagement and influencing goal, underscored by some narrower objectives, is a critical first stage of the design of any influencing and engagement plan. These objectives need to be consistent and flow from the intervention theory of change, as discussed in chapter 4. The goals and objectives could relate to an institutional or organisational priority, a research programme or project or a specific output such as a research report. They will help inform decisions around which particular organisations or individuals such as activists, NGOs or government officials you seek to engage with and influence, and how, in order to achieve the change you want to see.
8. Understanding your audiences and their operating environment
If researchers aim to engage in dialogue through structured processes, experience has shown that careful planning is required to clarify intentions, select who to engage with, when to engage, and how best to do so. (Dutta 2012: 915)
Once you have your engagement and influencing goal and objectives, you can start to think about which individuals and organisations will be critical in helping you achieve them. Thinking beyond individuals and organisations in isolation, and taking into account the broader international and national political, social, economic and cultural environments they are operating in and the power dynamics and the relationships between you and your audiences and their relationships with each other is critical (Court et al. 2004). Demonstrating this understanding in your engagement and influencing tools and outputs will be critical to establishing credibility amongst your key audiences, and ensuring the uptake of your research findings.
8.1 The stakeholder mapping process
There are various stakeholder mapping tools that can be used to help researchers identify and prioritise the audiences that they will need to engage with and influence in order to achieve their policy-influencing and engagement goals and objectives. The following diagram outlines the general stages of the mapping process that need to be undertaken.
In relation to your influencing goals and objectives:
- Identify stakeholders (individuals, organisations);
- Categorise stakeholders by type (government, media, donors);
- Map relationships and links between stakeholders;
- Rank stakeholders (by influence, power, alignment, interest, attitude);
- Analyse stakeholders’ positions, perspectives, links and relationships, how you might want them to change, and what this might mean for your strategies to engage audiences;
- Prioritise your key audiences.
As discussed in chapter 5, stakeholder mapping is an ongoing process throughout the implementation period. The process needs to reflect the dynamic spaces where relationships and power balances are constantly shifting, and needs to be reviewed and updated on a regular basis.
Example: Participatory Impact Pathways Analysis
The Participatory Impact Pathways Analysis (PIPA) offers researchers the opportunity to come together with partners, beneficiaries of their research and other key stakeholders such as government officials and practitioners to define and visualise how they are going to achieve their policy engagement goal. In 2013, the Institute of Development of Studies working with Practical Action, and funded as part of the International Development Research Centre's (IDRC) Think Tank Initiative 's Policy Engagement and Communications Programme hosted a workshop for South Asian think tanks. As part of this workshop the think tanks, some of whom were more research-focused and some of whom were more advocacy- and policy-orientated, worked to define their own institutional engagement and influencing goal and undertook a PIPA exercise in relation to this goal. The , including the impact stories of pathways to change that the groups came up with, were hugely varied and offer an interesting insight into the PIPA process. The discussions they had as part of the process of listing, grouping and analysing their stakeholders were extremely valuable.
8.2 Stakeholder mapping tools
There are a number of matrix and network mapping tools that can be used to undertake the stakeholder mapping process. The figure below outlines four different tools, and further links to more detailed explanations of how to use them.
9. Developing your engagement and influencing plan
As part of the stakeholder mapping process you will have already started to think about how you can reach some of your audiences in terms of their relationships to you and your organisation, and also in terms of their relationships with each other. An effective engagement and influencing strategy will need to include a number of essential questions in relation to a specific audience group including:
- HOW do they access information and who or what influences them?
Different audiences access information and evidence in a variety of ways. A recent study by the University of Manchester of how UK civil servants engage with academic research and expertise found that they preferred research information that had been 'pre-digested ' in the form of a briefing or a media report. However, they also found that just over half of the respondents were accessing more traditional academic outputs such as peer-reviewed journals.
- WHO in the organisation or programme is best placed to communicate with them?
This question helps to focus thinking on the relationships and leverage that exist within your organisation in relation to your target audiences and about the capacity and resources you have to act upon these. An International Initiative for Impact Evaluation recently looked at the effect of a policy brief that included an opinion piece from a sector-recognised expert on changing behaviours and prompting actions. The study found that including the opinion led to an increase in action such as sharing the brief more widely, but not necessarily sharing behaviour.
It's important to note that others outside your organisation, such as research partners, might be better placed to act as knowledge brokers or knowledge intermediaries to communicate with a specific audience. This may lead to questions about whether they have capacity to act in this way, irrespective of how well placed they are in terms of physical location and access. There is an interesting case study of a Knowledge Broker programme implemented in Burkina Faso to strengthen the way that scientific knowledge was made available to health practitioners and policymakers (Dagenais et al. 2015).
- WHEN is the best time to engage with them?
The response to this question should be shaped by internal factors such as the research timeline programme or moments in an institutional strategy, and by debates and policy windows in the external environment. The Overseas Development Institute (ODI) ROMA guide to policy influence and policy engagement outlines some useful steps to help map the external environment in relation to national government-driven policy formulation and change. However, it is important to note that national governments and the formal policymaking process are only one part of how change happens, and you will need to look to debates and activities being led by multilateral organisations, community activists, local governments and NGOs to develop a fuller picture to guide your own timeline of engagement.
- WHAT do you think the best tool or tactic will be to reach them?
A wide range of different tools and tactics will be required to target and reach different audiences, which will need to be used in combination. In trying to ensure effective capturing and sharing of learning from a UNICEF and IDS social protection research programme, researchers found that 'multiple media was required' (Batchelor and Perkins 2011).
- WHAT is the expected outcome – i.e. a change in behaviour or policy that you wish to see?
It is important to ensure that your activities remain focused and that you can measure how successful they are. It could be useful, where possible, to include some broad measures of success within these, such as percentage increases in spending, Minister of Health attends and speaks at an event, citations in media or parliamentary debates.
The following figure sets out these questions and an example of how you might respond to them in relation to a particular audience that you have identified.
Figure 10.5: Policy-influencing and engagement plan example
|Audience||HOW do they access information and who or what influences them?||WHO in the organisation or programme should communicate with them?||WHEN is the best time to engage?||WHAT do you think the best tool or tactic will be to reach them?||WHAT is the expected outcome, change in behaviour or policy that you wish to see?|
|Special Advisor to the Minister of Health|| • Policy briefings |
• Weekly MOH internal emails
• National newspaper and influenced by particular journalists
• University to which they have an affiliation
• Influenced by colleagues at the Finance/Treasury Department
| • Director has an existing relationship with them |
• Research partner based at the university where they have affiliation
| • Over next six months in run up to national budget |
• Prior to attendance at international meeting on nutrition
| • Cost analysis |
• Briefing note in advance of meeting
• Face-to-face meeting
• Invitation to speak at country launch of new hunger and nutrition commitment index
| • Greater political will to tackle under-nutrition |
• Increase in Ministry of Health direct spending on under-nutrition
10. Assessing the success of your engagement activities
As the Overseas Development Institute (ODI) ROMA guide to policy influence and policy engagement highlights, traditional monitoring and evaluation approaches 'which rely on a simple feedback model with predefined indicators, collecting data and assessing progress towards pre-set objectives – are simply not adequate in the context of policy-influencing interventions'. This is something that Benyon et al. reflect upon in their article 'Passing on the Hot Potato: Lessons from a Policy Brief Experiment' (Benyon et al. 2012). In their study of the effectiveness of policy briefs, they highlight that the simple linear model of actors such as government officials receiving policy-relevant messages, taking action upon these messages, leading ultimately to improved lives, vary rarely plays out in real life. Changes in behaviour and attitudes are also difficult to capture, especially through quantitative data. This is where impact stories and narratives can help illustrate your reach, influence and impact.
However, practical steps towards measuring the success and impact of influencing and engagement activities are possible and could include the following:
- Review your stakeholder map: have the positions of stakeholders, in terms of their relationship to the research, your organisation or to each other, changed, and can this be attributed to your policy-influencing and engagement objectives?
- Measure your success against your defined indicators: how far have you been successful in some of the broad indicators of success outlined in the expected outcomes section of your engagement plan? Scott and Munslow highlight some useful approaches to tracking research and policy conversations in online spaces (Scott and Munslow 2015).
- Capture the results of your activities in impact stories: attribution in relation to policy-influencing is complex and difficult. Quantitative measures are one part of this process. However, using narrative in the form of the written word or multimedia content, as exemplified by the Future Health Systems research programme, can help bring your stories of change to life.
Other useful health-related resources on the internet
- Cochrane collaboration database, which focuses on health reviews, www.cochranelibrary.com
- Evidence-informed Policy Network (with WHO), http://global.evipnet.org
- Exploring the Impact of Research Communications – What Difference Does a Policy Brief Make?, IDS project funded by International Initiative for Evaluation Impact
- Healthcare Online Resources 2015, www.zillman.us/subject-tracers/healthcare-resources/
- Loewenson, T. (2014) Annotated Bibliography of E-Platforms Used in Participatory and Peer Exchange and Learning, Harare: Equinet, www.equinetafrica.org/bibl/docs/Ann%20bib%20of%20e-%20platforms%20%20Dec2014.pdf
- McMaster Health Forum, www.mcmasterhealthforum.org/hse/
- Overseas Development Institute (ODI) ROMA guide to policy influence and policy engagement
- Participatory Impact Pathway Analysis: A Practical Method for Project Planning and Evaluation, Boru Douthwaite, CGIAR
- ResUp MeetUp Symposium and Training Exchange 2015
- Research Communicators: Let’s Talk Politics, Shall We?
- Research Gate: www.researchgate.net/
- K4Health: www.k4health.org/toolkits/research-utilization
Al-Riyami, A. (2010) 'Health Researchers and Policy Makers: A need to Strengthen Relationships', Oman Medical Journal 25.4: 251-52, www.ncbi.nlm.nih.gov/pmc/artciles/PMC3191660/ (accessed 23 March 2015)
Benyon, P.; Gaarder, M.; Chapoy, C. and Masset, E. (2012) 'Passing on the Hot Potato: Lessons from a Policy Brief Experiment', IDS Bulletin 43.5, Brighton: IDS
Chambers, R. (2014) 'Perverse Payment by Results: Frogs in a Pot and Straitjackets for Obstacle Courses', blog, https://participationpower.wordpress.com/2014/09/03/perverse-payment-by-results-frogs-in-a-pot-and-straitjackets-for-obstacle-courses/ (accessed 7 May 2015)
Court, J., Hovland, I. and Young, J. (2004) Bridging Research and Policy in International Development: Evidence and the Change Process, ITDG , Briefing Paper based on work conducted in the RAPID Programme at ODI
Count, J. and Young, J. (2003). Bridging Research and Policy: Insights from 50 Case Studies. ODI Working Paper 213. Overseas Development Institute. http://www.odi.org/sites/odi.org.uk/files/odi-assets/publications-opinion-files/180.pdf
Dagenais, C. et al. (2015) 'Collaborative Development and Implementation of a Knowledge Brokering Program to Promote Research Use in Burkina Faso, West Africa', Global Health Action 8.S1: 16549880 http://www.globalhealthaction.net/index.php/gha/article/view/26004
Davies, P. (1999) 'What is Evidence-based Education?', British Journal of Educational Studies 47.2: 108–12
Douthwaite, Boru (2013) Addressing Water, Food and Poverty Problems Together: Methods, Tools and Lessons. A Sourcebook from the CGIAR Challenge Program on Water and Food. Colombo, Sri Lanka: CGIAR Challenge Program on Water and Food (CPWF)
Dutta, A. (2012) 'Deliberation, Dialogue and Debate: Why Researchers Need to Engage with Others to Address Complex Issues', IDS Bulletin 43.5, Brighton: IDS
Fisher, C. and Vogel, I. (2008) Locating the Power of In-Between: How Research Brokers and Intermediaries Support Evidence-Based Pro-poor Policy and Practice. Brighton: IDS Knowledge Services, www.ids.ac.uk/files/dmfile/intconfpaper28Novwebsiteedit.pdf (accessed 7 May 2015)
Gaventa, J. (2006) 'Finding the Spaces for Change: A Power Analysis', IDS Bulletin 37.6: 2333
Greenhalgh, T. and Russell, J. (2006) 'Reframing Evidence Synthesis as Rhetorical Action in the Policy Making Drama', Health Care Policy 1.2: 3442, www.ncbi.nlm.nih.gov/pmc/articles/PMC2585323/ (accessed 7 May 2015)
Jones, H.; Jones, N; Shaxson, L. and Walker, D. (2012) Knowledge, Policy and Power in International Development, Bristol: The Policy Press
Knezovich, J. (2014) 'ARCADE Presentation What is a Policy Brief?', presentation delivered at the Institute of Development Studies. Open access resource.
Lomas, J. (2006) 'Commentary: Whose Views Count in Evidence Synthesis? And When Do They Count?, Health Care Policy 1.2: 55–57, www.ncbi.nlm.nih.gov/pmc/articles/PMC2585327/ (accessed 7 May 2015)
Lomas, J. (2005) 'Using Research to Inform Healthcare Managers' and Policy Makers' Questions: From Summative to Interpretive Synthesis', Health Care Policy 1.1: 55–71
Morton, James, Louise Shaxson, Jonathan Greenland (2012). Process Evaluation of the International Initiative for Impact Evaluation (2008-11): Final Report. Overseas Development Institute and Triple Line Consulting. 27 September 2012. www.3ieimpact.org/media/filer_public/2013/01/07/3ie_proces_evaluation_2012_full_report_1.pdf
Newman, K.; Fisher, C. and Shaxson, L. (2012) '', IDS Bulletin 43.5, Brighton: IDS
Pankhurst, J. (2011) 'Support Local Government to Get Research into Policy', SciDev.Net, www.scidev.net/global/communication/opinion/support-local-governance-to-get-research-into-policy-1.html (accessed 23 March 2015)
Perkin, E. and Court, J. (2005) Networks and Policy Processes in International Development: A Literature Review, Working Paper 252, London: Overseas Development Institute
Pope, C.; Mays, N. and Popay, J. (2005) 'Informing Policy Making and Management in Healthcare: The Place for Synthesis', Health Care Policy 1.2: 43–8
Porter, C. (2010) What Shapes the Influence Evidence Has on Policy? The Role of Politics in Research Utilisation, Working Paper No. 62, Young Lives, Oxford: Young Lives
Roger, R. (2006) 'A Decision-Maker's Perspective on Lavis and Lomas', Health Care Policy 1.2: 49–54, www.ncbi.nlm.nih.gov/pmc/articles/PMC2585335/ (accessed 23 March 2015)
Scott, A. and Munslow, T. (2015) , IDS Evidence Report 122, Brighton: IDS
Scott-Villiers, P. (2012) 'This Research does not Influence Policy', IDS Bulletin 43.5, Brighton: IDS
Shaxton, L. (2012) 'Blog: ', London School of Economics Impact Blog, http://blogs.lse.ac.uk/impactofsocialsciences/2013/01/24/8864/ (accessed 23 March 2015)
Sumner, A.; Ishmael-Perkins, N. and Lindstrom, J. (2009) , IDS Working Paper 335, Brighton: IDS
Weyrauch, V. and Diaz Langou, G. (2011) Sound Expectations: From Impact Evaluations to Policy Change, New Delhi: The International Initiative for Impact Evaluation (3ie): New Delhi
COVER IMAGE CREDIT: CARSTEN WIERTLEWSKI
What is ARCADE?
The ARCADE (African/Asian Regional Capacity Development) projects use innovative educational technologies to strengthen health research across Africa and Asia. Focusing on post-graduate, doctoral and post-doctoral training, partner institutions are developing cutting-edge online courses, blended learning modules and joint programmes that will enable training of researchers in low- and middle-income countries who might not otherwise have access to such material.
Additionally, the ARCADE projects work at an institutional level to strengthen education services, financial and administrative research management, research uptake capacity, and network building.
The projects are coordinated by Karolinska Institutet in Sweden, and together involve sixteen partners across Europe, Africa and Asia. These four-year projects (2011-2015) are funded by the European Commission’s 7th Framework Programme.
You can also download our ARCADE brochure here.
ARCADE HSSR; A focus on Health Systems research
In Africa, ARCADE’s focus is on building capacity for research into health systems and services (HSSR). HSS covers a range of topics, from health financing, to health workforce availability, service delivery and appropriate medical products and technologies. Research in this area is critical in Africa, where many health systems have chronically underperformed.
All of the material in this module has either been sourced from an open access source or permission has been granted from the author. The materials are published under a Creative Commons Attribution Non- commercial – ShareAlike 3.0 Unported license. This means that the materials are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
But the following terms apply:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
All course materials are published by the ARCADE Projects under the Creative Commons Attribution Non-commercial‐ShareAlike 3.0 Unported license.