top of page

Complexity Evaluation... How Do We?

Updated: Feb 1, 2022

I have spent nearly twenty years working as an evaluation practitioner in many different contexts. For about the first ten years of my career, all the evaluation work I was involved in fitted the dominant model of evaluation, one that ‘grew up in the projects’. With SMART goals; theories of change that fit on one page and show clear causal pathways to be ‘tested’; contracts that stipulate one way progress reporting, often based on a predetermined timeline; payment milestones weighted to a final report due several months or years into the future, and so on.

Over the second decade that way of working has remained pretty dominant, whether I was involved in commissioning or delivering evaluation. Even when the context called for something different, for example working with developmental evaluation to support evolving innovations at Nesta; or evaluability assessment as part of early thinking about new programme design at The Health Foundation, it often had an ‘under the radar’ feel about it, structured in a way that met expectations about what evaluation ‘looks like’, or to fit with procurement requirements.

Even now, after the dramatic pivots in ways of working, required as a result of the Covid epidemic, has brought the value of developmental evaluation to the fore, and the language of complexity and systems thinking becomes increasingly prevalent, the nuts and bolts of what is asked for and expected in evaluation specifications is still dominated by the traditional project based approach we all know (and love?).

But we know this isn’t fit for purpose. We can feel it in our bones as we are doing the work.

And the subjects of evaluation - the policy, service and programme ‘evaluands’ – are more often explicitly recognised as complex, and less suited to more rigidly linear models of evaluation, which focus on predetermined outcomes, and where the evaluand is often artificially isolated from wider contexts and perspectives. So the need for complexity focused evaluation is coming into sharper focus for all parties – evaluation practitioners, programme leads, funders and wider stakeholder communities.

Take for example the National Lottery Community Fund’s Growing Great Ideas programme. Spanning a decade, with multiple strands of activities, broad communities of stakeholders and participants, and a deliberate focus on how different elements of the programme can influence each other and the programme as they evolve together, this programme is

looking to invest in different combinations of people, communities, networks and organisations that demonstrate an ability to seed and grow alternative systems, accelerate the deep transition of 21st-century civil society, and to learn and adapt as they go”.

This programme cannot be understood as a ‘project’ for the purposes of evaluation and learning.

Making the transition from project based models of evaluation, to those that are more complexity aware raises a whole host of issues in relation to the design and management of evaluation and new methods; along with the less tangible aspects like mindset and cultures of practice.

What is perhaps lacking are resources to help us tackle some more specific, practical questions about the practice of complexity evaluation.

Here I share some of the questions I have been carrying around for a while, in the hope that a) I am not the only one who doesn’t know the answer to these things! And b) that we can work together, as commissioners and practitioners of evaluation, to find answers, ask new questions and swap resources and examples from our practice.

How do we…

1. ...know we are at the ‘evaluation ready’ version of the theory of change?

This helpful framework on the relationship between monitoring evaluation and learning, shows how the ‘Theory of change’ box on the far left hand side of the framework is revised through processes of reflection on and learning from data (the red box on the far right, looping back to the theory of change on the left).

How do we choose the appropriate time to pause the revisions to evaluate for outcomes? What version of the theory of change is the right one to base an evaluation on? Or is it about accepting that whatever version we use, it is likely to change during the life of the evaluation, and we must be ready to adapt to the changes?

There are other questions about how to do theory of change in a way that is sensitive to complexity.

Matt Baumann and Caroline Hattam provide a short example from their recent work for DEFRA, on what a complexity-appropriate evaluation framework for England’s action on nature over the next 10 years would look like, emphasis the need to

“…incorporate the contribution of emerging polices and funding streams and shifts in the framing and narratives proposed in the strategy as well as the development of ‘nested’ ToCs for individual policies and programmes or for clusters of policies and programmes. Furthermore, ToCs will need to be regularly reviewed in response to strategy evolution – or to guide such evolution.”

But how do we do this practically? Do we leave blanks so theory of change visuals show what is not yet known? Where do we put the narrative that explains how and why visuals change over time? How best to visualise the detail of interactions with a wider system? Do we need multiple versions for what might be very different perspectives, from different stakeholders? How do we involve stakeholders in participatory systems mapping?

And how do we take all of this into account without theory of change diagrams becoming too complicated and unclear, or unhelpfully simplistic? What other formats for communicating about theory of change would be helpful here?

2. ...assess the quality of bids for complexity appropriate evaluation, when we haven’t worked this way before?

To this end it would be useful to share information about experts who can help with assessment, and examples of high quality bids. How do we spot the skills and experience we need for high quality complexity evaluation? What traits, what combination of skills on a team? Should we look for a specific section on how they will approach adapting methods, what examples do we have of how to handle this well? What should we look for in proposed approaches to project management and client communication, to know they will support adaptation and ongoing learning?

3. …approach the preparatory work necessary before the ‘real’ evaluation begins?

Clarifying what is to be evaluated, describing the theory of change and primary intended users and uses for evaluation and so on, are often neglected or incomplete tasks before an invitation to tender is prepared and advertised. This can lead to the evaluation budget and timeline not accurately reflecting the amount of foundation work that has to be done before the evaluation itself can proceed.

Guidance on complexity evaluation adds to this clarification work, with emphasis on the importance of stakeholder engagement, mapping interactions with wider systems and so on. Combined with the fact that some of the interventions are in themselves large and complex, we can see how this preparatory work starts to be a big job.

How do we build understanding of the need to resource and engage with proper scoping work? And that evaluation design can’t happen without engaging in deep thinking about assumptions and meaning? What examples do we have of evaluations where there has been adequate time for and emphasis on reflective thinking, adapting practice, and translating data and ‘results’ into useful learning and outputs?

4. …manage the need for lots of stakeholder involvement?

Working in a complexity informed way emphasises stakeholder involvement, to fully see what is going on in a more nuanced and rounded way from as many perspectives as possible. For example, how the policy, or programme is evolving and adapting, how it interacts with other contexts, and creating space to understand and modify expected outcomes together.

What are the best ways to do that, particularly when you have a high number of stakeholders? Which perspectives get to set direction, where does power sit? What are the processes for deciding how that happens? What role can technology play here? How is involvement balanced with participant fatigue and the pressures of the timeline and budget?

5. …redesign and redeploy resources when the evaluation needs to adapt to changing circumstances?

As the CEACAN Toolkit recognises, if outcomes are determined before the full complexity of the intervention and context are understood, then what is measured may become redundant as the intervention adapts, or is better understood.

“…it is important to remain flexible about how one goes about the process, as requirements from the evaluation may change over time and detailed methodological requirements may emerge as understanding improves”

This raises practical contractual issues, for example about who has responsibility if additional resourcing is needed to meet changing needs? Should the commissioner or provider maintain an ‘adaptation’ contingency fund? And how to manage communication between all parties so necessary changes can be identified as early as possible?

6. …do progress reporting in a way that makes sense?

If we need to be on the watch for how an evaluation needs to adapt, so we can respond in a timely way, does this become the focus of progress reporting? As the Toolkit suggests,

Detailed methodological requirements may only emerge over time. Flexibility is therefore required, with evaluators and commissioners regularly reviewing the design to determine how well it is working and whether it should be modified

A new style of progress reporting, with information shared from both sides, to work through the best way forward at different points in time? As evaluation pioneer Michael Quinn Patton describes (in relation to developmental evaluation) timely feedback becomes “feedback to inform ongoing adaptation as needs, findings and insights emerge, rather than only at predetermined times (like quarterly, or mid-term or end of project)”.

7. …define outputs from complexity evaluation?

If I had a criticism of the CECAN Toolkit, it is the first visualisation, of the commissioning process for an external contractor. A linear figure that ends very firmly with ‘receive final report’, as the only output.

Reports can of course be useful and necessary for contractual and methodological accountability, as a ‘master’ document to draw from, and as a way of holding all the relevant information together in one place (not bits and pieces distributed across various hard drives and shared folders).

But how useful is it as a communication tool, and as a way of supporting the cycles of continuous learning necessary for complexity evaluation? As the Toolkit goes on to say,

“Commissioners should be prepared to explore not just how well the intervention is working and how it can be improved, but also question whether quite different approaches may have produced better results. In this way, appraisal and evaluation will merge into a continuous process of learning and policy evolution.”

“Ensuring that there is capacity, and capability, in the system to commission, undertake and use the findings from a complex evaluation” is about more than reading a final report.

How can outputs, and the payment milestones that are attached to them, be re focused to emphasis effort and energy on timely, continuous learning? What kind of skills and experience does an evaluation team need to support continuous learning, and what do we do when lack of progress is due to issues with capacity, capability (& willingness) on the commissioner side, beyond the control of the evaluation? Can we share examples of contracts that are set up to support complexity appropriate approaches to evaluation?

What next?

The Toolkit recognises that “there is currently no capability framework developed specifically for evaluations and commissioners working in complex settings”, which makes me feel a bit better about having so many unanswered questions!

But we do know, from some of the guidance already out there, that it is about moving away from traditional notions of appraisal, towards complexity thinking and evaluative practice, combined with adaptive or agile management techniques. And always with the vision in mind for evaluation to be

"an integrative , continuous process not a one-off exercise at the end or a series of self-contained steps, it needs to become a way of working” (Giorgi 2017 in CECAN’s Complexity Evaluation Toolkit).

How do we, as practitioners and commissioners of evaluation, come together to share what we are learning, answer these questions, ask new ones and develop ourselves and our practice in new directions?

Kerry McCarthy is an expert in using evaluation and data for strategy, decision making and to understand impact. Kerry aims for work to be rigorous, engaging and useful. Visit to find out more.

643 views0 comments


bottom of page