
12 minute read
4.0 Approaching Evaluation
from Measuring Wellbeing. Gauging (Mental) Wellbeing Benefits of Arts & Cultural Participation
by ccse_uws
The preceding discussion has examined creative health, wellbeing and the challenges of – and potential approaches to – measurement with specific reference to mental wellbeing. The following surveys some of the literature in which arts and culture-based approaches have been undertaken with target groups of participants for whom improved wellbeing or increased resilience (or similar) are desired aims and/ or, for whom an admission criterion to the participant group included a mental health challenge (i.e. anxiety, depression, dementia).
The aim of this segment is to provide some insight into the approach(es) taken to evaluation. As previously mentioned, evaluation per se, is not always central to the analysis and discussion being presented (this is particularly so for grey literature). In their work to develop an appropriate evaluation approach for projects operating in the realm of mental health, social inclusion and arts, Secker et al. (2007, also Hacking et al., 2006) found that only ‘two of the 102 projects included in the analysis were using validated outcome measures […] [and that] [m]ost projects were evaluating their work at only one point in time, precluding the measurement of change over time (Secker et al., 2007). Yet, their investigation demonstrated that gaining understanding of ‘distance travelled’ over the course of participation in a project was of particular importance. This team of researchers developed a baseline study16 to be administered at start and end (or post-end) of project and, which was recommended to be used in tandem with CORE17 as a suite of accessible tools that would be relatively straightforward to administer. Nevertheless, it is interesting to note that ‘of 51 projects that initially expressed interest in assisting with the study [to test the tools], 22 recruited participants. The main reasons for projects deciding not to take part included doubts about their own capacity to help for funding or staffing reasons’ (Secker et al. 2007).
Advertisement
The Centre for Cultural Value reflected briefly on how the value of cultural participation is researched or evaluated, noting that qualitative approaches were generally preferred ‘to understand the value of cultural participation for older people’s sense of community, connection and wellbeing’. Predominantly, data were gathered using focus group, interview and observation techniques with a few of the studies reviewed drawing upon more participatory approaches (i.e. photo elicitation or documentary making).
16 Social inclusion is the focus of the baseline tool; it comprises scales to measure social isolation, social relations and social acceptance.
17 Clinical Outcomes in Routine Evaluation: it is ‘a brief, user friendly, questionnaire measure intended to be used at the beginning of therapy to indicate the differences in the severity of problems people may have and to be used at intervals thereafter to measure change’ It comprises 34 statements, ‘scored by tick box completion on the same five response levels on all items’ (Evans et al. 2000).
At the same time, the authors observed that the literature under review was of variable quality, as the researcher/participant relationship had been inadequately accounted for, ethical issues were ill-considered and/or the interview or focus group had been far too brief to be certain that ‘the depth of older people’s experiences has truly been captured’ (The Centre for Cultural Value, 2022). The same review noted the use of a plethora of standardised measures in other work similarly seeking to illuminate ‘the value of cultural participation for older people’s sense of community connection and wellbeing’ (the authors list: Geriatric Depression Scale, WHO Quality of Life assessment (WHOQOL-BREF), Positive and Negative Affect Schedule, Multidimensional Perceived Social Support Scale, UCLA Loneliness Scale) (Centre for Cultural Value, 2022).
4.1 Qualitative Approaches
The collection and analysis of narrative-type data is common. In their work on the use of participatory theatre for mental health recovery, Torrissen and Stickley rely on a narrative inquiry approach which prioritises the experiences of, and the stories told by, actors participating in activities of the Teater Vildenvei, a ‘semi-professional theatre company open to mental health service users and their allies’ (Torrissen & Stickley, 2017). Analysis of these narratives revealed that involvement with the theatre had enhanced wellbeing though the provision of ‘something meaningful to do, social contact, peer support and improved self-esteem’ (Torrissen & Stickley, 2017). Further, the authors underscore the fact that, their findings could be analysed against the mental health recovery framework established by Leamy at al. (2011) which identifies 5 areas for progress in terms of improved mental health and wellbeing: connectedness, hope, optimism, identity, meaning in life and empowerment. Carey and Sutton’s evaluation of a community arts project in Speke, Liverpool, report respondents’ ‘appreciation of the opportunity to discuss their experiences in interview’ though interviewees also note that they would have welcomed the chance to discuss their views at an earlier point in the project to provide ‘constructive, ongoing feedback’ (Carey & Sutton, 2004). Methodologically, such an approach would also provide the opportunity for comparative/ longitudinal analysis. These findings give credence to Oman’s (2020) argument that people often prefer describing wellbeing in their own words rather than assigning a point on a scale.18
In their research examining sectoral approaches to, and understanding of, programme evaluation, Daykin et al. found that ‘respondents reported the use of a wide range of evaluation methods, with extensive use of informal ‘anecdotal’ methods such as comment slips, feedback from artists and participants, practitioner diaries and ad hoc case studies.’ Other narrative type approaches – such as interview and focus group – were also employed though ‘quantitative methods and the use of validated assessment tools such as the WEMWBS […][were] reported less frequently’ (Daykin et al., 2017a).
Evaluating the ‘Making for Change’ project which provided female prisoners with the opportunity to develop skills for shortage occupations in the fashion industry, Caulfield et al. note that the project did not expressly set out to with the specific intention of addressing the participants’ wellbeing needs, however, many of the women did experience benefits to their health and wellbeing. This emerged during the evaluation which relied upon ‘observational, focus group, and interview data’ gathered from both women prisoners and project staff members (Caulfield et al., 2018).
4.2 Validated Approaches
Attempts to develop standardised tools to capture subjective wellbeing are not a sure-fire guarantee of successful evaluation. For example, designing a Wellbeing Measure’s Toolkit for use in museums, Thomson and Chatterjee (2015) trialled their measures among older adults who exhibited mild-to-moderate symptoms of dementia. Some of these respondents found the wellbeing questions posed were inappropriate, ‘one commenting that the questionnaire was ‘superficial’, and ‘about being happy’ and did not ‘reflect the vast range of older people at very different ages and stages of engagement’ (Thomson & Chatterjee, 2015). In their formative evaluation of ‘Arts for Wellbeing’ programmes delivered in County Durham, White and Salamon (2010) uncovered instances of music group participants finding the administration of a validated measure to track progression (in this case SF3619) to be ‘intrusive, stress-inducing and at odds with the relaxation aim of the activity’ (White & Salamon, 2010). Nevertheless, the wider literature shows that there is significant utility for validated scales in the field.20
18 Although her investigation focussed on effective partnership working for delivery of a music project, Currie (n.d.) suggests that ‘if participants’ musical experiences in the singing groups [comprising the mainstay of project activity], in relation to wellbeing, was to be the focus of the enquiry, than longer-term, imbedded ethnographic or practice-based approaches may be more appropriate’.
19 Available: https://tinyurl.com/2p8uxfts
20 It is worth noting that the Office for National Statistics has developed a set of 4 questions (known as the ONS4) to capture what they call personal (subjective) wellbeing in all its dimensions. These questions are: Life Satisfaction – Overall, how satisfied are you with your life nowadays?; Worthwhile – Overall, to what extent do you feel that the things you do in your life are worthwhile?; Happiness - Overall, how happy did you feel yesterday? And, Anxiety - On a scale where 0 is “not at all anxious” and 10 is “completely anxious”, overall, how anxious did you feel yesterday? Responses are given on a scale of 0-10. The ONS utilises these questions as part of the Measuring National Wellbeing (MNW) Programme which also includes a range of additional standard, objective measures (i.e. income, health). The ONS observes that ‘[t]hese questions represent a harmonised standard for measuring personal well-being, and therefore are used in many surveys across the UK.’ The ONS questions were not commonly used in any of the studies in this review, however, they could be a useful tool, particularly for large scale surveying or any situation for which concision and brevity are advantageous. See: https://tinyurl.com/yc82ruwm. Indeed, ONS4’s compactness could make them appropriate tools for garnering feedback on a session-by-session basis through the life-course of an intervention.
This ranges from the use of the Edinburgh Postnatal Depression Scale21 administered twice over the period of the intervention and analysed statistically to gauge the impact of participation in a singing group on new mothers’ experience of postnatal depression (Fancourt & Perkins, 2018), to the use of the CORE questionnaire for clinical assessment which provides ‘an overall measure of ‘mental distress’ with an established … cut off point and clinically significant change scores (Clift, 2012; also see Clift & Morrison, 2011). Elsewhere, Clift and Hancox make use of the WHO Quality of Life Questionnaire (WHOQOL-BREF) in tandem with a dozen-item ‘effects of choral singing scale’ to assess the physical, social, environmental and psychological wellbeing accrued from taking part in the named activity (Clift & Hancox, 2010).
In recognition of the challenges encountered when attempting to measure wellbeing, and in seeking to create an element of standardisation to assess and compare impact cross-sectorially, Thomson and Chatterjee (2015) describe their efforts to codesign a wellbeing measures toolkit for use in the museum sector. Working alongside industry stakeholders, the authors developed a toolkit to ‘measure psychological or subjective well-being as an indicator of the mental state of an individual and to be flexible in its approach in order to evaluate the impact of a one-off activity or whole programme of events’ (Thomson & Chatterjee). To develop the kit, the authors drew upon elements of a number of pre-existing tools.22
Tools based on a similar principle to CORE and WHOQOL-BREF – using statement questions against which respondents estimate the extent to which they agree or disagree (Likert scale) – are quite widespread. The What Works Centre for Wellbeing provides a Wellbeing Measures Bank comprising a wide variety of scales suitable for a range of circumstances alongside a simple signposting system to assist evaluation planners in selection of appropriate and accessible tools.23 Among the available validated tools, WEMWBS is used with some consistency. Daykin et al., (2017) used WEMWBS in tandem with GHQ12, a 12 item general health questionnaire; Smith et al., (2012) used the scale in tandem with the Hospital Anxiety and Depression Scale, HADS; McElroy et al., 2021, used the short WEMWBS, Smith et al., (2012) used it to measure wellbeing, mental health and social support factors in a project spanning multiple indices of family circumstances alongside the Moods and Feelings Questionnaire (MFQ24). Also see Blodgett et al. (2022) who identified examples of WEMWBS’s use to evaluate interventions focussed on art, culture and environment in their wider review wellbeing evaluation research using WEMWBS.
21 The EPDS is a self-report, 10 item measure on a scale from 0-30 ‘with a ≥10 indicative of possible depression and higher scores indicating more severe depression’ (Fancourt & Perkins, 2018).
22 Warwick-Edinburgh Mental Wellbeing Scale, Mental Wellbeing Impact Assessment, and People Assessing their Mental Health (PATH II). The scale developed was then distributed for initial testing alongside the already validated VAS (Visual Analogue Scale) and PANAS (Positive and Negative Affect Schedule) tools (Thomson & Chatterjee, 2015).
23 Available at: https://measure.whatworkswellbeing.org/measures-bank/
24 The MFQ is a 32-item questionnaire for depressive symptoms based on the DSM-III-R criteria for depression (Smith et al., 2012).
4.3 A Closer Look at WEMWBS
Designed with both general population mental wellbeing and project evaluation uses in mind,25 WEMWBS comprises a positively termed, 14-item scale ‘designed to measure positive mental health or mental well-being.’26 It includes both the hedonic and eudaimonic factors (Taggart et al., 2013). Though validated prior to its general usage, and for use among ethnic minorities (Stewart Brown et al., 2011; Taggert et al. 2013) and younger age groups (i.e. McElroy et al., 2021, Stewart-Brown, 2011; Clarke et al., 2011)27 researchers have undertaken work to evaluate its responsiveness in a variety of settings. Maheswaran et al. found that the combination of hedonic and eudemonic was likely the foremost reason for the scale’s efficacy at both individual and group levels (Maheswaran et al., 2012; also see Tennant et al. 2007). In earlier work, Tennant et al. (2007) concluded that WEMWBS had features in common with a number of other scales28 though, its lower correlations with the Emotional Intelligence and the single-item measure of life satisfaction indicated that WEMWBS ‘may be measuring a different concept’ (Tennant et al., 2007).
It is also important to note that, in their investigation of the appropriateness of WEMWBS among ethnic minority groups,29 Taggert et al. concluded that although their findings suggested that ‘the WEMWBS is acceptable across different cultural groups and sufficiently sound from a psychometric perspective to be valid in general populations surveys, there were areas where the tool’s appropriateness was less clear (Taggart et al., 2013). For example, these authors report both the difficulty in translating some of the concepts used in WEMWBS cross-culturally; the groups participating in this study discussed the significance of ‘spirituality as important to wellbeing as well as the concept of responsibility’ for one’s one mental welfare. Both of which are unrepresented in the WEMWBS instrument (Taggart at al., 2013). While Chinese respondents tended to be ‘dismissive of depression believing that it was over diagnosed in England’, there is no direct translation for the word ‘optimistic’ in Pashtun. Thus, although most Pakistani study participants ‘felt they understood the word, and some understood it as ‘happy for the future’ it was not clear that the expectation that things would work out well was understood’. In a similar way, young men across both study groups ‘interpreted the item ‘feeling interested in other people’ in a sexual context’ (Taggart at al. 2013). In their study, undertaken in a criminal justice setting, Daykin et al. (2017) note that using an approach – such as WEMWBS – which relies upon respondents reflecting upon and self-reporting their mental health can be challenging. In this particular research setting, security concerns meant that questionnaires were done in a group setting where ‘some participants engaged in banter, conferring and joking about answers’ (Daykin et al., 2017).
25 https://tinyurl.com/49rtsspk
26 The short version – SWEMWBS – comprises 7 items.
27 MacLennan et al. (2021) signpost some resources specifically designed for measuring child and young person wellbeing. These ‘complement the national measures of wellbeing… Some questions match those asked in the adult measuring national wellbeing programme […] others are specifically designed to reflect themes important for these age groups (e.g. talking to parents about things that matter, quarrelling with parents).’ The authors also note that WEMWBS and SWEMWBS are validated for use with children ages 13+ and 11+ respectively.
28 WHO-5, the Short-Depression Happiness Scale, Satisfaction with Life Scale and Scales of Psychological Wellbeing.
29 In this case English speaking Pakistani and Chinese participants.
White and Salamon (2010) also report some specific issues encountered with the administration of WEMWBS. In their evaluation of the social prescription programme Arts for Wellbeing, they found difficulty in tracking which of the WEMWBS responses had been submitted first (baseline) and last (endpoint/follow-up), leading to issues when attempting to track change across time.30 They also came up against very low return rates as very few participants completed and returned both the baseline and follow-up WEMWBS forms. There were also problems with ‘getting forms to artists on time’ for onward distribution. Participant responses were also found to be problematic on occasion; ‘some filled [WEMWBS] out very quickly and with not much thought, others didn’t understand why the NHS wanted to know if they were happy (suggesting association of the NHS with ill health, not positive emotions?)’. The whole process was time consuming and, ‘if these forms were filled out in session 6,31 you do not find out the longevity of the effects’ (Salamon & White, 2010). While some of these complications arise as a result of insufficient planning/readiness for those delivering the programme evaluation, their occurrence does underscore the need to plan for evaluation before a project is begun. Yet, even when planning is sufficient, a high response rate – comprising both start and endpoint responses – is not guaranteed.32
It is also worth bearing in mind, that although they acknowledged WEMWBS’s usefulness and uptake in the arts and health space, the authors of the All-Party Parliamentary Group on Arts, Health and Wellbeing Inquiry Report note that ‘critics of WEMWBS point to its relentlessly upbeat nature and its failure to capture other factors impacting upon wellbeing, including socio-economic inequalities, the vagaries of daily life and the imminent end of enjoyable arts activities’ (APPG: 2017). This shortcoming might well be mitigated through the appropriate provision of narrative feedback options which could then be analysed in tandem with data deriving from any validated scale employed for evaluative purposes.
30 Late enrolment for some programme participants may also have contributed here with a missed first session resulting in no completion of a baseline form.
31 At the time of the evaluation, the programme under scrutiny ran for a maximum of 6 sessions for each participant.
32 In their rapid review of wellbeing evaluation research using WEMWBS, Blodgett et al. (2022) record a range of administration failures for which projects were excluded from consideration, i.e. administering WEMWBS only once, or at the start and end points of a 2 week or single event intervention for which the tool is not intended. On the other hand, ‘[a]nother series of interventions assessed WEMWBS before and after a 5-day multi-activity course, however as scores were also collected at follow-up points beyond two weeks, it was included’ (Blodgett et al. 2022).