Shifts in subjective well-being scales?

By Milan Griffes @ 2020-08-18T18:27 (+31)

Imagine that someone has an experience that causes them to re-conceptualize the scale they've been using to measure their well-being.

e.g. They used to assess themselves as an 8 out of 10 on some psychometric scale. Then they have this experience, and they think "Oh wow, now it seems that the scale I was using was wrong. I had been thinking that I was close to as good as I could be (8 out of 10), but now it feels like 'as good as I can be' is actually way higher than I thought..."

So they bump up the top end of the scale they're using. Now they assess themselves as 8 out of 50, not because they think their life has gotten worse, but because they're conceptualizing "as good as I can be" differently.

But this shift isn't reflected on the psychometric instrument they've been using, so when they assess themselves again, they have to normalize their new view to match the instrument's scale:

"I used to feel like I was 8 out of 10, and that matched up pretty well with this 10-point scale, so I recorded myself as 8 out of 10"
"Then I had an experience that caused me to believe that the potential upside can be way, way higher, so now I feel like I'm 8 out of 50"
"I have to fill out another one of these 10-point scales – should I still say I'm 8 out of 10? Or should I say I'm 2 out of 10, because that's proportional to my new internal assessment? But recording 2 out of 10 makes it seem like my life has gotten way worse since I last filled out an assessment, and that's not true..."

This issue (how to translate a person's internal state into a standard measure, then track that measure over time) seems pretty central to assessing any intervention aimed at improving subjective well-being, given that we're in a world where every person starts by bootstrapping their own internal scale / their own sense of what's possible.

Has anyone seen discussion of how to handle shifts like this in the SWB measurement literature?

MichaelPlant @ 2020-08-19T15:28 (+28)

TL;DR Evidence suggests there aren't shifts in SWB scales over time. This topic isn't well understood. I've got a paper on this area in the works.

The question you're asking here - do individuals rescale, that is, alter what the end-points of their scales refer to? - is one component of a broader concern.

The broader question is whether subjective scales, those where individuals give numerical ratings of subjective phenomena are cardinally comparable, that is, whether a one-point change, on a given scale, represents the same size change in subjective experience for different people and at different times. For instance, if I say my happiness has gone from a 4 to an 5 out of 10, and you say your happiness have gone from a 3 to 4, can we conclude we each had the same increase in happiness?

Given how fundamental the concern is - it applies to all subjective data, not just SWB data - I've been surprised to find the topic hasn't been looked into a great deal. Two leading SWB researchers, Stone and Krueger, said this in an 2018 review article

one of the most important issues inadequately addressed by current [SWB] research is that of systematic differences in question interpretation and response styles between population groups. Is there conclusive evidence that this is a problem? And, if so, are there ways to adjust for it? Information is needed about which types of group comparisons are affected, about the magnitude of the problem, and about the psychological mechanisms underlying these systematic differences

I've been looking at the cardinality of subjective scales. I've got a working paper that I'm not quite ready to put online - this should only be another couple of months. The paper is an evolution of work I had in my DPhil thesis (pp. 135), where I broke cardinal comparability into a number of components, reviewed the evidence for each, and concluded SWB data probably best interpreted as cardinally comparable.

The topic is pretty complicated and addressing all of it would take too long here. I'll just provide a 'quick and dirty' answer to the specific concern you raise about rescaling (aka 'intertemporal cardinality'). Prati and Senik (2020) compare remembered SWB—how satisfied individuals recall being—with observed past SWB—how satisfied individuals they said they were at the time. The use a German panel data where individuals were given 9 different pictures of changes in life satisfaction over time (e.g. staying flat, going up, going up then going down, etc) and asked to pick the one that best represented their own life.

There turns out be an (I think) pretty amazing match between the patterns of observed past and remembered SWB. This is only possible if either (A) individuals both use the same scale over time and have good memories or (B) individuals change the scale use and have bad memories. If individuals used the same scales and had bad memories, or used different scales and had good memories, there would be an inconsistency between the recalled and past observed patterns. Of the two options, (A) seems far more probable than (B). It's hard to believe individuals really can't remember how their lives have gone. Further, we might expect individuals will try not to rescale so that their answers are comparable over time.*

Hence, there doesn't seem to be rescaling at the population level. Further research into whether there are some individuals who rescale, and what causes this to happen, would be good. I'm not aware of any.

*In fact, (B) requires quite specific and implausible patterns of memory failure. To illustrate, suppose your experienced satisfaction has been flat but, because your scale has been shrinking, your reported 0-10 level of satisfaction had been rising over time. To make your observed past satisfaction and your recalled satisfaction consistent, given this scale shrinkage, you would need to falsely recall that your satisfaction has increased. If you instead erroneously recalled that your satisfaction had decreased, then there would be an inconsistency between observation and recall.

Milan_Griffes @ 2020-08-19T17:10 (+4)

Thanks for this!

How do you think the potential consistency over time (A) squares with the inconsistency between scales & sub-scales that Kaj pointed out?

MichaelPlant @ 2020-08-21T08:49 (+4)

It's not clear to me what relationship one should expect between the cardinality (or not) of subjective scales and the relationship between ratings of overall SWB and rating of sub-domains (and thus what one could infer about the other from results in the first).

As a separate point, I'm not sure how to make sense of the putative inconsistency Kaj's notes. I haven't looked into the relationships between overall rating and rating of sub-domains; it's not something that I've heard SWB researchers discuss much either. The most obvious explanations, in addition to those mentioned below, are to appeal to missing domains and/or different temporal foci (i.e. you just think about sub domains are they are now, but your life you also think about the future.

Kaj_Sotala @ 2020-08-19T12:38 (+6)

I don't know, but I get the impression that SWB questions are susceptible to framing effects in general: for example, Biswas-Diener & Diener (2001) found that when people in Calcutta were asked for their life satisfaction in general, and also for their satisfaction in 12 subdomains (material resources, friendship, morality, intelligence, food, romantic relationship, family, physical appearance, self, income, housing, and social life), they gave on average a slightly negative rating for the global satisfaction, while also giving positive ratings for all the subdomains. (This result was replicated at least by Cox 2011 in Nicaragua.)

Biswas-Diener & Diener 2001 (scale of 1-3):

The mean score for the three groups on global life satisfaction was 1.93 (on the negative side just under the neutral point of 2). [...] The mean ratings for all twelve ratings of domain satisfaction fell on the positive (satisfied) side, with morality being the highest (2.58) and the lowest being satisfaction with income (2.12).

Cox 2011 (scale of 1-7):

The sample level mean on global life satisfaction was 3.8 (SD = 1.7). Four is the mid-point of the scale and has been interpreted as a neutral score. Thus this sample had an overall mean just below neutral. [...] The specific domain satisfactions (housing, family, income, physical appearance, intelligence, friends, romantic relationships, morality, and food) have means ranging from 3.9 to 5.8, and a total mean of 4.9. Thus all nine specific domains are higher than global life satisfaction. For satisfaction with the broader domains (self, possessions, and social life) the means ranged from 4.4 to 5.2, with a mean of 4.8. Again, all broader domain satisfactions are higher than global life satisfaction. It is thought that global judgments of life satisfaction are more susceptible to positivity bias and that domain satisfaction might be more constrained by the concrete realities of an individual’s life (Diener et al. 2000)

Julia_Wise @ 2020-08-19T15:34 (+5)

A friend of mine described this happening between high school and university - he felt his life was pretty good in high school, and in university he thought "oh wow, there are so many options out here in adult life, my life could be way better than I thought."

G Gordon Worley III @ 2020-08-18T18:56 (+5)

Might help to see how this is handled, if at all, with pain scales. For example, I can imagine someone thinking they're having 9/10 or 10/10 pain, say from an injury, but then after something much worse happening, say a cluster headache or a kidney stone, they realize their injury pain was only a 6/10 or 7/10 and the cluster headache or kidney stone was the actual 10/10.

I know there is already some stuff about how the pain scale has cross cultural issues, with people from different cultures reporting and possibly even experience their pain as more or less worse than others from other cultures, so might be an entry point to this line of investigation.