Class 3

The Trouble with Traffic Lights

Why traditional risk assessment falls short when hazards interact, cascade, and hide their causes

A safety manager at a regional hospital sits down with her risk register. She has just received the results of a staff wellbeing survey: elevated workload complaints, low scores on supervisor support, and a spike in voluntary turnover among early-career nurses. She opens the organisation's risk matrix - a familiar 5×5 grid of likelihood and consequence, colour-coded in greens, ambers, and reds. She rates "high job demands" as likely and moderate in consequence: amber. She rates "low supervisor support" as possible and moderate: also amber. Two amber ratings. No red flags. The register goes to the executive team, who note the results, approve the existing controls, and move to the next agenda item. Three months later, the ward loses a third of its nursing staff to burnout-related resignations. How did two amber ratings miss a crisis?

How Organisations Actually Assess Psychosocial Risk

If you have spent any time in a workplace health and safety function, you will recognise the tools. Most organisations assess risk - including psychosocial risk - using some combination of three approaches: risk registers that catalogue known hazards with owner names and review dates, risk matrices that plot each hazard on a likelihood-by-consequence grid, and survey-based approaches that collect self-report data from workers and compare scores against benchmarks or norms (Taibi et al., 2022). These tools are not arbitrary. They descend from decades of occupational health and safety practice, and for physical and chemical hazards - an unguarded machine, a toxic gas exposure - they work reasonably well. The hazard is discrete. The exposure pathway is clear. The consequence is measurable.

Psychosocial hazards, however, are a different animal. As Schulte et al. (2024) document in their call to action from the US National Institute for Occupational Safety and Health, work-related psychosocial hazards are "on the verge of surpassing many other occupational hazards" in their contribution to ill-health and disability - yet the methods for assessing them remain borrowed, often without adaptation, from the physical hazard playbook. The standard approach asks practitioners to take each psychosocial hazard, treat it as a standalone line item, estimate its likelihood and consequence separately, multiply or cross-reference to produce a risk rating, and assign a colour: green, amber, or red. A traffic light.

This chapter is not about dismissing these tools. They provide structure, create accountability, and force conversations that might otherwise never happen. But it is about understanding their limits - limits that become severe when the hazards in question interact, accumulate, and produce outcomes through causal chains that no single colour code can capture.

The Risk Matrix: A Closer Look

The risk matrix is elegant in its simplicity. One axis represents likelihood (from rare to almost certain), the other represents consequence (from negligible to catastrophic). Each hazard occupies a single cell. The cell's colour - determined by some version of the formula Risk = Likelihood × Consequence - tells the decision-maker how urgently to act. Cox (2008), in a landmark analysis, showed that this simplicity comes at a steep mathematical cost. His paper demonstrated that a typical risk matrix can correctly rank fewer than 10% of randomly selected hazard pairs - a phenomenon he termed range compression. Hazards with very different quantitative risk levels are compressed into the same colour category, making them appear equivalent when they are not. Worse, the matrix can produce risk reversals, assigning a higher rating to a hazard that is quantitatively less dangerous than another rated lower (Cox, 2008).

Duijm (2015) extended these concerns by examining the practical design choices that organisations make: how many rows and columns, where to draw the colour boundaries, how to define "likely" versus "possible." Each choice embeds a judgement. Different users, presented with the same scenario, frequently produce opposite ratings - a finding that is especially pronounced for psychosocial hazards, where the "likelihood" and "consequence" of something like role ambiguity resist the kind of concrete estimation that a chemical spill permits (Dettmers & Stempel, 2021).

Think About It

Imagine you are asked to rate the "likelihood" that a team experiences low job control. Likelihood of what, exactly - that the condition exists, that it causes harm, or that the harm becomes severe? How would two different managers interpret this question differently?

These are serious problems, but they are problems of precision. The three limitations we turn to next are problems of architecture - they concern not how well the matrix executes its logic, but what that logic is fundamentally unable to represent.

Failure Mode 1: The Independence Assumption

A risk register lists hazards as separate line items. High job demands: row 14. Low job control: row 15. Low supervisor support: row 16. Each receives its own likelihood rating, its own consequence rating, and its own traffic light. The implicit assumption is that these hazards are independent - that the risk posed by high demands can be assessed without knowing anything about the level of control or support available to the worker.

This assumption collapses the moment you consult the research. Karasek's (1979) demand-control model demonstrated that psychological strain is not produced by high demands alone, nor by low control alone, but by their interaction. A nurse managing a heavy patient load with autonomy over scheduling and clinical decisions may thrive. The same nurse with the same load but no say in how the work is organised may develop exhaustion and depersonalisation. The critical quantity is not the sum of two independent ratings but the product of their combination.

Longitudinal evidence confirms this pattern. De Jonge et al. (2000), tracking healthcare workers over two years, found that the relationship between job demands and job satisfaction actually reversed direction depending on the level of control: positive under high control, negative under low control. This is not a subtle statistical nuance. It means that a risk matrix rating demands as "moderate risk" and control as "moderate risk" could be describing either a healthy workplace or a dangerous one - and the matrix has no way to distinguish the two.

Bakker and Demerouti's (2007) Job Demands-Resources model generalised this insight across dozens of occupational settings, showing that job resources buffer the impact of job demands on burnout. Across more than 12,000 employees, 88% of demand-resource interactions followed this pattern. The interaction effect - not the additive combination - is the explanatory mechanism. Yet risk matrices have no cell for interactions. They offer rows. Reality offers networks.

The Risk Matrix Stress Test

Rate two hazards independently using the dropdowns below, then toggle "Show Interaction" to see what the matrix misses.

Select Hazard Pair:

Likelihood: Consequence:

Failure Mode 2: The Snapshot Problem

A risk matrix captures a moment in time. It says: right now, given what we know, this hazard sits in this cell. But psychosocial hazards do not sit still. They propagate. High job demands, sustained over months, erode recovery. Eroded recovery reduces cognitive performance. Reduced performance generates errors. Errors trigger blame cultures. Blame cultures suppress reporting. Suppressed reporting means the next risk assessment shows fewer hazards, not more. The matrix at time two looks better than the matrix at time one - even as the actual situation deteriorates.

This is the snapshot problem: risk matrices cannot model how hazard levels flow through causal pathways over time. They provide a static photograph when what practitioners need is a motion picture. Taibi et al. (2022) make this point explicitly, noting that existing psychosocial risk assessment approaches "only consider one parameter of the risk definition" and that cross-sectional self-report data - the typical input to a psychosocial risk matrix - "cannot support causal conclusions." If the matrix tells you that demands are high and burnout is elevated, it cannot tell you whether the demands caused the burnout, whether burnout made demands feel higher than they are, or whether both are driven by a third factor the register never captured.

Think About It

If a team's risk matrix shows "role conflict" as amber this quarter and amber last quarter, does that mean the situation is stable? What causal processes might be changing underneath an unchanging colour code?

Failure Mode 3: The Direction Problem

Risk matrices reason in one direction: from hazard to rating. You identify a hazard, estimate its parameters, and compute forward to a conclusion. This is forward reasoning, and it answers the question: "Given this hazard, how bad could things get?"

But practitioners also need backward reasoning - the ability to start from an observed outcome and work back to its likely causes. When a ward experiences a sudden spike in sick leave, the manager needs to ask: "What combination of conditions most likely produced this?" Was it demands? Support? A scheduling change? Some interaction among them? A risk matrix cannot answer this question. It was not designed to. It has no mechanism for diagnostic inference - for updating beliefs about causes in light of observed effects.

This directional limitation matters profoundly in practice. Dettmers and Stempel (2021) showed that converting psychosocial questionnaire scores into risk values already requires assumptions about causal direction that the tools themselves cannot verify. The procedures organisations use to determine whether a score is "critical" differ significantly and often rest on arbitrary rules of thumb rather than empirical evidence. Without a tool that reasons in both directions, practitioners are left guessing at causes - and interventions based on guesses are interventions that may target the wrong lever entirely.

What Would a Better Tool Need?

Let us gather the threads. The three failure modes - independence, propagation, and directionality - are not minor inconveniences. They are architectural gaps. A tool adequate to the complexity of psychosocial hazards would need three corresponding capabilities:

The ability to model interactions: It must represent the joint behaviour of hazards that combine non-additively - where the effect of A depends on the level of B.
The ability to trace causal propagation: It must show how conditions flow through pathways, from upstream causes through intermediate states to downstream outcomes, allowing "what-if" reasoning about changed conditions.
The ability to reason in both directions: It must support forward inference (from causes to consequences) and backward inference (from observed outcomes to probable causes).

No spreadsheet, no colour-coded grid, and no standalone survey instrument offers all three. But such a tool exists. It has been used for decades in medical diagnosis, engineering reliability, and artificial intelligence. It has a name.

We are not ready to say it yet. Not quite. First, in Chapter 4, we need to learn the language of causal thinking - to understand what it means for one thing to cause another, and how we represent that relationship formally. But hold the question in mind as you move forward: What if we could build a model of the workplace that shows how hazards connect to each other and flow into outcomes - a model we could run forward and backward, asking both "what-if" and "why" questions?

Key Takeaways

Traditional psychosocial risk assessment relies on risk registers, risk matrices, and survey-based approaches - tools adapted from physical hazard management that provide useful starting points but embed significant limitations.
Risk matrices suffer from range compression and risk reversal, meaning they can assign identical ratings to very different risks and occasionally rank less dangerous hazards above more dangerous ones.
The independence assumption treats interacting hazards as separate line items, missing the combinatorial effects that research (Karasek's demand-control model, the JD-R model) identifies as the primary drivers of strain.
The snapshot problem means risk matrices capture a static moment and cannot model how hazards propagate through causal pathways over time.
The direction problem means traditional tools can only reason forward (hazard → rating) but cannot reason backward from observed outcomes to likely causes - the diagnostic inference practitioners urgently need.
A tool adequate for psychosocial risk must support three capabilities: modelling interactions, tracing causal propagation, and reasoning bidirectionally.

Looking Ahead

We have identified what a better tool needs to do. In Chapter 4, we lay the foundation for building one - learning the language of causation, distinguishing correlation from causal influence, and discovering how directed acyclic graphs give us a precise way to draw the relationships that risk matrices can only ignore.

References

Bakker, A. B., & Demerouti, E. (2007). The Job Demands-Resources model: State of the art. Journal of Managerial Psychology, 22(3), 309–328. https://doi.org/10.1108/02683940710733115

Cox, L. A. (Tony). (2008). What's wrong with risk matrices? Risk Analysis, 28(2), 497–512. https://doi.org/10.1111/j.1539-6924.2008.01030.x

de Jonge, J., Dollard, M. F., Dormann, C., Le Blanc, P. M., & Houtman, I. L. D. (2000). A longitudinal test of the demand–control model using specific job demands and specific job control. International Journal of Behavioral Medicine, 7(4), 307–321. https://pmc.ncbi.nlm.nih.gov/articles/PMC2862948/

Dettmers, J., & Stempel, C. R. (2021). How to use questionnaire results in psychosocial risk assessment: Calculating risks for health impairment in psychosocial work risk assessment. International Journal of Environmental Research and Public Health, 18(13), 7107. https://doi.org/10.3390/ijerph18137107

Duijm, N. J. (2015). Recommendations on the use and design of risk matrices. Safety Science, 76, 21–31. https://doi.org/10.1016/j.ssci.2015.02.014

Karasek, R. A., Jr. (1979). Job demands, job decision latitude, and mental strain: Implications for job redesign. Administrative Science Quarterly, 24(2), 285–308. https://doi.org/10.2307/2392498

Schulte, P. A., Sauter, S. L., Pandalai, S. P., Tiesman, H. M., Chosewood, L. C., Cunningham, T. R., Wurzelbacher, S. J., Pana-Cryan, R., Swanson, N. G., Chang, C.-C., Nigam, J. A. S., Reissman, D. B., Ray, T. K., & Howard, J. (2024). An urgent call to address work-related psychosocial hazards and improve worker well-being. American Journal of Industrial Medicine, 67(6), 499–514. https://doi.org/10.1002/ajim.23583

Taibi, Y., Metzler, Y. A., Bellingrath, S., Neuhaus, C. A., & Müller, A. (2022). Applying risk matrices for assessing the risk of psychosocial hazards at work. Frontiers in Public Health, 10, 965262. https://doi.org/10.3389/fpubh.2022.965262