A Practical Standard for Measuring Customer Sentiment and Service Quality in HHS
Download the full Report here.
Health and Human Services interactions are often high-stakes, emotionally charged, and constrained by policy—yet the quality of those interactions shapes whether clients follow through, return repeatedly, escalate, or disengage. At the same time, the strain of repeated conflict and ambiguity affects workforce morale and retention. Despite these stakes, agencies typically measure service quality unevenly—through local surveys, informal observation, or narrow operational metrics—without a consistent way to understand what clients experienced in the moment and what workers actually did to influence that experience.
This whitepaper proposes a practical starting standard that agencies can adapt while still enabling comparison and improvement over time. GTN and Tandem developed the approach by synthesizing measurement practices from sectors that treat sentiment measurement as operational infrastructure—then translating those practices into the realities of HHS service channels and constraints. The design emphasizes fairness and diagnostic value by separating “experience” from “worker behavior,” keeping the criteria stable across programs and channels, and ensuring scores remain interpretable and evidence-based (useful for coaching rather than punitive surveillance).
The proposed standard draws lessons from contact centers (disciplined QA, calibration, closed-loop coaching), retail/hospitality (rapid feedback and service recovery), financial services (separating empathy from compliance), and digital support environments (measuring experience across journeys, not just single touchpoints). A key takeaway across these sectors is that the most useful insight often isn’t an absolute sentiment label—it’s the trajectory: when sentiment shifts, what triggered it, and what behaviors helped stabilize the interaction. That trajectory framing is especially relevant in HHS, where customers may begin with frustration or fear and “success” can mean clarity, dignity, and stabilization even when outcomes are constrained.
At the core is a two-layer model with one shared language: an Experience Layer (what the customer is likely feeling and perceiving) and a Worker Behavior Layer (what the worker did in observable, teachable terms). The paper also provides a reusable criteria library: ten experience dimensions (e.g., trust, comprehension, effort burden, perceived fairness, respect/dignity, and de-escalation success) and nine worker behavior dimensions (e.g., expectation-setting, plain language, active listening, transparent constraint explanation, bounded choices, confirming understanding, and actionable closing). Each criterion is scored from 1–5, producing an Experience total (max 50) and a Worker Behavior total (max 45) that can be consistently applied across programs and channels.
To make reporting actionable without overclaiming precision, the framework maps totals into three tiers—At-risk, Stable, and Strong—and recommends pairing scores with the specific moments (short excerpts or timestamps) that most influenced the result. Finally, the paper outlines staged implementation: start with calibrated QA using the worker behavior rubric, add experience scoring as a trajectory, and integrate outcome measures where feasible. Strong governance is essential—defining how scores will be used, who can see them, and safeguards to prevent misuse—so measurement becomes a tool for learning, coaching, and process improvement, while supporting workforce fairness and public accountability.