Whitepaper
The Stori Trait Assessment: A Dual-Output Framework for Candidate Intelligence
Fraser Hill / Stori Labs — 2026
For decades, hiring has relied on proxies — resumes, behavioral interviews, and intuition. These tools have persisted not because they are accurate, but because they produce outcomes that appear acceptable. Most hires do not fail catastrophically. Most employees are “good enough.” And so the system has remained largely unchallenged.
Artificial intelligence has not created this problem. It has exposed it. Resumes are now generated and optimized by machines. Candidates rehearse with AI coaches. The signals that hiring processes were designed to read have become indistinguishable from noise.
This paper describes the measurement system designed to replace those proxies: a structured trait assessment and narrative interview framework that captures how people actually think, work, and perform — grounded in behavioral reality rather than self-reported history.
1. The problem with hiring today
The tools companies use to evaluate candidates have barely changed in fifty years. Resumes list credentials. Behavioral interviews ask candidates to recall polished highlights. Reference checks confirm employment dates. Each of these is a proxy — an indirect signal that correlates weakly, and often misleadingly, with actual job performance.
Research from Schmidt and Hunter (1998) established that unstructured interviews predict job performance at r = 0.38. Structured interviews improve to r = 0.51 — better, but still a coin flip dressed up as a process. Meanwhile, cognitive ability tests (r = 0.65) and personality assessments (r = 0.31 for conscientiousness alone, higher in combination) consistently outperform interviews, yet most companies still treat the interview as their primary evaluation tool.
The arrival of generative AI has made this worse. Resumes are now written by machines, making them indistinguishable. Candidates can rehearse with AI coaches. The 400-word resume — already a poor signal — has become noise.
2. Eight years of original research
Between 2012 and 2020, Stori founder Fraser Hill conducted over 1,700 leadership interviews across banking, technology, and professional services. The research, published in The CEO's Greatest Asset (2020), examined what differentiates high performers from the rest — not on paper, but in how they think, decide, communicate, and respond to pressure.
The core finding: the traits that predict success are observable in conversation, but traditional interview formats are not designed to elicit them. Behavioral questions (“Tell me about a time when...”) invite rehearsed performance. They measure interview skill, not the underlying cognitive and behavioral patterns that drive results.
This research identified a set of twelve behavioral facets — observable, measurable dimensions of how people work — organized into four meta-traits. These became the foundation of the Stori Trait Assessment.
3. The Stori Trait Assessment (STA)
The STA is a dual-output psychometric system. From a single, six-minute experience, it produces two complementary views of a candidate:
Stori Traits
Four meta-traits and twelve facets describing behavioral and cognitive tendencies. Applied, narrative, immediately useful for hiring decisions.
Big Five Personality Profile
Openness, Conscientiousness, Extraversion, Agreeableness, Emotional Stability. The most widely studied personality framework in psychology, enabling cross-system benchmarking.
Both outputs derive from the same 120 adjective rankings. The Stori Traits tell you how someone works. The Big Five overlay tells you where they sit on the most widely studied personality framework in psychology. Together, they provide both practical utility and research-grounded context.
4. Assessment design: forced-choice triads
The STA uses a forced-choice triad model. Candidates see 40 blocks of three adjectives and rank each block: most like me (+2), more like me (+1), less like me (-1). Each of the 120 adjectives appears exactly once across the entire assessment.
This design is not arbitrary. Forced-choice formats are built to reduce three critical biases that undermine traditional self-report questionnaires:
- Acquiescence bias — the tendency to agree with statements regardless of content. Forced ranking makes “agree with everything” impossible.
- Social desirability — the tendency to present oneself favorably. When all three options are positive, there is no “right answer” to game.
- Central tendency — the tendency to choose middle options. Triads force differentiation.
Each triad draws from three different meta-traits, with each meta-trait omitted from exactly ten triads. This rotation ensures balanced measurement without fatigue. Completion takes approximately six minutes.
5. Twelve facets of human performance
Each facet is anchored by ten adjectives drawn from the International Personality Item Pool (IPIP) — a public-domain lexical base that underpins decades of peer-reviewed personality research. The adjectives were selected and refined using insights from the 1,700-interview research corpus.
Curious
ThinkingInquisitive, Analytical, Investigative, Exploratory, Probing, Reflective, Questioning, Studious, Insightful, Curious
Imaginative
ThinkingCreative, Visionary, Inventive, Vivid, Abstract, Conceptual, Dreaming, Innovative, Artistic, Imaginative
Intuitive
ThinkingInstinctive, Discerning, Sensitive, Intuitive, Holistic, Foresighted, Subtle, Prescient, Speculative, Strategic
Driven
DisciplineMotivated, Purposeful, Ambitious, Persistent, Determined, Industrious, Self-starting, Diligent, Competitive, Goal-oriented
Principled
DisciplineHonest, Ethical, Trustworthy, Principled, Reliable, Genuine, Transparent, Moral, Fair, Authentic
Consistent
DisciplineOrganized, Structured, Steady, Predictable, Consistent, Systematic, Dependable, Thorough, Regular, Focused
Courageous
ExecutionBrave, Bold, Confident, Decisive, Daring, Resilient, Assertive, Fearless, Steadfast, Courageous
Adaptable
ExecutionFlexible, Resourceful, Versatile, Calm, Composed, Open-minded, Easy-going, Agile, Adaptable, Patient
Accountable
ExecutionResponsible, Dependable, Dutiful, Loyal, Reliable, Committed, Conscientious, Answerable, Accountable, Faithful
Articulate
CommunicationWell-spoken, Clear, Coherent, Concise, Verbal, Precise, Fluent, Lucid, Eloquent, Articulate
Influential
CommunicationPersuasive, Charismatic, Inspiring, Confident, Assertive, Convincing, Poised, Engaging, Energetic, Motivating
Perceptive
CommunicationPerceptive, Observant, Attentive, Attuned, Receptive, Tactful, Astute, Responsive, Empathetic, Aware
6. Scoring: from ipsative to normative
Forced-choice data is inherently ipsative — it tells you which traits are strongest relative to each other within one person, but not how that person compares to others. This is a well-known limitation of forced-choice formats.
The STA resolves this using established psychometric methods developed for forced-choice instruments, including ipsative-to-normative conversion techniques. The process:
- Estimate latent trait scores per facet from the forced-choice rankings.
- Compute z-scores relative to a normative distribution.
- Transform to T-scores (mean = 50, SD = 10) for interpretability.
- Generate percentile equivalents for benchmarking.
- Assign banding: High (≥65), Moderate-High (55-64), Average (45-54), Low (≤44).
The result is interval-level data suitable for radar charts, statistical models, and cross-candidate comparison — not just “this person is more curious than they are organized,” but “this person is in the 82nd percentile for curiosity.”
7. Dual-output mapping
Each Stori facet carries a primary Big Five loading and, where the literature supports it, a secondary loading. This allows the same 120 data points to produce both the Stori profile and a Big Five Personality Profile.
| Big Five Dimension | Stori Meta-Trait Link | Contributing Facets |
|---|---|---|
| Openness | Thinking | Curious, Imaginative, Intuitive |
| Conscientiousness | Discipline + Execution | Driven, Principled, Consistent, Accountable |
| Extraversion | Communication | Articulate, Influential |
| Agreeableness | Communication | Perceptive (+ secondary loadings) |
| Emotional Stability | Execution | Courageous, Adaptable |
The Stori meta-traits are designed to align with their Big Five anchors while preserving the distinctiveness of the Stori framework. Formal construct validity studies are planned as assessment data scales.
8. Interview intelligence: the evidence layer
A trait score tells you what someone is like. An interview tells you what they have done. Neither is complete alone. The STA framework is designed so that both speak the same language.
When a candidate completes a Stori interview, the full transcript is analyzed by AI to extract structured intelligence across the same facet dimensions measured by the trait assessment. The system identifies timestamped moments where the candidate demonstrates specific facets — a moment of Courageous when they describe a high-stakes decision, evidence of Driven when they talk about pursuing an aggressive target, Accountable when they own a mistake.
These highlights are surfaced as tagged, seekable moments in the interview player. A hiring manager can click “Driven” and watch the 45 seconds where it showed up. This is not a summary or a score — it is the primary evidence, timestamped and accessible.
9. The Narrative Method: AI-resistant by design
The Stori interview uses what we call the Narrative Method. Rather than asking behavioral questions that invite rehearsed answers, the interview asks candidates to tell their story in the context of specific, verifiable details:
- Names and relationships — “Who was your manager? Tell me about them.” AI cannot fabricate authentic relationship dynamics.
- Rankings and comparisons — “Rank your last three roles by how much you learned.” These force genuine reflection, not rehearsed narratives.
- Context and consequence — “What happened after that?” Follow-up probes test depth. Surface-level answers are flagged automatically.
The AI interviewer has a probe-or-proceed protocol: if an answer lacks specifics, it asks follow-up questions. If the candidate provides concrete detail, it moves on. This ensures every transcript contains genuine behavioral data, not rehearsed highlights.
10. Psychometric reliability
The STA is designed to meet the following reliability benchmarks, consistent with established forced-choice personality instruments:
These benchmarks will be refined as assessment data scales and norming populations are established. The AI narrative layer operates under strict constraints: it can polish phrasing but cannot modify any numeric score. All computations are deterministic, logged, and auditable.
11. Operational guardrails
The STA enforces a strict separation between data and narrative:
- Scores are deterministic. The AI cannot change, round, or reinterpret any numeric output. T-scores, percentiles, and banding are computed by fixed algorithms.
- Narrative generation follows templates. Each facet-band combination has a pre-written deterministic narrative. The AI polishes language but cannot alter meaning.
- PII is masked. No personally identifiable information is sent to AI providers for narrative generation.
- Every run is logged. Timestamps, model versions, and run IDs create a complete audit trail.
- Fallback is always deterministic. If the AI fails validation twice, the system falls back to raw deterministic text with no AI involvement.
12. A unified language for understanding people
The Stori Trait Assessment is not another hiring tool bolted onto the same broken process. It is a new measurement system designed from first principles:
- A forced-choice assessment that reduces the biases of traditional self-report.
- A dual-output framework that produces both applied Stori Traits and Big Five personality scores from one experience.
- A Narrative Method interview designed to surface authentic behavior, not rehearsed performance.
- Interview highlights that tag behavioral evidence to the same facet language as the trait assessment.
- A unified report that cross-references personality data with demonstrated behavior, so every score has evidence behind it.
The result is a candidate intelligence layer where traits, interviews, and evidence speak the same language — and where hiring decisions are based on measurement, not proxies.
Lexical base from the International Personality Item Pool (IPIP) — public domain. Scoring, dual-output mapping, interview intelligence, and visualization architecture are patent pending. © 2025 Stori Labs / Fraser Hill. All rights reserved.
This assessment is an interpretive tool grounded in established psychometric frameworks. It is not a clinical or diagnostic instrument.
