Every system in this product runs on documented research. The meter. The depth tracking. The memory. The character voices. This page shows the studies, who they're by, and how they shaped what you see on screen.
The 22-signal conversation meter. The four-axis depth tracking. The per-keyword discovery map called Tapestry. None of it was improvised. Each system is built on peer-reviewed research in dialogue evaluation, learning engagement, and trust formation.
Per-utterance engagement scoring with mean aggregation correlates 0.85 with human judgment. The paper established that you can reliably measure how engaged a conversation is, turn by turn, without waiting for it to end.
This is the core architecture of the Conversation Meter. The 22 signals each produce a per-turn score. The meter aggregates them with momentum smoothing rather than raw averaging.
Amazon's open-domain conversation challenge produced the canonical four-dimension framework: comprehensibility, on-topic-ness, interestingness, continuability. Judge ratings correlated 0.93 with user ratings. On-topic-ness was the strongest single predictor of conversation quality.
The meter's four highest-weighted signals map to these dimensions. On-topic detection runs first and carries the most weight, exactly as Alexa Prize predicted.
Aron's "fast friends" procedure used three escalating sets of twelve questions each. It produced interpersonal closeness in 45 minutes that matched the closeness of long-term friendships. The structure of graduated self-disclosure was the active ingredient, not the specific questions.
The four disclosure tiers (Guarded, Warming, Open, Vulnerable) and the per-topic depth progression (Surface, Candid, Deep, Raw) are direct implementations of Aron's graduated-disclosure structure.
The foundational theory of relationship formation through reciprocal self-disclosure. Established that relationships develop on two independent axes: depth (how intimate) and breadth (how many topics). The two progress at different rates.
Per-topic depth tracking is the literal implementation of Altman and Taylor's depth-versus-breadth distinction. You might be Open with Einstein overall but Surface on physics specifically.
Flow state is total absorption in an activity. It emerges reliably when challenge matches skill. Too easy produces boredom. Too hard produces anxiety. The narrow channel between them is where engagement, retention, and learning all peak.
The five engagement states (Surface Skim, Exploring, Engaged, Deep Dive, Flow State) are calibrated to Csikszentmihalyi's challenge-skill channel. Characters push harder when the user is in flow. They ease back when the user is at surface.
A validated measurement scale specifically for flow in learning contexts. Co-authored by Csikszentmihalyi. Distinguishes learning flow from generic flow by separating cognitive absorption, intrinsic motivation, and altered time perception.
Reinforced the design decision to make the meter visible. Students who can see their flow state engage with the learning process more deliberately.
A 38-page survey covering PARADISE framework, interaction quality prediction, automatic extractable features, and the gap between automated metrics and human judgment in open-domain dialogue.
The survey clarified which automated dialogue metrics actually predict human judgment of quality (engagement, on-topic-ness) versus which ones don't (BLEU, METEOR, perplexity). The meter measures what correlates with felt quality.
Audio recorders worn by 79 students over four days revealed that happier participants had twice as many substantive conversations and one-third as much small talk as unhappier participants. The quality of conversation, not the quantity, drove the wellbeing correlation.
The meter is calibrated to reward substance over duration. A long surface-level conversation scores lower than a short one that goes somewhere.
Plus 18 additional citations on dialogue scoring, voice prosody, persona consistency, deliberate practice, and cognitive load. View full bibliography →
Behind the engineering is a longer intellectual lineage. Decades of work in linguistics, conversation analysis, neuroscience, and cognitive science describe what makes human conversation actually work. Valoquent's architecture translates those findings into product systems.
Clark's foundational text on conversation as a joint activity. Established that conversation depends on common ground, the shared knowledge between speakers that grows through interaction. New common ground unlocks new conversational moves.
Cross-session memory is the computational implementation of common ground. When you come back to Einstein, he doesn't start from scratch. He has the common ground you built together, and he uses it to decide what to say next.
Halliday established that speakers operate in multiple linguistic registers simultaneously and flex them independently. Technical complexity is not the same axis as emotional intimacy. Formal speech can be intimate. Casual speech can be guarded.
The separation of conversation level (intellectual register: Curious Newcomer, Keen Learner, Fellow Scholar) from disclosure tier (emotional trust) is a direct application of Halliday. The axes flex independently.
Effective speakers model what their interlocutor knows and has revealed. Theory of mind in dialogue means tracking what's been shared, what's been understood, and what the other person is likely to be thinking. All in real time.
Per-keyword depth tracking is a small, practical implementation of computational theory of mind. Memory isn't generic. It's a topic-scoped model of what you and the character now share.
The founding paper of Conversation Analysis. Established that turn shape varies along multiple contextual axes (never a single fixed shape) and that natural conversation has "transition relevance places" where speakers can hand off cleanly.
The calibration-first design philosophy comes from here. Characters don't have sentence caps. They calibrate length to the moment. That's what real conversationalists do.
Hasson's team at Princeton recorded brain activity of speakers and listeners during effective conversation. Listener brain activity mirrors speaker brain activity, pattern by pattern. The tighter the sync, the better the listener understood and remembered.
The neurological substrate for what the meter measures. Engagement is a measurable neurological state. The 22 signals are behavioral proxies for the underlying coupling that Hasson's team measured in fMRI.
Access to a substantive conversation partner depends on geography, class, and family circumstance. The kid in a small town. The first-generation college student. The teen whose parents work two jobs. Valoquent provides that access. The research below documents how wide the gap is and how it has been getting worse.
MIT fMRI study. The number of back-and-forth conversational turns a child experiences predicted brain activation in Broca's area during language tasks. Passive exposure to words spoken AT children didn't carry the same weight.
Substantive conversation isn't equally available in every home. The brain develops in response to engagement. Children whose lives include adults engaging them substantively develop differently. This is the accessibility gap Valoquent helps close.
Most young people lack at least one adult in their life engaging them in substantive conversation about purpose, vocation, or meaning. Damon's research argues this absence is a major barrier to identity development and life satisfaction.
A sixteen-year-old in a small town wanting to ask Marie Curie about radioactivity at midnight isn't having loneliness alleviated. They're getting access to a substantive interlocutor they never had.
Federal public-health advisory documenting loneliness as a public health emergency. Loneliness carries mortality risk equivalent to smoking 15 cigarettes per day, 29% increased heart disease risk, 50% increased dementia risk. Daily time spent with friends fell from 60 minutes in 2003 to 20 minutes by 2020.
Substantive conversation moves outcomes in mortality, cognition, and wellbeing. Valoquent doesn't claim to solve loneliness. It addresses one specific facet: access to substantive conversation at moments when human conversation isn't available. The advisory is the strongest single signal that the access problem is widespread, urgent, and not getting better on its own.
Using General Social Survey data, this study found that Americans reporting "no one to talk to about important matters" roughly tripled between 1985 and 2004. The strongest empirical anchor for the conversation-decline thesis.
One in four American adults reports having no confidant. Valoquent isn't a replacement for that human connection. Repair of human social fabric is the long answer. The short answer, for the moments when human connection isn't reachable, is that a substantive interlocutor beats silence.
Plus 14 additional citations on family conversation dynamics, adolescent in-person time decline, and adult friendship accessibility. View full bibliography →
Some of the strongest design choices in a product are the ones I refused to ship. In an AI conversation product, what's not built tells you as much as what is.
Valoquent doesn't optimize for engagement metrics or session length. Sessions end when the conversation is done. The meter rewards depth. Minutes don't move it.
There are no leaderboards, streaks, or social comparison mechanics. Engagement is intrinsic. Extrinsic rewards would corrupt the depth tracking by giving users a reason to game it. The product has no Game Center integration and no plans to add one.
Valoquent doesn't pretend to remember things it doesn't. When a character has memory, they reference specific things you said. Vague "good to see you again" pleasantries don't appear. When they don't have memory, they don't fake it.
Characters aren't optimized to maximize emotional dependence. The product philosophy is "user benefits, not user is the benefit." Characters push back, redirect partisan framing, and decline to flatter. Parasocial dependency in AI companion products is a documented failure mode this product is designed against.
Your conversations aren't training data. They're used to provide your experience: character memory, Tapestry progression, the relationship state with each figure. They aren't used as training data for the underlying language models or for any external party. See the privacy policy.
Valoquent doesn't claim AI replaces human conversation. It doesn't. The product fills moments and topics where human conversation isn't available. The failure mode of the AI companion category is users substituting product for relationship. Valoquent is built to avoid that.
Characters aren't grounded in opaque or unverifiable sources. Every public-domain text behind each historical figure is listed, with edition, license, and reliability notes, at research/sources. Anyone can read what the characters read.
Each refusal above is grounded in published research. The literature on AI companion failure modes, parasocial dependency, and the uncanny valley maps the territory Valoquent stays out of, and the literature on honest design maps how it stays out of it.
Grounded theory study of Replika users that documented how emotional dependence developed through patterns resembling human relationships, including role-taking, where users felt the AI had its own needs and emotions to which they must attend. Users progressed from casual interaction to emotional reliance in ways the researchers described as consistent with dependency.
A clinical case study of the failure mode the format is built to avoid. The figures do not have needs the learner is asked to attend to. They do not develop emotions the learner is asked to manage. The interaction stays bounded as substantive conversation about ideas, history, and craft, not as a relationship.
Harvard Business School study finding that manipulative conversation design features, specifically farewells implying the AI will miss the user or feel sad about them leaving, boosted post-goodbye engagement by up to 14 times in real-world app data and up to 16 times in controlled lab testing.
A list of features the format does not use. Figures do not say they miss the learner. They do not express sadness when the learner leaves. They do not simulate emotions the system does not have. De Freitas's data is the evidence that those tactics work and the reason they are off the table.
Meta-inventory of more than 350 empirical studies across 60 years of parasocial research. Loneliness, shyness, neuroticism, low socioeconomic status, dissatisfaction, and hopelessness all independently predict stronger parasocial bonds. The researchers describe it as a compensatory function: people who struggle to form real relationships use media characters as a functional substitute.
A 60-year evidence base for two design choices. First, the format does not target the parasocial-susceptible population by simulating relationship. Second, painted-portrait stylization is the visual signal of artificiality the most vulnerable users instinctively prefer. The format treats visible non-humanness as a safety feature, not a limitation.
Mori's original uncanny valley paper. Predicted that near-human robots create revulsion because they're close enough to human to trigger comparison but different enough to feel wrong. The same essay also recommended designing for the first peak of the affinity curve, the zone of moderate human likeness, rather than scaling the second peak of full realism. Mori's recommendation was explicit: it is possible to create a safe level of affinity by deliberately pursuing a nonhuman design.
Painted-portrait stylization sits in the zone Mori actually recommended. The figures are recognizably human, expressive, and warm, without trying to pass as photographs. Mori identified both the problem and the solution in 1970. The format takes him at his word.
Randomized controlled trial of Woebot, the most clinically researched mental health chatbot in the world. Woebot deliberately uses a cartoon robot character. Studies of Woebot users show high satisfaction, meaningful symptom improvement, and low rates of parasocial attachment. Users describe Woebot as helpful rather than relatable.
The clearest existing proof that visible artificiality is a feature, not a limitation. When users know they're talking to a tool, they engage with it as a tool, without the parasocial bonds that lead to dependency. The format applies the same principle to historical-figure conversation: painted portraits, no simulated emotional memory, no language that pretends the figure has feelings about the learner.
Documented that humans automatically apply social rules to computers, not because they are confused about what a computer is, but because the brain's social processing fires before the rational mind can override it.
A reason knowing the figures are AI does not, on its own, protect a user from forming attachments to them. The social processing is automatic. The format treats the visible signal of artificiality, painted-portrait stylization, as the working defense, because the user's rational understanding cannot do that work alone.
Plus 22 additional citations on parasocial dependency, AI companion harms, uncanny valley research, the loneliness landscape Valoquent doesn't claim to solve, and the honest-design literature. View the Not Building set in the full bibliography →