Research Efforts (shared)

All our efforts, including our research, focus on building digital twins—AI companions that faithfully mirror each person’s individuality and can represent them accurately and authentically at scale. The following is a list of research efforts that we're undertaking at the moment:

1. Continuous learning for digital twins

Continuous learning is the lever that turns a static chat interface into a living twin. Every interaction—whether the twin books a flight, drafts a tweet, or nudges its human toward a long‑term goal—produces feedback that should refine future behaviour. Yet the feedback landscape is messy: some rewards arrive instantly and objectively, others are subjective and judge‑mediated, and the most valuable ones may not materialise for months. Designing a learning loop that thrives across all three regimes is therefore central to self‑evolving twins.

Reward regimes

Regime	Description	Example digital‑twin tasks	Open challenges
Clear rewards	Success is binary, verifiable, and arrives within seconds or minutes. Classic RL settings already handle these well.	Completing a ticket purchase once the confirmation code matches airline records.	On‑device, low‑latency fine‑tuning without drifting from safe defaults; constraining the action space to prevent catastrophic exploration.
Fuzzy rewards	Outcomes are subjective; the signal often comes from an LLM judge or explicit human rating.	Writing a tweet in the user’s voice or composing an empathic apology.	Aligning LLM‑judge scores with true user preferences R ePalm; preventing reward hacking and style drift over time (paper); fusing dense judge feedback with sparse likes or emoji reactions (paper).
Extremely slow rewards	Pay‑offs unfold over weeks or months and may conflict with short‑term comfort.	Guiding weight‑loss, improving savings, or nurturing relationships.	Reconciling local (immediate) vs. global (long‑term) value functions (paper); learning safely when successful trajectories are rare.

Research directions

We are in the early stages of developing a continuous learning pipeline that would need to encompass:

Fast in‑session adaptation: Using tiny Adapter/LoRA heads that update from clear rewards in real time, allowing the twin to refine routine skills on‑device without full retraining.
LLM‑judge preference loops: Batch fuzzy‑reward data and run offline RLHF with calibrated LLM evaluators, extending mixture‑of‑judges work to detect bias and drift.
Hierarchical, multi‑timescale RL: Combine daily proxies (step count, calendar compliance) with monthly targets (BMI delta, net‑worth change) under a multiscale controller. Bayesian credit‑assignment techniques track which micro‑actions move the global needle.

Continuous learning across these regimes remains a frontier. Solving it will let digital twins not just react to their humans but grow with them—improving competence, style alignment, and long‑term stewardship with every interaction.

2. Memory architecture for digital twins

Foundation models supply general intelligence, yet every human carries a rich, idiosyncratic history; a twin must ingest, store, and reason over that lifetime of memories at least as well as the human. Unlike standard QA tasks that might explicitly ask for a fact, a human-like AI twin should remember details naturally over the course of interactions. Memory is arguably the foundation upon which a convincing digital twin must be built—without it, the twin cannot maintain consistency with past interactions or demonstrate the rich personal history that defines a human individual.

Research direction

Building a good digital twin would require developing a memory architecture that addresses multiple dimensions of human-like memory functionality. We are in the early stages of exploring a multi-layered memory framework that would need to encompass:

Multiple query types: A foundational layer ensuring the twin maintains basic factual consistency about the individual it represents, accurately recalling specific facts shared by or about the human at various levels of granularity:
1. Local query handling: Support for answering queries relevant to conversations with individual entities.
2. Global query handling: Capability to answer queries requiring reasoning over trends across conversations and memories involving multiple entities.
3. Episodic query support: Support for answering time, space, entity, and content-based queries. The ability to remember personal events with their temporal and spatial context, drawing inspiration from the Episodic Memories Benchmark [Hwang et al., 2024]. This would need to address both cue-based recall (retrieving details when given partial information) and chronological reasoning (understanding the order and relationships between events).
Iterative data ingestion: Mechanism for rapidly ingesting new data in an iterative manner without having to restructure a large fraction of the existing structured memory.
Proactive memory application: Perhaps the most challenging aspect—developing systems that volunteer relevant past information at contextually appropriate moments and naturally weave past information into current interactions, mimicking human memory's associative retrieval patterns.

Understanding and implementing these memory capabilities represents a key frontier in the development of authentic digital twins—one that must be addressed before twins can effectively coordinate resources, inform decision-making, and participate in governance at scale.

1. Continuous learning for digital twins

Reward regimes

Research directions

2. Memory architecture for digital twins

Research direction

3. Verifiable MCP servers