Skip to main content

Mars Genesis vs MiroFish: Two Approaches to Multi-Agent Simulation

· 12 min read
Johnny Dunn
Creator of AgentOS

Multi-agent simulation splits into two schools: predict the real world, or generate emergent worlds that never existed. MiroFish (54k GitHub stars) builds parallel digital worlds from real-world seed data to forecast outcomes. Mars Genesis (built with AgentOS) creates a deterministic Mars colony where two AI commanders with distinct personalities face emergent crises, forge computational tools at runtime, and produce measurably different civilizations from identical starting conditions.

This post breaks down how each system works at the architecture level, where they diverge in design philosophy, and what builders can learn from both.

Mars Genesis simulation dashboard showing two AI commanders running side-by-side with emergent crises, department analysis, forged tools, colonist reactions, and divergence tracking Mars Genesis dashboard: two AI commanders face different emergent crises. Left column (Visionary) chose a risky exterior repair during a dust storm. Right column (Engineer) landed conservatively at Arcadia Planitia. Department pills, forged tools, colonist voice reactions, and the divergence rail are visible.


The Core Question Each System Answers

MiroFish asks: "Given this real-world information, what happens next?"

You upload a news article, policy draft, or financial signal. MiroFish extracts entities and relationships into a knowledge graph (Zep Cloud GraphRAG), generates agent profiles with personalities and backstories, then runs a social media simulation on Twitter and Reddit replicas powered by OASIS. Thousands of agents post, reply, like, and argue. A ReportAgent with retrieval tools synthesizes the results into a prediction report.

Mars Genesis asks: "How does leadership personality shape civilization?"

You configure two commanders with HEXACO personality profiles (six-factor model from psychology research). Both start with the same 100 colonists, same resources, same deterministic seed. An AI Crisis Director generates unique crises per timeline based on colony state. Department agents analyze each crisis, forge computational tools (radiation dose calculators, food security models), and the commander decides. The deterministic kernel applies bounded numerical effects. Five turns later, the two colonies have diverged in population, morale, infrastructure, and political structure.

The difference is not cosmetic. It reflects a fundamental architectural split in how each system handles truth, agency, and emergence.

Architecture: Who Owns Truth?

MiroFish: The Graph Owns Truth

MiroFish's architecture has five stages:

  1. Graph Building: Seed text is chunked and fed to Zep Cloud to build a knowledge graph. Entities (people, organizations, events) and their relationships become the simulation's ground truth.

  2. Environment Setup: An OasisProfileGenerator converts graph entities into agent profiles with personality (MBTI-based), demographics, social metrics (karma, followers), and behavioral instructions. Each agent gets a persona field: a paragraph-length character description generated by an LLM from the entity's graph context.

  3. Simulation: OASIS runs parallel Twitter and Reddit simulations. Agents take actions each round: create posts, like, reply, retweet, follow. Activity levels vary by simulated time of day (the codebase includes a CHINA_TIMEZONE_CONFIG with hourly activity multipliers). A SimulationIPCClient uses file-system IPC (commands written to disk, polled by the simulation process) to coordinate between the Flask backend and the OASIS subprocess.

  4. Report Generation: A ReportAgent using ReACT-style reasoning queries the post-simulation graph with three retrieval tools: InsightForge (deep multi-query retrieval), PanoramaSearch (breadth search including expired content), and QuickSearch. It generates a structured prediction report.

  5. Deep Interaction: Users can chat with any agent in the simulated world or interrogate the ReportAgent.

The knowledge graph is the single source of truth. Agents' actions update the graph. The simulation runner reads from it. Reports query it.

Mars Genesis: The Kernel Owns Truth

Mars Genesis has a different separation:

Director crisis → Department analyses → Commander choice
→ Typed policy effect → Kernel progression → Outcome
→ Personality drift → Colonist reactions → Next turn

The deterministic kernel (no LLM calls) owns canonical state: population, births, deaths, aging, bone density loss, radiation accumulation, resource production, career progression. All driven by seeded RNG (Mulberry32). Same seed always produces the same colonist roster and the same downstream event stream.

AI agents own interpretation: the Crisis Director generates crises from colony state, department agents analyze them and forge tools, the commander decides strategy. Their decisions feed into the kernel as bounded numerical effects (morale shifts, power changes, food reserve adjustments). The kernel applies these deterministically.

The LLM-as-judge reviews forged tool code for safety, correctness, determinism, and bounded execution. It does not determine simulation outcomes.

The principle: the host runtime owns truth, the agents own interpretation.

Agent Architecture: Personality at Scale

MiroFish: MBTI + Social Graph Personas

MiroFish agents get their personality from two sources:

  1. Entity extraction: Zep's knowledge graph identifies entities from the seed text. Each entity becomes a potential agent.

  2. LLM-generated personas: The OasisProfileGenerator calls an LLM to generate a detailed character description from the entity's graph context. The persona includes MBTI type, age, gender, profession, interested topics, and a narrative backstory.

@dataclass
class OasisAgentProfile:
user_id: int
user_name: str
name: str
bio: str
persona: str # LLM-generated character description
age: Optional[int]
gender: Optional[str]
mbti: Optional[str]
country: Optional[str]
profession: Optional[str]
interested_topics: List[str]

Agent behavior emerges from the persona prompt combined with OASIS's social media action space (post, reply, like, retweet, follow). Personality does not evolve over time: an agent's MBTI and persona remain fixed throughout the simulation.

Mars Genesis: HEXACO + Drift Forces

Mars Genesis uses the HEXACO model from psychology research (six continuous traits 0-1, not categorical types):

  • Openness: creativity, willingness to experiment
  • Conscientiousness: discipline, thoroughness
  • Extraversion: sociability, assertiveness
  • Agreeableness: cooperation, trust
  • Emotionality: anxiety, empathy
  • Honesty-Humility: sincerity, fairness

Personality is not static. Three forces cause trait drift each turn:

  1. Leader pull (0.02/turn): promoted colonists' traits converge toward their commander's profile. Grounded in leader-follower alignment research (Van Iddekinge 2023).

  2. Role pull (0.01/turn): department roles activate specific traits. Engineering activates conscientiousness. Psychology activates agreeableness and emotionality. Based on trait activation theory (Tett & Burnett 2003).

  3. Outcome pull (event-driven): successful risks boost openness. Failed risks boost conscientiousness. Consistent with the social investment principle (Roberts 2005).

After each turn, all ~100 alive colonists generate individual reactions via lightweight LLM calls. Each colonist's HEXACO profile, health stats, social ties, and the crisis context shape their 1-2 sentence reaction. A high-openness Mars-born teenager reacts differently to a governance crisis than a high-conscientiousness Earth-born engineer.

Mars Genesis colonist reactions panel showing individual quotes from 100+ colonists with mood distribution bar, personality details on hover Colonist reactions after a crisis outcome. Each of 100+ colonists generates an individual reaction shaped by their HEXACO personality, health, and social context. Mood distribution shows 63% negative, 38% anxious. Hovering any colonist reveals their full profile.

Emergent Capabilities: Prediction vs Tool Forging

MiroFish: Emergent Social Dynamics

MiroFish's emergence happens through agent interactions on simulated social platforms. Thousands of agents posting, replying, and influencing each other produce:

  • Information cascading: how news spreads through a social network
  • Opinion polarization: echo chambers forming around contentious topics
  • Herd behavior: agents following trending content
  • Sentiment shifts: collective mood changes over simulation rounds

The emergence is social: individual agents acting on their personas and the content they see produce macro-level patterns that no single agent intended. OASIS supports simulations of up to one million agents, enabling studies at real-world platform scale.

Mars Genesis: Emergent Tool Forging

Mars Genesis's emergence happens through runtime capability creation. Department agents can invent new computational tools that never existed before:

  1. Agent identifies need: The Medical agent facing a radiation crisis decides it needs a "cumulative dose risk calculator."
  2. Agent writes code: Specifies tool name, description, input/output schemas, sandboxed JavaScript implementation, and test cases.
  3. Judge reviews: An LLM-as-judge scores safety, correctness, determinism, bounded execution, and input validation.
  4. Tool executes: Approved tools run in an isolated V8 sandbox with colony data as input, producing numerical results.
  5. Output informs decisions: The tool's computed output (risk scores, projections) appears in the department report to the commander.

This uses AgentOS's EmergentCapabilityEngine, EmergentJudge, and SandboxedToolForge. Tools do not directly change colony state. They produce analysis that informs decisions. The commander's selected policy effects change state through the kernel.

Tools forged in one turn persist and can be reused. Over a 12-turn simulation, department agents accumulate a growing toolkit of specialized analytical instruments, each reviewed by the judge and sandboxed for safe execution.

Determinism: Reproducibility vs Exploration

MiroFish: Stochastic Exploration

MiroFish simulations are inherently non-deterministic. Each run produces different agent interactions, different post timings, different content cascades. This is by design: the value is in running many simulations and aggregating patterns. The SimulationConfigGenerator uses LLM calls to auto-generate simulation parameters (time schedule, event injection, agent activity levels), which introduces additional variance.

Mars Genesis: Deterministic Kernel + Stochastic Agents

Mars Genesis separates determinism from agency:

  • Deterministic: colonist roster generation, births, deaths, aging, bone density, radiation, career progression, outcome classification (all seeded RNG)
  • Non-deterministic: crisis generation, department analysis, tool forging, commander decisions, colonist reactions (all LLM-driven)

Same seed guarantees the same starting conditions and the same mechanical outcomes for identical decisions. The divergence between two timelines is entirely attributable to different leadership personalities making different choices. This makes the simulation's central claim ("different personalities create different civilizations") empirically testable.

Technology Stack Comparison

DimensionMiroFishMars Genesis
LanguagePython backend + Vue.js frontendTypeScript (full stack)
Agent runtimeOASIS (CAMEL-AI)AgentOS
Knowledge storeZep Cloud GraphRAGDOI-linked knowledge base + live web search
Personality modelMBTI (categorical, static)HEXACO (continuous, evolving)
Agent scaleThousands to millions~107 per turn (1 commander + 5 dept heads + 1 director + ~100 colonists)
Simulation typeSocial media platform replicaColony management with crisis response
Emergence mechanismSocial dynamics (posts, likes, follows)Tool forging + personality drift + crisis generation
DeterminismNon-deterministicSeeded RNG kernel + non-deterministic agents
OutputPrediction reportsSide-by-side civilization comparison
IPCFile-system command/responseIn-process SSE streaming
DeploymentDocker + Flask + Node.jsSingle npx tsx process

What Builders Can Take From Each

From MiroFish:

  • GraphRAG as simulation ground truth is powerful for prediction use cases. The Zep integration shows how entity extraction can bootstrap agent populations from unstructured text.
  • Dual-platform simulation (Twitter + Reddit) captures different interaction modalities. Agents behave differently with character limits vs. threaded discussions.
  • ReACT-style report generation with retrieval tools produces richer analysis than simple summarization.

From Mars Genesis:

  • Separating the deterministic kernel from AI interpretation makes claims testable. You can prove divergence came from decisions, not randomness.
  • Runtime tool forging gives agents genuine problem-solving capability beyond their initial training. The judge pipeline (build, test, review, sandbox) makes it safe.
  • Continuous personality evolution produces more nuanced behavioral change than static personality types. HEXACO drift grounded in psychology research adds scientific credibility.
  • Typed policy effects with bounded numerical ranges prevent LLM hallucination from corrupting simulation state.

Running Mars Genesis

Mars Genesis runs as a single process with a live dashboard:

npm install
cp .env.example .env
# Add your OPENAI_API_KEY or ANTHROPIC_API_KEY

# Launch with dashboard
npm run dashboard

# Or quick 3-turn smoke test
npm run dashboard:smoke

Mars Genesis settings panel with HEXACO personality sliders, leader configuration, starting resources, department toggles, API keys, and model selection Settings panel: configure two leaders with HEXACO sliders, starting resources, key personnel, custom event injection, API keys, LLM model selection, and sandbox execution parameters.

The Settings tab lets you configure leaders, HEXACO personality sliders, custom events, starting resources, API keys, model selection, and department activation. Presets include "Balanced Founders," "High Risk vs Ultra Cautious," and custom configurations shareable via URL.

Mars Genesis game report showing side-by-side turn comparison with crisis titles, decisions, outcomes, department stats, and colonist quotes Reports panel: side-by-side turn-by-turn comparison with crisis titles, commander decisions, outcome badges, department citation and tool counts, colonist quotes, and a replay scrubber.

The simulation streams events via SSE to the browser. Both timelines run in parallel. Department cards show citations, forged tools, and risk assessments. Decision cards expand to show full commander reasoning and selected policies. Colonist quotes show individual reactions with personality-driven mood analysis.

Try Both

Both are open source. Both prove that multi-agent simulation has moved past chatbot demos into systems that generate genuine emergent behavior. The question is no longer whether AI agents can simulate complex social and organizational dynamics. It's which dynamics matter for your use case.


Mars Genesis is built with AgentOS, an open-source TypeScript runtime for autonomous AI agents. Install: npm i @framers/agentos

Built by Manic Agency / Frame.dev. Contact: team@frame.dev