Files
cursebreaker-parser-rust/SUMMARY.md
2026-01-02 09:28:22 +00:00

14 KiB

Cursebreaker Parser - Current State Summary

Last Updated: 2026-01-01 Version: 0.1.0 (Major refactoring in progress)

Overview

This codebase is a Unity file parser that converts Unity YAML files (.unity, .prefab, .asset) into Rust data structures. A major architectural refactoring has been completed to:

  1. Parse YAML directly into component types (bypassing intermediate UnityDocument)
  2. Automatically build Sparsey ECS Worlds for scene files
  3. Keep prefabs as raw YAML for efficient cloning and instantiation

Current Architecture

Data Flow

Unity File (.unity/.prefab/.asset)
    ↓
Parser detects file type by extension
    ↓
┌─────────────┬──────────────┬──────────────┐
│   .unity    │   .prefab    │   .asset     │
│   (Scene)   │   (Prefab)   │   (Asset)    │
└─────────────┴──────────────┴──────────────┘
       ↓              ↓              ↓
  Parse YAML    Parse YAML     Parse YAML
       ↓              ↓              ↓
  RawDocument   RawDocument    RawDocument
       ↓              ↓              ↓
  Build World   Store YAML     Store YAML
       ↓              ↓              ↓
 UnityScene    UnityPrefab    UnityAsset
       ↓
  Entity + Components

Core Types

UnityFile (src/model/mod.rs:14-53)

pub enum UnityFile {
    Scene(UnityScene),    // .unity files → ECS World
    Prefab(UnityPrefab),  // .prefab files → Raw YAML
    Asset(UnityAsset),    // .asset files → Raw YAML
}

UnityScene (src/model/mod.rs:60-85)

Contains a fully-parsed Sparsey ECS World:

pub struct UnityScene {
    pub path: PathBuf,
    pub world: World,                      // Sparsey ECS World
    pub entity_map: HashMap<FileID, Entity>, // Unity FileID → Entity mapping
}

UnityPrefab / UnityAsset (src/model/mod.rs:92-150)

Contains raw YAML documents for cloning:

pub struct UnityPrefab {
    pub path: PathBuf,
    pub documents: Vec<RawDocument>,  // Raw YAML + metadata
}

RawDocument (src/model/mod.rs:160-194)

Lightweight storage of Unity object metadata + YAML:

pub struct RawDocument {
    pub type_id: u32,           // Unity type ID
    pub file_id: FileID,         // Unity file ID
    pub class_name: String,      // "GameObject", "Transform", etc.
    pub yaml: serde_yaml::Value, // Inner YAML (after "GameObject: {...}" wrapper)
}

Component System

UnityComponent Trait (src/types/component.rs:18-28)

Components parse directly from YAML:

pub trait UnityComponent: Sized {
    fn parse(yaml: &serde_yaml::Mapping, ctx: &ComponentContext) -> Option<Self>;
}

Key Change: Previously used UnityDocument, now uses raw serde_yaml::Mapping for zero-copy parsing.

ComponentContext (src/types/component.rs:8-15)

Provides metadata during parsing:

pub struct ComponentContext<'a> {
    pub type_id: u32,
    pub file_id: FileID,
    pub class_name: &'a str,
}

YAML Helpers (src/types/component.rs:31-167)

Typed accessors for Unity YAML patterns:

  • get_vector3() - Parses {x, y, z} into glam::Vec3
  • get_quaternion() - Parses {x, y, z, w} into glam::Quat
  • get_file_ref() - Parses {fileID: N} into FileRef
  • etc.

Implemented Components

  1. GameObject (src/types/game_object.rs) - Basic entity data (name, active, layer)
  2. Transform (src/types/transform.rs) - Position, rotation, scale + hierarchy
  3. RectTransform (src/types/transform.rs) - UI transform with anchors

ECS World Building (src/ecs/builder.rs)

3-Pass Approach:

Pass 1: Spawn GameObjects (lines 32-36)

  • Creates entities for all GameObjects
  • Maps FileID → Entity

Pass 2: Attach Components (lines 38-42)

  • Parses components from YAML
  • Dispatches to correct parser based on class_name
  • Attaches to GameObject entities

Pass 3: Resolve Hierarchy (lines 44-46)

  • Converts Transform parent/children FileRefs to Entity references

Parser Pipeline (src/parser/mod.rs)

File Type Detection (lines 69-76)

.unity   FileType::Scene   Build ECS World
.prefab  FileType::Prefab  Store Raw YAML
.asset   FileType::Asset   Store Raw YAML

YAML Document Parsing (lines 125-167)

  1. Parse Unity tag: --- !u!1 &12345
  2. Extract YAML after tag line
  3. Unwrap class name wrapper: GameObject: {...}{...}
  4. Store as RawDocument

What's Implemented

Fully Working

  • File type detection by extension
  • YAML parsing with Unity header validation
  • Direct YAML-to-component parsing (bypasses UnityDocument)
  • Component trait with typed YAML helpers
  • GameObject, Transform, RectTransform parsing
  • Separate code paths for scenes vs prefabs
  • Sparsey World creation with component registration
  • Entity spawning for GameObjects
  • Component Linking (Transform parent and children) with callbacks in case the component hasn't been initialized yet.

What's Not Implemented

Critical Missing Features

1. Prefab Instancing System (MEDIUM PRIORITY)

Status: Not started

What's Needed: Create src/prefab/mod.rs with:

pub struct PrefabInstance {
    documents: Vec<RawDocument>,  // Cloned YAML
}

impl UnityPrefab {
    /// Clone prefab for instancing
    pub fn instantiate(&self) -> PrefabInstance;
}

impl PrefabInstance {
    /// Override YAML values before spawning
    pub fn override_value(&mut self, file_id: FileID, path: &str, value: serde_yaml::Value);

    /// Spawn into existing scene world
    pub fn spawn_into(self, world: &mut World) -> Result<HashMap<FileID, Entity>>;
}

Usage Example:

let prefab = match unity_file {
    UnityFile::Prefab(p) => p,
    _ => panic!("Not a prefab"),
};

let mut instance = prefab.instantiate();
instance.override_value(file_id, "m_Name", "CustomName".into())?;
instance.override_value(file_id, "m_LocalPosition.x", 100.0.into())?;
let entities = instance.spawn_into(&mut scene.world)?;

Implementation Steps:

  1. Create src/prefab/mod.rs
  2. Implement YAML cloning (serde_yaml::Value::clone())
  3. Implement YAML path navigation for overrides (e.g., "m_LocalPosition.x")
  4. Reuse build_world_from_documents() for spawning
  5. Add tests with real prefab files

Files to Create:

  • src/prefab/mod.rs

Files to Modify:

  • src/lib.rs (add pub mod prefab)

4. UnityProject Module Update (MEDIUM PRIORITY)

Status: Currently disabled to allow compilation

Location: src/project/mod.rs, src/project/query.rs

Problem: References old UnityDocument type that no longer exists.

What's Needed:

  • Update UnityProject to store HashMap<PathBuf, UnityFile> instead of files with documents
  • Implement queries that work across scenes/prefabs:
    • get_all_scenes() -> Vec<&UnityScene>
    • get_all_prefabs() -> Vec<&UnityPrefab>
    • find_by_name() - search across RawDocuments in prefabs
  • Update reference resolution for cross-file references
  • GUID → Entity resolution for scene references to prefabs

Files to Modify:

  • src/project/mod.rs (lines 9, 36-50)
  • src/project/query.rs (entire file)
  • src/lib.rs (re-enable module exports)

Example Updated API:

impl UnityProject {
    pub fn load_file(&mut self, path: impl AsRef<Path>) -> Result<&UnityFile>;

    pub fn get_scenes(&self) -> Vec<&UnityScene>;
    pub fn get_prefabs(&self) -> Vec<&UnityPrefab>;

    pub fn find_prefab_by_name(&self, name: &str) -> Option<&UnityPrefab>;
}

5. Additional Unity Components (LOW PRIORITY)

Status: Only 3 components implemented

Currently Missing:

  • Camera
  • Light
  • MeshRenderer / MeshFilter
  • Collider variants (BoxCollider, SphereCollider, etc.)
  • Rigidbody
  • MonoBehaviour (custom scripts)
  • UI components (Image, Text, Button, etc.)

Implementation Pattern:

// src/types/camera.rs
#[derive(Debug, Clone)]
pub struct Camera {
    pub field_of_view: f32,
    pub near_clip_plane: f32,
    pub far_clip_plane: f32,
    // ... other fields
}

impl UnityComponent for Camera {
    fn parse(yaml: &serde_yaml::Mapping, _ctx: &ComponentContext) -> Option<Self> {
        Some(Self {
            field_of_view: yaml_helpers::get_f64(yaml, "m_FieldOfView")? as f32,
            near_clip_plane: yaml_helpers::get_f64(yaml, "near clip plane")? as f32,
            far_clip_plane: yaml_helpers::get_f64(yaml, "far clip plane")? as f32,
        })
    }
}

Files to Create:

  • src/types/camera.rs
  • src/types/light.rs
  • src/types/renderer.rs
  • etc.

Files to Modify:

  • src/types/mod.rs (add module declarations)
  • src/ecs/builder.rs:96-122 (add component dispatch cases)
  • Register components in Sparsey World builder (src/ecs/builder.rs:24-28)

🔧 Known Issues

1. Compilation Warnings

None currently! Code compiles cleanly in release mode.

2. Disabled Modules

  • src/project/ - Commented out in src/lib.rs:33 due to UnityDocument references

3. Stubbed Functionality

  • Component insertion (src/ecs/builder.rs:141-151)
  • Transform hierarchy resolution (src/ecs/builder.rs:155-176)

Phase 1: Complete Sparsey Integration (CRITICAL)

Time Estimate: 1-2 hours of research + 2-3 hours implementation

Success Criteria:

  • Parse a .unity scene with nested GameObjects
  • Verify Transform hierarchy is correctly resolved
  • Query entities and access components from World

Phase 2: Implement Prefab Instancing (HIGH VALUE)

Time Estimate: 3-4 hours

  1. Create src/prefab/mod.rs with PrefabInstance API
  2. Implement YAML cloning and override logic
  3. Implement spawn_into() using existing world builder
  4. Add tests with real prefab files

Success Criteria:

  • Load a prefab
  • Override values (name, position, etc.)
  • Instantiate into scene multiple times
  • Verify entities created correctly

Phase 3: Update UnityProject Module (MEDIUM PRIORITY)

Time Estimate: 2-3 hours

  1. Update HashMap to store UnityFile enum
  2. Implement scene/prefab accessors
  3. Update query functions for RawDocument
  4. Re-enable module exports

Success Criteria:

  • Load multiple scenes and prefabs
  • Query across files
  • Find prefabs by name

Phase 4: Add More Components (ONGOING)

Time Estimate: 1-2 hours per component

Start with most common components:

  1. Camera (critical for scene rendering)
  2. Light (critical for scene rendering)
  3. MeshRenderer + MeshFilter (for 3D objects)

🎯 Performance Characteristics

Memory Improvements

  • Before: YAML → PropertyValue tree → Component (2x allocations)
  • After (Scenes): YAML → Component (1x allocation, ~40% reduction)
  • After (Prefabs): YAML → serde_yaml::Value (shared references, minimal overhead)

Parsing Speed

  • Direct YAML access eliminates PropertyValue conversion
  • Prefabs use cheap cloning (Arc-based in serde_yaml)

🧪 Testing Status

Unit Tests

  • Parser header validation (src/parser/mod.rs:196-201)
  • YAML content extraction (src/parser/mod.rs:204-209)
  • File type detection (src/parser/mod.rs:212-229)

Integration Tests

  • Scene parsing end-to-end
  • Prefab parsing end-to-end
  • Component attachment
  • Transform hierarchy resolution
  • Prefab instantiation

Recommendation: Add integration tests once Sparsey integration is complete.

📝 Code Organization

src/
├── lib.rs                  # Public API + exports
├── error.rs               # Error types
├── model/
│   └── mod.rs            # ✅ UnityFile, UnityScene, UnityPrefab, RawDocument
├── parser/
│   ├── mod.rs            # ✅ File type detection + parsing pipeline
│   ├── unity_tag.rs      # ✅ Unity tag parsing (!u!N &ID)
│   ├── yaml.rs           # ✅ YAML document splitting
│   └── meta.rs           # ✅ .meta file parsing
├── types/
│   ├── mod.rs            # ✅ Type exports
│   ├── component.rs      # ✅ UnityComponent trait + yaml_helpers
│   ├── game_object.rs    # ✅ GameObject component
│   ├── transform.rs      # ✅ Transform + RectTransform
│   ├── ids.rs            # ✅ FileID, LocalID
│   ├── values.rs         # ✅ Vector2/3, Quaternion, Color, etc.
│   ├── reference.rs      # ✅ UnityReference enum
│   └── type_registry.rs  # ✅ Type ID ↔ Class name mapping
├── ecs/
│   ├── mod.rs            # ✅ Module exports
│   └── builder.rs        # ⚠️ 3-pass world building (incomplete)
├── prefab/               # ❌ NOT CREATED YET
│   └── mod.rs            # TODO: Prefab instancing
├── project/              # ❌ DISABLED (needs refactoring)
│   ├── mod.rs            # ❌ References old UnityDocument
│   └── query.rs          # ❌ References old UnityDocument
└── property/
    └── mod.rs            # ✅ PropertyValue (kept for helpers)

🔗 External Dependencies

  • serde_yaml 0.9 - YAML parsing
  • sparsey 0.13 - ECS framework
  • glam 0.29 - Math types (Vec2/3, Quat)
  • indexmap 2.1 - Ordered maps
  • lru 0.12 - LRU cache for references

📚 Useful Documentation

🤝 Contributing / Next Agent Instructions

If you're the next AI agent working on this:

  1. Start here: Read this summary completely
  2. Quick test: Try cargo build --release - should compile cleanly
  3. Focus on: Sparsey integration (Phase 1) - highest priority
  4. Key files:
    • src/ecs/builder.rs (needs Sparsey API research)
    • src/prefab/mod.rs (doesn't exist yet)
    • src/project/mod.rs (needs refactoring)

Before making changes:

  • Understand the 3-pass world building approach
  • Know that dispatcher routes to parsers (no redundant type checks in parsers)
  • RawDocument.yaml contains INNER yaml (after class name wrapper is removed)

Testing approach:

  • Use files in data/ directory for real Unity files
  • Focus on .unity scenes first, then .prefab files
  • Verify entity creation and component attachment

Good luck! 🚀