7.4 KiB
Unity Parser – Design Document
Overview
Unity Parser is a Rust library for parsing local Unity projects (scenes and prefabs) from their YAML representation (.unity and .prefab files) and loading the resulting data into an ECS world.
The primary goal is to enable users to:
- Selectively extract only the data they care about (minimal memory footprint).
- Mirror Unity MonoBehaviour types in Rust with minimal boilerplate.
- Query the fully instantiated scene (including all nested prefabs) using ECS queries.
Use cases include:
- Modding tools
- Static analysis
- Database generation
- Asset inspection / reporting
- Custom exporters
The library is offline-only – it works exclusively on exported Unity project files (YAML + assets). No runtime or in-engine integration is planned.
Core Principles
- Minimal memory usage: Only parse and store components explicitly requested by the user.
- Fast setup: Users declare desired types via a single procedural macro.
- Full prefab instantiation: All prefabs (including nested/variant) are fully expanded into the scene.
- Simple querying: Users work directly with the ECS world (Sparsey) or optional helper methods.
Architecture
ECS Backend
- Sparsey is used as the ECS implementation.
- Rationale: Lightweight, excellent insertion performance, no archetype overhead.
- Query performance trade-off is acceptable because queries are infrequent (typically once or a few times per tool run, not per-frame like in games).
- Each loaded scene gets its own
World(Sparsey terminology). - The ECS world is exposed directly to users for maximum flexibility.
- Optional ergonomic helpers may be added later (e.g.,
scene.foreach::<(GameObject, Transform, Interactable)>(|...|)).
Data Flow
- User configures which component types to parse (via macro).
- Library scans project for relevant
.unity,.prefab, and.metafiles. - Scenes and prefabs are streamed parsed (YAML).
- Only declared components are deserialized and inserted.
- Prefab instances are recursively instantiated (new fileID mapping per nesting level).
- After all objects are created, world transforms are computed in a post-process pass.
- Resulting
Worldis returned (or cached).
User Configuration
Users declare all desired types with a single procedural macro:
#[unity_parser(
// Built-in Unity components (non-script)
unity_types(Transform, MeshFilter, MeshRenderer, Collider /* ... */),
// Custom MonoBehaviour components
custom_types(Interactable, Harvestable, LootContainer, EnemyAI),
// Asset types beyond scenes/prefabs (future extension)
asset_types(/* Material, Texture2D */)
)]
struct MyProjectConfig;
Rules
- unity_types: Built-in Unity components (no associated script).
- custom_types: User-defined structs that mirror MonoBehaviour scripts.
- Struct name must exactly match the C# class name.
- The parser will automatically locate the corresponding
.csfile to extract its GUID for matching YAML entries.
- Users must explicitly list every component they want. Nothing is parsed by default.
- Examples and common sets will be provided in documentation.
Component Definition
Components are plain Rust structs mirroring Unity’s serialized fields.
#[derive(Component)]
struct Transform {
local_position: Vec3,
local_rotation: Quat,
local_scale: Vec3,
world_matrix: Mat4, // Computed in post-process
parent: Option<Entity>,
children: Vec<Entity>,
}
#[derive(Component)]
struct Interactable {
interaction_prompt: String,
radius: f32,
}
- Users can implement custom parsing logic if needed.
- Derive macros will offer automatic field parsing for common cases.
Special Cases
- GameObject: Not a true component, but stored as a component containing:
name: Stringlayer: u32active: bool
Prefab Instantiation
- Full support for nested prefabs (modern Unity prefab workflow).
- Strategy:
- Prefabs are parsed exactly like scenes.
- When a
PrefabInstanceis encountered, the referenced prefab is loaded recursively. - A new
HashMap<fileID → Entity>mapping is created for each nesting level. - Overrides are applied only to property values (via
propertyPath). - Current scope: only property overrides are applied.
- TODO: Support added/removed components, reordered children, removed GameObjects.
Asset Handling
All parsable assets implement a trait:
trait AssetParser {
fn extensions() -> &'static [&'static str];
fn parse(yaml: &YamlNode, context: &ParseContext) -> Result<Self>;
}
- Built-in:
.unity(scenes),.prefab(prefabs). .metafiles are parsed to build GUID ↔ path mappings.- Future extension possible for other YAML assets (e.g., ScriptableObjects).
Selective Parsing & Memory
- Only components listed in the config macro are parsed.
- During YAML streaming, unknown component types (
!u!XXX) are completely skipped – no allocation, no temporary structures. - Goal: Load even very large scenes (hundreds of thousands of objects) into moderate RAM when only a subset of components is requested.
Transform Hierarchy
- Local transforms are parsed immediately.
- Parent/child relationships are recorded.
- World matrices and full hierarchy are computed in a single post-process pass after all entities exist.
Caching
- Optional caching to SQLite.
- Single database file containing all scenes.
- Tables:
scenes(scene_path PRIMARY KEY, hash, timestamp)entities(entity_id, scene_path, gameobject_name, layer, active)- One table per component type (e.g.,
transform,interactable)
- Cache contains only final ECS data (post-instantiation, post-transform pass).
- No sophisticated invalidation: user controls caching via flag/option.
parse(..., use_cache: bool)- CLI:
--cache/--no-cache
- Cache is regenerated completely when enabled and source files are newer or cache missing.
API Sketch
let world = unity_parser::parse::<MyProjectConfig>(
project_root: "/path/to/unity/project",
scenes: vec!["Assets/Scenes/Level1.unity"],
use_cache: true,
max_parallel: Some(4),
)?;
ParserBuildermay be added later for more configuration.- Parallel parsing of independent scenes/prefabs is supported (rayon, limited to 4 jobs by default to control memory).
Error Handling
- Malformed YAML or missing references: log warning/error, continue parsing.
- Missing expected component fields: log, insert default/None where possible.
- Critical failures (e.g., corrupted scene file): return
Err.
Future Considerations / TODOs
- ParserBuilder API
- Automatic derive for common component parsing
- Support for added/removed components in prefab overrides
- Component serialization versioning
- More asset types (Materials, Animators, etc.)
- Binary cache format for faster loading
- Helper query methods on top of raw Sparsey API
Testing
To test this repo, another project will be made in the same repository directory that will load the "Cursebreaker" game that can be found at a certain path that can be configured in the .env file.
Summary
Unity Parser aims to be the fastest, most memory-efficient way to extract structured data from Unity YAML projects in Rust, with a focus on user-defined components and full prefab instantiation. By leveraging Sparsey and aggressive selective parsing, it enables tools that process massive Unity scenes on ordinary hardware.