6.2 KiB
6.2 KiB
Cursebreaker Unity Parser - Design Document
Project Overview
A high-performance Rust library for parsing and querying Unity project files (.unity scenes, .prefab prefabs, and .asset ScriptableObjects).
Goals
- Parse Unity YAML Format: Handle Unity's YAML 1.1 format with custom tags (
!u!) and file ID references - Extract Structure: Parse GameObjects, Components, and their properties into queryable data structures
- High Performance: Optimized for large Unity projects with minimal memory footprint
- Type Safety: Strong typing for Unity's component system
- Library-First: Designed as a reusable SDK for other Rust tools
Target File Formats
.unity- Unity scene files.prefab- Unity prefab files.asset- Unity ScriptableObject and other asset files
All three formats share the same underlying YAML structure with Unity-specific extensions.
Unity File Format Structure
Unity files use YAML 1.1 with special conventions:
%YAML 1.1
%TAG !u! tag:unity3d.com,2011:
--- !u!1 &1866116814460599870
GameObject:
m_ObjectHideFlags: 0
m_Component:
- component: {fileID: 8151827567463220614}
- component: {fileID: 8755205353704683373}
m_Name: CardGrabber
--- !u!224 &8151827567463220614
RectTransform:
m_GameObject: {fileID: 1866116814460599870}
m_LocalPosition: {x: 0, y: 0, z: 0}
Key Concepts
- Documents: Each
---starts a new YAML document representing a Unity object - Type Tags:
!u!Nindicates Unity type (e.g.,!u!1= GameObject,!u!224= RectTransform) - Anchors:
&IDdefines a local file ID for the object - File References:
{fileID: N}references objects by their ID (local or external) - GUID References:
{guid: ...}references external assets - Properties: All Unity objects have serialized fields (usually prefixed with
m_)
Architecture
Core Components
cursebreaker-parser/
├── src/
│ ├── lib.rs # Public API exports
│ ├── parser/ # YAML parsing layer
│ │ ├── mod.rs
│ │ ├── yaml.rs # YAML document parser
│ │ ├── unity_tag.rs # Unity type tag handler (!u!)
│ │ └── reference.rs # FileID/GUID reference parser
│ ├── model/ # Data model
│ │ ├── mod.rs
│ │ ├── document.rs # UnityDocument struct
│ │ ├── object.rs # UnityObject base
│ │ ├── gameobject.rs # GameObject type
│ │ ├── component.rs # Component types
│ │ └── property.rs # Property value types
│ ├── types/ # Unity type system
│ │ ├── mod.rs
│ │ ├── type_id.rs # Unity type ID -> name mapping
│ │ └── component_types.rs
│ ├── query/ # Query API
│ │ ├── mod.rs
│ │ ├── project.rs # UnityProject (multi-file)
│ │ ├── find.rs # Find objects/components
│ │ └── filter.rs # Filter/search utilities
│ └── error.rs # Error types
Data Model
// Core types
pub struct UnityFile {
pub path: PathBuf,
pub documents: Vec<UnityDocument>,
}
pub struct UnityDocument {
pub type_id: u32, // From !u!N
pub file_id: i64, // From &ID
pub class_name: String, // E.g., "GameObject"
pub properties: PropertyMap,
}
pub struct UnityProject {
pub files: HashMap<PathBuf, UnityFile>,
// Reference resolution cache
}
// Property values (simplified)
pub enum PropertyValue {
Integer(i64),
Float(f64),
String(String),
Boolean(bool),
FileRef { file_id: i64, guid: Option<String> },
Vector3 { x: f64, y: f64, z: f64 },
Color { r: f64, g: f64, b: f64, a: f64 },
Array(Vec<PropertyValue>),
Object(PropertyMap),
}
Performance Considerations
- Streaming Parser: Parse YAML incrementally rather than loading entire file into memory
- Lazy Loading: Only parse files when accessed
- Reference Caching: Cache resolved references to avoid repeated lookups
- Zero-Copy Where Possible: Use string slices and borrowed data where feasible
- Parallel Parsing: Support parsing multiple files concurrently
Dependencies
yaml-rust2orserde_yaml- YAML parsing (evaluate both)serde- Serialization/deserializationrayon- Parallel processing (optional, for multi-file parsing)thiserror- Error handlingindexmap- Ordered maps for properties
Testing Strategy
- Unit Tests: Each parser component tested independently
- Integration Tests: Full file parsing with real Unity files
- Sample Data: Use PiratePanic project as test corpus
- Benchmarks: Performance tests on large Unity projects
- Fuzzing: Fuzz testing for parser robustness (future)
API Design Goals
Simple File Parsing
let file = UnityFile::from_path("Scene.unity")?;
for doc in &file.documents {
println!("{}: {}", doc.class_name, doc.file_id);
}
Query API
let project = UnityProject::from_directory("Assets/")?;
// Find all GameObjects
let objects = project.find_all_by_type("GameObject");
// Find by name
let player = project.find_by_name("Player")?;
// Get components
let transform = player.get_component("Transform")?;
let position = transform.get_vector3("m_LocalPosition")?;
Reference Resolution
// Follow references automatically
let gameobject = project.get_object(file_id)?;
let transform_ref = gameobject.get_file_ref("m_Component[0].component")?;
let transform = project.resolve_reference(transform_ref)?;
Future Enhancements (Out of Scope for v1)
- Unity YAML serialization (writing files)
- C# script parsing
- Asset dependency graphs
- Unity version detection and compatibility
- Binary .unity format support (older Unity versions)
- Meta file parsing (.meta files)
Success Criteria
- Successfully parse all files in PiratePanic sample project
- Extract all GameObjects and Components with properties
- Resolve all internal file references correctly
- Parse large scene files (>10MB) in <100ms
- Memory usage scales linearly with file size
- Clean, documented public API