193 lines
6.2 KiB
Markdown
193 lines
6.2 KiB
Markdown
# Cursebreaker Unity Parser - Design Document
|
|
|
|
## Project Overview
|
|
|
|
A high-performance Rust library for parsing and querying Unity project files (.unity scenes, .prefab prefabs, and .asset ScriptableObjects).
|
|
|
|
## Goals
|
|
|
|
1. **Parse Unity YAML Format**: Handle Unity's YAML 1.1 format with custom tags (`!u!`) and file ID references
|
|
2. **Extract Structure**: Parse GameObjects, Components, and their properties into queryable data structures
|
|
3. **High Performance**: Optimized for large Unity projects with minimal memory footprint
|
|
4. **Type Safety**: Strong typing for Unity's component system
|
|
5. **Library-First**: Designed as a reusable SDK for other Rust tools
|
|
|
|
## Target File Formats
|
|
|
|
- `.unity` - Unity scene files
|
|
- `.prefab` - Unity prefab files
|
|
- `.asset` - Unity ScriptableObject and other asset files
|
|
|
|
All three formats share the same underlying YAML structure with Unity-specific extensions.
|
|
|
|
## Unity File Format Structure
|
|
|
|
Unity files use YAML 1.1 with special conventions:
|
|
|
|
```yaml
|
|
%YAML 1.1
|
|
%TAG !u! tag:unity3d.com,2011:
|
|
--- !u!1 &1866116814460599870
|
|
GameObject:
|
|
m_ObjectHideFlags: 0
|
|
m_Component:
|
|
- component: {fileID: 8151827567463220614}
|
|
- component: {fileID: 8755205353704683373}
|
|
m_Name: CardGrabber
|
|
--- !u!224 &8151827567463220614
|
|
RectTransform:
|
|
m_GameObject: {fileID: 1866116814460599870}
|
|
m_LocalPosition: {x: 0, y: 0, z: 0}
|
|
```
|
|
|
|
### Key Concepts
|
|
|
|
1. **Documents**: Each `---` starts a new YAML document representing a Unity object
|
|
2. **Type Tags**: `!u!N` indicates Unity type (e.g., `!u!1` = GameObject, `!u!224` = RectTransform)
|
|
3. **Anchors**: `&ID` defines a local file ID for the object
|
|
4. **File References**: `{fileID: N}` references objects by their ID (local or external)
|
|
5. **GUID References**: `{guid: ...}` references external assets
|
|
6. **Properties**: All Unity objects have serialized fields (usually prefixed with `m_`)
|
|
|
|
## Architecture
|
|
|
|
### Core Components
|
|
|
|
```
|
|
cursebreaker-parser/
|
|
├── src/
|
|
│ ├── lib.rs # Public API exports
|
|
│ ├── parser/ # YAML parsing layer
|
|
│ │ ├── mod.rs
|
|
│ │ ├── yaml.rs # YAML document parser
|
|
│ │ ├── unity_tag.rs # Unity type tag handler (!u!)
|
|
│ │ └── reference.rs # FileID/GUID reference parser
|
|
│ ├── model/ # Data model
|
|
│ │ ├── mod.rs
|
|
│ │ ├── document.rs # UnityDocument struct
|
|
│ │ ├── object.rs # UnityObject base
|
|
│ │ ├── gameobject.rs # GameObject type
|
|
│ │ ├── component.rs # Component types
|
|
│ │ └── property.rs # Property value types
|
|
│ ├── types/ # Unity type system
|
|
│ │ ├── mod.rs
|
|
│ │ ├── type_id.rs # Unity type ID -> name mapping
|
|
│ │ └── component_types.rs
|
|
│ ├── query/ # Query API
|
|
│ │ ├── mod.rs
|
|
│ │ ├── project.rs # UnityProject (multi-file)
|
|
│ │ ├── find.rs # Find objects/components
|
|
│ │ └── filter.rs # Filter/search utilities
|
|
│ └── error.rs # Error types
|
|
```
|
|
|
|
### Data Model
|
|
|
|
```rust
|
|
// Core types
|
|
pub struct UnityFile {
|
|
pub path: PathBuf,
|
|
pub documents: Vec<UnityDocument>,
|
|
}
|
|
|
|
pub struct UnityDocument {
|
|
pub type_id: u32, // From !u!N
|
|
pub file_id: i64, // From &ID
|
|
pub class_name: String, // E.g., "GameObject"
|
|
pub properties: PropertyMap,
|
|
}
|
|
|
|
pub struct UnityProject {
|
|
pub files: HashMap<PathBuf, UnityFile>,
|
|
// Reference resolution cache
|
|
}
|
|
|
|
// Property values (simplified)
|
|
pub enum PropertyValue {
|
|
Integer(i64),
|
|
Float(f64),
|
|
String(String),
|
|
Boolean(bool),
|
|
FileRef { file_id: i64, guid: Option<String> },
|
|
Vector3 { x: f64, y: f64, z: f64 },
|
|
Color { r: f64, g: f64, b: f64, a: f64 },
|
|
Array(Vec<PropertyValue>),
|
|
Object(PropertyMap),
|
|
}
|
|
```
|
|
|
|
## Performance Considerations
|
|
|
|
1. **Streaming Parser**: Parse YAML incrementally rather than loading entire file into memory
|
|
2. **Lazy Loading**: Only parse files when accessed
|
|
3. **Reference Caching**: Cache resolved references to avoid repeated lookups
|
|
4. **Zero-Copy Where Possible**: Use string slices and borrowed data where feasible
|
|
5. **Parallel Parsing**: Support parsing multiple files concurrently
|
|
|
|
## Dependencies
|
|
|
|
- `yaml-rust2` or `serde_yaml` - YAML parsing (evaluate both)
|
|
- `serde` - Serialization/deserialization
|
|
- `rayon` - Parallel processing (optional, for multi-file parsing)
|
|
- `thiserror` - Error handling
|
|
- `indexmap` - Ordered maps for properties
|
|
|
|
## Testing Strategy
|
|
|
|
1. **Unit Tests**: Each parser component tested independently
|
|
2. **Integration Tests**: Full file parsing with real Unity files
|
|
3. **Sample Data**: Use PiratePanic project as test corpus
|
|
4. **Benchmarks**: Performance tests on large Unity projects
|
|
5. **Fuzzing**: Fuzz testing for parser robustness (future)
|
|
|
|
## API Design Goals
|
|
|
|
### Simple File Parsing
|
|
```rust
|
|
let file = UnityFile::from_path("Scene.unity")?;
|
|
for doc in &file.documents {
|
|
println!("{}: {}", doc.class_name, doc.file_id);
|
|
}
|
|
```
|
|
|
|
### Query API
|
|
```rust
|
|
let project = UnityProject::from_directory("Assets/")?;
|
|
|
|
// Find all GameObjects
|
|
let objects = project.find_all_by_type("GameObject");
|
|
|
|
// Find by name
|
|
let player = project.find_by_name("Player")?;
|
|
|
|
// Get components
|
|
let transform = player.get_component("Transform")?;
|
|
let position = transform.get_vector3("m_LocalPosition")?;
|
|
```
|
|
|
|
### Reference Resolution
|
|
```rust
|
|
// Follow references automatically
|
|
let gameobject = project.get_object(file_id)?;
|
|
let transform_ref = gameobject.get_file_ref("m_Component[0].component")?;
|
|
let transform = project.resolve_reference(transform_ref)?;
|
|
```
|
|
|
|
## Future Enhancements (Out of Scope for v1)
|
|
|
|
- Unity YAML serialization (writing files)
|
|
- C# script parsing
|
|
- Asset dependency graphs
|
|
- Unity version detection and compatibility
|
|
- Binary .unity format support (older Unity versions)
|
|
- Meta file parsing (.meta files)
|
|
|
|
## Success Criteria
|
|
|
|
1. Successfully parse all files in PiratePanic sample project
|
|
2. Extract all GameObjects and Components with properties
|
|
3. Resolve all internal file references correctly
|
|
4. Parse large scene files (>10MB) in <100ms
|
|
5. Memory usage scales linearly with file size
|
|
6. Clean, documented public API
|