# XML Parser Documentation This document explains the XML parsing system used to load game data from Cursebreaker's XML files and populate the SQLite database. ## Overview The XML parser system is responsible for: 1. Reading game data from XML files (items, NPCs, quests, etc.) 2. Parsing the XML into Rust structs 3. Storing the parsed data in a SQLite database ## Architecture ### File Structure ``` cursebreaker-parser/src/ ├── xml_parsers/ # XML parsing module │ ├── mod.rs # Shared utilities and re-exports │ ├── items.rs # Item parser │ ├── npcs.rs # NPC parser │ ├── quests.rs # Quest parser │ ├── harvestables.rs # Harvestable resource parser │ ├── loot.rs # Loot table parser │ ├── maps.rs # Map/scene parser │ ├── fast_travel.rs # Fast travel location parser │ ├── player_houses.rs # Player house parser │ ├── traits.rs # Character trait parser │ └── shops.rs # Shop/vendor parser ├── databases/ # Database abstraction layer │ ├── item_database.rs │ ├── npc_database.rs │ └── ... ├── types/ # Data structures │ └── cursebreaker/ │ ├── item.rs │ ├── npc.rs │ └── ... └── bin/ └── xml-parser.rs # CLI binary ``` ### Data Flow ``` XML Files (CBAssets/Data/XMLs/) │ ▼ XML Parsers (xml_parsers/*.rs) │ ▼ Rust Structs (types/cursebreaker/*.rs) │ ▼ Database Layer (databases/*.rs) │ ▼ SQLite Database (cursebreaker.db) ``` ## Parser Components ### Shared Utilities (`xml_parsers/mod.rs`) The module provides common functionality used by all parsers: ```rust /// Error types for XML parsing pub enum XmlParseError { XmlError(quick_xml::Error), // XML syntax errors IoError(std::io::Error), // File read errors AttrError(AttrError), // Attribute parsing errors MissingAttribute(String), // Required attribute not found InvalidAttribute(String), // Attribute value invalid } /// Parse XML element attributes into a HashMap fn parse_attributes(element: &BytesStart) -> Result, XmlParseError> /// Parse health range strings like "3-5" or "3" into (min, max) fn parse_health_range(health_str: &str) -> (i32, i32) ``` ### Individual Parsers Each parser follows a similar pattern: 1. **Open and read the XML file** using `quick_xml::Reader` 2. **Iterate through XML events** (Start, Empty, End, Text, Eof) 3. **Match element names** and extract attributes 4. **Build Rust structs** from the parsed data 5. **Return a Vec** of parsed objects #### Example: Item Parser Flow ```rust pub fn parse_items_xml>(path: P) -> Result, XmlParseError> { // 1. Open file and create reader let file = File::open(path)?; let mut reader = Reader::from_reader(BufReader::new(file)); // 2. Process XML events loop { match reader.read_event_into(&mut buf) { Ok(Event::Start(e)) | Ok(Event::Empty(e)) => { match e.name().as_ref() { b"item" => { // 3. Parse attributes let attrs = parse_attributes(&e)?; let id = attrs.get("id")...; let name = attrs.get("name")...; // 4. Create struct let item = Item::new(id, name); current_item = Some(item); } b"stat" => { /* Parse nested stat element */ } _ => {} } } Ok(Event::End(e)) => { if e.name().as_ref() == b"item" { // 5. Add completed item to results items.push(current_item.take().unwrap()); } } Ok(Event::Eof) => break, Err(e) => return Err(XmlParseError::XmlError(e)), _ => {} } } Ok(items) } ``` ## Supported Data Types | Parser | XML Source | Description | |--------|-----------|-------------| | `items` | `Items/Items.xml` | Game items (weapons, armor, consumables, etc.) | | `npcs` | `Npcs/NPCInfo.xml` | Non-player characters (enemies, vendors, quest givers) | | `quests` | `Quests/Quests.xml` | Quest definitions with phases and rewards | | `harvestables` | `Harvestables/HarvestableInfo.xml` | Gatherable resources (trees, rocks, fishing spots) | | `loot` | `Loot/Loot.xml` | NPC drop tables | | `maps` | `Maps/Maps.xml` | Game scenes/areas with lighting and fog settings | | `fast_travel` | `FastTravel*.xml` | Teleport locations, canoe routes, portals | | `player_houses` | `PlayerHouses/PlayerHouses.xml` | Purchasable player housing | | `traits` | `Traits/Traits.xml` | Character traits/perks | | `shops` | `Shops/Shops.xml` | Vendor inventories and pricing | ## CLI Usage The `xml-parser` binary provides command-line control over which parsers to run: ```bash # Parse all data types xml-parser --all xml-parser -a # Parse specific data types xml-parser --items # or -i xml-parser --npcs # or -n xml-parser --quests # or -q xml-parser --harvestables # or -r xml-parser --loot # or -l xml-parser --maps # or -m xml-parser --fast-travel # or -f xml-parser --houses # or -p xml-parser --traits # or -t xml-parser --shops # or -s # Combine multiple parsers xml-parser --items --npcs --quests xml-parser -i -n -q # View help xml-parser --help ``` ### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `CB_ASSETS_PATH` | `/home/connor/repos/CBAssets` | Path to game assets directory | | `DATABASE_URL` | `cursebreaker.db` | SQLite database file path | ## Database Integration Each parser has a corresponding database module that handles: 1. **Loading from XML** - Wraps the parser and creates a queryable database 2. **Querying** - Methods like `get_by_id()`, `get_by_name()`, `get_all()` 3. **Saving to SQLite** - Serializes data and inserts into database tables ### Example: ItemDatabase ```rust // Load items from XML let item_db = ItemDatabase::load_from_xml("path/to/Items.xml")?; // Query items let sword = item_db.get_by_id(150); let bows = item_db.get_by_category("bow"); // Save to database (includes icon processing) item_db.save_to_db_with_images(&mut conn, "path/to/icons")?; ``` ## XML Format Examples ### Item XML ```xml ``` ### NPC XML ```xml ``` ### Quest XML ```xml ``` ## Error Handling The parser uses a custom `XmlParseError` enum to handle various failure modes: - **MissingAttribute**: Required XML attribute not found (e.g., missing `id`) - **InvalidAttribute**: Attribute value cannot be parsed (e.g., non-numeric ID) - **XmlError**: Malformed XML syntax - **IoError**: File not found or permission denied Parsers fail fast on required attributes but use defaults for optional ones: ```rust // Required - returns error if missing let id = attrs.get("id") .ok_or_else(|| XmlParseError::MissingAttribute("id".to_string()))?; // Optional - uses default if missing let level = attrs.get("level") .and_then(|v| v.parse().ok()) .unwrap_or(1); ``` ## Performance Considerations - **Streaming parser**: Uses `quick_xml` which processes XML as a stream, keeping memory usage low - **Single-pass parsing**: Each file is read once and parsed in a single pass - **Batch database inserts**: Data is collected into vectors before database insertion - **Selective parsing**: CLI allows parsing only needed data types, reducing processing time ## Adding a New Parser To add support for a new XML data type: 1. **Create the type** in `types/cursebreaker/new_type.rs` 2. **Create the parser** in `xml_parsers/new_type.rs` 3. **Export from mod.rs**: Add `mod new_type;` and `pub use new_type::parse_new_type_xml;` 4. **Create database module** in `databases/new_type_database.rs` 5. **Add CLI flag** in `bin/xml-parser.rs` 6. **Update this documentation**