XML Generator CLI Tool
This Python CLI tool processes semicolon-separated CSV files to generate XML files with <pack_content> sections.
Features
- Reads semicolon-separated CSV files (Excel format)
- Handles BOM (Byte Order Mark) characters automatically
- Groups individual CIS codes by SET CIS values
- Generates XML using a template file
- Supports column name override for CIS codes
- Provides dry-run mode for testing
- Proper XML escaping for special characters
- NEW: Set composition validation using dictionary rules
- NEW: Auto-generation of document ID and operation time
- NEW: Custom document parameters (ID, number, operation time)
- NEW: Validation-only mode for checking data integrity
- NEW: Comprehensive error and warning reporting
Installation
pip install -r requirements.txt
Usage
Basic Usage
python xml_generator.py input.csv template.xml -o output.xml
Options
--output/-o FILE: Output XML file path (prints to stdout if not specified)--cis-column/-c TEXT: Column name for CIS codes (default: "Код")--encoding/-e TEXT: CSV file encoding (default: utf-8)--dry-run: Show what would be processed without generating output--set-dict FILE: Path to set dictionary CSV file for validation--document-id TEXT: Document ID to use in XML (auto-generated if not provided)--document-number TEXT: Document number to use in XML--operation-time TEXT: Operation time in ISO format (auto-generated if not provided)--validate-only: Only validate composition without generating XML
Examples
-
Basic XML generation:
python xml_generator.py set_distributed.csv sets_creation.xml -o output.xml -
With validation and custom parameters:
python xml_generator.py --set-dict set_dict.csv --document-id "DOC_123" --document-number "NUM_456" --operation-time "2024-01-15T10:30:00+03:00" set_distributed.csv sets_creation.xml -o output.xml -
Validation only (no XML generation):
python xml_generator.py --validate-only --set-dict set_dict.csv set_distributed.csv sets_creation.xml -
Dry run with validation:
python xml_generator.py --dry-run --set-dict set_dict.csv set_distributed.csv sets_creation.xml -
Override CIS column:
python xml_generator.py --cis-column "MyColumn" input.csv template.xml -o output.xml -
Auto-generated parameters:
python xml_generator.py --document-number "MY_DOC_001" set_distributed.csv sets_creation.xml -o output.xml # Document ID and operation time will be auto-generated
CSV File Format
Distributed CSV File (set_distributed.csv)
The main CSV file should be semicolon-separated with columns:
SET CIS: Pack codes that become<pack_code>elementsКод(or specified column): Individual CIS codes that become<cis>elementsSET GTIN: SET GTIN codes for validation (optional)GTIN: Individual GTIN codes for validation (optional)
Set Dictionary CSV File (set_dict.csv)
For validation, provide a semicolon-separated dictionary file with:
GTIN SET: SET GTIN codes (matchesSET GTINin distributed file)GTIN ITEM: Individual GTIN codes (matchesGTINin distributed file)COUNT: Expected count of each GTIN in the setSET NAME: Descriptive name of the set (optional)
Validation Features
When using --set-dict, the tool validates that each SET CIS contains the correct composition:
- Missing GTINs: Checks if required GTIN items are missing from sets
- Wrong counts: Verifies that GTIN item counts match expectations
- Extra GTINs: Identifies unexpected GTIN items in sets
- Validation modes:
--validate-only: Only validate without generating XML--dry-run: Show validation results and processing preview- Normal mode: Validate and generate XML (errors prevent generation)
Validation Output
- ✅ OK: Set composition is valid
- ⚠️ WARNING: Set has unexpected items but required items are present
- ❌ ERROR: Set is missing required items or has wrong counts
XML Template
The tool uses an XML template file and replaces existing <pack_content> sections with generated data. The generated sections are inserted before the </Document> tag.
Template Parameters
The tool can automatically replace these parameters in your template:
document_id="..."- Replaced with--document-idvaluedocument_number="..."- Replaced with--document-numbervalueoperation_date_time="..."- Replaced with--operation-timevalue
Example Output
<pack_content>
<pack_code><![CDATA[0104639970975627215!rYq<zP+sPBY]]></pack_code>
<cis><![CDATA[0104639970975245215!lv).%ApB'P>]]></cis>
<cis><![CDATA[0104639970975306215"ZqzWoJbOFB4]]></cis>
<cis><![CDATA[0104639970975276215%fTQ*tVRESUU]]></cis>
</pack_content>
Requirements
- Python 3.6+
- click>=8.0.0
Description
Languages
Python
68.3%
VBA
22.6%
PowerShell
9.1%