Files
crpt-aggregation/README.md
2026-05-08 14:59:56 +03:00

4.9 KiB

XML Generator CLI Tool

This Python CLI tool processes semicolon-separated CSV files to generate XML files with <pack_content> sections.

Features

  • Reads semicolon-separated CSV files (Excel format)
  • Handles BOM (Byte Order Mark) characters automatically
  • Groups individual CIS codes by SET CIS values
  • Generates XML using a template file
  • Supports column name override for CIS codes
  • Provides dry-run mode for testing
  • Proper XML escaping for special characters
  • NEW: Set composition validation using dictionary rules
  • NEW: Auto-generation of document ID and operation time
  • NEW: Custom document parameters (ID, number, operation time)
  • NEW: Validation-only mode for checking data integrity
  • NEW: Comprehensive error and warning reporting

Installation

pip install -r requirements.txt

Usage

Basic Usage

python xml_generator.py input.csv template.xml -o output.xml

Options

  • --output/-o FILE: Output XML file path (prints to stdout if not specified)
  • --cis-column/-c TEXT: Column name for CIS codes (default: "Код")
  • --encoding/-e TEXT: CSV file encoding (default: utf-8)
  • --dry-run: Show what would be processed without generating output
  • --set-dict FILE: Path to set dictionary CSV file for validation
  • --document-id TEXT: Document ID to use in XML (auto-generated if not provided)
  • --document-number TEXT: Document number to use in XML
  • --operation-time TEXT: Operation time in ISO format (auto-generated if not provided)
  • --validate-only: Only validate composition without generating XML

Examples

  1. Basic XML generation:

    python xml_generator.py set_distributed.csv sets_creation.xml -o output.xml
    
  2. With validation and custom parameters:

    python xml_generator.py --set-dict set_dict.csv --document-id "DOC_123" --document-number "NUM_456" --operation-time "2024-01-15T10:30:00+03:00" set_distributed.csv sets_creation.xml -o output.xml
    
  3. Validation only (no XML generation):

    python xml_generator.py --validate-only --set-dict set_dict.csv set_distributed.csv sets_creation.xml
    
  4. Dry run with validation:

    python xml_generator.py --dry-run --set-dict set_dict.csv set_distributed.csv sets_creation.xml
    
  5. Override CIS column:

    python xml_generator.py --cis-column "MyColumn" input.csv template.xml -o output.xml
    
  6. Auto-generated parameters:

    python xml_generator.py --document-number "MY_DOC_001" set_distributed.csv sets_creation.xml -o output.xml
    # Document ID and operation time will be auto-generated
    

CSV File Format

Distributed CSV File (set_distributed.csv)

The main CSV file should be semicolon-separated with columns:

  • SET CIS: Pack codes that become <pack_code> elements
  • Код (or specified column): Individual CIS codes that become <cis> elements
  • SET GTIN: SET GTIN codes for validation (optional)
  • GTIN: Individual GTIN codes for validation (optional)

Set Dictionary CSV File (set_dict.csv)

For validation, provide a semicolon-separated dictionary file with:

  • GTIN SET: SET GTIN codes (matches SET GTIN in distributed file)
  • GTIN ITEM: Individual GTIN codes (matches GTIN in distributed file)
  • COUNT: Expected count of each GTIN in the set
  • SET NAME: Descriptive name of the set (optional)

Validation Features

When using --set-dict, the tool validates that each SET CIS contains the correct composition:

  • Missing GTINs: Checks if required GTIN items are missing from sets
  • Wrong counts: Verifies that GTIN item counts match expectations
  • Extra GTINs: Identifies unexpected GTIN items in sets
  • Validation modes:
    • --validate-only: Only validate without generating XML
    • --dry-run: Show validation results and processing preview
    • Normal mode: Validate and generate XML (errors prevent generation)

Validation Output

  • OK: Set composition is valid
  • ⚠️ WARNING: Set has unexpected items but required items are present
  • ERROR: Set is missing required items or has wrong counts

XML Template

The tool uses an XML template file and replaces existing <pack_content> sections with generated data. The generated sections are inserted before the </Document> tag.

Template Parameters

The tool can automatically replace these parameters in your template:

  • document_id="..." - Replaced with --document-id value
  • document_number="..." - Replaced with --document-number value
  • operation_date_time="..." - Replaced with --operation-time value

Example Output

<pack_content>
    <pack_code><![CDATA[0104639970975627215!rYq<zP+sPBY]]></pack_code>
    <cis><![CDATA[0104639970975245215!lv).%ApB'P>]]></cis>
    <cis><![CDATA[0104639970975306215"ZqzWoJbOFB4]]></cis>
    <cis><![CDATA[0104639970975276215%fTQ*tVRESUU]]></cis>
</pack_content>

Requirements

  • Python 3.6+
  • click>=8.0.0