Skip to content

PascalVOC Object Detection Format

Overview

PascalVOC (Visual Object Classes) is a widely used format for object detection tasks, introduced in the seminal paper "The PASCAL Visual Object Classes (VOC) Challenge" by Everingham et al. It stores annotations in XML files, with one XML file per image containing bounding box coordinates and class labels. The complete format specification is available in the PascalVOC development kit.

Specification

Each XML annotation file contains: - Image metadata (filename, size, etc.) - List of objects, each with: - Class name (string, allows spaces, e.g., "traffic light" or "stop sign") - Bounding box coordinates as integer pixel values: - xmin: left-most pixel coordinate - ymin: top-most pixel coordinate - xmax: right-most pixel coordinate - ymax: bottom-most pixel coordinate - Optional attributes (difficult, truncated, occluded)

Directory Structure

dataset/
├── images/
│   ├── image1.jpg
│   └── image2.jpg
└── annotations/
    ├── image1.xml
    └── image2.xml

Example Annotation

<annotation>
    <folder>images</folder>
    <filename>image1.jpg</filename>
    <size>
        <width>640</width>
        <height>480</height>
        <depth>3</depth>
    </size>
    <object>
        <name>cat</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>100</xmin>
            <ymin>200</ymin>
            <xmax>300</xmax>
            <ymax>400</ymax>
        </bndbox>
    </object>
</annotation>

Format Details

  • Coordinates are in absolute pixel values (not normalized)
  • Bounding boxes use XYXY format (xmin, ymin, xmax, ymax)
  • Each object can have optional attributes:
  • difficult: Indicates hard to recognize objects
  • truncated: Indicates objects partially outside the image
  • occluded: Indicates partially obscured objects

Converting with Labelformat

COCO to PascalVOC

labelformat convert \
    --task object-detection \
    --input-format coco \
    --input-file coco-labels/annotations.json \
    --output-format pascalvoc \
    --output-folder pascalvoc-labels

PascalVOC to COCO

labelformat convert \
    --task object-detection \
    --input-format pascalvoc \
    --input-folder pascalvoc-labels \
    --category-names cat,dog,fish \
    --output-format coco \
    --output-file coco-labels/annotations.json

Required Arguments

  • For input:
  • --input-folder: Directory containing PascalVOC XML files
  • --category-names: Comma-separated list of category names (e.g., 'dog,cat')
  • For output:
  • --output-folder: Directory to save generated XML files

References