videoflow.processors.vision package

The videoflow.processors.vision package contains processors that are used in the computer vision domain, such as detectors, classifiers, trackers, pose estimation, etc.

Submodules

videoflow.processors.vision.annotators module

class videoflow.processors.vision.annotators.BoundingBoxAnnotator(class_labels_path=None, class_labels_dataset='coco', box_color=(255, 225, 0), box_thickness=2, text_color=(255, 255, 0), nb_tasks=1)

Bases: videoflow.processors.vision.annotators.ImageAnnotator

Draws bounding boxes on images.

  • Arguments:
    • class_labels_path: path to pbtxt file that defines the labels indexes

    • class_labels_dataset: If class_labels_path is None, then we use this attribute to download the file from the releases folder. Currently supported datasets are: coco, oidv4, pascal and kitti.

    • box_color: color to use to draw the boxes

    • box_thickness: thickness of boxes to draw

    • text_color: color of text to draw

supported_datasets = ['coco', 'oidv4', 'pascal', 'kitti']
class videoflow.processors.vision.annotators.ImageAnnotator(nb_tasks=1)

Bases: videoflow.core.node.ProcessorNode

Interface for all image annotators. All image annotators receive as input an image and annotation metadata, and return as output a copy of the image with the drawings representing the metadata.

process(im: numpy.core.multiarray.array, annotations: any) → numpy.core.multiarray.array

Returns a copy of im visually annotated with the annotations defined in annotations

class videoflow.processors.vision.annotators.SegmenterAnnotator(class_labels_path=None, class_labels_dataset='coco', transparency=0.5, nb_tasks=1)

Bases: videoflow.processors.vision.annotators.ImageAnnotator

Draws bounding boxes on images. - Arguments:

  • class_labels_path: path to pbtxt file that defines the labels indexes

  • class_labels_dataset: If class_labels_path is None, then we use this attribute to download the file from the releases folder. Currently supported datasets are: coco, oidv4, pascal and kitti.

  • transparency: A value between 0 and 1 for the transparency of the mask

colors = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (255, 0, 255), (0, 255, 255), (255, 127.5, 0), (255, 0, 127.5), (127.5, 255, 0), (0, 255, 127.5), (127.5, 0, 255), (0, 127.5, 255)]
supported_datasets = ['coco', 'oidv4', 'pascal', 'kitti']
class videoflow.processors.vision.annotators.TrackerAnnotator(box_color=(255, 225, 0), box_thickness=2, text_color=(255, 255, 255), nb_tasks=1)

Bases: videoflow.processors.vision.annotators.ImageAnnotator

Draws bounding boxes on images with track id.

videoflow.processors.vision.counters module

videoflow.processors.vision.detectors module

Collection of object detection processors

class videoflow.processors.vision.detectors.ObjectDetector(nb_tasks: int = 1, device_type='cpu', **kwargs)

Bases: videoflow.core.node.ProcessorNode

Abstract class that defines the interface of object detectors

process(im: numpy.core.multiarray.array) → numpy.core.multiarray.array
  • Arguments:
    • im (np.array): (h, w, 3)

  • Returns:
    • dets: np.array of shape (nb_boxes, 6) Specifically (nb_boxes, [ymin, xmin, ymax, xmax, class_index, score])

      The box coordinates are returned unnormalized (values NOT between 0 and 1, but using the original dimension of the image)

videoflow.processors.vision.pose module

videoflow.processors.vision.segmentation module

class videoflow.processors.vision.segmentation.Segmenter(nb_tasks: int = 1, device_type='cpu', **kwargs)

Bases: videoflow.core.node.ProcessorNode

Abstract class that defines the interface to do image segmentation in images

process(im: numpy.core.multiarray.array) → numpy.core.multiarray.array
  • Arguments:
    • im (np.array): (h, w, 3)

  • Returns:
    • masks: np.array of shape (nb_masks, h, w)

    • classes: np.array of shape (nb_masks,)

    • scores: np.array of shape (nb_masks,)

videoflow.processors.vision.trackers module

class videoflow.processors.vision.trackers.BoundingBoxTracker(*args, **kwargs)

Bases: videoflow.core.node.OneTaskProcessorNode

Tracks bounding boxes from one frame to another. It keeps an internal state representation that allows it to track across frames.

process(dets: numpy.core.multiarray.array) → numpy.core.multiarray.array
  • Arguments:
    • dets: np.array of shape (nb_boxes, 6) Specifically (nb_boxes, [ymin, xmin, ymax, xmax, class_index, score])

  • Returns:
    • tracks: np.array of shape (nb_boxes, 5) Specifically (nb_boxes, [ymin, xmin, ymax, xmax, track_id])

videoflow.processors.vision.transformers module

class videoflow.processors.vision.transformers.CropImageTransformer(crop_dimensions: Optional[numpy.core.multiarray.array] = None)

Bases: videoflow.core.node.ProcessorNode

  • Arguments:
    • crop_dimensions: np.array of shape (nb_boxes, 4) second dimension entries are [ymin, xmin, ymax, xmax] or None

  • Raises:
    • ValueError:
      • If any of crop_dimensions less than 0

      • If ymin > ymax or xmin > xmax

process(im: numpy.core.multiarray.array, crop_dimensions: Optional[numpy.core.multiarray.array]) → List[numpy.core.multiarray.array]

Crops image according to the coordinates in crop_dimensions. If those coordinates are out of bounds, it will raise errors

  • Arguments:
    • im (np.array): shape of (h, w, 3)

    • crop_dimensions: np.array of shape (nb_boxes, 4) second dimension entries are [ymin, xmin, ymax, xmax] or None

  • Raises:
    • ValueError:
      • If any of crop_dimensions less than 0

      • If any of crop_dimensions out of bounds

      • If ymin > ymax or xmin > xmax

  • Returns:
    • list of np.arrays: Returns a list of cropped images of the same size as crop_dimensions

class videoflow.processors.vision.transformers.MaskImageTransformer

Bases: videoflow.core.node.ProcessorNode

process(im: numpy.core.multiarray.array, mask: numpy.core.multiarray.array) → numpy.core.multiarray.array

Masks an image according to given masks

  • Arguments:
    • im (np.array): shape of (h, w, 3)

    • mask (np.array): (h, w) of type np.float32, with values between zero and one

  • Raises:
    • ValueError:
      • If mask does not have same height and width as im

class videoflow.processors.vision.transformers.ResizeImageTransformer(maintain_ratio=False)

Bases: videoflow.core.node.ProcessorNode

process(im: numpy.core.multiarray.array, new_size) → numpy.core.multiarray.array

Resizes image according to coordinates in new_size

  • Arguments:
    • im (np.array): shape of (h, w, 3)

    • new_size (tuple): (new_height, new_width)

  • Raises:
    • ValueError:
      • If new_height or new_width are negative