Visualizer¶
DepthAI SDK visualizer serves as a tool to visualize the output of the DepthAI pipeline.
It can be used to visualize the output of the camera, neural network, depth and disparity map, the rectified streams, the spatial location of the detected objects, and more.
Getting Started¶
Visualizer
is created upon calling
OakCamera.visualize()
, which returns Visualizer
instance.
Once it is created, the visualizer configs can be modified using output()
,
stereo()
,
text()
,
detections()
,
tracking()
methods.
Example how Visualizer
can be created:
from depthai_sdk import OakCamera
with OakCamera() as oak:
cam = oak.create_camera('color')
visualizer = oak.visualize(cam.out.main)
oak.start(blocking=True)
Visualizer
is primarily used alongside with Packets
in depthai_sdk.oak_outputs
module.
Configs¶
Visualizer
is configurable via
VisConfig
that consists of five auxiliary configs:
OutputConfig
,
StereoConfig
,
TextConfig
,
DetectionConfig
,
and TrackingConfig
.
Each config’s type has its own set of parameters, which effects how the corresponding object will be visualized.
There are the following methods for modifying the default configuration:
output()
,
stereo()
,
text()
,
detections()
,
tracking()
.
The arguments should be passed as key-value arguments with the same signature as the corresponding config,
e.g., Visualizer.text(font_size=2, font_color=(255,123,200))
.
The modified configuration will be applied to every created objects. The methods support
fluent interface and can be chained, e.g., Visualizer.text(font_size=2).detections(color=(255, 0, 0))
.
Example how to configure the visualizer:
visualizer = oak.visualize(camera.out.main)
visualizer.detections(
bbox_style=BboxStyle.RECTANGLE,
label_position=TextPosition.MID,
).text(
auto_scale=True
)
Objects¶
Visualizer
operates with objects. Objects can be seen as a hierarchical structure.
The root object is self
, and the children are the list of the created objects.
add_child
should be used to add the object to the children list.
The parent object shares the config and frame shape with all children.
All objects must be derived from GenericObject
.
Implemented objects:
VisDetections
,VisText
,VisLine
,VisCircle
,VisTrail
.
Objects can be added to the visualizer using the following methods:
add_text()
,add_detections()
,add_trail()
,add_circle()
,add_line()
.
Create your own object¶
If the provided functionality is not enough, you can create your own object. To do so, you need to create a class
derived from GenericObject
and implement the
prepare
,
serialize
,
and draw
methods.
The draw
method should draw the object on the passed frame
argument.
class YourOwnObject:
def __init__(self, ...):
...
def prepare(self) -> None:
...
def serialize(self) -> str:
...
def draw(self, frame) -> None:
...
with OakCamera() as oak:
cam = oak.create_camera(...)
visualizer = cam.visualize(cam.out.main)
visualizer.add_object(YourOwnObject(...))
Example usage¶
The following script will visualize the output of face detection model:
from depthai_sdk import OakCamera
from depthai_sdk.visualize.configs import BboxStyle, TextPosition
with OakCamera() as oak:
camera = oak.create_camera('color')
det = oak.create_nn('face-detection-retail-0004', camera)
visualizer = oak.visualize(det.out.main, fps=True)
visualizer.detections(
color=(0, 255, 0),
thickness=2,
bbox_style=BboxStyle.RECTANGLE,
label_position=TextPosition.MID,
).text(
font_color=(255, 255, 0),
auto_scale=True
).tracking(
line_thickness=5
)
oak.start(blocking=True)
Serialization¶
The visualizer provides a way to serialize the output objects to JSON, which can be used for further processing.
JSON schemas¶
General config¶
{
"frame_shape": {
"type": "array",
"items": {
"type": "integer"
},
"description": "Frame shape in (height, width) format."
},
"config": {
"type": "object",
"output": {
"img_scale": {
"type": "number",
"minimum": 0.0,
"maximum": 1.0,
"default": 1.0,
"description": "Scale of output image."
},
"show_fps": {
"type": "boolean",
"default": false,
"description": "Show FPS on output image."
},
"clickable": {
"type": "boolean",
"default": false,
"description": "Show disparity or depth value on mouse hover."
}
},
"stereo": {
"type": "object",
"colorize": {
"type": "integer",
"default": 2,
"description": "cv2 colormap."
},
"colormap": {
"type": "integer",
"default": 2,
"description": "0 - gray, 1 - color, 2 - blended color and depth."
},
"wls_filter": {
"type": "boolean",
"default": false
},
"wls_lambda": {
"type": "number",
"default": 8000.0
},
"wls_sigma": {
"type": "number",
"default": 1.5
}
},
"detection": {
"type": "object",
"thickness": {
"type": "integer",
"default": 1
},
"fill_transparency": {
"type": "number",
"default": 0.15,
"minimum": 0.0,
"maximum": 1.0,
"description": "Transparency of bbox fill."
},
"box_roundness": {
"type": "integer",
"default": 0,
"description": "Roundness of bbox corners, used only when bbox_style is set to BboxStyle.ROUNDED_*."
},
"color": {
"type": "array",
"items": {
"type": "integer"
},
"default": [0, 255, 0],
"description": "Default bbox color in RGB format."
},
"bbox_style": {
"type": "integer",
"default": 0,
"description": "``depthai_sdk.visualize.configs.BBoxStyle`` enum value."
},
"line_width": {
"type": "number",
"default": 0.5,
"minimum": 0.0,
"maximum": 1.0,
"description": "Horizontal line width of bbox."
},
"line_height": {
"type": "number",
"default": 0.5,
"minimum": 0.0,
"maximum": 1.0,
"description": "Vertical line height of bbox."
},
"hide_label": {
"type": "boolean",
"default": false,
"description": "Hide class label on output image."
},
"label_position": {
"type": "integer",
"default": 0,
"description": "``depthai_sdk.visualize.configs.TextPosition`` enum value."
},
"label_padding": {
"type": "integer",
"default": 10,
"description": "Padding between label and bbox."
}
},
"text": {
"font_face": {
"type": "integer",
"default": 0,
"description": "cv2 font face."
},
"font_color": {
"type": "array",
"items": {
"type": "integer"
},
"default": [255, 255, 255],
"description": "Font color in RGB format."
},
"font_transparency": {
"type": "number",
"default": 0.5,
"minimum": 0.0,
"maximum": 1.0
},
"font_scale": {
"type": "number",
"default": 1.0
},
"font_thickness": {
"type": "integer",
"default": 2
},
"font_position": {
"type": "integer",
"default": 0,
"description": "``depthai_sdk.visualize.configs.TextPosition`` enum value."
},
"bg_transparency": {
"type": "number",
"default": 0.5,
"minimum": 0.0,
"maximum": 1.0,
"description": "Text outline transparency."
},
"bg_color": {
"type": "array",
"items": {
"type": "integer"
},
"default": [0, 0, 0],
"description": "Text outline color in RGB format."
},
"line_type": {
"type": "integer",
"default": 16,
"description": "cv2 line type."
},
"auto_scale": {
"type": "boolean",
"default": true,
"description": "Automatically scale font size based on bbox size."
}
},
"tracking": {
"max_length": {
"type": "integer",
"default": -1,
"description": "Maximum length of tracking line, -1 for infinite."
},
"deletion_lost_threshold": {
"type": "integer",
"default": 5,
"description": "Number of frames after which lost track is deleted."
},
"line_thickness": {
"type": "integer",
"default": 1
},
"fading_tails": {
"type": "boolean",
"default": false,
"description": "Enable fading tails - reduces line thickness over time."
},
"line_color": {
"type": "array",
"items": {
"type": "integer"
},
"default": [255, 255, 255],
"description": "Tracking line color in RGB format."
},
"line_type": {
"type": "integer",
"default": 16,
"description": "cv2 line type."
}
},
"circle": {
"thickness": {
"type": "integer",
"default": 1
},
"color": {
"type": "array",
"items": {
"type": "integer"
},
"default": [0, 255, 0],
"description": "Circle color in RGB format."
},
"line_type": {
"type": "integer",
"default": 16,
"description": "cv2 line type."
}
}
},
"objects": {
"type": "array",
"items": {
"type": "object"
},
"description": "Array of objects (e.g. detection, text, line).",
"default": []
}
}
Objects¶
Detection:
{
"type": "detections",
"detections": {
"type": "array",
"items": {
"type": "object",
"bbox": {
"type": "array",
"items": {
"type": "number"
},
"description": "bbox absolute coordinates in format [x1, y1, x2, y2]"
},
"label": {
"type": "string",
"description": "class label"
},
"color": {
"type": "array",
"items": {
"type": "integer"
},
"description": "bbox color in RGB format"
}
}
},
"children": {
"type": "array",
"items": {
"type": "object"
},
"description": "array of child objects (e.g. detection, text, line)",
"default": []
}
}
Text:
{
"type": "text",
"text": {
"type": "plain_text"
},
"coords": {
"type": "array",
"items": {
"type": "number"
},
"description": "The absolute coordinates of the text in the format (x1, y1)."
}
}
Line:
{
"type": "line",
"pt1": {
"type": "array",
"items": {
"type": "number"
},
"description": "Absolute (x, y) coordinates of the first point."
},
"pt2": {
"type": "array",
"items": {
"type": "number"
},
"description": "Absolute (x, y) coordinates of the second point."
},
"children": {
"type": "array",
"items": {
"type": "object"
},
"description": "array of child objects (e.g. detection, text, line).",
"default": []
}
}
Example JSON output¶
{
"frame_shape": [720, 1280],
"config": {
"output": {
"img_scale": 1.0,
"show_fps": false,
"clickable": true
},
"stereo": {
"colorize": 2,
"colormap": 2,
"wls_filter": false,
"wls_lambda": 8000,
"wls_sigma": 1.5
},
"detection": {
"thickness": 1,
"fill_transparency": 0.15,
"box_roundness": 0,
"color": [0, 255, 0],
"bbox_style": 0,
"line_width": 0.5,
"line_height": 0.5,
"hide_label": false,
"label_position": 0,
"label_padding": 10
},
"text": {
"font_face": 0,
"font_color": [255, 255, 255],
"font_transparency": 0.5,
"font_scale": 1.0,
"font_thickness": 2,
"font_position": 0,
"bg_transparency": 0.5,
"bg_color": [0, 0, 0],
"line_type": 16,
"auto_scale": true
},
"tracking": {
"max_length": -1,
"deletion_lost_threshold": 5,
"line_thickness": 1,
"fading_tails": false,
"line_color": [255, 255, 255],
"line_type": 16
},
"circle": {
"thickness": 1,
"color": [255, 255, 255],
"line_type": 16
}
},
"objects": [
{
"type": "detections",
"detections": [
{
"bbox": [101, 437, 661, 712],
"label": "laptop",
"color": [210, 167, 218]
}
],
"children": [
{
"type": "text",
"text": "Laptop",
"coords": [111, 469]
}
]
}
]
}