ColorCamera¶

ColorCamera node is a source of image frames. You can control in at runtime with the InputControl and InputConfig.

How to place it¶

pipeline = dai.Pipeline()
cam = pipeline.create(dai.node.ColorCamera)

dai::Pipeline pipeline;
auto cam = pipeline.create<dai::node::ColorCamera>();

Inputs and Outputs¶

                       ColorCamera node
               ┌──────────────────────────────┐
               │   ┌─────────────┐            │
               │   │    Image    │ raw        │     raw
               │   │    Sensor   │---┬--------├────────►
               │   └────▲────────┘   |        │
               │        │   ┌--------┘        │
               │      ┌─┴───▼─┐               │     isp
inputControl   │      │       │-------┬-------├────────►
──────────────►│------│  ISP  │ ┌─────▼────┐  │   video
               │      │       │ |          |--├────────►
               │      └───────┘ │   Image  │  │   still
inputConfig    │                │   Post-  │--├────────►
──────────────►│----------------|Processing│  │ preview
               │                │          │--├────────►
               │                └──────────┘  │
               └──────────────────────────────┘

Message types

inputConfig - ImageManipConfig
inputControl - CameraControl
raw - ImgFrame - RAW10 bayer data. Demo code for unpacking here
isp - ImgFrame - YUV420 planar (same as YU12/IYUV/I420)
still - ImgFrame - NV12, suitable for bigger size frames. The image gets created when a capture event is sent to the ColorCamera, so it’s like taking a photo
preview - ImgFrame - RGB (or BGR planar/interleaved if configured), mostly suited for small size previews and to feed the image into NeuralNetwork
video - ImgFrame - NV12, suitable for bigger size frames

ISP (image signal processor) is used for bayer transformation, demosaicing, noise reduction, and other image enhancements. It interacts with the 3A algorithms: auto-focus, auto-exposure, and auto-white-balance, which are handling image sensor adjustments such as exposure time, sensitivity (ISO), and lens position (if the camera module has a motorized lens) at runtime. Click here for more information.

Image Post-Processing converts YUV420 planar frames from the ISP into video/ preview/ still frames.

still (when a capture is triggered) and isp work at the max camera resolution, while video and preview are limited to max 4K (3840 x 2160) resolution, which is cropped from isp. For IMX378 (12MP), the post-processing works like this:

┌─────┐   Cropping to   ┌─────────┐  Downscaling   ┌──────────┐
│ ISP ├────────────────►│  video  ├───────────────►│ preview  │
└─────┘  max 3840x2160  └─────────┘  and cropping  └──────────┘

If the resolution is set to 12MP and video mode is used, a 4K frame (3840x2160) will be cropped from the center of the 12MP frame.

Full FOV¶

Some sensors, such as the IXM378, default to a 1080P resolution, which is a crop from the full sensor resolution. You can print sensor features to see how the field of view (FOV) is affected by the selected sensor resolution:

import depthai as dai

with dai.Device() as device:
    for cam in dev.getConnectedCameraFeatures():
        print(cam)
        #continue  # uncomment for less verbosity
        for cfg in cam.configs:
            print("   ", cfg)

# Running on OAK-D-S2 will print:
# {socket: CAM_A, sensorName: IMX378, width: 4056, height: 3040, orientation: AUTO, supportedTypes: [COLOR], hasAutofocus: 1, hasAutofocusIC: 1, name: color}
#     {width: 1920, height: 1080, minFps: 2.03, maxFps: 60, type: COLOR, fov: {x:108, y: 440, width: 3840, height: 2160}}
#     {width: 3840, height: 2160, minFps: 1.42, maxFps: 42, type: COLOR, fov: {x:108, y: 440, width: 3840, height: 2160}}
#     {width: 4056, height: 3040, minFps: 1.42, maxFps: 30, type: COLOR, fov: {x:0, y: 0, width: 4056, height: 3040}}
#     {width: 1352, height: 1012, minFps: 1.25, maxFps: 52, type: COLOR, fov: {x:0, y: 0, width: 4056, height: 3036}}
#     {width: 2024, height: 1520, minFps: 2.03, maxFps: 85, type: COLOR, fov: {x:4, y: 0, width: 4048, height: 3040}}
# {socket: CAM_B, sensorName: OV9282, width: 1280, height: 800, orientation: AUTO, supportedTypes: [MONO], hasAutofocus: 0, hasAutofocusIC: 0, name: left}
#     {width: 1280, height: 720, minFps: 1.687, maxFps: 143.1, type: MONO, fov: {x:0, y: 40, width: 1280, height: 720}}
#     {width: 1280, height: 800, minFps: 1.687, maxFps: 129.6, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}
#     {width: 640, height: 400, minFps: 1.687, maxFps: 255.7, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}
# {socket: CAM_C, sensorName: OV9282, width: 1280, height: 800, orientation: AUTO, supportedTypes: [MONO], hasAutofocus: 0, hasAutofocusIC: 0, name: right}
#     {width: 1280, height: 720, minFps: 1.687, maxFps: 143.1, type: MONO, fov: {x:0, y: 40, width: 1280, height: 720}}
#     {width: 1280, height: 800, minFps: 1.687, maxFps: 129.6, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}
#     {width: 640, height: 400, minFps: 1.687, maxFps: 255.7, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}

For the IMX378 sensor, selecting a 4K or 1080P resolution results in approximately 5% horizontal and 29% vertical FOV cropping. To utilize the full sensor FOV, you should choose one of the following resolutions: THE_12_MP, THE_1352X1012, or THE_2024X1520.

Usage¶

pipeline = dai.Pipeline()
cam = pipeline.create(dai.node.ColorCamera)
cam.setPreviewSize(300, 300)
cam.setBoardSocket(dai.CameraBoardSocket.CAM_A)
cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
cam.setInterleaved(False)
cam.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)

dai::Pipeline pipeline;
auto cam = pipeline.create<dai::node::ColorCamera>();
cam->setPreviewSize(300, 300);
cam->setBoardSocket(dai::CameraBoardSocket::CAM_A);
cam->setResolution(dai::ColorCameraProperties::SensorResolution::THE_1080_P);
cam->setInterleaved(false);
cam->setColorOrder(dai::ColorCameraProperties::ColorOrder::RGB);

Limitations¶

Here are known camera limitations for the RVC2:

ISP can process about 600 MP/s, and about 500 MP/s when the pipeline is also running NNs and video encoder in parallel
3A algorithms can process about 200..250 FPS overall (for all camera streams). This is a current limitation of our implementation, and we have plans for a workaround to run 3A algorithms on every Xth frame, no ETA yet
ISP Scaling numerator value can be 1..16 and denominator value 1..32 for both vertical and horizontal scaling. So you can downscale eg. 12MP (4056x3040) only to resolutions calculated here

Examples of functionality¶

Reference¶

class depthai.node.ColorCamera

class Id: Node identificator. Unique for every node on a single Pipeline

getAssetManager(*args, **kwargs)

Overloaded function.

getAssetManager(self: depthai.Node) -> depthai.AssetManager
getAssetManager(self: depthai.Node) -> depthai.AssetManager

getBoardSocket(self: depthai.node.ColorCamera) → depthai.CameraBoardSocket

getCamId(self: depthai.node.ColorCamera) → int

getCamera(self: depthai.node.ColorCamera) → str

getColorOrder(self: depthai.node.ColorCamera) → depthai.ColorCameraProperties.ColorOrder

getFp16(self: depthai.node.ColorCamera) → bool

getFps(self: depthai.node.ColorCamera) → float

getFrameEventFilter(self: depthai.node.ColorCamera) → list[depthai.FrameEvent]

getImageOrientation(self: depthai.node.ColorCamera) → depthai.CameraImageOrientation

getInputRefs(*args, **kwargs)

Overloaded function.

getInputRefs(self: depthai.Node) -> list[depthai.Node.Input]
getInputRefs(self: depthai.Node) -> list[depthai.Node.Input]

getInputs(self: depthai.Node) → list[depthai.Node.Input]

getInterleaved(self: depthai.node.ColorCamera) → bool

getIspHeight(self: depthai.node.ColorCamera) → int

getIspNumFramesPool(self: depthai.node.ColorCamera) → int

getIspSize(self: depthai.node.ColorCamera) → tuple[int, int]

getIspWidth(self: depthai.node.ColorCamera) → int

getName(self: depthai.Node) → str

getOutputRefs(*args, **kwargs)

Overloaded function.

getOutputRefs(self: depthai.Node) -> list[depthai.Node.Output]
getOutputRefs(self: depthai.Node) -> list[depthai.Node.Output]

getOutputs(self: depthai.Node) → list[depthai.Node.Output]

getParentPipeline(*args, **kwargs)

Overloaded function.

getParentPipeline(self: depthai.Node) -> depthai.Pipeline
getParentPipeline(self: depthai.Node) -> depthai.Pipeline

getPreviewHeight(self: depthai.node.ColorCamera) → int

getPreviewKeepAspectRatio(self: depthai.node.ColorCamera) → bool

getPreviewNumFramesPool(self: depthai.node.ColorCamera) → int

getPreviewSize(self: depthai.node.ColorCamera) → tuple[int, int]

getPreviewWidth(self: depthai.node.ColorCamera) → int

getRawNumFramesPool(self: depthai.node.ColorCamera) → int

getResolution(self: depthai.node.ColorCamera) → depthai.ColorCameraProperties.SensorResolution

getResolutionHeight(self: depthai.node.ColorCamera) → int

getResolutionSize(self: depthai.node.ColorCamera) → tuple[int, int]

getResolutionWidth(self: depthai.node.ColorCamera) → int

getSensorCrop(self: depthai.node.ColorCamera) → tuple[float, float]

getSensorCropX(self: depthai.node.ColorCamera) → float

getSensorCropY(self: depthai.node.ColorCamera) → float

getStillHeight(self: depthai.node.ColorCamera) → int

getStillNumFramesPool(self: depthai.node.ColorCamera) → int

getStillSize(self: depthai.node.ColorCamera) → tuple[int, int]

getStillWidth(self: depthai.node.ColorCamera) → int

getVideoHeight(self: depthai.node.ColorCamera) → int

getVideoNumFramesPool(self: depthai.node.ColorCamera) → int

getVideoSize(self: depthai.node.ColorCamera) → tuple[int, int]

getVideoWidth(self: depthai.node.ColorCamera) → int

getWaitForConfigInput(self: depthai.node.ColorCamera) → bool

sensorCenterCrop(self: depthai.node.ColorCamera) → None

setBoardSocket(self: depthai.node.ColorCamera, boardSocket: depthai.CameraBoardSocket) → None

setCamId(self: depthai.node.ColorCamera, arg0: int) → None

setCamera(self: depthai.node.ColorCamera, name: str) → None

setColorOrder(self: depthai.node.ColorCamera, colorOrder: depthai.ColorCameraProperties.ColorOrder) → None

setFp16(self: depthai.node.ColorCamera, fp16: bool) → None

setFps(self: depthai.node.ColorCamera, fps: float) → None

setFrameEventFilter(self: depthai.node.ColorCamera, events: list[depthai.FrameEvent]) → None

setImageOrientation(self: depthai.node.ColorCamera, imageOrientation: depthai.CameraImageOrientation) → None

setInterleaved(self: depthai.node.ColorCamera, interleaved: bool) → None

setIsp3aFps(self: depthai.node.ColorCamera, isp3aFps: int) → None

setIspNumFramesPool(self: depthai.node.ColorCamera, arg0: int) → None

setIspScale(*args, **kwargs)

Overloaded function.

setIspScale(self: depthai.node.ColorCamera, numerator: int, denominator: int) -> None
setIspScale(self: depthai.node.ColorCamera, scale: tuple[int, int]) -> None
setIspScale(self: depthai.node.ColorCamera, horizNum: int, horizDenom: int, vertNum: int, vertDenom: int) -> None
setIspScale(self: depthai.node.ColorCamera, horizScale: tuple[int, int], vertScale: tuple[int, int]) -> None

setNumFramesPool(self: depthai.node.ColorCamera, raw: int, isp: int, preview: int, video: int, still: int) → None

setPreviewKeepAspectRatio(self: depthai.node.ColorCamera, keep: bool) → None

setPreviewNumFramesPool(self: depthai.node.ColorCamera, arg0: int) → None

setPreviewSize(*args, **kwargs)

Overloaded function.

setPreviewSize(self: depthai.node.ColorCamera, width: int, height: int) -> None
setPreviewSize(self: depthai.node.ColorCamera, size: tuple[int, int]) -> None

setRawNumFramesPool(self: depthai.node.ColorCamera, arg0: int) → None

setRawOutputPacked(self: depthai.node.ColorCamera, packed: bool) → None

setResolution(self: depthai.node.ColorCamera, resolution: depthai.ColorCameraProperties.SensorResolution) → None

setSensorCrop(self: depthai.node.ColorCamera, x: float, y: float) → None

setStillNumFramesPool(self: depthai.node.ColorCamera, arg0: int) → None

setStillSize(*args, **kwargs)

Overloaded function.

setStillSize(self: depthai.node.ColorCamera, width: int, height: int) -> None
setStillSize(self: depthai.node.ColorCamera, size: tuple[int, int]) -> None

setVideoNumFramesPool(self: depthai.node.ColorCamera, arg0: int) → None

setVideoSize(*args, **kwargs)

Overloaded function.

setVideoSize(self: depthai.node.ColorCamera, width: int, height: int) -> None
setVideoSize(self: depthai.node.ColorCamera, size: tuple[int, int]) -> None

setWaitForConfigInput(self: depthai.node.ColorCamera, wait: bool) → None

class dai::node::ColorCamera : public dai::NodeCRTP<Node, ColorCamera, ColorCameraProperties>¶

ColorCamera node. For use with color sensors.

Public Functions

ColorCamera(const std::shared_ptr<PipelineImpl> &par, int64_t nodeId)¶: Constructs ColorCamera node.

ColorCamera(const std::shared_ptr<PipelineImpl> &par, int64_t nodeId, std::unique_ptr<Properties> props)¶

int getScaledSize(int input, int num, int denom) const¶: Computes the scaled size given numerator and denominator

void setBoardSocket(CameraBoardSocket boardSocket)¶

Specify which board socket to use

Parameters

boardSocket: Board socket to use

CameraBoardSocket getBoardSocket() const¶

Retrieves which board socket to use

Return: Board socket to use

void setCamera(std::string name)¶

Specify which camera to use by name

Parameters

name: Name of the camera to use

std::string getCamera() const¶

Retrieves which camera to use by name

Return: Name of the camera to use

void setCamId(int64_t id)¶: Set which color camera to use.

int64_t getCamId() const¶: Get which color camera to use.

void setImageOrientation(CameraImageOrientation imageOrientation)¶: Set camera image orientation.

CameraImageOrientation getImageOrientation() const¶: Get camera image orientation.

void setColorOrder(ColorCameraProperties::ColorOrder colorOrder)¶: Set color order of preview output images. RGB or BGR.

ColorCameraProperties::ColorOrder getColorOrder() const¶: Get color order of preview output frames. RGB or BGR.

void setInterleaved(bool interleaved)¶: Set planar or interleaved data of preview output frames.

bool getInterleaved() const¶: Get planar or interleaved data of preview output frames.

void setFp16(bool fp16)¶: Set fp16 (0..255) data type of preview output frames.

bool getFp16() const¶: Get fp16 (0..255) data of preview output frames.

void setPreviewSize(int width, int height)¶: Set preview output size.

void setPreviewSize(std::tuple<int, int> size)¶: Set preview output size, as a tuple <width, height>

void setPreviewNumFramesPool(int num)¶: Set number of frames in preview pool.

void setVideoSize(int width, int height)¶: Set video output size.

void setVideoSize(std::tuple<int, int> size)¶: Set video output size, as a tuple <width, height>

void setVideoNumFramesPool(int num)¶: Set number of frames in preview pool.

void setStillSize(int width, int height)¶: Set still output size.

void setStillSize(std::tuple<int, int> size)¶: Set still output size, as a tuple <width, height>

void setStillNumFramesPool(int num)¶: Set number of frames in preview pool.

void setResolution(Properties::SensorResolution resolution)¶: Set sensor resolution.

Properties::SensorResolution getResolution() const¶: Get sensor resolution.

void setRawNumFramesPool(int num)¶: Set number of frames in raw pool.

void setIspNumFramesPool(int num)¶: Set number of frames in isp pool.

void setNumFramesPool(int raw, int isp, int preview, int video, int still)¶: Set number of frames in all pools.

void setIspScale(int numerator, int denominator)¶: Set ‘isp’ output scaling (numerator/denominator), preserving the aspect ratio. The fraction numerator/denominator is simplified first to a irreducible form, then a set of hardware scaler constraints applies: max numerator = 16, max denominator = 63

void setIspScale(std::tuple<int, int> scale)¶: Set ‘isp’ output scaling, as a tuple <numerator, denominator>

void setIspScale(int horizNum, int horizDenom, int vertNum, int vertDenom)¶: Set ‘isp’ output scaling, per each direction. If the horizontal scaling factor (horizNum/horizDen) is different than the vertical scaling factor (vertNum/vertDen), a distorted (stretched or squished) image is generated

void setIspScale(std::tuple<int, int> horizScale, std::tuple<int, int> vertScale)¶: Set ‘isp’ output scaling, per each direction, as <numerator, denominator> tuples.

void setFps(float fps)¶

Set rate at which camera should produce frames

Parameters

fps: Rate in frames per second

void setIsp3aFps(int isp3aFps)¶: Isp 3A rate (auto focus, auto exposure, auto white balance, camera controls etc.). Default (0) matches the camera FPS, meaning that 3A is running on each frame. Reducing the rate of 3A reduces the CPU usage on CSS, but also increases the convergence rate of 3A. Note that camera controls will be processed at this rate. E.g. if camera is running at 30 fps, and camera control is sent at every frame, but 3A fps is set to 15, the camera control messages will be processed at 15 fps rate, which will lead to queueing.

void setFrameEventFilter(const std::vector<dai::FrameEvent> &events)¶

std::vector<dai::FrameEvent> getFrameEventFilter() const¶

float getFps() const¶

Get rate at which camera should produce frames

Return: Rate in frames per second

std::tuple<int, int> getPreviewSize() const¶: Get preview size as tuple.

int getPreviewWidth() const¶: Get preview width.

int getPreviewHeight() const¶: Get preview height.

std::tuple<int, int> getVideoSize() const¶: Get video size as tuple.

int getVideoWidth() const¶: Get video width.

int getVideoHeight() const¶: Get video height.

std::tuple<int, int> getStillSize() const¶: Get still size as tuple.

int getStillWidth() const¶: Get still width.

int getStillHeight() const¶: Get still height.

std::tuple<int, int> getResolutionSize() const¶: Get sensor resolution as size.

int getResolutionWidth() const¶: Get sensor resolution width.

int getResolutionHeight() const¶: Get sensor resolution height.

std::tuple<int, int> getIspSize() const¶: Get ‘isp’ output resolution as size, after scaling.

int getIspWidth() const¶: Get ‘isp’ output width.

int getIspHeight() const¶: Get ‘isp’ output height.

void sensorCenterCrop()¶: Specify sensor center crop. Resolution size / video size

void setSensorCrop(float x, float y)¶

Specifies the cropping that happens when converting ISP to video output. By default, video will be center cropped from the ISP output. Note that this doesn’t actually do on-sensor cropping (and MIPI-stream only that region), but it does postprocessing on the ISP (on RVC).

Parameters

x: Top left X coordinate
y: Top left Y coordinate

std::tuple<float, float> getSensorCrop() const¶

Return: Sensor top left crop coordinates

float getSensorCropX() const¶: Get sensor top left x crop coordinate.

float getSensorCropY() const¶: Get sensor top left y crop coordinate.

void setWaitForConfigInput(bool wait)¶

Specify to wait until inputConfig receives a configuration message, before sending out a frame.

Parameters

wait: True to wait for inputConfig message, false otherwise

bool getWaitForConfigInput() const¶

See: setWaitForConfigInput
Return: True if wait for inputConfig message, false otherwise

void setPreviewKeepAspectRatio(bool keep)¶

Specifies whether preview output should preserve aspect ratio, after downscaling from video size or not.

Parameters

keep: If true, a larger crop region will be considered to still be able to create the final image in the specified aspect ratio. Otherwise video size is resized to fit preview size

bool getPreviewKeepAspectRatio()¶

See: setPreviewKeepAspectRatio
Return: Preview keep aspect ratio option

int getPreviewNumFramesPool()¶: Get number of frames in preview pool.

int getVideoNumFramesPool()¶: Get number of frames in video pool.

int getStillNumFramesPool()¶: Get number of frames in still pool.

int getRawNumFramesPool()¶: Get number of frames in raw pool.

int getIspNumFramesPool()¶: Get number of frames in isp pool.

void setRawOutputPacked(bool packed)¶: Configures whether the camera raw frames are saved as MIPI-packed to memory. The packed format is more efficient, consuming less memory on device, and less data to send to host: RAW10: 4 pixels saved on 5 bytes, RAW12: 2 pixels saved on 3 bytes. When packing is disabled (false), data is saved lsb-aligned, e.g. a RAW10 pixel will be stored as uint16, on bits 9..0: 0b0000’00pp’pppp’pppp. Default is auto: enabled for standard color/monochrome cameras where ISP can work with both packed/unpacked, but disabled for other cameras like ToF.

Public Members

CameraControl initialControl¶: Initial control options to apply to sensor

Input inputConfig = {*this, "inputConfig", Input::Type::SReceiver, false, 8, {{DatatypeEnum::ImageManipConfig, false}}}¶

Input for ImageManipConfig message, which can modify crop parameters in runtime

Default queue is non-blocking with size 8

Input inputControl = {*this, "inputControl", Input::Type::SReceiver, true, 8, {{DatatypeEnum::CameraControl, false}}}¶

Input for CameraControl message, which can modify camera parameters in runtime

Default queue is blocking with size 8

Output video = {*this, "video", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}¶

Outputs ImgFrame message that carries NV12 encoded (YUV420, UV plane interleaved) frame data.

Suitable for use with VideoEncoder node

Output preview = {*this, "preview", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}¶

Outputs ImgFrame message that carries BGR/RGB planar/interleaved encoded frame data.

Suitable for use with NeuralNetwork node

Output still = {*this, "still", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}¶

Outputs ImgFrame message that carries NV12 encoded (YUV420, UV plane interleaved) frame data.

The message is sent only when a CameraControl message arrives to inputControl with captureStill command set.

Output isp = {*this, "isp", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}¶

Outputs ImgFrame message that carries YUV420 planar (I420/IYUV) frame data.

Generated by the ISP engine, and the source for the ‘video’, ‘preview’ and ‘still’ outputs

Output raw = {*this, "raw", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}¶

Outputs ImgFrame message that carries RAW10-packed (MIPI CSI-2 format) frame data.

Captured directly from the camera sensor, and the source for the ‘isp’ output.

Output frameEvent = {*this, "frameEvent", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}¶

Outputs metadata-only ImgFrame message as an early indicator of an incoming frame.

It’s sent on the MIPI SoF (start-of-frame) event, just after the exposure of the current frame has finished and before the exposure for next frame starts. Could be used to synchronize various processes with camera capture. Fields populated: camera id, sequence number, timestamp

Public Static Attributes

static constexpr const char *NAME = "ColorCamera"¶

Private Members

std::shared_ptr<RawCameraControl> rawControl¶

Got questions?

Head over to Discussion Forum for technical support or any other questions you might have.