ColorCamera

ColorCamera node is a source of image frames. You can control in at runtime with the InputControl and InputConfig.

How to place it

pipeline = dai.Pipeline()
cam = pipeline.create(dai.node.ColorCamera)
dai::Pipeline pipeline;
auto cam = pipeline.create<dai::node::ColorCamera>();

Inputs and Outputs

                       ColorCamera node
               ┌──────────────────────────────┐
               │   ┌─────────────┐            │
               │   │    Image    │ raw        │     raw
               │   │    Sensor   │---┬--------├────────►
               │   └────▲────────┘   |        │
               │        │   ┌--------┘        │
               │      ┌─┴───▼─┐               │     isp
inputControl   │      │       │-------┬-------├────────►
──────────────►│------│  ISP  │ ┌─────▼────┐  │   video
               │      │       │ |          |--├────────►
               │      └───────┘ │   Image  │  │   still
inputConfig    │                │   Post-  │--├────────►
──────────────►│----------------|Processing│  │ preview
               │                │          │--├────────►
               │                └──────────┘  │
               └──────────────────────────────┘

Message types

  • inputConfig - ImageManipConfig

  • inputControl - CameraControl

  • raw - ImgFrame - RAW10 bayer data. Demo code for unpacking here

  • isp - ImgFrame - YUV420 planar (same as YU12/IYUV/I420)

  • still - ImgFrame - NV12, suitable for bigger size frames. The image gets created when a capture event is sent to the ColorCamera, so it’s like taking a photo

  • preview - ImgFrame - RGB (or BGR planar/interleaved if configured), mostly suited for small size previews and to feed the image into NeuralNetwork

  • video - ImgFrame - NV12, suitable for bigger size frames

ISP (image signal processor) is used for bayer transformation, demosaicing, noise reduction, and other image enhancements. It interacts with the 3A algorithms: auto-focus, auto-exposure, and auto-white-balance, which are handling image sensor adjustments such as exposure time, sensitivity (ISO), and lens position (if the camera module has a motorized lens) at runtime. Click here for more information.

Image Post-Processing converts YUV420 planar frames from the ISP into video/ preview/ still frames.

still (when a capture is triggered) and isp work at the max camera resolution, while video and preview are limited to max 4K (3840 x 2160) resolution, which is cropped from isp. For IMX378 (12MP), the post-processing works like this:

┌─────┐   Cropping to   ┌─────────┐  Downscaling   ┌──────────┐
│ ISP ├────────────────►│  video  ├───────────────►│ preview  │
└─────┘  max 3840x2160  └─────────┘  and cropping  └──────────┘

If the resolution is set to 12MP and video mode is used, a 4K frame (3840x2160) will be cropped from the center of the 12MP frame.

Full FOV

Some sensors, such as the IXM378, default to a 1080P resolution, which is a crop from the full sensor resolution. You can print sensor features to see how the field of view (FOV) is affected by the selected sensor resolution:

import depthai as dai

with dai.Device() as device:
    for cam in dev.getConnectedCameraFeatures():
        print(cam)
        #continue  # uncomment for less verbosity
        for cfg in cam.configs:
            print("   ", cfg)

# Running on OAK-D-S2 will print:
# {socket: CAM_A, sensorName: IMX378, width: 4056, height: 3040, orientation: AUTO, supportedTypes: [COLOR], hasAutofocus: 1, hasAutofocusIC: 1, name: color}
#     {width: 1920, height: 1080, minFps: 2.03, maxFps: 60, type: COLOR, fov: {x:108, y: 440, width: 3840, height: 2160}}
#     {width: 3840, height: 2160, minFps: 1.42, maxFps: 42, type: COLOR, fov: {x:108, y: 440, width: 3840, height: 2160}}
#     {width: 4056, height: 3040, minFps: 1.42, maxFps: 30, type: COLOR, fov: {x:0, y: 0, width: 4056, height: 3040}}
#     {width: 1352, height: 1012, minFps: 1.25, maxFps: 52, type: COLOR, fov: {x:0, y: 0, width: 4056, height: 3036}}
#     {width: 2024, height: 1520, minFps: 2.03, maxFps: 85, type: COLOR, fov: {x:4, y: 0, width: 4048, height: 3040}}
# {socket: CAM_B, sensorName: OV9282, width: 1280, height: 800, orientation: AUTO, supportedTypes: [MONO], hasAutofocus: 0, hasAutofocusIC: 0, name: left}
#     {width: 1280, height: 720, minFps: 1.687, maxFps: 143.1, type: MONO, fov: {x:0, y: 40, width: 1280, height: 720}}
#     {width: 1280, height: 800, minFps: 1.687, maxFps: 129.6, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}
#     {width: 640, height: 400, minFps: 1.687, maxFps: 255.7, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}
# {socket: CAM_C, sensorName: OV9282, width: 1280, height: 800, orientation: AUTO, supportedTypes: [MONO], hasAutofocus: 0, hasAutofocusIC: 0, name: right}
#     {width: 1280, height: 720, minFps: 1.687, maxFps: 143.1, type: MONO, fov: {x:0, y: 40, width: 1280, height: 720}}
#     {width: 1280, height: 800, minFps: 1.687, maxFps: 129.6, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}
#     {width: 640, height: 400, minFps: 1.687, maxFps: 255.7, type: MONO, fov: {x:0, y: 0, width: 1280, height: 800}}

For the IMX378 sensor, selecting a 4K or 1080P resolution results in approximately 5% horizontal and 29% vertical FOV cropping. To utilize the full sensor FOV, you should choose one of the following resolutions: THE_12_MP, THE_1352X1012, or THE_2024X1520.

Usage

pipeline = dai.Pipeline()
cam = pipeline.create(dai.node.ColorCamera)
cam.setPreviewSize(300, 300)
cam.setBoardSocket(dai.CameraBoardSocket.CAM_A)
cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
cam.setInterleaved(False)
cam.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)
dai::Pipeline pipeline;
auto cam = pipeline.create<dai::node::ColorCamera>();
cam->setPreviewSize(300, 300);
cam->setBoardSocket(dai::CameraBoardSocket::CAM_A);
cam->setResolution(dai::ColorCameraProperties::SensorResolution::THE_1080_P);
cam->setInterleaved(false);
cam->setColorOrder(dai::ColorCameraProperties::ColorOrder::RGB);

Limitations

Here are known camera limitations for the RVC2:

  • ISP can process about 600 MP/s, and about 500 MP/s when the pipeline is also running NNs and video encoder in parallel

  • 3A algorithms can process about 200..250 FPS overall (for all camera streams). This is a current limitation of our implementation, and we have plans for a workaround to run 3A algorithms on every Xth frame, no ETA yet

  • ISP Scaling numerator value can be 1..16 and denominator value 1..32 for both vertical and horizontal scaling. So you can downscale eg. 12MP (4056x3040) only to resolutions calculated here

Examples of functionality

Reference

class depthai.node.ColorCamera
class Id

Node identificator. Unique for every node on a single Pipeline

getAssetManager(*args, **kwargs)

Overloaded function.

  1. getAssetManager(self: depthai.Node) -> depthai.AssetManager

  2. getAssetManager(self: depthai.Node) -> depthai.AssetManager

getBoardSocket(self: depthai.node.ColorCamera)depthai.CameraBoardSocket
getCamId(self: depthai.node.ColorCamera)int
getCamera(self: depthai.node.ColorCamera)str
getColorOrder(self: depthai.node.ColorCamera)depthai.ColorCameraProperties.ColorOrder
getFp16(self: depthai.node.ColorCamera)bool
getFps(self: depthai.node.ColorCamera)float
getFrameEventFilter(self: depthai.node.ColorCamera)list[depthai.FrameEvent]
getImageOrientation(self: depthai.node.ColorCamera)depthai.CameraImageOrientation
getInputRefs(*args, **kwargs)

Overloaded function.

  1. getInputRefs(self: depthai.Node) -> list[depthai.Node.Input]

  2. getInputRefs(self: depthai.Node) -> list[depthai.Node.Input]

getInputs(self: depthai.Node)list[depthai.Node.Input]
getInterleaved(self: depthai.node.ColorCamera)bool
getIspHeight(self: depthai.node.ColorCamera)int
getIspNumFramesPool(self: depthai.node.ColorCamera)int
getIspSize(self: depthai.node.ColorCamera)tuple[int, int]
getIspWidth(self: depthai.node.ColorCamera)int
getName(self: depthai.Node)str
getOutputRefs(*args, **kwargs)

Overloaded function.

  1. getOutputRefs(self: depthai.Node) -> list[depthai.Node.Output]

  2. getOutputRefs(self: depthai.Node) -> list[depthai.Node.Output]

getOutputs(self: depthai.Node)list[depthai.Node.Output]
getParentPipeline(*args, **kwargs)

Overloaded function.

  1. getParentPipeline(self: depthai.Node) -> depthai.Pipeline

  2. getParentPipeline(self: depthai.Node) -> depthai.Pipeline

getPreviewHeight(self: depthai.node.ColorCamera)int
getPreviewKeepAspectRatio(self: depthai.node.ColorCamera)bool
getPreviewNumFramesPool(self: depthai.node.ColorCamera)int
getPreviewSize(self: depthai.node.ColorCamera)tuple[int, int]
getPreviewWidth(self: depthai.node.ColorCamera)int
getRawNumFramesPool(self: depthai.node.ColorCamera)int
getResolution(self: depthai.node.ColorCamera)depthai.ColorCameraProperties.SensorResolution
getResolutionHeight(self: depthai.node.ColorCamera)int
getResolutionSize(self: depthai.node.ColorCamera)tuple[int, int]
getResolutionWidth(self: depthai.node.ColorCamera)int
getSensorCrop(self: depthai.node.ColorCamera)tuple[float, float]
getSensorCropX(self: depthai.node.ColorCamera)float
getSensorCropY(self: depthai.node.ColorCamera)float
getStillHeight(self: depthai.node.ColorCamera)int
getStillNumFramesPool(self: depthai.node.ColorCamera)int
getStillSize(self: depthai.node.ColorCamera)tuple[int, int]
getStillWidth(self: depthai.node.ColorCamera)int
getVideoHeight(self: depthai.node.ColorCamera)int
getVideoNumFramesPool(self: depthai.node.ColorCamera)int
getVideoSize(self: depthai.node.ColorCamera)tuple[int, int]
getVideoWidth(self: depthai.node.ColorCamera)int
getWaitForConfigInput(self: depthai.node.ColorCamera)bool
sensorCenterCrop(self: depthai.node.ColorCamera)None
setBoardSocket(self: depthai.node.ColorCamera, boardSocket: depthai.CameraBoardSocket)None
setCamId(self: depthai.node.ColorCamera, arg0: int)None
setCamera(self: depthai.node.ColorCamera, name: str)None
setColorOrder(self: depthai.node.ColorCamera, colorOrder: depthai.ColorCameraProperties.ColorOrder)None
setFp16(self: depthai.node.ColorCamera, fp16: bool)None
setFps(self: depthai.node.ColorCamera, fps: float)None
setFrameEventFilter(self: depthai.node.ColorCamera, events: list[depthai.FrameEvent])None
setImageOrientation(self: depthai.node.ColorCamera, imageOrientation: depthai.CameraImageOrientation)None
setInterleaved(self: depthai.node.ColorCamera, interleaved: bool)None
setIsp3aFps(self: depthai.node.ColorCamera, isp3aFps: int)None
setIspNumFramesPool(self: depthai.node.ColorCamera, arg0: int)None
setIspScale(*args, **kwargs)

Overloaded function.

  1. setIspScale(self: depthai.node.ColorCamera, numerator: int, denominator: int) -> None

  2. setIspScale(self: depthai.node.ColorCamera, scale: tuple[int, int]) -> None

  3. setIspScale(self: depthai.node.ColorCamera, horizNum: int, horizDenom: int, vertNum: int, vertDenom: int) -> None

  4. setIspScale(self: depthai.node.ColorCamera, horizScale: tuple[int, int], vertScale: tuple[int, int]) -> None

setNumFramesPool(self: depthai.node.ColorCamera, raw: int, isp: int, preview: int, video: int, still: int)None
setPreviewKeepAspectRatio(self: depthai.node.ColorCamera, keep: bool)None
setPreviewNumFramesPool(self: depthai.node.ColorCamera, arg0: int)None
setPreviewSize(*args, **kwargs)

Overloaded function.

  1. setPreviewSize(self: depthai.node.ColorCamera, width: int, height: int) -> None

  2. setPreviewSize(self: depthai.node.ColorCamera, size: tuple[int, int]) -> None

setRawNumFramesPool(self: depthai.node.ColorCamera, arg0: int)None
setRawOutputPacked(self: depthai.node.ColorCamera, packed: bool)None
setResolution(self: depthai.node.ColorCamera, resolution: depthai.ColorCameraProperties.SensorResolution)None
setSensorCrop(self: depthai.node.ColorCamera, x: float, y: float)None
setStillNumFramesPool(self: depthai.node.ColorCamera, arg0: int)None
setStillSize(*args, **kwargs)

Overloaded function.

  1. setStillSize(self: depthai.node.ColorCamera, width: int, height: int) -> None

  2. setStillSize(self: depthai.node.ColorCamera, size: tuple[int, int]) -> None

setVideoNumFramesPool(self: depthai.node.ColorCamera, arg0: int)None
setVideoSize(*args, **kwargs)

Overloaded function.

  1. setVideoSize(self: depthai.node.ColorCamera, width: int, height: int) -> None

  2. setVideoSize(self: depthai.node.ColorCamera, size: tuple[int, int]) -> None

setWaitForConfigInput(self: depthai.node.ColorCamera, wait: bool)None
class dai::node::ColorCamera : public dai::NodeCRTP<Node, ColorCamera, ColorCameraProperties>

ColorCamera node. For use with color sensors.

Public Functions

ColorCamera(const std::shared_ptr<PipelineImpl> &par, int64_t nodeId)

Constructs ColorCamera node.

ColorCamera(const std::shared_ptr<PipelineImpl> &par, int64_t nodeId, std::unique_ptr<Properties> props)
int getScaledSize(int input, int num, int denom) const

Computes the scaled size given numerator and denominator

void setBoardSocket(CameraBoardSocket boardSocket)

Specify which board socket to use

Parameters
  • boardSocket: Board socket to use

CameraBoardSocket getBoardSocket() const

Retrieves which board socket to use

Return

Board socket to use

void setCamera(std::string name)

Specify which camera to use by name

Parameters
  • name: Name of the camera to use

std::string getCamera() const

Retrieves which camera to use by name

Return

Name of the camera to use

void setCamId(int64_t id)

Set which color camera to use.

int64_t getCamId() const

Get which color camera to use.

void setImageOrientation(CameraImageOrientation imageOrientation)

Set camera image orientation.

CameraImageOrientation getImageOrientation() const

Get camera image orientation.

void setColorOrder(ColorCameraProperties::ColorOrder colorOrder)

Set color order of preview output images. RGB or BGR.

ColorCameraProperties::ColorOrder getColorOrder() const

Get color order of preview output frames. RGB or BGR.

void setInterleaved(bool interleaved)

Set planar or interleaved data of preview output frames.

bool getInterleaved() const

Get planar or interleaved data of preview output frames.

void setFp16(bool fp16)

Set fp16 (0..255) data type of preview output frames.

bool getFp16() const

Get fp16 (0..255) data of preview output frames.

void setPreviewSize(int width, int height)

Set preview output size.

void setPreviewSize(std::tuple<int, int> size)

Set preview output size, as a tuple <width, height>

void setPreviewNumFramesPool(int num)

Set number of frames in preview pool.

void setVideoSize(int width, int height)

Set video output size.

void setVideoSize(std::tuple<int, int> size)

Set video output size, as a tuple <width, height>

void setVideoNumFramesPool(int num)

Set number of frames in preview pool.

void setStillSize(int width, int height)

Set still output size.

void setStillSize(std::tuple<int, int> size)

Set still output size, as a tuple <width, height>

void setStillNumFramesPool(int num)

Set number of frames in preview pool.

void setResolution(Properties::SensorResolution resolution)

Set sensor resolution.

Properties::SensorResolution getResolution() const

Get sensor resolution.

void setRawNumFramesPool(int num)

Set number of frames in raw pool.

void setIspNumFramesPool(int num)

Set number of frames in isp pool.

void setNumFramesPool(int raw, int isp, int preview, int video, int still)

Set number of frames in all pools.

void setIspScale(int numerator, int denominator)

Set ‘isp’ output scaling (numerator/denominator), preserving the aspect ratio. The fraction numerator/denominator is simplified first to a irreducible form, then a set of hardware scaler constraints applies: max numerator = 16, max denominator = 63

void setIspScale(std::tuple<int, int> scale)

Set ‘isp’ output scaling, as a tuple <numerator, denominator>

void setIspScale(int horizNum, int horizDenom, int vertNum, int vertDenom)

Set ‘isp’ output scaling, per each direction. If the horizontal scaling factor (horizNum/horizDen) is different than the vertical scaling factor (vertNum/vertDen), a distorted (stretched or squished) image is generated

void setIspScale(std::tuple<int, int> horizScale, std::tuple<int, int> vertScale)

Set ‘isp’ output scaling, per each direction, as <numerator, denominator> tuples.

void setFps(float fps)

Set rate at which camera should produce frames

Parameters
  • fps: Rate in frames per second

void setIsp3aFps(int isp3aFps)

Isp 3A rate (auto focus, auto exposure, auto white balance, camera controls etc.). Default (0) matches the camera FPS, meaning that 3A is running on each frame. Reducing the rate of 3A reduces the CPU usage on CSS, but also increases the convergence rate of 3A. Note that camera controls will be processed at this rate. E.g. if camera is running at 30 fps, and camera control is sent at every frame, but 3A fps is set to 15, the camera control messages will be processed at 15 fps rate, which will lead to queueing.

void setFrameEventFilter(const std::vector<dai::FrameEvent> &events)
std::vector<dai::FrameEvent> getFrameEventFilter() const
float getFps() const

Get rate at which camera should produce frames

Return

Rate in frames per second

std::tuple<int, int> getPreviewSize() const

Get preview size as tuple.

int getPreviewWidth() const

Get preview width.

int getPreviewHeight() const

Get preview height.

std::tuple<int, int> getVideoSize() const

Get video size as tuple.

int getVideoWidth() const

Get video width.

int getVideoHeight() const

Get video height.

std::tuple<int, int> getStillSize() const

Get still size as tuple.

int getStillWidth() const

Get still width.

int getStillHeight() const

Get still height.

std::tuple<int, int> getResolutionSize() const

Get sensor resolution as size.

int getResolutionWidth() const

Get sensor resolution width.

int getResolutionHeight() const

Get sensor resolution height.

std::tuple<int, int> getIspSize() const

Get ‘isp’ output resolution as size, after scaling.

int getIspWidth() const

Get ‘isp’ output width.

int getIspHeight() const

Get ‘isp’ output height.

void sensorCenterCrop()

Specify sensor center crop. Resolution size / video size

void setSensorCrop(float x, float y)

Specifies the cropping that happens when converting ISP to video output. By default, video will be center cropped from the ISP output. Note that this doesn’t actually do on-sensor cropping (and MIPI-stream only that region), but it does postprocessing on the ISP (on RVC).

Parameters
  • x: Top left X coordinate

  • y: Top left Y coordinate

std::tuple<float, float> getSensorCrop() const

Return

Sensor top left crop coordinates

float getSensorCropX() const

Get sensor top left x crop coordinate.

float getSensorCropY() const

Get sensor top left y crop coordinate.

void setWaitForConfigInput(bool wait)

Specify to wait until inputConfig receives a configuration message, before sending out a frame.

Parameters
  • wait: True to wait for inputConfig message, false otherwise

bool getWaitForConfigInput() const

See

setWaitForConfigInput

Return

True if wait for inputConfig message, false otherwise

void setPreviewKeepAspectRatio(bool keep)

Specifies whether preview output should preserve aspect ratio, after downscaling from video size or not.

Parameters
  • keep: If true, a larger crop region will be considered to still be able to create the final image in the specified aspect ratio. Otherwise video size is resized to fit preview size

bool getPreviewKeepAspectRatio()

See

setPreviewKeepAspectRatio

Return

Preview keep aspect ratio option

int getPreviewNumFramesPool()

Get number of frames in preview pool.

int getVideoNumFramesPool()

Get number of frames in video pool.

int getStillNumFramesPool()

Get number of frames in still pool.

int getRawNumFramesPool()

Get number of frames in raw pool.

int getIspNumFramesPool()

Get number of frames in isp pool.

void setRawOutputPacked(bool packed)

Configures whether the camera raw frames are saved as MIPI-packed to memory. The packed format is more efficient, consuming less memory on device, and less data to send to host: RAW10: 4 pixels saved on 5 bytes, RAW12: 2 pixels saved on 3 bytes. When packing is disabled (false), data is saved lsb-aligned, e.g. a RAW10 pixel will be stored as uint16, on bits 9..0: 0b0000’00pp’pppp’pppp. Default is auto: enabled for standard color/monochrome cameras where ISP can work with both packed/unpacked, but disabled for other cameras like ToF.

Public Members

CameraControl initialControl

Initial control options to apply to sensor

Input inputConfig = {*this, "inputConfig", Input::Type::SReceiver, false, 8, {{DatatypeEnum::ImageManipConfig, false}}}

Input for ImageManipConfig message, which can modify crop parameters in runtime

Default queue is non-blocking with size 8

Input inputControl = {*this, "inputControl", Input::Type::SReceiver, true, 8, {{DatatypeEnum::CameraControl, false}}}

Input for CameraControl message, which can modify camera parameters in runtime

Default queue is blocking with size 8

Output video = {*this, "video", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}

Outputs ImgFrame message that carries NV12 encoded (YUV420, UV plane interleaved) frame data.

Suitable for use with VideoEncoder node

Output preview = {*this, "preview", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}

Outputs ImgFrame message that carries BGR/RGB planar/interleaved encoded frame data.

Suitable for use with NeuralNetwork node

Output still = {*this, "still", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}

Outputs ImgFrame message that carries NV12 encoded (YUV420, UV plane interleaved) frame data.

The message is sent only when a CameraControl message arrives to inputControl with captureStill command set.

Output isp = {*this, "isp", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}

Outputs ImgFrame message that carries YUV420 planar (I420/IYUV) frame data.

Generated by the ISP engine, and the source for the ‘video’, ‘preview’ and ‘still’ outputs

Output raw = {*this, "raw", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}

Outputs ImgFrame message that carries RAW10-packed (MIPI CSI-2 format) frame data.

Captured directly from the camera sensor, and the source for the ‘isp’ output.

Output frameEvent = {*this, "frameEvent", Output::Type::MSender, {{DatatypeEnum::ImgFrame, false}}}

Outputs metadata-only ImgFrame message as an early indicator of an incoming frame.

It’s sent on the MIPI SoF (start-of-frame) event, just after the exposure of the current frame has finished and before the exposure for next frame starts. Could be used to synchronize various processes with camera capture. Fields populated: camera id, sequence number, timestamp

Public Static Attributes

static constexpr const char *NAME = "ColorCamera"

Private Members

std::shared_ptr<RawCameraControl> rawControl

Got questions?

Head over to Discussion Forum for technical support or any other questions you might have.