Camera and object tracking are critical components of advanced virtual production workflows. Yet, the industry faces a major challenge: countless tracking devices use different communication protocols, creating complexity for hardware manufacturers, software developers, and production teams.
The Comprehensive Tracking Protocol (C-Tracking) solves this problem. Created by industry experts, C-tracking is a free, unified, and future-proof protocol designed to streamline tracking across all platforms and devices. It’s accurate, flexible, simple to implement, and free to use without restrictions. Making it the ideal standard for the entire industry.
Whether you're building hardware, developing virtual production platforms, or managing large-scale productions, C-Tracking is built to meet your needs, without licensing fees or usage restrictions.
Download Comprehensive Tracking Protocol Specification
Learn how to implement the protocol and join the movement toward seamless, standardized
tracking.
No registration required.
Revision 43
This protocol is used to transmit tracking data about devices across a network. C-Tracking is designed for sending real physical values. See more info in appendix D. In addition to tracking information, it can include data about other measurements.
Messages of this protocol are sent over UDP. The maximum length of a message, not including the IP and UDP headers, is 1400 bytes.
Messages are generated by the source (“sender”) and are sent to a pre-configured destination IP address-UDP port pair (“receiver”). The messages may be generated at regular intervals or as soon as the data becomes available at the sender’s discretion. The destination IP address may be unicast, broadcast or multicast. There is no communication in the other direction (from the receiver to the sender).
An UDP packet should contain information about a single logical device, and all UDP packets sent to the same port should contain information about the same device. If the device is a camera, all information about it (lens information, position, etc.) is expected to be sent in a single packet. A device can also represent only the measurement of the position and/or orientation of a single object in space, in which case the packet will probably only contain position/orientation information.
Any destination port number may be used, although port numbers below 1024 are not recommended for default values because they may cause problems if the receiver system requires special permissions to listen on such ports. The recommended default destination port is 2001 for the first device, and subsequent port numbers for any additional devices.
The packet format is flexible, with all the transmitted data (except for a short header) contained in elements, each of which are optional. This allows the sender to omit data which is unknown, or which is not applicable to the device whose data is being reported (for example, lens distortion information for a tracking device). However, every packet must contain all the available data for the device, even if it did not change since the last packet.
All numerical fields in a packet spanning multiple bytes are stored in big endian byte order. Signed integer fields are stored in two’s complement notation. Floating point values are encoded according to IEEE 754-2008 in binary32 (“single precision”) format. Infinite and NaN values are only allowed where explicitly stated. All NaN values sent must be quiet (non-signaling), with the specified payload, or with a payload of 0 if it is not mentioned.
Description of the types used in the tables below:
The header starts at address 0 in the packet.
The receiver discards packets with an unexpected protocol identifier.
The receiver discards packets with a header length less than 6. The receiver silently ignores the contents of the header beyond the first 6 bytes if the header length is greater than 6.
ProtocolIdentifier | 4 bytes | “CTrk” (the value 0x4354726B in big endian byte order). |
HeaderLength | uint16 | 6 |
All elements are optional. However, it's recommended that the sender includes the Frame rate element in the packets it sends. In the absence of this element, the receiver has to assume the packets may arrive from the sender irregularly. Each element may only be present once in a packet unless otherwise noted in the element description.
All elements start at the first address aligned to a multiple of 4 after the previous element or (in the case of the first element) after the header. Padding bytes (no more than 3) are inserted by the sender after the header, between elements, and optionally after the last element to enforce this alignment. The sender sets the padding bytes to 0; the receiver ignores the contents of the padding bytes. The packet may contain no data after the last element apart from optionally at most 3 padding bytes.
The receiver discards packets containing an element with an element length less than 4.
The receiver silently ignores any elements with an unknown element type.
If the receiver encounters an element with a length less than it expects, it silently ignores the element.
If the receiver encounters an element with a length greater than it expects, it parses the element as if it were of the expected length, ignoring the excess data. The address of the next element is still calculated based on the value of the ElementLength field.
ElementType | uint16 | Element type |
ElementLength | uint16 | Length of the element, including the ElementType and ElementLength fields, excluding any padding bytes. |
Payload | Element type-specific payload |
This element contains the timecode for the packet.
The four parts of the HH:MM:SS:FF timecode are sent in separate fields, with an additional optional field for a subframe index as described below.
The timecode base is normally equal to the frame rate of the sending system. The protocol allows any integral timecode base up to 255. However, only timecode bases up to 30 are compliant with the SMPTE standard. To send an SMPTE-compliant timecode for higher frame rates, set Base to the highest divisor of the frame rate that is less than or equal to 30. In this case, multiple consecutive frames will have the same HH:MM:SS:FF timecode. To differentiate between these, increase Subframe by one for every subsequent frame with the same frame number, and reset Subframe to zero when the frame number changes.
For example, to represent 50 frames per second, Base can simply be set to 50. In this case, Frame will increase from 0 to 49 within a single second, and Subframe will always contain zero. Alternatively, for SMPTE compliance, Base can be set to 25. In this case, Frame will only increase from 0 to 24 before jumping back to zero at the end of a second. Each HH:MM:SS:FF timecode will be repeated in two consecutive frames. Subframe should be set to 0 in the first one, and to 1 in the second.
In order to specify a timecode base for an NTSC-based frame rate (23.976, 29.97, 59.94, 119.88, 239.76), set Base to 24, 30, 60, 120 or 240 respectively, and set the NTSC bit (the least significant bit) in Flags. In an SMPTE-compliant representation, the maximum frame rate representable without subframes is 29.97. To specify, for example, 239.76 frames per second in SMPTE-compliant format, set Base to 30, set the NTSC bit, and use Subframe values from 0 to 7 cyclically to distinguish between the same HH:MM:SS:FF values.
Also note that all NTSC-based frame timecodes, except for 23.976, are assumed to be drop-frame ones.
ElementType | uint16 | 0 |
ElementLength | uint16 | 11 |
Hours | uint8 | The hours part of the timecode |
Minutes | uint8 | The minutes part of the timecode |
Seconds | uint8 | The seconds part of the timecode |
Frames | uint8 | The frames part of the timecode |
Subframe | uint8 | The subframe indexd |
Base | uint8 | The timecode base |
Flags | uint8 | The least significant bit is set to 1 for NTSC frame rates, and to 0 otherwise. The other bits are currently unassigned and should be set to 0 when sending. |
This element contains the field of view and the frame aspect ratio for a camera.
ElementType | uint16 | 1 |
ElementLength | uint16 | 12 |
HorizontalFieldOfView | single | Horizontal field of view in degrees |
FrameAspectRatio | single | Frame aspect ratio, width / height |
This element contains the K1/K2-based lens distortion parameters for a camera.
This element is only used to describe radial distortion in general-purpose lenses. In particular, it is not suitable to describe the distortion of fisheye, anamorphic, or tilt-shift lenses.
A packet containing this element should also contain a “Field of view” element. Without field of view information, the contents of this element are not useful to the receiver.
The fields of this element contain the coefficients according to the distortion model used by OpenCV. A description of the distortion model is provided in appendix B, with the assumption that all coefficients other than K1 and K2 are zero.
Fields with unknown (not measured/calibrated) values should contain zero.
On choosing between the basic and extended parameter set see appendix B.2.
ElementType | uint16 | 2 |
ElementLength | uint16 | 20 |
CenterX | single | X coordinate of the principal point, relative to the image center. The left edge of the image is -0.5, the right edge is 0.5. |
CenterY | single | Y coordinate of the principal point, relative to the image center. The top edge of the image is -0.5, the bottom edge is 0.5. |
K1 | single | First radial distortion coefficient |
K2 | single | Second radial distortion coefficient |
This element contains an extended set of lens distortion parameters for a camera.
This element is only used to describe distortion in general-purpose lenses. In particular, it is not suitable to describe the distortion of fisheye, anamorphic, or tilt-shift lenses.
A packet containing this element should also contain a “Field of view” element. Without field of view information, the contents of this element are not useful to the receiver.
The fields of this element contain the coefficients according to the distortion model used by OpenCV. A description of the distortion model is provided in appendix B.
Fields with unknown (not measured/calibrated) values should contain zero.
On choosing between the basic and extended parameter set see appendix B.2.
ElementType | uint16 | 3 |
ElementLength | uint16 | 60 |
CenterX | single | X coordinate of the principal point, relative to the image center. The left edge of the image is -0.5, the right edge is 0.5. |
CenterY | single | Y coordinate of the principal point, relative to the image center. The top edge of the image is -0.5, the bottom edge is 0.5. |
K1 | single | First radial distortion coefficient |
K2 | single | Second radial distortion coefficient |
K3 | single | Third radial distortion coefficient |
K4 | single | Fourth radial distortion coefficient |
K5 | single | Fifth radial distortion coefficient |
K6 | single | Sixth radial distortion coefficient |
P1 | single | First tangential distortion coefficient |
P2 | single | Second tangential distortion coefficient |
S1 | single | First thin prism distortion coefficient |
S2 | single | Second thin prism distortion coefficient |
S3 | single | Third thin prism distortion coefficient |
S4 | single | Fourth thin prism distortion coefficient |
This element contains the distance to the object in focus for a camera. The distance is measured from the entrance pupil as described in appendix B. This distance may be different from the physical distance calculated from the camera sensor. In most practical cases however, the difference between these two values is negligible.
ElementType | uint16 | 4 |
ElementLength | uint16 | 8 |
FocusDistance | single | The focus distance in meters |
This element contains information about the camera’s sensor.
ElementType | uint16 | 5 |
ElementLength | uint16 | 16 |
ActiveSensorWidth | single | Width of the part of the camera sensor used to produce the output image, in millimeters. May be zero if unknown. |
ActiveSensorHeight | single | Height of the part of the camera sensor used to produce the output image, in millimeters. May be zero if unknown. |
ActiveHorizontalResolution | unit16 | Horizontal resolution of the part of the camera sensor used to produce the output image, in pixels. May be zero if unknown. |
ActiveVerticalResolution | unit16 | Vertical resolution of the part of the camera sensor used to produce the output image, in pixels. May be zero if unknown. |
This element contains the width of the aperture, represented as an f-number, for a camera.
ElementType | uint16 | 6 |
ElementLength | uint16 | 8 |
FNumber | single | The f-number. For example, an aperture of f/5.6 should be represented by the value 5.6. |
This element contains information about the gradual reduction of the image’s brightness towards periphery of the image.
This element is only used to describe vignetting in general-purpose lenses with a circular vignetting effect. In particular, it may not be suitable to describe the effect in fisheye, anamorphic or tilt-shift lenses.
This element contains an arbitrary number N (≥ 1) of data fields. Each of these fields contain the ratio of brightness lost at a specific distance from the image center. Each field must contain a value between 0.0 and 1.0, with 0.0 representing no brightness loss and 1.0 representing a complete loss of brightness (a pixel that is always black).
The values are ordered from the image center to the corners and are at equal intervals. The first value represents the brightness loss at 1/N the way from the image center to a corner, the second value at 2/N the way, and so on. The last value is the brightness loss at the corners. The value for the center of the image is not transported and is always assumed to be zero (no loss of brightness).
ElementType | uint16 | 7 |
ElementLength | uint16 | Length of the element. Equal to 8 + 4⋅RatioCount. |
RatioCount | uint16 | The number of ratio fields in the packet (N). |
Reserved | uint16 | Zero |
Ratio1 | single | Brightness loss at 1/N the way from the image center to a corner |
Ratio2 | single | Brightness loss at 2/N the way from the image center to a corner |
... | ||
RatioN | single | Brightness loss at the image corners |
This element contains information about the position of a camera or a standalone tracking device.
If the device is a camera, the position reported is that of the entrance pupil as described in appendix B.
The position is represented as the combination of a translation and a rotation. These are applied in the usual order (rotation first, then translation) to get the transformation representing the position.
The translation is specified in meters in a right-handed coordinate system. The axes of the coordinate system represent the right (X), up (Y) and backward (Z) directions of the tracking system. If the tracking system is capable of ensuring it, Y = 0 should correspond to the floor level, with Y > 0 representing objects above the floor.
The rotation is represented as a unit quaternion (using Hamilton’s definition for multiplying basis elements) in the form (x, y, z, w), where x, y and z are the components corresponding to the respective coordinates, and w is the real component.
The error of the translation value is described in the TranslationError field. This field contains the maximum magnitude of the difference between the reported translation and the real translation vector, with a confidence of 99.73% (3σ, assuming a measurement error with normal distribution).
If the translation is unknown, the translation error fields should be set to positive infinity. If the error of the translation is unknown, the translation error fields should contain zero.
The error of the rotation value is described in the RotationError field. It contains the maximum angle of the rotation between the reported rotation (described by the RotationX, RotationY, RotationZ and RotationW fields) and the real rotation, with a confidence of 99.73%.
If the rotation is unknown, the rotation error field should be set to positive infinity. If the error of the rotation is unknown, the rotation error field should contain zero.
ElementType | uint16 | 8 |
ElementLength | uint16 | 40 |
TranslationX | single | X component of the translation vector in meters |
TranslationY | single | Y component of the translation vector in meters |
TranslationZ | single | Z component of the translation vector in meters |
RotationX | single | X component of the rotation quaternion |
RotationY | single | Y component of the rotation quaternion |
RotationZ | single | Z component of the rotation quaternion |
RotationW | single | real component of the rotation quaternion |
TranslationError | single | Error of the translation vector in meters. +∞ if the translaon vector is unknown; 0.0 if the error is unknown. |
RotationError | single | Error of the rotation component in radians. +∞ if the rotation is unknown; 0.0 if the error is unknown. |
This element contains information about the velocity and angular velocity of a camera or standalone tracking device. It can help the receiver approximate the device’s position between packets.
The velocity is specified in the same coordinate system as the position in meters per second; the angular velocity (per second) is represented as a unit quaternion like the rotation.
If either the velocity or the angular velocity is unknown, the component fields of the unknown quantity should contain NaN.
ElementType | uint16 | 9 |
ElementLength | uint16 | 32 |
VelocityX | single | X component of the velocity vector in meters per second; NaN if the velocity is unknown |
VelocityY | single | Y component of the velocity vector in meters per second; NaN if the velocity is unknown |
VelocityZ | single | Z component of the velocity vector in meters per second; NaN if the velocity is unknown |
AngularVelocityX | single | X component of the angular velocity quaternion (per second); NaN if the angular velocity is unknown |
AngularVelocityY | single | Y component of the angular velocity quaternion (per second); NaN if the angular velocity is unknown |
AngularVelocityZ | single | Z component of the angular velocity quaternion (persecond); NaN if the angular velocity is unknown |
AngularVelocityW | single | real component of the angular velocity quaternion (persecond); NaN if the angular velocity is unknown |
This element contains the frame rate of the tracking device.
Presence of this element indicates that the sender generates packets in regular intervals, with one packet being generated every 1/framerate seconds. If this element is present, the sender must ensure it sends packets as close to this schedule as possible.
Frame rate is expressed in the form of a rational number (quotient of two integers). This is necessary to express NTSC-based frame rates precisely.
In the non-NTSC cases when the frame rate is integer we recommend using 1 as the denominator, for example 50 fps should be expressed as 50 / 1. In the case of a fractional frame rate use a suitable denominator, for example 12.5 fps would be 25 / 2.
In the NTSC cases it is mandatory to use 1001 as the denominator. For example 23.976 fps must be expressed as 24000 / 1001, 59.94 fps as 60000 / 1001 and so on.
ElementType | uint16 | 10 |
ElementLength | uint16 | 12 |
FrameRateNumerator | uint32 | The numerator part of the frames per second value |
FrameRateDenominator | uint32 | The denominator part of the frames per second value |
This element can be used to report the value of a measurement. The measurement can be one of the usual encoder values like zoom, focus, or aperture, or any custom measurement value, for example a temperature. Multiple elements of this type can be present in a packet, but each instance has to contain a different value in the MeasurementType field.
It is important to note that sending raw zoom and focus values should only be a supplementary addition, for example as a debugging data. It is mandatory to also include the Field of view element in the packet when the raw zoom encoder value is present, the Focus distance element when the raw focus encoder value is present, and the Aperture element when the raw aperture encoder value is present.
ElementType | uint16 | 11 |
ElementLength | uint16 | 20 |
MeasurementType | uint32 | An identifier describing the type of value contained in the element. 0 for zoom encoder, 1 for focus encoder, 2 for aperture encoder; any other value designates a custom measurement. |
Value | single | The value of the measurement |
MinValue | single | The lowest possible value of the measurement or NaN if unknown. The value may be negative infinity if there is no lower bound. |
MaxValue | single | The highest possible value of the measurement or NaN if unknown. The value may be positive infinity if there is no upper bound. |
The receiver can calculate the depth of field from the focal length of the lens, the focus distance, and the aperture width. The focal length itself is calculated from the field of view and the size of the part of the camera sensor used to produce the output image.
This means that a packet needs to contain the following elements for the receiver to be able to calculate depth of field information:
The transmitted data contains the “Horizontal Field of View” and “Frame Aspect Ratio” values. If you also need the Vertical Field Of View value, you can calculate it this way:The transmitted data contains the “Horizontal Field of View” and “Frame Aspect Ratio” values. If you also need the Vertical Field Of View value, you can calculate it this way:
\[ \textit{FOV}_y = 2 \cdot \textit{arctan}\left( \textit{tan}\left( \frac{\textit{FOV}_x}{2} \right) \cdot \frac{1}{\textit{AR}} \right) \]
To represent the projection and the distortion of the camera, a pinhole camera model is used with the addition of a distortion model similar to that used by OpenCV as described at https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html#details.
This model describes most general-purpose camera lenses well but is not suitable for describing the distortion of fisheye, anamorphic, or tilt-shift lenses.
The model is centered around the entrance pupil of the camera (the center of perspective, sometimes also incorrectly called the nodal point). The entrance pupil is not a physical point in the lens or the camera; it’s the point where a virtual pinhole camera must be placed for it to perfectly overlap the field of view of the real camera. The entrance pupil is located on the camera’s optical axis, somewhere behind the lens. In the case of a zoom lens, its position is not fixed; it moves forward and backward as the field of view changes.
The camera’s coordinate system is centered on the entrance pupil, with the X axis pointing right, the Y axis pointing up, and the Z axis pointing backward.
The following steps are used to find the coordinates (u, v) of the pixel at which the camera displays a point at coordinates (xc, yc, zc) in this coordinate system:
The coordinates are projected onto the z = 1 plane:
$$x_1 = \frac{x_c}{z_c}$$
$$y_1 = \frac{y_c}{z_c}$$
Distance from the optical axis is calculated:
$$r = \sqrt{x_1^2 + y_1^2}$$
Lens distortion is applied:
$$ x_d = x_1 \cdot \frac{1 + k_1 r^2 + k_2 r^4 + k_3 r^6}{1 + k_4 r^2 + k_5 r^4 + k_6 r^6} + 2p_1 x_1 y_1 + p_2 (r^2 + 2x_1^2) + s_1 r^2 + s_2 r^4 $$
$$ y_d = y_1 \cdot \frac{1 + k_1 r^2 + k_2 r^4 + k_3 r^6}{1 + k_4 r^2 + k_5 r^4 + k_6 r^6} + p_1 (r^2 + 2 y_1^2) + 2 p_2 x_1 y_1 + s_3 r^2 + s_4 r^4 $$
The distorted point is projected onto the image sensor:
$$u = f_x \cdot x_d + c_x$$
$$v = f_y \cdot y_d + c_y$$
Coefficients k1 through k6 represent radial distortion (barrel distortion, pincushion distortion, or a combination of the two). p1 and p2 represent tangential (de-centering) distortion. s1 through s4 represent thin prism distortion. The “Lens distortion” element contains these coefficients. If any of these coefficients is not measured, it should be passed as zero.
The most prevalent type of distortion on general-purpose lenses is radial distortion, and it can be usually described fairly accurately using only k1 and k2. If lens distortion is calculated in a model using only these two coefficients, the rest of the fields in the “Lens distortion” element should be zero.
In the final step, fx and fy contain the focal length measured in pixels, and (cx, cy) are the coordinates of the principal point (the point where the optical axis intersects the image sensor) in pixels. These values are not contained in the packets directly to avoid dependence on the resolution of the image transmitted by the camera. The following data is transmitted instead:
$$\textit{FOV}_x = 2 \cdot \textit{arctan}\left(\frac{r_x}{2 f_x}\right)$$
\[ \textit{AR} = \frac{r_x}{r_y} \cdot \frac{f_x}{f_y} \]
(FOV is measured in degrees), and
\[ C_x = \frac{c_x}{r_x} - 0.5 \]
\[ C_y = \frac{c_y}{r_y} - 0.5 \]
where rx and ry are the horizontal and vertical resolution of the image, respectively.
Though the extended model is a superset of the basic model, its K1 and K2 coefficients can differ from the K1 and K2 values of the basic model. The sender should send the model that the lens calibration is performed with. However, if the sender uses the extended model, it is highly recommended to also calculate and send the basic parameter set, or at least provide a selectable option on its UI to do so.
The receiver is allowed to support only one of the models, depending on which one is more suitable for its purposes.
The protocol accepts lens distortion data in only one format. If the distortion of the lens is calculated using a different model, it needs to be converted to this representation. The exact way to do this depends on the distortion model used; some common differences are listed below.
Camera models usually use focal length to describe the field of view; this protocol expects field of view angles instead. In general, the
$$\textit{FOV}_x = 2 \cdot \textit{arctan}\left(\frac{s_x}{2 f_x}\right)$$
$$\textit{FOV}_y = 2 \cdot \textit{arctan}\left(\frac{s_y}{2 f_y}\right)$$
equations can be used to calculate the field of view from the focal length (fx, fy) and the sensor size (sx, sy). The two quantities have to be in the same unit of measurement. Calibration tools usually output fx and fy in pixels, in which case sx and sy are the horizontal and vertical of the image, respectively. Some devices may output both in millimeters, though, in which case the same formula can be used.
Other models may describe the principal point using different units or in different directions. If the model’s description doesn’t explicitly mention what the principal point is, it’s often easiest to find it as the central point of the radial distortion.
Some models apply distortion and projection in the opposite order; that is, they first project the object coordinates onto the sensor (using fx, fy, cx and cy), and apply distortion after this. To convert the distortion coefficients from such a model to the one used in this protocol, one first has to reproject the image onto a plane at unit distance and transform the coefficients accordingly.
In addition, the values of the coefficients depend on the choice of unit for the coordinates. The coordinates are often specified in pixels, millimeters (on the camera sensor), or in multiples of the image width, height, or diagonal. As an example, if the coordinates are measured in millimeters, k1 is measured in mm-2 and k2 is measured in mm-4. The unit of measurement needs to be taken into account when converting the coefficients to the representation used in this protocol.
In the model used in this protocol, the undistorted coordinates x1 and y1 are measured on a plane at unit distance from the entrance pupil, which results in what could be considered a natural representation. This means that the distortion coefficients are independent of both the focal length and of the unit used for measuring xc, y and zc.
It should also be noted that technically distortion applied after projection using the focal lengths (fx and fy) cannot be represented using the model used in this protocol (and vice versa): a circularly symmetric radial distortion in one will generally result in a non-circular distortion in the other. However, as long as fx and fy are almost identical—as is the case in any lens which does not purposefully compress the image in one direction—the result should be visually indistinguishable.
In addition to the above, some models describe distortion in the inverse direction; that is, they give the coordinates of the undistorted pixels as a function of the coordinates of the distorted ones. There is generally no exact way to calculate the inverse form of such a model, but here is a method that works well with not loo large distortions. areaWidth refers to the projection area, typically the used part of the camera sensor witdh. Its unit depends on the model you’re converting from.
bool InvertRadial(double K1, double K2, float areaWidth, double& invK1, double& invK2)
{
invK1 = invK2 = 0.0;
const int N = 5;
double pa = 0.5f * areaWidth;
double dx = 1.0 / N;
double x = dx;
double D[N][2];
double d[N];
for (int i = 0; i < N; i++, x += dx)
{
double undist = x;
double p = x * pa;
double p2 = p * p;
p *= (1.0f + p2 * (K1 + p2 * K2));
double dist = p / pa;
double r = dist * pa;
double q = undist * pa;
double r2 = r * r;
D[i][0] = r * r2;
D[i][1] = r * r2 * r2;
d[i] = q - r;
}
// D.transp() * D
double A[2][2];
memset(A, 0, sizeof(A));
for (int i = 0; i < N; i++)
{
A[0][0] += D[i][0] * D[i][0];
A[0][1] += D[i][0] * D[i][1];
A[1][1] += D[i][1] * D[i][1];
}
A[1][0] = A[0][1];
// (D.transp() * D).inv()
double B[2][2];
double det = (A[0][0] * A[1][1] - A[0][1] * A[1][0]);
if (fabs(det) < 1e-14) return false;
B[0][0] = A[1][1] / det;
B[1][1] = A[0][0] / det;
B[0][1] = B[1][0] = -A[0][1] / det;
// (D.transp() * D).inv() * D.transp()
double C[2][N];
for (int i = 0; i < N; i++)
{
C[0][i] = B[0][0] * D[i][0] + B[0][1] * D[i][1];
C[1][i] = B[1][0] * D[i][0] + B[1][1] * D[i][1];
}
// (D.transp() * D).inv() * D.transp() * d
for (int i = 0; i < N; i++)
{
invK1 += C[0][i] * d[i];
invK2 += C[1][i] * d[i];
}
return true;
}
C-Tracking is designed for sending real physical values. For example, instead of sending the raw encoder value of a zoom sensor, it is recommended to send the real FOV and lens distortion data. This requires that the lens calibration happens on the tracking device's side. It might need a little bit more involvement, but in return the tracking device becomes a real plug-and-play device. In some cases, like for example PTZ cameras, this can happen in the factory. For a batch of cameras with the same building parameters, it is enough to happen once, and the calibration profile can be stored in the firmware.