Skip to main content
Version: dev-latest

Evaluate Mode

Purpose

  • Evaluate the quality of a calibration on a test dataset with fiducials.
Evaluate Mode

Evaluate mode has been hidden in the latest MetriCal release.

At present, evaluate mode performs a single-iteration optimization which may not produce results in a way that is coherent with the intended use of the mode. In particular, this mode is intended to test and produce metrics against a separate object-space and dataset from what is passed into calibrate mode. However, if there is significant error in the calibration the results of evaluate are likely to projectively compensate by pushing all error into the object space during that 1-iteration optimization.

It can often be difficult to interpret the results of evaluate mode in light of this. As such, we have hidden this mode by default and plan to fix the underlying implementation to aid the interpretation of results in a future release. The mode is still available, but hidden. We advise caution in using this mode if you are not sure what you are doing. Feel free to contact support@tangramvision.com if you need any assistance in using this mode.

Usage

metrical evaluate [OPTIONS] \
<INPUT_DATA_PATH> \
<PLEX_OR_RESULTS_PATH> \
<OBJECT_SPACE_OR_RESULTS_PATH>

Concepts

Evaluate applies a calibration to a given dataset and produces metrics to validate the quality of the calibration. This is commonly referred to validating calibration on a "test" dataset, rather than using the "training" dataset that produced the calibration in the first place.

Evaluate runs the same adjustment as a metrical calibrate run, but with one crucial difference: it fixes all values in the plex and uses them to generate metrics.

Evaluate mode is designed to use the results of a calibration as its inputs. The input plex can either be extracted from a results JSON or the results JSON itself! The same principle applies to the input object space: you may provide an optimized object space from a results JSON or the results JSON directly.

Just like Calibrate mode, Evaluate mode has the ability to use cached detections.

Examples

Run an evaluation on a test dataset

TRAINING_DATA=/my/training_data.mcap
TEST_DATA=/my/test_data.mcap # <-- Should have the same object space as the training data!
RESULTS=/my/results.json

metrical calibrate -o $RESULTS $TRAINING_DATA $PLEX $OBJ

metrical evaluate $TEST_DATA $RESULTS $RESULTS

Notice that the input plex and object space for Evaluate mode is extracted directly from the results JSON of Calibrate mode.

Arguments

[INPUT_DATA_OR_DETECTIONS_PATH]

The dataset with which to calibrate, or the detections from an earlier run. Users can pass:

  1. ROS1 bags, in the form of a .bag file.
  2. MCAP files, with an .mcap extension.
  3. A top-level directory containing a set of nested directories for each topic.
  4. A detections JSON with all the cached detections from a previous run.

In all cases, the topic/folder name must match a named component in the plex in order to be matched correctly. If this is not the case, there's no need to edit the plex directly; instead, one may use the --topic_to_component flag.

[PLEX_OR_RESULTS_PATH]

The path to the input plex. This can be a MetriCal results JSON or a plex.json

[OBJECT_SPACE_OR_RESULTS_PATH]

A path pointing to a description of the object space for the adjustment. This can be a MetriCal results JSON or an object_space.json

Options

Universal Options

As with every mode, all universal options are supported (though not all may be used).

-y, --overwrite-detections

Overwrite the detections at this location, if they exist.

--disable-motion-filter

Disable data filtering based on motion from any data source. This is useful for datasets that are a series of snapshots, rather than continuous motion

--camera-motion-threshold [CAMERA_MOTION_THRESHOLD]

Default: 1.0 (pixels/observation)

This threshold is used for filtering camera data based on detected feature motion in the image. An image is considered "still" if the average delta in features between subsequent frames is below this threshold. The units for this threshold are in pixels/frame.

--lidar-motion-threshold [LIDAR_MOTION_THRESHOLD]

Default: 0.1 (meters/observation)

This threshold is used for filtering lidar data based on detected feature motion in the point cloud's detected circle center. A point cloud is considered "still" if the average delta in metric space between subsequent detected circle centers is below this threshold. The units for this threshold are in meters/observation.

-o, --results-path [RESULTS_PATH]

Default: path/of/dataset/[name_of_dataset].results.json

The output path to save the final results of the program, in JSON format.

-m, --topic-to-component [TOPIC_TO_COMPONENT]

A mapping of ROS topic/folder names to component names/UUIDs in the input plex.

MetriCal only parses data that has a topic-component mapping. Ideally, topics and components share the same name. However, if this is not the case, use this flag to map topic names from the dataset to component names in the plex.

-r, --render

Whether to visualize the current mode using Rerun. Run

rerun --memory-limit=1GB

...to install Rerun and start a visualization server. This will listen for outbound messages on port 9876 from MetriCal. Read more about configuring Rereun here.

--render-socket [RENDER_SOCKET]

The web socket address on which Rerun is listening. This should be an IP address and port number separated by a colon, e.g. --render-socket="127.0.0.1:3030". By default, Rerun will listen on socket host.docker.internal:9876. If running locally (not via Docker), Rerun's default port is 127.0.0.1:9876

When running Rerun from its CLI, the IP would correspond to its --bind option and the port would correspond to its --port option.

Deprecated Options

These options are now deprecated and will be removed in the next major version of MetriCal.

-d, --disable-filter

Replaced with --disable-motion-filter.

-t, --stillness-threshold [STILLNESS_THRESHOLD]

Replaced with --camera-motion-threshold.

-T (short argument)

Short arg -T replaced by -m.