Evaluate Mode
Purpose
- Evaluate the quality of a calibration on a test dataset with fiducials.
Evaluate mode has been hidden in the latest MetriCal release.
At present, evaluate mode performs a single-iteration optimization which may not produce results
in a way that is coherent with the intended use of the mode. In particular, this mode is
intended to test and produce metrics against a separate object-space and dataset from what is
passed into calibrate
mode. However, if there is significant error in the calibration the
results of evaluate are likely to projectively compensate by pushing all error into the object
space during that 1-iteration optimization.
It can often be difficult to interpret the results of evaluate
mode in light of this. As such,
we have hidden this mode by default and plan to fix the underlying implementation to aid the
interpretation of results in a future release. The mode is still available, but hidden. We
advise caution in using this mode if you are not sure what you are doing. Feel free to contact
support@tangramvision.com if you need any assistance in
using this mode.
Usage
metrical evaluate [OPTIONS] \
<INPUT_DATA_PATH> \
<PLEX_OR_RESULTS_PATH> \
<OBJECT_SPACE_OR_RESULTS_PATH>
Concepts
Evaluate applies a calibration to a given dataset and produces metrics to validate the quality of the calibration. This is commonly referred to validating calibration on a "test" dataset, rather than using the "training" dataset that produced the calibration in the first place.
Evaluate runs the same adjustment as a metrical calibrate
run, but with one crucial difference: it
fixes all values in the plex and uses them to generate metrics.
Evaluate mode is designed to use the results of a calibration as its inputs. The input plex can either be extracted from a results JSON or the results JSON itself! The same principle applies to the input object space: you may provide an optimized object space from a results JSON or the results JSON directly.
Just like Calibrate mode, Evaluate mode has the ability to use cached detections.
Examples
Run an evaluation on a test dataset
TRAINING_DATA=/my/training_data.mcap
TEST_DATA=/my/test_data.mcap # <-- Should have the same object space as the training data!
RESULTS=/my/results.json
metrical calibrate -o $RESULTS $TRAINING_DATA $PLEX $OBJ
metrical evaluate $TEST_DATA $RESULTS $RESULTSNotice that the input plex and object space for Evaluate mode is extracted directly from the results JSON of Calibrate mode.
Arguments
[INPUT_DATA_OR_DETECTIONS_PATH]
The dataset with which to calibrate, or the detections from an earlier run. Users can pass:
- ROS1 bags, in the form of a
.bag
file.- MCAP files, with an
.mcap
extension.- A top-level directory containing a set of nested directories for each topic.
- A detections JSON with all the cached detections from a previous run.
In all cases, the topic/folder name must match a named component in the plex in order to be matched correctly. If this is not the case, there's no need to edit the plex directly; instead, one may use the
--topic_to_component
flag.
[PLEX_OR_RESULTS_PATH]
The path to the input plex. This can be a MetriCal results JSON or a plex.json
[OBJECT_SPACE_OR_RESULTS_PATH]
A path pointing to a description of the object space for the adjustment. This can be a MetriCal results JSON or an object_space.json
Options
Universal Options
As with every mode, all universal options are supported (though not all may be used).
-y, --overwrite-detections
Overwrite the detections at this location, if they exist.
--disable-motion-filter
Disable data filtering based on motion from any data source. This is useful for datasets that are a series of snapshots, rather than continuous motion
--camera-motion-threshold [CAMERA_MOTION_THRESHOLD]
Default: 1.0 (pixels/observation)
This threshold is used for filtering camera data based on detected feature motion in the image. An image is considered "still" if the average delta in features between subsequent frames is below this threshold. The units for this threshold are in pixels/frame.
--lidar-motion-threshold [LIDAR_MOTION_THRESHOLD]
Default: 0.1 (meters/observation)
This threshold is used for filtering lidar data based on detected feature motion in the point cloud's detected circle center. A point cloud is considered "still" if the average delta in metric space between subsequent detected circle centers is below this threshold. The units for this threshold are in meters/observation.
-o, --results-path [RESULTS_PATH]
Default: path/of/dataset/[name_of_dataset].results.json
The output path to save the final results of the program, in JSON format.
-m, --topic-to-component [TOPIC_TO_COMPONENT]
A mapping of ROS topic/folder names to component names/UUIDs in the input plex.
MetriCal only parses data that has a topic-component mapping. Ideally, topics and components share the same name. However, if this is not the case, use this flag to map topic names from the dataset to component names in the plex.
-r, --render
Whether to visualize the current mode using Rerun. Run
rerun --memory-limit=1GB
...to install Rerun and start a visualization server. This will listen for outbound messages on port 9876 from MetriCal. Read more about configuring Rereun here.
--render-socket [RENDER_SOCKET]
The web socket address on which Rerun is listening. This should be an IP address and port number separated by a colon, e.g.
--render-socket="127.0.0.1:3030"
. By default, Rerun will listen on sockethost.docker.internal:9876
. If running locally (not via Docker), Rerun's default port is127.0.0.1:9876
When running Rerun from its CLI, the IP would correspond to its
--bind
option and the port would correspond to its--port
option.
Deprecated Options
These options are now deprecated and will be removed in the next major version of MetriCal.
-d, --disable-filter
Replaced with --disable-motion-filter.
-t, --stillness-threshold [STILLNESS_THRESHOLD]
Replaced with
--camera-motion-threshold
.
-T (short argument)
Short arg
-T
replaced by-m
.