Getting Started
cifr.yml Contract Reference
Complete reference for every field in the agent contract file.
The cifr.yml file is the contract between your code and CIFR. It lives at the root of your repository and tells CIFR what your agent does, what it expects as input, and what it produces. Without it, your submission is a one-shot experiment. With it, your code becomes a registered, callable, citable agent.
Top-level structure
Every cifr.yml starts with a single agent: key:
agent:
name: ...
version: ...
description: ...
# ... everything else goes here
Extra keys at any level are rejected. Typos fail loudly rather than being silently ignored.
Field reference
Core fields
| Field | Required | Type | Description |
|---|---|---|---|
name |
yes | string | Kebab-case identifier, 3-80 characters. Lowercase letters, digits, and hyphens. Must start and end with a letter or digit. |
version |
yes | string | Semver MAJOR.MINOR.PATCH (e.g. 1.0.0). Immutable once registered -- bump to publish an update. |
description |
yes | string | One or two sentences explaining what the agent does. Up to 2000 characters. |
invoke |
conditional | string | The shell command CIFR runs inside the container. Required for single-function agents. Mutually exclusive with functions:. |
inputs |
no | list | Declared input fields. Each has name, format, and optional description. |
outputs |
conditional | list | Declared output fields. At least one required for single-function agents. Same shape as inputs. |
Identity and provenance
| Field | Required | Type | Description |
|---|---|---|---|
rai |
conditional | string | Research Agent Identifier. Format: RAI-YYYY-author-slug. Required when paper: is present. See RAI docs. |
provenance_type |
no | string | How the agent's code relates to the paper. Defaults to author_original. See Provenance Types for all six types. |
Paper metadata
The paper: block attaches publication metadata to your agent. When present, rai: becomes required.
| Field | Required | Type | Description |
|---|---|---|---|
paper.title |
yes | string | Full title of the publication. |
paper.doi |
no | string | DOI starting with 10. (e.g. 10.1109/TSG.2016.2561303). |
paper.year |
no | integer | Publication year (1900-2100). |
paper.venue |
no | string | Journal, conference, or preprint server name. |
paper.abstract |
no | string | Paper abstract, up to 8000 characters. |
paper.keywords |
no | list of strings | Up to 32 keywords, each 1-100 characters. |
paper.authors |
no | list | Author records (see below). |
paper.preprint_url |
no | string | Link to a preprint (arXiv, SSRN, etc.). |
paper.related_rais |
no | list of strings | RAIs of related agents. |
paper.bibtex_key |
no | string | Preferred BibTeX citation key. |
Each author in paper.authors has:
| Field | Required | Description |
|---|---|---|
name |
yes | Full name. |
orcid |
no | ORCID in 0000-0000-0000-000X format. |
affiliation |
no | Institution name. |
email |
no | Contact email. |
Input and output fields
Each entry in inputs: or outputs: has:
| Field | Required | Description |
|---|---|---|
name |
yes | Snake_case identifier. Lowercase letters, digits, underscores. Must start with a letter. |
format |
yes | MIME type (e.g. application/json, text/csv, image/png). |
description |
no | Human-readable explanation of the field. |
from_agent |
no | Composition binding (inputs only). See Composition. |
Composition
| Field | Required | Type | Description |
|---|---|---|---|
depends_on |
no | list of strings | RAIs of upstream agents this agent calls at runtime. |
An input field's from_agent: binding tells CIFR to fill that input by calling an upstream agent instead of expecting it from the user:
from_agent:
rai: RAI-2016-chanda-resiliency-pds
output: result # which upstream output to read
version: 1.0.0 # optional: pin to a specific version
inputs_from: # map upstream inputs to your user-supplied fields
topology: topology_a
Every RAI referenced in a from_agent: binding must appear in depends_on:.
Multi-function agents
Use functions: instead of top-level invoke/inputs/outputs when one agent exposes multiple operations:
| Field | Required | Description |
|---|---|---|
functions[].name |
yes | Kebab-case function name, unique within the agent. |
functions[].description |
yes | What this function does. |
functions[].invoke |
yes | Shell command for this function. |
functions[].inputs |
no | Input fields for this function. |
functions[].outputs |
yes | Output fields (at least one). |
You cannot mix top-level invoke/outputs with a functions: block. Choose one style or the other.
Benchmarks
Declare performance claims that CIFR will verify automatically:
| Field | Required | Description |
|---|---|---|
benchmarks[].dataset |
yes | Identifier of the benchmark dataset (e.g. redd-house-1, imagenet-val-2012). |
benchmarks[].metric |
yes | Evaluation metric name, lowercase with underscores (e.g. f1_score, accuracy, rmse). |
benchmarks[].value |
yes | The claimed metric value (e.g. 0.973). |
benchmarks[].description |
no | Human-readable description of the benchmark. |
Complete examples
Example 1: Simple single-function agent
A wavelet-based event detector for power system waveforms. No paper, no RAI -- just a utility agent.
agent:
name: wavelet-event-detector
version: 1.0.0
description: Detect transient events in power system waveforms using discrete wavelet transform decomposition.
provenance_type: original_unpublished
invoke: python detect.py
inputs:
- name: waveform
format: application/json
description: Time-series array of voltage or current samples at a fixed sampling rate.
- name: config
format: application/json
description: Detection parameters (wavelet family, threshold, minimum event duration).
outputs:
- name: events
format: application/json
description: Array of detected events with start time, end time, magnitude, and classification.
benchmarks:
- dataset: redd-house-1
metric: f1_score
value: 0.973
description: Event detection F1 on REDD House 1 dataset.
Example 2: Paper-backed agent with full metadata
The resiliency index from Chanda 2016, with a complete publication record and an RAI.
agent:
name: resiliency-pds
rai: RAI-2016-chanda-resiliency-pds
version: 1.0.0
description: Topological resiliency index for power distribution systems with multiple microgrids using analytic hierarchy process weights.
provenance_type: author_original
paper:
title: Defining and Enabling Resiliency of Electric Distribution Systems with Multiple Microgrids
doi: 10.1109/TSG.2016.2561303
year: 2016
venue: IEEE Transactions on Smart Grid
abstract: This paper proposes a comprehensive resiliency metric for distribution systems...
authors:
- name: Sayonsom Chanda
orcid: 0000-0003-4178-9482
- name: Anurag K. Srivastava
keywords:
- resilience
- microgrid
- distribution network
- analytic hierarchy process
invoke: python -m resiliency
inputs:
- name: topology
format: application/json
description: Network topology as a node-edge adjacency structure with component attributes.
outputs:
- name: result
format: application/json
description: Resiliency index score and per-component breakdown.
benchmarks:
- dataset: ieee-33bus
metric: r_index
value: 0.847
description: Resiliency index on the IEEE 33-bus test system.
Example 3: Data wrapper agent
Experimental measurement data exposed as a queryable agent. Researchers can invoke it to get specific subsets of measurements instead of downloading the entire dataset.
agent:
name: redd-house-measurements
rai: RAI-2011-kolter-redd-house1
version: 1.0.0
description: Query interface for REDD House 1 energy disaggregation measurements. Returns time-windowed appliance-level and aggregate power readings.
provenance_type: data_wrapper
paper:
title: "REDD: A Public Data Set for Energy Disaggregation Research"
doi: 10.1007/978-3-642-25999-5_12
year: 2011
venue: Workshop on Data Mining Applications in Sustainability
authors:
- name: J. Zico Kolter
- name: Matthew J. Johnson
keywords:
- energy disaggregation
- NILM
- smart meter
invoke: python query.py
inputs:
- name: query
format: application/json
description: "Query parameters: start_time, end_time, appliances (list), resolution."
outputs:
- name: measurements
format: application/json
description: Time-series power readings for the requested appliances and time window.
How invocations work
When someone calls your agent via POST /api/agents/{id}/invoke_json, CIFR:
- Validates the request body against your declared
inputs. - Writes each input to
/inputs/{field_name}{ext}inside a fresh container. - Starts the container from your pinned image digest with
--network none, the configured memory limit, and the configured timeout. - Runs your
invokecommand. - Captures everything written to
/outputs/. - Computes a provenance hash over
(image_digest, inputs_sha256, outputs_sha256)and returns it.
Your code reads from /inputs/ and writes to /outputs/. That is the entire contract. The provenance hash makes every invocation independently verifiable.
Versioning rules
Versions are immutable. Submitting the same (name, version) twice produces a clear error. Bump the version number to publish an update:
- Patch (1.0.0 to 1.0.1) -- bug fixes that do not change the input/output schema.
- Minor (1.0.0 to 1.1.0) -- new optional inputs, new outputs, additive features.
- Major (1.0.0 to 2.0.0) -- breaking changes to inputs, outputs, or semantics.
External callers typically pin to a specific minor version in their depends_on so patches flow through automatically while major changes require an explicit upgrade.