CIFR

Getting Started

cifr.yml Contract Reference

Complete reference for every field in the agent contract file.

The cifr.yml file is the contract between your code and CIFR. It lives at the root of your repository and tells CIFR what your agent does, what it expects as input, and what it produces. Without it, your submission is a one-shot experiment. With it, your code becomes a registered, callable, citable agent.

Top-level structure

Every cifr.yml starts with a single agent: key:

agent:
  name: ...
  version: ...
  description: ...
  # ... everything else goes here

Extra keys at any level are rejected. Typos fail loudly rather than being silently ignored.

Field reference

Core fields

Field Required Type Description
name yes string Kebab-case identifier, 3-80 characters. Lowercase letters, digits, and hyphens. Must start and end with a letter or digit.
version yes string Semver MAJOR.MINOR.PATCH (e.g. 1.0.0). Immutable once registered -- bump to publish an update.
description yes string One or two sentences explaining what the agent does. Up to 2000 characters.
invoke conditional string The shell command CIFR runs inside the container. Required for single-function agents. Mutually exclusive with functions:.
inputs no list Declared input fields. Each has name, format, and optional description.
outputs conditional list Declared output fields. At least one required for single-function agents. Same shape as inputs.

Identity and provenance

Field Required Type Description
rai conditional string Research Agent Identifier. Format: RAI-YYYY-author-slug. Required when paper: is present. See RAI docs.
provenance_type no string How the agent's code relates to the paper. Defaults to author_original. See Provenance Types for all six types.

Paper metadata

The paper: block attaches publication metadata to your agent. When present, rai: becomes required.

Field Required Type Description
paper.title yes string Full title of the publication.
paper.doi no string DOI starting with 10. (e.g. 10.1109/TSG.2016.2561303).
paper.year no integer Publication year (1900-2100).
paper.venue no string Journal, conference, or preprint server name.
paper.abstract no string Paper abstract, up to 8000 characters.
paper.keywords no list of strings Up to 32 keywords, each 1-100 characters.
paper.authors no list Author records (see below).
paper.preprint_url no string Link to a preprint (arXiv, SSRN, etc.).
paper.related_rais no list of strings RAIs of related agents.
paper.bibtex_key no string Preferred BibTeX citation key.

Each author in paper.authors has:

Field Required Description
name yes Full name.
orcid no ORCID in 0000-0000-0000-000X format.
affiliation no Institution name.
email no Contact email.

Input and output fields

Each entry in inputs: or outputs: has:

Field Required Description
name yes Snake_case identifier. Lowercase letters, digits, underscores. Must start with a letter.
format yes MIME type (e.g. application/json, text/csv, image/png).
description no Human-readable explanation of the field.
from_agent no Composition binding (inputs only). See Composition.

Composition

Field Required Type Description
depends_on no list of strings RAIs of upstream agents this agent calls at runtime.

An input field's from_agent: binding tells CIFR to fill that input by calling an upstream agent instead of expecting it from the user:

from_agent:
  rai: RAI-2016-chanda-resiliency-pds
  output: result          # which upstream output to read
  version: 1.0.0          # optional: pin to a specific version
  inputs_from:            # map upstream inputs to your user-supplied fields
    topology: topology_a

Every RAI referenced in a from_agent: binding must appear in depends_on:.

Multi-function agents

Use functions: instead of top-level invoke/inputs/outputs when one agent exposes multiple operations:

Field Required Description
functions[].name yes Kebab-case function name, unique within the agent.
functions[].description yes What this function does.
functions[].invoke yes Shell command for this function.
functions[].inputs no Input fields for this function.
functions[].outputs yes Output fields (at least one).

You cannot mix top-level invoke/outputs with a functions: block. Choose one style or the other.

Benchmarks

Declare performance claims that CIFR will verify automatically:

Field Required Description
benchmarks[].dataset yes Identifier of the benchmark dataset (e.g. redd-house-1, imagenet-val-2012).
benchmarks[].metric yes Evaluation metric name, lowercase with underscores (e.g. f1_score, accuracy, rmse).
benchmarks[].value yes The claimed metric value (e.g. 0.973).
benchmarks[].description no Human-readable description of the benchmark.

Complete examples

Example 1: Simple single-function agent

A wavelet-based event detector for power system waveforms. No paper, no RAI -- just a utility agent.

agent:
  name: wavelet-event-detector
  version: 1.0.0
  description: Detect transient events in power system waveforms using discrete wavelet transform decomposition.
  provenance_type: original_unpublished
  invoke: python detect.py
  inputs:
    - name: waveform
      format: application/json
      description: Time-series array of voltage or current samples at a fixed sampling rate.
    - name: config
      format: application/json
      description: Detection parameters (wavelet family, threshold, minimum event duration).
  outputs:
    - name: events
      format: application/json
      description: Array of detected events with start time, end time, magnitude, and classification.
  benchmarks:
    - dataset: redd-house-1
      metric: f1_score
      value: 0.973
      description: Event detection F1 on REDD House 1 dataset.

Example 2: Paper-backed agent with full metadata

The resiliency index from Chanda 2016, with a complete publication record and an RAI.

agent:
  name: resiliency-pds
  rai: RAI-2016-chanda-resiliency-pds
  version: 1.0.0
  description: Topological resiliency index for power distribution systems with multiple microgrids using analytic hierarchy process weights.
  provenance_type: author_original
  paper:
    title: Defining and Enabling Resiliency of Electric Distribution Systems with Multiple Microgrids
    doi: 10.1109/TSG.2016.2561303
    year: 2016
    venue: IEEE Transactions on Smart Grid
    abstract: This paper proposes a comprehensive resiliency metric for distribution systems...
    authors:
      - name: Sayonsom Chanda
        orcid: 0000-0003-4178-9482
      - name: Anurag K. Srivastava
    keywords:
      - resilience
      - microgrid
      - distribution network
      - analytic hierarchy process
  invoke: python -m resiliency
  inputs:
    - name: topology
      format: application/json
      description: Network topology as a node-edge adjacency structure with component attributes.
  outputs:
    - name: result
      format: application/json
      description: Resiliency index score and per-component breakdown.
  benchmarks:
    - dataset: ieee-33bus
      metric: r_index
      value: 0.847
      description: Resiliency index on the IEEE 33-bus test system.

Example 3: Data wrapper agent

Experimental measurement data exposed as a queryable agent. Researchers can invoke it to get specific subsets of measurements instead of downloading the entire dataset.

agent:
  name: redd-house-measurements
  rai: RAI-2011-kolter-redd-house1
  version: 1.0.0
  description: Query interface for REDD House 1 energy disaggregation measurements. Returns time-windowed appliance-level and aggregate power readings.
  provenance_type: data_wrapper
  paper:
    title: "REDD: A Public Data Set for Energy Disaggregation Research"
    doi: 10.1007/978-3-642-25999-5_12
    year: 2011
    venue: Workshop on Data Mining Applications in Sustainability
    authors:
      - name: J. Zico Kolter
      - name: Matthew J. Johnson
    keywords:
      - energy disaggregation
      - NILM
      - smart meter
  invoke: python query.py
  inputs:
    - name: query
      format: application/json
      description: "Query parameters: start_time, end_time, appliances (list), resolution."
  outputs:
    - name: measurements
      format: application/json
      description: Time-series power readings for the requested appliances and time window.

How invocations work

When someone calls your agent via POST /api/agents/{id}/invoke_json, CIFR:

  1. Validates the request body against your declared inputs.
  2. Writes each input to /inputs/{field_name}{ext} inside a fresh container.
  3. Starts the container from your pinned image digest with --network none, the configured memory limit, and the configured timeout.
  4. Runs your invoke command.
  5. Captures everything written to /outputs/.
  6. Computes a provenance hash over (image_digest, inputs_sha256, outputs_sha256) and returns it.

Your code reads from /inputs/ and writes to /outputs/. That is the entire contract. The provenance hash makes every invocation independently verifiable.

Versioning rules

Versions are immutable. Submitting the same (name, version) twice produces a clear error. Bump the version number to publish an update:

  • Patch (1.0.0 to 1.0.1) -- bug fixes that do not change the input/output schema.
  • Minor (1.0.0 to 1.1.0) -- new optional inputs, new outputs, additive features.
  • Major (1.0.0 to 2.0.0) -- breaking changes to inputs, outputs, or semantics.

External callers typically pin to a specific minor version in their depends_on so patches flow through automatically while major changes require an explicit upgrade.