Name: Phoenix Cli
Author: Github

Install

Terminal · npx

$npx skills add https://github.com/microsoft/github-copilot-for-azure --skill azure-ai

Works with Paperclip

How Phoenix Cli fits into a Paperclip company.

Phoenix Cli drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md162 linesmarkdown

Expand

1---2name: phoenix-cli3description: Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, inspect datasets, and query the GraphQL API. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues.4license: Apache-2.05compatibility: Requires Node.js (for npx) or global install of @arizeai/phoenix-cli. Optionally requires jq for JSON processing.6metadata:7  author: arize-ai8  version: "2.0.0"9---10 11# Phoenix CLI12 13## Invocation14 15```bash16px <resource> <action>                          # if installed globally17npx @arizeai/phoenix-cli <resource> <action>    # no install required18```19 20The CLI uses singular resource commands with subcommands like `list` and `get`:21 22```bash23px trace list24px trace get <trace-id>25px span list26px dataset list27px dataset get <name>28```29 30## Setup31 32```bash33export PHOENIX_HOST=http://localhost:600634export PHOENIX_PROJECT=my-project35export PHOENIX_API_KEY=your-api-key  # if auth is enabled36```37 38Always use `--format raw --no-progress` when piping to `jq`.39 40## Traces41 42```bash43px trace list --limit 20 --format raw --no-progress | jq .44px trace list --last-n-minutes 60 --limit 20 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'45px trace list --format raw --no-progress | jq 'sort_by(-.duration) | .[0:5]'46px trace get <trace-id> --format raw | jq .47px trace get <trace-id> --format raw | jq '.spans[] | select(.status_code != "OK")'48```49 50## Spans51 52```bash53px span list --limit 20                                    # recent spans (table view)54px span list --last-n-minutes 60 --limit 50                # spans from last hour55px span list --span-kind LLM --limit 10                    # only LLM spans56px span list --status-code ERROR --limit 20                # only errored spans57px span list --name chat_completion --limit 10             # filter by span name58px span list --trace-id <id> --format raw --no-progress | jq .   # all spans for a trace59px span list --include-annotations --limit 10              # include annotation scores60px span list output.json --limit 100                       # save to JSON file61px span list --format raw --no-progress | jq '.[] | select(.status_code == "ERROR")'62```63 64### Span JSON shape65 66```67Span68  name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT"|"RERANKER"|"GUARDRAIL"|"EVALUATOR"|"UNKNOWN")69  status_code ("OK"|"ERROR"|"UNSET"), status_message70  context.span_id, context.trace_id, parent_id71  start_time, end_time72  attributes (same as trace span attributes above)73  annotations[] (with --include-annotations)74    name, result { score, label, explanation }75```76 77### Trace JSON shape78 79```80Trace81  traceId, status ("OK"|"ERROR"), duration (ms), startTime, endTime82  rootSpan  — top-level span (parent_id: null)83  spans[]84    name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT")85    status_code ("OK"|"ERROR"), parent_id, context.span_id86    attributes87      input.value, output.value          — raw input/output88      llm.model_name, llm.provider89      llm.token_count.prompt/completion/total90      llm.token_count.prompt_details.cache_read91      llm.token_count.completion_details.reasoning92      llm.input_messages.{N}.message.role/content93      llm.output_messages.{N}.message.role/content94      llm.invocation_parameters          — JSON string (temperature, etc.)95      exception.message                  — set if span errored96```97 98## Sessions99 100```bash101px session list --limit 10 --format raw --no-progress | jq .102px session list --order asc --format raw --no-progress | jq '.[].session_id'103px session get <session-id> --format raw | jq .104px session get <session-id> --include-annotations --format raw | jq '.annotations'105```106 107### Session JSON shape108 109```110SessionData111  id, session_id, project_id112  start_time, end_time113  traces[]114    id, trace_id, start_time, end_time115 116SessionAnnotation (with --include-annotations)117  id, name, annotator_kind ("LLM"|"CODE"|"HUMAN"), session_id118  result { label, score, explanation }119  metadata, identifier, source, created_at, updated_at120```121 122## Datasets / Experiments / Prompts123 124```bash125px dataset list --format raw --no-progress | jq '.[].name'126px dataset get <name> --format raw | jq '.examples[] | {input, output: .expected_output}'127px experiment list --dataset <name> --format raw --no-progress | jq '.[] | {id, name, failed_run_count}'128px experiment get <id> --format raw --no-progress | jq '.[] | select(.error != null) | {input, error}'129px prompt list --format raw --no-progress | jq '.[].name'130px prompt get <name> --format text --no-progress   # plain text, ideal for piping to AI131```132 133## GraphQL134 135For ad-hoc queries not covered by the commands above. Output is `{"data": {...}}`.136 137```bash138px api graphql '{ projectCount datasetCount promptCount evaluatorCount }'139px api graphql '{ projects { edges { node { name traceCount tokenCountTotal } } } }' | jq '.data.projects.edges[].node'140px api graphql '{ datasets { edges { node { name exampleCount experimentCount } } } }' | jq '.data.datasets.edges[].node'141px api graphql '{ evaluators { edges { node { name kind } } } }' | jq '.data.evaluators.edges[].node'142 143# Introspect any type144px api graphql '{ __type(name: "Project") { fields { name type { name } } } }' | jq '.data.__type.fields[]'145```146 147Key root fields: `projects`, `datasets`, `prompts`, `evaluators`, `projectCount`, `datasetCount`, `promptCount`, `evaluatorCount`, `viewer`.148 149## Docs150 151Download Phoenix documentation markdown for local use by coding agents.152 153```bash154px docs fetch                                # fetch default workflow docs to .px/docs155px docs fetch --workflow tracing             # fetch only tracing docs156px docs fetch --workflow tracing --workflow evaluation157px docs fetch --dry-run                      # preview what would be downloaded158px docs fetch --refresh                      # clear .px/docs and re-download159px docs fetch --output-dir ./my-docs         # custom output directory160```161 162Key options: `--workflow` (repeatable, values: `tracing`, `evaluation`, `datasets`, `prompts`, `integrations`, `sdk`, `self-hosting`, `all`), `--dry-run`, `--refresh`, `--output-dir` (default `.px/docs`), `--workers` (default 10).

Related skills

Add Educational Comments

Takes any code file and transforms it into a teaching resource by adding educational comments that explain syntax, design choices, and language concepts. Automa

Agent Governance

When your AI agents start calling APIs, touching databases, or executing shell commands, you need guardrails before something goes sideways. This gives you comp

Agentic Eval

Implements self-critique loops where Claude generates output, evaluates it against your criteria, then refines based on its own feedback. Includes evaluator-opt