Skip to content

Needlr

PipelineCostEvaluator

ncosentino/needlr

PipelineCostEvaluator

NexusLabs.Needlr.AgentFramework.Evaluation ¶

PipelineCostEvaluator Class¶

Deterministic evaluator that scores token usage and cost breakdown per stage of a pipeline run from the captured NexusLabs.Needlr.AgentFramework.Diagnostics.IPipelineRunResult snapshot carried in a PipelineEvaluationContext.

public sealed class PipelineCostEvaluator : Microsoft.Extensions.AI.Evaluation.IEvaluator

Inheritance System.Object 🡒 PipelineCostEvaluator

Implements Microsoft.Extensions.AI.Evaluation.IEvaluator

Remarks¶

This evaluator never contacts a language model. It reads NexusLabs.Needlr.AgentFramework.Diagnostics.IPipelineRunResult.AggregateTokenUsage and per-stage NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics.AggregateTokenUsage to produce: - pipeline.total_tokens — sum of all stage tokens. - pipeline.total_input_tokens — aggregate input tokens. - pipeline.total_output_tokens — aggregate output tokens. - pipeline.stage_count — number of stages in the pipeline. - pipeline.stages_with_diagnostics — count of stages that have non-null diagnostics. - pipeline.most_expensive_stage — name of the stage with the most tokens. - pipeline.most_expensive_stage_pct — percentage of total tokens used by the most expensive stage.

When no PipelineEvaluationContext is present in the additionalContext collection, the evaluator returns an empty Microsoft.Extensions.AI.Evaluation.EvaluationResult — callers should treat that as "not applicable".

Fields¶

PipelineCostEvaluator.MostExpensiveStageMetricName Field¶

Metric name for the name of the most expensive stage by token count.

public const string MostExpensiveStageMetricName = "pipeline.most_expensive_stage";

Field Value¶

PipelineCostEvaluator.MostExpensiveStagePctMetricName Field¶

Metric name for the percentage of total tokens used by the most expensive stage.

public const string MostExpensiveStagePctMetricName = "pipeline.most_expensive_stage_pct";

Field Value¶

PipelineCostEvaluator.StageCountMetricName Field¶

Metric name for the number of stages in the pipeline.

public const string StageCountMetricName = "pipeline.stage_count";

Field Value¶

PipelineCostEvaluator.StagesWithDiagnosticsMetricName Field¶

Metric name for the count of stages that have diagnostics.

public const string StagesWithDiagnosticsMetricName = "pipeline.stages_with_diagnostics";

Field Value¶

PipelineCostEvaluator.TotalInputTokensMetricName Field¶

Metric name for the total input token count.

public const string TotalInputTokensMetricName = "pipeline.total_input_tokens";

Field Value¶

PipelineCostEvaluator.TotalOutputTokensMetricName Field¶

Metric name for the total output token count.

public const string TotalOutputTokensMetricName = "pipeline.total_output_tokens";

Field Value¶

PipelineCostEvaluator.TotalTokensMetricName Field¶

Metric name for the total token count across all stages.

public const string TotalTokensMetricName = "pipeline.total_tokens";

Field Value¶

Properties¶

PipelineCostEvaluator.EvaluationMetricNames Property¶

Gets the Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics produced by this Microsoft.Extensions.AI.Evaluation.IEvaluator.

public System.Collections.Generic.IReadOnlyCollection<string> EvaluationMetricNames { get; }

Implements EvaluationMetricNames

Property Value¶

System.Collections.Generic.IReadOnlyCollection<System.String >

Methods¶

PipelineCostEvaluator.EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken) Method¶

Evaluates the supplied modelResponse and returns an Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.

public System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult> EvaluateAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatResponse modelResponse, Microsoft.Extensions.AI.Evaluation.ChatConfiguration? chatConfiguration=null, System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>? additionalContext=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));

Parameters¶

messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage >

The conversation history including the request that produced the supplied modelResponse.

modelResponse Microsoft.Extensions.AI.ChatResponse

The response that is to be evaluated.

chatConfiguration Microsoft.Extensions.AI.Evaluation.ChatConfiguration

A Microsoft.Extensions.AI.Evaluation.ChatConfiguration that specifies the Microsoft.Extensions.AI.IChatClient that should be used if one or more composed Microsoft.Extensions.AI.Evaluation.IEvaluators use an AI model to perform evaluation.

additionalContext System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext >

Additional contextual information (beyond that which is available in messages) that the Microsoft.Extensions.AI.Evaluation.IEvaluator may need to accurately evaluate the supplied modelResponse.

cancellationToken System.Threading.CancellationToken

A System.Threading.CancellationToken that can cancel the evaluation operation.

Implements EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken)

Returns¶

System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult >
An Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.

Remarks¶

The Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics contained in the returned Microsoft.Extensions.AI.Evaluation.EvaluationResult should match Microsoft.Extensions.AI.Evaluation.IEvaluator.EvaluationMetricNames.

Also note that chatConfiguration must not be omitted if the evaluation is performed using an AI model.