EfficiencyEvaluator
NexusLabs.Needlr.AgentFramework.Evaluation¶
EfficiencyEvaluator Class¶
Deterministic evaluator that scores the token efficiency and cost profile of an agent run from the captured NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics snapshot carried in an AgentRunDiagnosticsContext.
Inheritance System.Object 🡒 EfficiencyEvaluator
Implements Microsoft.Extensions.AI.Evaluation.IEvaluator
Example¶
// Score efficiency with a 10,000-token budget
var evaluator = new EfficiencyEvaluator(tokenBudget: 10_000);
var result = await evaluator.EvaluateAsync(
messages: Array.Empty<ChatMessage>(),
modelResponse: new ChatResponse(),
additionalContext: [new AgentRunDiagnosticsContext(diagnostics)]);
var underBudget = ((BooleanMetric)result.Metrics["Under Budget"]).Value;
var tokensPerTool = ((NumericMetric)result.Metrics["Tokens Per Tool Call"]).Value;
Remarks¶
This evaluator never contacts a language model. It reads
NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics.AggregateTokenUsage and
NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics.ToolCalls to produce:
- Total Tokens — aggregate token count across all LLM calls.
- Input Token Ratio — input tokens / total tokens. High values suggest verbose prompts; low values suggest verbose outputs.
- Tokens Per Tool Call — total tokens / tool call count. Measures the token cost of each tool invocation.
- Cache Hit Ratio — cached input tokens / input tokens. Higher values mean more prompt-cache reuse.
- Under Budget — boolean. true when total tokens is strictly below the configured token budget. Only emitted when a budget is configured.
When no AgentRunDiagnosticsContext is present in the
additionalContext collection, the evaluator returns an empty
Microsoft.Extensions.AI.Evaluation.EvaluationResult — callers should treat that as "not applicable".
Constructors¶
EfficiencyEvaluator(Nullable<long>) Constructor¶
Creates a new EfficiencyEvaluator.
Parameters¶
tokenBudget System.Nullable<System.Int64>
Optional token budget. When provided, the evaluator emits the UnderBudgetMetricName metric. When null, the metric is omitted.
Fields¶
EfficiencyEvaluator.CacheHitRatioMetricName Field¶
Metric name for the prompt-cache hit ratio.
Field Value¶
EfficiencyEvaluator.InputTokenRatioMetricName Field¶
Metric name for the input-to-total token ratio.
Field Value¶
EfficiencyEvaluator.TokensPerToolCallMetricName Field¶
Metric name for tokens consumed per tool call.
Field Value¶
EfficiencyEvaluator.TotalTokensMetricName Field¶
Metric name for the aggregate token count.
Field Value¶
EfficiencyEvaluator.UnderBudgetMetricName Field¶
Metric name for the boolean budget check.
Field Value¶
Properties¶
EfficiencyEvaluator.EvaluationMetricNames Property¶
Gets the Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics produced by this Microsoft.Extensions.AI.Evaluation.IEvaluator.
Implements EvaluationMetricNames
Property Value¶
System.Collections.Generic.IReadOnlyCollection<System.String>
Methods¶
EfficiencyEvaluator.EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken) Method¶
Evaluates the supplied modelResponse and returns an Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.
public System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult> EvaluateAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatResponse modelResponse, Microsoft.Extensions.AI.Evaluation.ChatConfiguration? chatConfiguration=null, System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>? additionalContext=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));
Parameters¶
messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>
The conversation history including the request that produced the supplied modelResponse.
modelResponse Microsoft.Extensions.AI.ChatResponse
The response that is to be evaluated.
chatConfiguration Microsoft.Extensions.AI.Evaluation.ChatConfiguration
A Microsoft.Extensions.AI.Evaluation.ChatConfiguration that specifies the Microsoft.Extensions.AI.IChatClient that should be used if one or more composed Microsoft.Extensions.AI.Evaluation.IEvaluators use an AI model to perform evaluation.
additionalContext System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>
Additional contextual information (beyond that which is available in messages) that the Microsoft.Extensions.AI.Evaluation.IEvaluator may need to accurately evaluate the supplied modelResponse.
cancellationToken System.Threading.CancellationToken
A System.Threading.CancellationToken that can cancel the evaluation operation.
Returns¶
System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult>
An Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.
Remarks¶
The Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics contained in the returned Microsoft.Extensions.AI.Evaluation.EvaluationResult should match Microsoft.Extensions.AI.Evaluation.IEvaluator.EvaluationMetricNames.
Also note that chatConfiguration must not be omitted if the evaluation is performed using an AI model.