ToolCallTrajectoryEvaluator
NexusLabs.Needlr.AgentFramework.Evaluation¶
ToolCallTrajectoryEvaluator Class¶
Deterministic evaluator that scores the tool-call trajectory of an agent run from the captured NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics snapshot carried in an AgentRunDiagnosticsContext.
Inheritance System.Object 🡒 ToolCallTrajectoryEvaluator
Implements Microsoft.Extensions.AI.Evaluation.IEvaluator
Remarks¶
This evaluator never contacts a language model. It reads the ordered
NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics.ToolCalls collection and produces:
- Tool Calls Total — total number of tool invocations.
- Tool Calls Failed — count of tool invocations whose NexusLabs.Needlr.AgentFramework.Diagnostics.ToolCallDiagnostics.Succeeded is false.
- Tool Call Sequence Gaps — number of missing slots in the NexusLabs.Needlr.AgentFramework.Diagnostics.ToolCallDiagnostics.Sequence stream (a strictly increasing sequence starting at 0 has zero gaps).
- All Tool Calls Succeeded — boolean rollup. true when every tool invocation succeeded (or when no tool calls occurred).
When no AgentRunDiagnosticsContext is present in the
additionalContext collection, the evaluator returns an empty
Microsoft.Extensions.AI.Evaluation.EvaluationResult — callers should treat that as "not applicable".
Fields¶
ToolCallTrajectoryEvaluator.AllSucceededMetricName Field¶
Metric name for the boolean rollup indicating every tool call succeeded.
Field Value¶
ToolCallTrajectoryEvaluator.FailedMetricName Field¶
Metric name for the failed tool-call count.
Field Value¶
ToolCallTrajectoryEvaluator.SequenceGapsMetricName Field¶
Metric name for the number of gaps in the recorded tool-call sequence.
Field Value¶
ToolCallTrajectoryEvaluator.TotalMetricName Field¶
Metric name for the total tool-call count.
Field Value¶
Properties¶
ToolCallTrajectoryEvaluator.EvaluationMetricNames Property¶
Gets the Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics produced by this Microsoft.Extensions.AI.Evaluation.IEvaluator.
Implements EvaluationMetricNames
Property Value¶
System.Collections.Generic.IReadOnlyCollection<System.String>
Methods¶
ToolCallTrajectoryEvaluator.EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken) Method¶
Evaluates the supplied modelResponse and returns an Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.
public System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult> EvaluateAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatResponse modelResponse, Microsoft.Extensions.AI.Evaluation.ChatConfiguration? chatConfiguration=null, System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>? additionalContext=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));
Parameters¶
messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>
The conversation history including the request that produced the supplied modelResponse.
modelResponse Microsoft.Extensions.AI.ChatResponse
The response that is to be evaluated.
chatConfiguration Microsoft.Extensions.AI.Evaluation.ChatConfiguration
A Microsoft.Extensions.AI.Evaluation.ChatConfiguration that specifies the Microsoft.Extensions.AI.IChatClient that should be used if one or more composed Microsoft.Extensions.AI.Evaluation.IEvaluators use an AI model to perform evaluation.
additionalContext System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>
Additional contextual information (beyond that which is available in messages) that the Microsoft.Extensions.AI.Evaluation.IEvaluator may need to accurately evaluate the supplied modelResponse.
cancellationToken System.Threading.CancellationToken
A System.Threading.CancellationToken that can cancel the evaluation operation.
Returns¶
System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult>
An Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.
Remarks¶
The Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics contained in the returned Microsoft.Extensions.AI.Evaluation.EvaluationResult should match Microsoft.Extensions.AI.Evaluation.IEvaluator.EvaluationMetricNames.
Also note that chatConfiguration must not be omitted if the evaluation is performed using an AI model.