ToolCallTrajectoryEvaluator

NexusLabs.Needlr.AgentFramework.Evaluation ¶

ToolCallTrajectoryEvaluator Class¶

Deterministic evaluator that scores the tool-call trajectory of an agent run from the captured NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics snapshot carried in an AgentRunDiagnosticsContext.

public sealed class ToolCallTrajectoryEvaluator : Microsoft.Extensions.AI.Evaluation.IEvaluator

Inheritance System.Object 🡒 ToolCallTrajectoryEvaluator

Implements Microsoft.Extensions.AI.Evaluation.IEvaluator

Remarks¶

This evaluator never contacts a language model. It reads the ordered NexusLabs.Needlr.AgentFramework.Diagnostics.IAgentRunDiagnostics.ToolCalls collection and produces: - Tool Calls Total — total number of tool invocations. - Tool Calls Failed — count of tool invocations whose NexusLabs.Needlr.AgentFramework.Diagnostics.ToolCallDiagnostics.Succeeded is false. - Tool Call Sequence Gaps — number of missing slots in the NexusLabs.Needlr.AgentFramework.Diagnostics.ToolCallDiagnostics.Sequence stream (a strictly increasing sequence starting at 0 has zero gaps). - All Tool Calls Succeeded — boolean rollup. true when every tool invocation succeeded (or when no tool calls occurred).

When no AgentRunDiagnosticsContext is present in the additionalContext collection, the evaluator returns an empty Microsoft.Extensions.AI.Evaluation.EvaluationResult — callers should treat that as "not applicable".

Fields¶

ToolCallTrajectoryEvaluator.AllSucceededMetricName Field¶

Metric name for the boolean rollup indicating every tool call succeeded.

public const string AllSucceededMetricName = "All Tool Calls Succeeded";

Field Value¶

System.String

ToolCallTrajectoryEvaluator.FailedMetricName Field¶

Metric name for the failed tool-call count.

public const string FailedMetricName = "Tool Calls Failed";

Field Value¶

System.String

ToolCallTrajectoryEvaluator.SequenceGapsMetricName Field¶

Metric name for the number of gaps in the recorded tool-call sequence.

public const string SequenceGapsMetricName = "Tool Call Sequence Gaps";

Field Value¶

System.String

ToolCallTrajectoryEvaluator.TotalMetricName Field¶

Metric name for the total tool-call count.

public const string TotalMetricName = "Tool Calls Total";

Field Value¶

System.String

Properties¶

ToolCallTrajectoryEvaluator.EvaluationMetricNames Property¶

Gets the Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics produced by this Microsoft.Extensions.AI.Evaluation.IEvaluator.

public System.Collections.Generic.IReadOnlyCollection<string> EvaluationMetricNames { get; }

Implements EvaluationMetricNames

Property Value¶

System.Collections.Generic.IReadOnlyCollection<System.String >

Methods¶

ToolCallTrajectoryEvaluator.EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken) Method¶

Evaluates the supplied modelResponse and returns an Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.

public System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult> EvaluateAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatResponse modelResponse, Microsoft.Extensions.AI.Evaluation.ChatConfiguration? chatConfiguration=null, System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>? additionalContext=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));

Parameters¶

messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage >

The conversation history including the request that produced the supplied modelResponse.

modelResponse Microsoft.Extensions.AI.ChatResponse

The response that is to be evaluated.

chatConfiguration Microsoft.Extensions.AI.Evaluation.ChatConfiguration

A Microsoft.Extensions.AI.Evaluation.ChatConfiguration that specifies the Microsoft.Extensions.AI.IChatClient that should be used if one or more composed Microsoft.Extensions.AI.Evaluation.IEvaluators use an AI model to perform evaluation.

additionalContext System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext >

Additional contextual information (beyond that which is available in messages) that the Microsoft.Extensions.AI.Evaluation.IEvaluator may need to accurately evaluate the supplied modelResponse.

cancellationToken System.Threading.CancellationToken

A System.Threading.CancellationToken that can cancel the evaluation operation.

Implements EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken)

Returns¶

System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult >
An Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.

Remarks¶

The Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics contained in the returned Microsoft.Extensions.AI.Evaluation.EvaluationResult should match Microsoft.Extensions.AI.Evaluation.IEvaluator.EvaluationMetricNames.

Also note that chatConfiguration must not be omitted if the evaluation is performed using an AI model.

ToolCallTrajectoryEvaluator

NexusLabs.Needlr.AgentFramework.Evaluation¶

ToolCallTrajectoryEvaluator Class¶

Remarks¶

Fields¶

ToolCallTrajectoryEvaluator.AllSucceededMetricName Field¶

Field Value¶

ToolCallTrajectoryEvaluator.FailedMetricName Field¶

Field Value¶

ToolCallTrajectoryEvaluator.SequenceGapsMetricName Field¶

Field Value¶

ToolCallTrajectoryEvaluator.TotalMetricName Field¶

Field Value¶

Properties¶

ToolCallTrajectoryEvaluator.EvaluationMetricNames Property¶

Property Value¶

Methods¶

ToolCallTrajectoryEvaluator.EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken) Method¶

Parameters¶

Returns¶

Remarks¶

NexusLabs.Needlr.AgentFramework.Evaluation ¶