Skip to content

Needlr

TaskCompletionEvaluator

ncosentino/needlr

TaskCompletionEvaluator

NexusLabs.Needlr.AgentFramework.Evaluation ¶

TaskCompletionEvaluator Class¶

LLM-judged evaluator that assesses whether an agent actually accomplished the task it was given. Unlike MEAI's TaskAdherenceEvaluator (which checks instruction following), this evaluator checks \<em>task success\</em>: did the agent produce output that satisfies the original request?

public sealed class TaskCompletionEvaluator : Microsoft.Extensions.AI.Evaluation.IEvaluator

Inheritance System.Object 🡒 TaskCompletionEvaluator

Implements Microsoft.Extensions.AI.Evaluation.IEvaluator

Remarks¶

This evaluator requires a Microsoft.Extensions.AI.Evaluation.ChatConfiguration with a judge Microsoft.Extensions.AI.IChatClient. It sends the original prompt and agent output to the judge with a structured evaluation prompt and parses the response.

When no judge is configured (chatConfiguration is null or has no Microsoft.Extensions.AI.Evaluation.ChatConfiguration.ChatClient), the evaluator returns an empty Microsoft.Extensions.AI.Evaluation.EvaluationResult.

Metrics produced: - Task Completed — boolean. true when the judge determines the agent accomplished the requested task. - Task Completion Score — numeric (1–5). How completely and correctly the agent fulfilled the request. 5 = fully complete, 1 = not started or completely wrong. - Task Completion Reasoning — string. The judge's explanation for the score.

Fields¶

TaskCompletionEvaluator.CompletionThreshold Field¶

Score threshold at or above which the task is considered completed.

public const int CompletionThreshold = 3;

Field Value¶

TaskCompletionEvaluator.TaskCompletedMetricName Field¶

Metric name for the boolean task-completed flag.

public const string TaskCompletedMetricName = "Task Completed";

Field Value¶

TaskCompletionEvaluator.TaskCompletionReasoningMetricName Field¶

Metric name for the judge's reasoning.

public const string TaskCompletionReasoningMetricName = "Task Completion Reasoning";

Field Value¶

TaskCompletionEvaluator.TaskCompletionScoreMetricName Field¶

Metric name for the numeric 1–5 completion score.

public const string TaskCompletionScoreMetricName = "Task Completion Score";

Field Value¶

Properties¶

TaskCompletionEvaluator.EvaluationMetricNames Property¶

Gets the Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics produced by this Microsoft.Extensions.AI.Evaluation.IEvaluator.

public System.Collections.Generic.IReadOnlyCollection<string> EvaluationMetricNames { get; }

Implements EvaluationMetricNames

Property Value¶

System.Collections.Generic.IReadOnlyCollection<System.String >

Methods¶

TaskCompletionEvaluator.EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken) Method¶

Evaluates the supplied modelResponse and returns an Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.

public System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult> EvaluateAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatResponse modelResponse, Microsoft.Extensions.AI.Evaluation.ChatConfiguration? chatConfiguration=null, System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>? additionalContext=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));

Parameters¶

messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage >

The conversation history including the request that produced the supplied modelResponse.

modelResponse Microsoft.Extensions.AI.ChatResponse

The response that is to be evaluated.

chatConfiguration Microsoft.Extensions.AI.Evaluation.ChatConfiguration

A Microsoft.Extensions.AI.Evaluation.ChatConfiguration that specifies the Microsoft.Extensions.AI.IChatClient that should be used if one or more composed Microsoft.Extensions.AI.Evaluation.IEvaluators use an AI model to perform evaluation.

additionalContext System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext >

Additional contextual information (beyond that which is available in messages) that the Microsoft.Extensions.AI.Evaluation.IEvaluator may need to accurately evaluate the supplied modelResponse.

cancellationToken System.Threading.CancellationToken

A System.Threading.CancellationToken that can cancel the evaluation operation.

Implements EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken)

Returns¶

System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult >
An Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.

Remarks¶

The Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics contained in the returned Microsoft.Extensions.AI.Evaluation.EvaluationResult should match Microsoft.Extensions.AI.Evaluation.IEvaluator.EvaluationMetricNames.

Also note that chatConfiguration must not be omitted if the evaluation is performed using an AI model.