TaskCompletionEvaluator
NexusLabs.Needlr.AgentFramework.Evaluation¶
TaskCompletionEvaluator Class¶
LLM-judged evaluator that assesses whether an agent actually accomplished
the task it was given. Unlike MEAI's TaskAdherenceEvaluator (which
checks instruction following), this evaluator checks \<em>task success\</em>:
did the agent produce output that satisfies the original request?
Inheritance System.Object 🡒 TaskCompletionEvaluator
Implements Microsoft.Extensions.AI.Evaluation.IEvaluator
Remarks¶
This evaluator requires a Microsoft.Extensions.AI.Evaluation.ChatConfiguration with a judge Microsoft.Extensions.AI.IChatClient. It sends the original prompt and agent output to the judge with a structured evaluation prompt and parses the response.
When no judge is configured (chatConfiguration is null
or has no Microsoft.Extensions.AI.Evaluation.ChatConfiguration.ChatClient), the evaluator
returns an empty Microsoft.Extensions.AI.Evaluation.EvaluationResult.
Metrics produced:
- Task Completed — boolean. true when the
judge determines the agent accomplished the requested task.
- Task Completion Score — numeric (1–5). How completely and
correctly the agent fulfilled the request. 5 = fully complete, 1 = not started or
completely wrong.
- Task Completion Reasoning — string. The judge's
explanation for the score.
Fields¶
TaskCompletionEvaluator.CompletionThreshold Field¶
Score threshold at or above which the task is considered completed.
Field Value¶
TaskCompletionEvaluator.TaskCompletedMetricName Field¶
Metric name for the boolean task-completed flag.
Field Value¶
TaskCompletionEvaluator.TaskCompletionReasoningMetricName Field¶
Metric name for the judge's reasoning.
Field Value¶
TaskCompletionEvaluator.TaskCompletionScoreMetricName Field¶
Metric name for the numeric 1–5 completion score.
Field Value¶
Properties¶
TaskCompletionEvaluator.EvaluationMetricNames Property¶
Gets the Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics produced by this Microsoft.Extensions.AI.Evaluation.IEvaluator.
Implements EvaluationMetricNames
Property Value¶
System.Collections.Generic.IReadOnlyCollection<System.String>
Methods¶
TaskCompletionEvaluator.EvaluateAsync(IEnumerable<ChatMessage>, ChatResponse, ChatConfiguration, IEnumerable<EvaluationContext>, CancellationToken) Method¶
Evaluates the supplied modelResponse and returns an Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.
public System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult> EvaluateAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatResponse modelResponse, Microsoft.Extensions.AI.Evaluation.ChatConfiguration? chatConfiguration=null, System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>? additionalContext=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));
Parameters¶
messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>
The conversation history including the request that produced the supplied modelResponse.
modelResponse Microsoft.Extensions.AI.ChatResponse
The response that is to be evaluated.
chatConfiguration Microsoft.Extensions.AI.Evaluation.ChatConfiguration
A Microsoft.Extensions.AI.Evaluation.ChatConfiguration that specifies the Microsoft.Extensions.AI.IChatClient that should be used if one or more composed Microsoft.Extensions.AI.Evaluation.IEvaluators use an AI model to perform evaluation.
additionalContext System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.Evaluation.EvaluationContext>
Additional contextual information (beyond that which is available in messages) that the Microsoft.Extensions.AI.Evaluation.IEvaluator may need to accurately evaluate the supplied modelResponse.
cancellationToken System.Threading.CancellationToken
A System.Threading.CancellationToken that can cancel the evaluation operation.
Returns¶
System.Threading.Tasks.ValueTask<Microsoft.Extensions.AI.Evaluation.EvaluationResult>
An Microsoft.Extensions.AI.Evaluation.EvaluationResult containing one or more Microsoft.Extensions.AI.Evaluation.EvaluationMetrics.
Remarks¶
The Microsoft.Extensions.AI.Evaluation.EvaluationMetric.Names of the Microsoft.Extensions.AI.Evaluation.EvaluationMetrics contained in the returned Microsoft.Extensions.AI.Evaluation.EvaluationResult should match Microsoft.Extensions.AI.Evaluation.IEvaluator.EvaluationMetricNames.
Also note that chatConfiguration must not be omitted if the evaluation is performed using an AI model.