Skip to content

TokenBudgetChatMiddleware

NexusLabs.Needlr.AgentFramework.Workflows

NexusLabs.Needlr.AgentFramework.Workflows.Budget

TokenBudgetChatMiddleware Class

Microsoft.Extensions.AI.DelegatingChatClient that accumulates token usage from each LLM call into an NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker and aborts when the budget is exceeded. Emits NexusLabs.Needlr.AgentFramework.Progress.BudgetUpdatedEvent and NexusLabs.Needlr.AgentFramework.Progress.BudgetExceededEvent to the progress reporter in real-time.

public sealed class TokenBudgetChatMiddleware : Microsoft.Extensions.AI.DelegatingChatClient

Inheritance System.Object 🡒 Microsoft.Extensions.AI.DelegatingChatClient 🡒 TokenBudgetChatMiddleware

Remarks

Budget enforcement uses two mechanisms: 1. System.OperationCanceledException wrapping NexusLabs.Needlr.AgentFramework.Budget.TokenBudgetExceededException thrown from the middleware (works for direct agent runs). 2. NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker.BudgetCancellationToken cancelled when tokens are recorded past the limit (works for MAF workflow runs — pass this token to the workflow).

Limitation: Only GetResponseAsync is budget-tracked. Streaming via GetStreamingResponseAsync passes through without enforcement.

Constructors

TokenBudgetChatMiddleware(IChatClient, ITokenBudgetTracker, IProgressReporterAccessor) Constructor

public TokenBudgetChatMiddleware(Microsoft.Extensions.AI.IChatClient innerClient, NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker tracker, NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor progressAccessor);

Parameters

innerClient Microsoft.Extensions.AI.IChatClient

The inner chat client to delegate to.

tracker NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker

The token budget tracker scoped to the current pipeline run.

progressAccessor NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor

Progress reporter accessor for emitting budget events.

Methods

TokenBudgetChatMiddleware.GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken) Method

Sends chat messages and returns the response.

public override System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse> GetResponseAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatOptions? options, System.Threading.CancellationToken cancellationToken);

Parameters

messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>

The sequence of chat messages to send.

options Microsoft.Extensions.AI.ChatOptions

The chat options with which to configure the request.

cancellationToken System.Threading.CancellationToken

The System.Threading.CancellationToken to monitor for cancellation requests. The default is System.Threading.CancellationToken.None.

Implements GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken)

Returns

System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse>
The response messages generated by the client.

Exceptions

System.ArgumentNullException
messages is null.