TokenBudgetChatMiddleware
NexusLabs.Needlr.AgentFramework.Workflows¶
NexusLabs.Needlr.AgentFramework.Workflows.Budget¶
TokenBudgetChatMiddleware Class¶
Microsoft.Extensions.AI.DelegatingChatClient that accumulates token usage from each LLM call into an NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker and aborts when the budget is exceeded. Emits NexusLabs.Needlr.AgentFramework.Progress.BudgetUpdatedEvent and NexusLabs.Needlr.AgentFramework.Progress.BudgetExceededEvent to the progress reporter in real-time.
Inheritance System.Object 🡒 Microsoft.Extensions.AI.DelegatingChatClient 🡒 TokenBudgetChatMiddleware
Remarks¶
Budget enforcement uses two mechanisms: 1. System.OperationCanceledException wrapping NexusLabs.Needlr.AgentFramework.Budget.TokenBudgetExceededException thrown from the middleware (works for direct agent runs). 2. NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker.BudgetCancellationToken cancelled when tokens are recorded past the limit (works for MAF workflow runs — pass this token to the workflow).
Limitation: Only GetResponseAsync is budget-tracked.
Streaming via GetStreamingResponseAsync passes through without enforcement.
Constructors¶
TokenBudgetChatMiddleware(IChatClient, ITokenBudgetTracker, IProgressReporterAccessor) Constructor¶
public TokenBudgetChatMiddleware(Microsoft.Extensions.AI.IChatClient innerClient, NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker tracker, NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor progressAccessor);
Parameters¶
innerClient Microsoft.Extensions.AI.IChatClient
The inner chat client to delegate to.
tracker NexusLabs.Needlr.AgentFramework.Budget.ITokenBudgetTracker
The token budget tracker scoped to the current pipeline run.
progressAccessor NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor
Progress reporter accessor for emitting budget events.
Methods¶
TokenBudgetChatMiddleware.GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken) Method¶
Sends chat messages and returns the response.
public override System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse> GetResponseAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatOptions? options, System.Threading.CancellationToken cancellationToken);
Parameters¶
messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>
The sequence of chat messages to send.
options Microsoft.Extensions.AI.ChatOptions
The chat options with which to configure the request.
cancellationToken System.Threading.CancellationToken
The System.Threading.CancellationToken to monitor for cancellation requests. The default is System.Threading.CancellationToken.None.
Implements GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken)
Returns¶
System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse>
The response messages generated by the client.