ContextWindowGuardMiddleware
NexusLabs.Needlr.AgentFramework.Workflows¶
NexusLabs.Needlr.AgentFramework.Workflows.Budget¶
ContextWindowGuardMiddleware Class¶
Microsoft.Extensions.AI.DelegatingChatClient safety net that estimates cumulative context size across LLM calls and emits a warning when approaching a configurable limit. Optionally prunes oldest non-system messages to keep context under the limit.
Inheritance System.Object 🡒 Microsoft.Extensions.AI.DelegatingChatClient 🡒 ContextWindowGuardMiddleware
Remarks¶
This middleware is a safety net for FunctionInvokingChatClient (FIC) usage
where conversation history accumulates. It does NOT replace the iterative loop
pattern — prefer NexusLabs.Needlr.AgentFramework.Iterative.IIterativeAgentLoop
for tool-heavy stages. Use this middleware on stages that remain FIC-based as a
guard against context window overflow.
Token estimation is approximate: each message's text content length is divided by CharsPerToken (default 4) since exact tokenization requires a model-specific tokenizer. This is conservative — it may trigger warnings earlier than necessary, but never later.
Constructors¶
ContextWindowGuardMiddleware(IChatClient, int, IProgressReporterAccessor, double, bool) Constructor¶
public ContextWindowGuardMiddleware(Microsoft.Extensions.AI.IChatClient innerClient, int maxContextTokens, NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor progressAccessor, double warningThreshold=0.8, bool pruneOnOverflow=false);
Parameters¶
innerClient Microsoft.Extensions.AI.IChatClient
The inner chat client to delegate to.
maxContextTokens System.Int32
Estimated maximum context window size in tokens. When the message list exceeds this, a warning is emitted and optionally oldest messages are pruned.
progressAccessor NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor
Progress reporter for emitting warning events.
warningThreshold System.Double
Fraction of maxContextTokens at which to emit a warning.
Defaults to 0.8 (80%).
pruneOnOverflow System.Boolean
When true, automatically removes oldest non-system messages to keep estimated context under maxContextTokens. Defaults to false (warn only).
Properties¶
ContextWindowGuardMiddleware.CharsPerToken Property¶
Approximate characters per token for estimation. Defaults to 4.
Property Value¶
Methods¶
ContextWindowGuardMiddleware.GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken) Method¶
Sends chat messages and returns the response.
public override System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse> GetResponseAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatOptions? options=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));
Parameters¶
messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>
The sequence of chat messages to send.
options Microsoft.Extensions.AI.ChatOptions
The chat options with which to configure the request.
cancellationToken System.Threading.CancellationToken
The System.Threading.CancellationToken to monitor for cancellation requests. The default is System.Threading.CancellationToken.None.
Implements GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken)
Returns¶
System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse>
The response messages generated by the client.
Exceptions¶
System.ArgumentNullException
messages is null.
ContextWindowGuardMiddleware.GetStreamingResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken) Method¶
Sends chat messages and streams the response.
public override System.Collections.Generic.IAsyncEnumerable<Microsoft.Extensions.AI.ChatResponseUpdate> GetStreamingResponseAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatOptions? options=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));
Parameters¶
messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>
The sequence of chat messages to send.
options Microsoft.Extensions.AI.ChatOptions
The chat options with which to configure the request.
cancellationToken System.Threading.CancellationToken
The System.Threading.CancellationToken to monitor for cancellation requests. The default is System.Threading.CancellationToken.None.
Implements GetStreamingResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken)
Returns¶
System.Collections.Generic.IAsyncEnumerable<Microsoft.Extensions.AI.ChatResponseUpdate>
The response messages generated by the client.