Skip to content

ContextWindowGuardMiddleware

NexusLabs.Needlr.AgentFramework.Workflows

NexusLabs.Needlr.AgentFramework.Workflows.Budget

ContextWindowGuardMiddleware Class

Microsoft.Extensions.AI.DelegatingChatClient safety net that estimates cumulative context size across LLM calls and emits a warning when approaching a configurable limit. Optionally prunes oldest non-system messages to keep context under the limit.

public sealed class ContextWindowGuardMiddleware : Microsoft.Extensions.AI.DelegatingChatClient

Inheritance System.Object 🡒 Microsoft.Extensions.AI.DelegatingChatClient 🡒 ContextWindowGuardMiddleware

Remarks

This middleware is a safety net for FunctionInvokingChatClient (FIC) usage where conversation history accumulates. It does NOT replace the iterative loop pattern — prefer NexusLabs.Needlr.AgentFramework.Iterative.IIterativeAgentLoop for tool-heavy stages. Use this middleware on stages that remain FIC-based as a guard against context window overflow.

Token estimation is approximate: each message's text content length is divided by CharsPerToken (default 4) since exact tokenization requires a model-specific tokenizer. This is conservative — it may trigger warnings earlier than necessary, but never later.

Constructors

ContextWindowGuardMiddleware(IChatClient, int, IProgressReporterAccessor, double, bool) Constructor

public ContextWindowGuardMiddleware(Microsoft.Extensions.AI.IChatClient innerClient, int maxContextTokens, NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor progressAccessor, double warningThreshold=0.8, bool pruneOnOverflow=false);

Parameters

innerClient Microsoft.Extensions.AI.IChatClient

The inner chat client to delegate to.

maxContextTokens System.Int32

Estimated maximum context window size in tokens. When the message list exceeds this, a warning is emitted and optionally oldest messages are pruned.

progressAccessor NexusLabs.Needlr.AgentFramework.Progress.IProgressReporterAccessor

Progress reporter for emitting warning events.

warningThreshold System.Double

Fraction of maxContextTokens at which to emit a warning. Defaults to 0.8 (80%).

pruneOnOverflow System.Boolean

When true, automatically removes oldest non-system messages to keep estimated context under maxContextTokens. Defaults to false (warn only).

Properties

ContextWindowGuardMiddleware.CharsPerToken Property

Approximate characters per token for estimation. Defaults to 4.

public int CharsPerToken { get; set; }

Property Value

System.Int32

Methods

ContextWindowGuardMiddleware.GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken) Method

Sends chat messages and returns the response.

public override System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse> GetResponseAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatOptions? options=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));

Parameters

messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>

The sequence of chat messages to send.

options Microsoft.Extensions.AI.ChatOptions

The chat options with which to configure the request.

cancellationToken System.Threading.CancellationToken

The System.Threading.CancellationToken to monitor for cancellation requests. The default is System.Threading.CancellationToken.None.

Implements GetResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken)

Returns

System.Threading.Tasks.Task<Microsoft.Extensions.AI.ChatResponse>
The response messages generated by the client.

Exceptions

System.ArgumentNullException
messages is null.

ContextWindowGuardMiddleware.GetStreamingResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken) Method

Sends chat messages and streams the response.

public override System.Collections.Generic.IAsyncEnumerable<Microsoft.Extensions.AI.ChatResponseUpdate> GetStreamingResponseAsync(System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage> messages, Microsoft.Extensions.AI.ChatOptions? options=null, System.Threading.CancellationToken cancellationToken=default(System.Threading.CancellationToken));

Parameters

messages System.Collections.Generic.IEnumerable<Microsoft.Extensions.AI.ChatMessage>

The sequence of chat messages to send.

options Microsoft.Extensions.AI.ChatOptions

The chat options with which to configure the request.

cancellationToken System.Threading.CancellationToken

The System.Threading.CancellationToken to monitor for cancellation requests. The default is System.Threading.CancellationToken.None.

Implements GetStreamingResponseAsync(IEnumerable<ChatMessage>, ChatOptions, CancellationToken)

Returns

System.Collections.Generic.IAsyncEnumerable<Microsoft.Extensions.AI.ChatResponseUpdate>
The response messages generated by the client.

Exceptions

System.ArgumentNullException
messages is null.