• AI Search
  • Cryptocurrency
  • Earnings
  • Enterprise
  • About TechBooky
  • Submit Article
  • Advertise Here
  • Contact Us
TechBooky
  • African
  • AI
  • Metaverse
  • Gadgets
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
  • African
  • AI
  • Metaverse
  • Gadgets
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
TechBooky
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Home Artificial Intelligence

Apple Enhances On-Device AI for Better Context in iOS 26.4

Paul Balo by Paul Balo
March 23, 2026
in Artificial Intelligence
Share on FacebookShare on Twitter

Apple is tightening up how developers manage the limited context window for its on-device Foundation Models, introducing new tools in iOS 26.4 Release Candidate that make token usage easier to track and control.

Like most large language models, Apple’s Foundation Models rely on a context window – the fixed amount of tokens available to hold system instructions, user prompts and model responses. On Apple’s on-device models, that window is relatively small at 4,096 tokens. In chat-style apps where prompts and replies accumulate, that capacity can be exhausted quickly.

When the limit is hit, the framework throws an .exceededContextWindowSize error and the model can no longer respond within the same session. To recover, developers must spin up a new session and re-establish the necessary state so the user’s workflow can continue without a jarring interruption.

Apple’s recent work pushes developers to treat the context window as a constrained resource, much like memory in a low-resource system. Instead of assuming the model will always have room, apps are expected to plan for how that space is used and reclaimed over time.

Apple has previously published technical guidance with practical strategies for working within the limit. Those recommendations include:

  • Splitting large tasks into multiple language model sessions instead of trying to handle everything in one long conversation.
  • Requesting shorter answers from the model to reduce token consumption per response.
  • Trimming prompts, for example by summarising earlier parts of a conversation or keeping only the most relevant turns.
  • Using tool calling efficiently so the model doesn’t waste tokens on unnecessary context.

These approaches help reduce the likelihood of hitting the 4,096-token ceiling, but they don’t remove the need for precise accounting. Developers still have to understand what is contributing to token usage at any given time.

iOS 26.4 RC adds new capabilities to the Foundation Models framework aimed squarely at that problem. A new contextSize property on SystemLanguageModel exposes the available context capacity. Rather than hard-coding the 4,096-token maximum, apps can query contextSize directly, making token-aware logic more robust against future changes.

Complementing that, a tokenCount(for:) method lets developers measure how many tokens a given input will consume. This becomes the basis for what is effectively token bookkeeping: before sending prompts, tools or other data to the model, the app can estimate their token cost and adapt accordingly.

According to a practical walkthrough by developer Artem Novichkov, effective context management means accounting for every element that contributes to the window. That includes the system prompt, all user instructions and the model’s own responses. It also extends to tool usage, which can be a hidden source of token bloat.

When tools are involved, their definitions – including the tool’s name, description and argument schema – are serialised and sent alongside the instructions. This additional metadata can significantly increase the token count, eating into the context budget faster than developers might expect.

Novichkov’s article refers to a tokenUsage(for:) method; in the latest iOS 26.4 Release Candidate, that API appears under the name tokenCount(for:). The new additions to the Foundation Models framework are marked with @backDeployed(before: iOS 26.4, macOS 26.4, visionOS 26.4), which makes them available on earlier OS versions that already support the framework, not just on devices running iOS 26.4 and its desktop and visionOS counterparts.

This combination – knowing the actual context capacity via contextSize and measuring consumption via tokenCount(for:) – gives developers the raw data they need to manage the 4,096-token window more intelligently. It does not fully solve the complexity of deciding what to keep, summarise or discard in a live conversation, but it lays the groundwork for more predictable, user-friendly on-device AI experiences.

Related Posts:

  • 1701928885-3932
    Google Expands Gemini 2.0 with Advanced AI Models
  • open-ai-gpt-4-turbo
    OpenAI Unleashes GPT-4 Turbo, See All The Pricing Here
  • assets_task_01jryqpar7fd1vr3zjb9wj416t_img_0
    OpenAI Unveils GPT-4.1, Its Flagship AI Model
  • 63915-132944-WWDC-2025----June-9-_-Apple-8-42-screenshot-xl
    Apple Launches On-Device AI Framework for Developers
  • 5.4_Thinking_Art_Card
    OpenAI Debuts GPT-5.4 With Pro & Thinking Tiers
  • Apple-Intelligence-860x488
    Gemini and ChatGPT Lead Apple by Two Years in AI Race
  • 873fde1146dd5b90ddec6d4b29f82761
    Apple Intelligence Is The Latest AI Feature For…
  • Apple Intelligence
    Apple Uses Privacy Techniques to Enhance…

Discover more from TechBooky

Subscribe to get the latest posts sent to your email.

Tags: AIAppleios 26.4
Paul Balo

Paul Balo

Paul Balo is the founder of TechBooky and a highly skilled wireless communications professional with a strong background in cloud computing, offering extensive experience in designing, implementing, and managing wireless communication systems.

BROWSE BY CATEGORIES

Receive top tech news directly in your inbox

subscription from
Loading

Freshly Squeezed

  • OnlyFans Owner Leonid Radvinsky Dies at 43 After Cancer Battle March 23, 2026
  • FBI Warns of Handala Hackers Using Telegram for Malware March 23, 2026
  • Moniepoint Acquires Orda’s Nigeria Business to Expand in Restaurant Tech March 23, 2026
  • Apple Enhances On-Device AI for Better Context in iOS 26.4 March 23, 2026
  • Galaxy S26 Gets AirDrop-Like Sharing via Google Quick Share March 23, 2026
  • Jury Finds Musk Misled Twitter Investors Before Buyout March 21, 2026
  • Meta’s Instagram U-Turn on Encryption Raises Privacy Concerns March 21, 2026
  • Pinterest CEO Supports Under-16 Ban but Excludes Pinterest March 21, 2026
  • Blue Origin’s Project Sunrise Aims To Put AI Data Centres In Orbit March 21, 2026
  • OpenAI Plans Desktop Super App Combining ChatGPT and Codex March 21, 2026
  • South Africa Reviews Canal+–MultiChoice Deal Amid Showmax Concerns March 19, 2026
  • Baidu, Tencent Boost AI Push Amid OpenClaw Boom March 19, 2026

Browse Archives

March 2026
MTWTFSS
 1
2345678
9101112131415
16171819202122
23242526272829
3031 
« Feb    

Quick Links

  • About TechBooky
  • Advertise Here
  • Contact us
  • Submit Article
  • Privacy Policy
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
  • African
  • Artificial Intelligence
  • Gadgets
  • Metaverse
  • Tips
  • AI Search
  • About TechBooky
  • Advertise Here
  • Submit Article
  • Contact us

© 2025 Designed By TechBooky Elite

Discover more from TechBooky

Subscribe now to keep reading and get access to the full archive.

Continue reading

We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.