Cengiz Ă–zemli
Academic
- Thread Author
- #1
With the widespread adoption of large language models (LLMs) in edge computing, memory constraints have posed a significant obstacle. However, the strategic collaboration between ASRock Industrial and Phison Electronics offers an innovative solution to this problem. This partnership combines edge AI hardware with memory expansion technology, enabling larger local AI models to operate with reduced memory requirements.
─────────────────────────
đź§ Memory Constraints and Solutions in Edge AI
As organizations move AI workloads closer to where data is generated, memory capacity has become a critical limiting factor. Large language models typically require significant DRAM or VRAM resources, which increases system cost and restricts deployment in compact edge devices.
This joint solution overcomes this challenge by combining ASRock Industrial's edge AI platforms with Phison's aiDAPTIV memory expansion architecture. The goal is to expand model capacity beyond traditional system memory, thereby mitigating the memory constraints that limit local AI inference in enterprise and industrial applications.
─────────────────────────
⚙️ AI Memory Expansion Architecture Details
The collaboration integrates ASRock Industrial's AI BOX-A395 platform with Phison's aiDAPTIV technology. According to the companies, this system enables a 120-billion-parameter large language model to run locally with 64 GB of system memory and 85 GB of aiDAPTIV cache memory.
This architecture is designed to reduce traditional memory requirements by up to 50% compared to conventional deployments of similarly sized models. Instead of relying solely on DRAM or VRAM, the solution uses SSD-based memory expansion to store and access model data during inference operations.
By creating an additional high-speed storage layer optimized for AI workloads, the platform can support larger models while preserving system memory resources for operating systems, edge software agents, and concurrent applications.
─────────────────────────
⚡ Edge AI Platforms and Inference Performance
The implementation includes the AI BOX-A395 platform, powered by AMD Ryzen AI Max+ 395 processors, and the NUC Ultra 300 BOX Series, based on Intel Core Ultra Series 3 processors.
The architecture utilizes the Mixture of Experts (MoE) inference approach, allowing only selected model components to be activated during processing tasks. This reduces memory usage and computational load compared to architectures that require full model activation for each inference request.
In configurations equipped with 128 GB of memory, model data can be dynamically streamed to SSD cache resources, enabling the system to allocate available DRAM and VRAM to additional workloads. This allows a single edge device to perform AI inference while simultaneously supporting other software processes.
─────────────────────────
📊 AI System Selection and Deployment Support
To simplify deployment planning, ASRock Industrial will integrate verified aiDAPTIV configurations into its AI-Pathfinder platform. This tool is designed to help customers identify suitable hardware configurations, accelerator cards, GPUs, and edge AI systems based on their workload characteristics and deployment requirements.
The goal is to reduce system design complexity and provide guidance for selecting infrastructure that can support specific AI models and operational environments.
This collaboration marks a significant step forward for the future of edge AI, paving the way for more efficient and powerful local AI applications.


















