logo
Nhà Các trường hợp

IBM Introduces Content-Aware-Storage for RAG Workloads

Chứng nhận
Trung Quốc Beijing Qianxing Jietong Technology Co., Ltd. Chứng chỉ
Trung Quốc Beijing Qianxing Jietong Technology Co., Ltd. Chứng chỉ
Khách hàng đánh giá
Các nhân viên kinh doanh của Beijing Qianxing Jietong Technology Co., Ltd rất chuyên nghiệp và kiên nhẫn. Họ có thể cung cấp báo giá một cách nhanh chóng. Chất lượng và bao bì của sản phẩm cũng rất tốt. Sự hợp tác của chúng tôi rất suôn sẻ.

—— 《Festfing DV》 LLC

Khi tôi đang tìm kiếm gấp CPU intel và SSD Toshiba, Sandy từ Beijing Qianxing Jietong Technology Co., Ltd đã giúp đỡ tôi rất nhiều và nhanh chóng nhận được sản phẩm tôi cần. Tôi thực sự đánh giá cao cô ấy.

—— Kitty Yen

Sandy của Beijing Qianxing Jietong Technology Co., Ltd là một nhân viên bán hàng rất cẩn thận, người có thể nhắc nhở tôi về lỗi cấu hình kịp thời khi tôi mua máy chủ. Các kỹ sư cũng rất chuyên nghiệp và có thể nhanh chóng hoàn thành quá trình thử nghiệm.

—— Strelkin Mikhail Vladimirovich

Chúng tôi rất hài lòng với trải nghiệm làm việc với Bắc Kinh Qianxing Jietong. Chất lượng sản phẩm tuyệt vời và giao hàng luôn đúng hẹn. Đội ngũ bán hàng của họ chuyên nghiệp, kiên nhẫn và rất hữu ích với tất cả các câu hỏi của chúng tôi. Chúng tôi thực sự đánh giá cao sự hỗ trợ của họ và mong muốn có một mối quan hệ đối tác lâu dài. Rất khuyến khích!

—— Ahmad Navid

Chất lượng: Kinh nghiệm tuyệt vời với nhà cung cấp của tôi. MikroTik RB3011 đã được sử dụng, nhưng nó ở trong tình trạng rất tốt và mọi thứ hoạt động hoàn hảo.và tất cả những lo ngại của tôi đã được giải quyết nhanh chóng- Nhà cung cấp rất đáng tin cậy.

—— Geran Colesio

Tôi trò chuyện trực tuyến bây giờ

IBM Introduces Content-Aware-Storage for RAG Workloads

April 24, 2026
IBM has unveiled a content-aware storage (CAS) architecture that embeds AI data processing directly within the storage layer. This approach is tailored for retrieval-augmented generation (RAG) workflows, as it integrates document vectorization into the storage system itself—cutting down on the need for external preprocessing pipelines.

CAS transfers a key RAG function—document embedding via large language model (LLM)-based methods—into the storage infrastructure. This allows enterprises to process and index data in its existing location, aligning storage systems with AI-driven workloads and minimizing data movement across different infrastructure tiers. IBM positions this as a means to simplify deployment while boosting performance and enhancing data locality for AI applications.

Vector Database at Scale


At the heart of IBM’s CAS implementation lies a vector database optimized for semantic search. Vector databases support approximate nearest-neighbor (ANN) search, enabling AI systems to retrieve relevant data chunks based on similarity metrics like cosine similarity or L2 distance. This capability is fundamental to RAG, where user queries are converted into vectors and matched against indexed enterprise data to deliver context-aware responses.


trường hợp công ty mới nhất về IBM Introduces Content-Aware-Storage for RAG Workloads  0
                                                                             IBM CAS ChartSource: IBM

IBM Research, in collaboration with Samsung and NVIDIA, showcased a prototype system capable of scaling to 100 billion vectors on a single server. The system achieved over 90 percent recall and precision, with an average query latency of under 700 milliseconds. This scale caters to enterprise environments where datasets can span billions of files and, once fully indexed, grow to hundreds of billions of vectors.

RAG Pipeline Integration


RAG is becoming a favored approach for enterprise AI, as it enhances output accuracy without the need for model retraining. It works by supplementing prompts with enterprise-specific data retrieved from a vector database.

The pipeline starts with data ingestion, where documents such as PDFs and presentations are parsed, split into chunks, and converted into embeddings. These embeddings are stored in a vector database that organizes data for efficient similarity search. During querying, user input is embedded and matched against stored vectors, with relevant content passed to the language model as context. This grounding mechanism reduces hallucinations and increases trust in AI-generated outputs.

IBM’s CAS integrates this entire pipeline directly into storage, consolidating ingestion, indexing, and retrieval in close proximity to the data.

Addressing Scale and Cost Challenges


Enterprise storage systems already operate at petabyte scale. When extended to CAS, each file can generate hundreds of vectors, quickly expanding the dataset size. Traditional vector databases typically scale out across multiple servers, introducing additional costs and operational complexity. Indexing and reindexing large datasets also become time-consuming tasks.

IBM’s approach focuses on improving vector density and reducing indexing overhead to limit infrastructure sprawl. The architecture separates vector and index storage from query compute, enabling independent scaling of storage and compute resources. This is made possible by IBM Storage Scale and its high-performance parallel file system.

Storage and Hardware Architecture


The CAS implementation leverages the IBM Storage Scale System 6000 (ESS 6000), an all-flash platform designed for AI and high-performance workloads. The system supports up to 48 NVMe drives per 4U enclosure, with individual drive capacities ranging from 7 TB to 60 TB. It integrates PCIe Gen5, 400 Gb InfiniBand, or 200 Gb Ethernet connectivity, delivering up to 340 GB/s read and 175 GB/s write throughput per node, along with up to 7 million IOPS.

The platform also supports NVIDIA GPUDirect Storage, facilitating direct data paths between storage and GPUs, as well as BlueField-3 DPUs to offload network and data processing tasks.

Samsung PM9D3a PCIe Gen5 NVMe SSDs provide high-throughput, high-density storage. Based on eighth-generation TLC V-NAND, these drives offer up to 30.72 TB per device, with sequential read speeds of up to 12 GB/s and write speeds of up to 6.8 GB/s. The use of commercially available enterprise SSDs allows the architecture to scale using standard components.

Hierarchical Indexing and GPU Acceleration


To tackle indexing at scale, IBM developed a hierarchical indexing model consisting of multiple sub-indexes that can be optimized independently. This structure enables incremental updates and localized reindexing without disrupting the entire dataset, improving both availability and operational efficiency.

GPU acceleration drastically reduces indexing time compared to CPU-only approaches. Tasks that would take hours on CPUs can be completed in minutes using NVIDIA GPUs. In testing, building indexes for 100 billion vectors took 4 days with 6 NVIDIA H200 GPUs, compared to an estimated 120 days on a dual-socket CPU system.

The full dataset, including vectors and indexes, consumed approximately 153 TiB of storage. Initial data loading and partitioning took nine days. The resulting system delivered an average query latency of 694ms with 90% recall, validated against brute-force ground-truth calculations.

Roadmap


IBM and NVIDIA are continuing to optimize the platform, focusing on reducing indexing and query latency. Current targets include indexing 100 billion or more vectors within a single day, cutting data ingestion time from nine days to one day, and lowering query latency to the 50-100 millisecond range while maintaining 90 percent recall.

Integrating vector indexing into standard file systems aims to simplify deployment and lower barriers to enterprise AI adoption. By embedding RAG capabilities directly into storage, IBM is positioning CAS as a foundational layer for AI-enabled infrastructure.

Beijing Qianxing Jietong Technology Co., Ltd.
Sandy Yang/Global Strategy Director
WhatsApp / WeChat: +86 13426366826
Email: yangyd@qianxingdata.com
Website: www.qianxingdata.com/www.storagesserver.com
Business Focus:
ICT Product Distribution/System Integration & Services/Infrastructure Solutions
With 20+ years of IT distribution experience, we partner with leading global brands to deliver reliable products and professional services.
“Using Technology to Build an Intelligent World”Your Trusted ICT Product Service Provider!
Chi tiết liên lạc
Beijing Qianxing Jietong Technology Co., Ltd.

Người liên hệ: Ms. Sandy Yang

Tel: 13426366826

Gửi yêu cầu thông tin của bạn trực tiếp cho chúng tôi (0 / 3000)