MetaKernel: Enabling Efficient Encrypted Neural Network Inference through Unified MVM and Convolution

Aug 20, 2025 • cmplr

Abstract

Practical encrypted neural network inference under the CKKS fully homomorphic encryption (FHE) scheme relies heavily on accelerating two key kernel operations: Matrix-Vector Multiplication (MVM) and Convolution (Conv). However, existing solutions—such as expert-tuned libraries and domain-specific languages—are designed in an ad hoc manner, leading to significant inefficiencies caused by excessive rotations.

We introduce MKR, a novel composition-based compiler approach that optimizes MVM and Conv kernel operations for DNN models under CKKS within a unified framework. MKR decomposes each kernel into composable units, called MetaKernels, to enhance SIMD parallelism within ciphertexts (via horizontal batching) and computational parallelism across them (via vertical batching). Our approach tackles previously unaddressed challenges, including reducing rotation overhead through a rotation-aware cost model for data packing, while also ensuring high slot utilization, uniform handling of inputs with arbitrary sizes, and compatibility with the output tensor layout. Implemented in a production-quality FHE compiler, MKR achieves inference time speedups of 10.08×–185.60× for individual MVM and Conv kernels and 1.75×–11.84× for end-to-end inference compared to a state-of-the-art FHE compiler. Moreover, MKR enables homomorphic execution of large DNN models, where prior methods fail, significantly advancing the practicality of FHE compilers.

ACM Reference Format

Peng Yuan, Yan Liu, JianXin Lai, Long Li, Tianxiang Sui, Linjie Xiao, Xiaojing Zhang, Qing Zhu, and Jingling Xue. 2025. MetaKernel: Enabling Efficient Encrypted Neural Network Inference through Unified MVM and Convolution. Proc. ACM Program. Lang. 9, OOPSLA2, Article 317 (October 2025), 28 pages. https://doi.org/10. 1145/3763095

[Paper Download]