..

センサーネットワークとデータ通信の国際ジャーナル

原稿を提出する arrow_forward arrow_forward ..

A Compiler-Aware Framework of Network Pruning and Architecture Search for Mobile Acceleration

Abstract

Zhengang Li

With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently and do not fully consider compiler-level optimization which is a must-do for mobile acceleration. In this work, we propose NPAS, a compiler-aware unified network pruning and architecture search and the corresponding comprehensive compiler optimizations supporting different DNNs and different pruning schemes, which bridge the gap of weight pruning and NAS. Our framework achieves 6.7 ms, 5.9 ms, and 3.9 ms ImageNet inference times with 78%, 75% (MobileNet-V3 level), and 71% (MobileNet-V2 level) Top-1 accuracy respectively on an off-the-shelf mobile phone, consistently outperforming prior work.

免責事項: この要約は人工知能ツールを使用して翻訳されており、まだレビューまたは確認されていません

この記事をシェアする

インデックス付き

arrow_upward arrow_upward