Direkt zum Inhalt

Ibrahim, Muhammad Sohail ; Usman, Muhammad ; Lee, Jeong-A.

USEFUSE: Uniform stride for enhanced performance in fused layer architecture of deep neural networks

Ibrahim, Muhammad Sohail, Usman, Muhammad und Lee, Jeong-A. (2025) USEFUSE: Uniform stride for enhanced performance in fused layer architecture of deep neural networks. Journal of Systems Architecture 166, S. 103459.

Veröffentlichungsdatum dieses Volltextes: 04 Jun 2025 09:15
Artikel
DOI zum Zitieren dieses Dokuments: 10.5283/epub.76770


Zusammenfassung

Convolutional Neural Networks (CNNs) are crucial in various applications, but deploying them on resource-constrained edge devices poses challenges. This study presents the Sum-of-Products (SOP) units for convolution, which utilize low-latency left-to-right bit-serial arithmetic to minimize response time and enhance overall performance. The study proposes a methodology for fusing multiple ...

Convolutional Neural Networks (CNNs) are crucial in various applications, but deploying them on resource-constrained edge devices poses challenges. This study presents the Sum-of-Products (SOP) units for convolution, which utilize low-latency left-to-right bit-serial arithmetic to minimize response time and enhance overall performance. The study proposes a methodology for fusing multiple convolution layers to reduce off-chip memory communication and increase the overall performance. An effective mechanism detects and skips inefficient convolutions after ReLU layers, minimizing power consumption without compromising accuracy. Additionally, efficient tile movement guarantees uniform access to the fusion pyramid. An analysis demonstrates the uniform stride strategy improves operational intensity. Two designs cater to varied demands: one focuses on minimal response time for mission-critical applications, and another focuses on resource-constrained devices with comparable latency. This approach notably reduced redundant computations, improving the efficiency of CNN deployment on edge devices.



Beteiligte Einrichtungen


Details

DokumentenartArtikel
Titel eines Journals oder einer ZeitschriftJournal of Systems Architecture
Verlag:Elsevier
Band:166
Seitenbereich:S. 103459
Datum27 Mai 2025
InstitutionenInformatik und Data Science > Fachbereich Bioinformatik > Lehrstuhl für Bildverarbeitung (Prof. Dr.-Ing. Dorit Merhof)
Identifikationsnummer
WertTyp
10.1016/j.sysarc.2025.103459DOI
Stichwörter / KeywordsConvolution neural network, Online arithmetic, Most-significant-digit-first arithmetic, CNN acceleration, Layer fusion
Dewey-Dezimal-Klassifikation000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
StatusVeröffentlicht
BegutachtetJa, diese Version wurde begutachtet
An der Universität Regensburg entstandenZum Teil
URN der UB Regensburgurn:nbn:de:bvb:355-epub-767708
Dokumenten-ID76770

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

nach oben