1. Thesis introduction / motivation:
Neuromorphic computing (NC) aims to revolutionize the field of artificial intelligence. It involves designing and implementing electronic systems that simulate the behavior of biological neurons using specialized hardware such as field-programmable gate arrays (FPGAs) or dedicated neuromorphic chips [1, 2]. NC is designed to be highly efficient, optimized for low-power consumption and high parallelism [3]. These systems are adaptable to changing environments and can learn during operation, making them well-suited for solving dynamic and unpredictable problems [4].
However, the use of NC for solving real-life problems is currently limited because the performance of spiking neural networks (SNNs), the neural networks employed in NC, is not as good as that of traditional computation systems such as specialized deep learning hardware in terms of accuracy and speed of learning [5, 6]. Several reasons contribute to the performance gap: SNNs are more challenging to train due to their specialized training algorithms [7, 8], they are more sensitive to hyperparameters since they are dynamic systems with complex interactions [9], they require specialized datasets (neuromorphic data) that are currently scarce and limited in size [10], and the range of functions SNNs can approximate is more limited compared to traditional artificial neural networks (ANNs) [11]. Before NC can have a more significant impact in AI and computing technology, addressing these challenges related to SNNs is necessary.
2. Contents of the study This dissertation aims to reduce the performance gap between neuromorphic computing systems and traditional computing systems, especially in solving pattern recognition tasks. To do so, we focus on addressing the described above issues with two approaches.
First, we improve SNNs performance using Auxiliary Learning (AL) [12]. AL is a technique used in ANNs in which the network is trained on the main task and on one or more additional, auxiliary, tasks. By using additional tasks, the network is forced to find more general and robust parameters. Use of AL, however, requires careful selecting of auxiliary tasks as well as the method of combining multiple tasks during training [13]. Specifically, the network consists of a feature extraction block connected in a feed-forward fashion to the main task and auxiliary task(s) blocks. The spiking input signal is processed by the first block, the feature extraction block, into a latent p-dimensional spiking feature vector, which is then fed into the main and auxiliary task classifier blocks to find the outputs. The idea behind this architecture is to allow the feature extraction block receive feedback from the main classifier block (main task loss) and also from the auxiliary task classifier block(s) (auxiliary task losses) during training. The implementation of the network is carried out using the SpikingJelly framework for SNNs simulation [14]. The experiments are validated on DVS-CIFAR10 [15] and DVS128-Gesture [16] neuromorphic datasets.
Second, we improve SNNs performance by designing new architecture that can be modified based on changing of the firing threshold. This is done as an attempt to exploit dynamic capabilities of SNNs. The proposed network is able to learn two different tasks but performing only one of them at the time [17]. The task the network performs is selected by modulating the firing threshold of the spiking neuron used. This operation is inspired by the neuromodulation property of biological neurons, which can regulate (modify) their internal dynamics based on external stimuli [18]. We refer to our proposed network as multi-task spiking neural network (MT-SNN). MT-SNN consists of three blocks. Each block is built of one or more spiking neuron layers connected in a feed-forward fashion. SLAYER, a spiked-based backpropagation algorithm, is used for training the system [19]. Experiments and results of implementing MT-SNN in Intel's lava neuromorphic framework for solving multi-task classification on the NMNIST neuromorphic data is presented.
Further, we enhance development of the SNNs of the two approaches by using advanced spiking neuron models and neuromorphic data augmentation. Specifically, we focus on the parametric leaky integrate and fire (PLIF) neuron model [20]. PLIF neurons are modified leaky integrate and fire neurons that allow training of not only the weights but also the membrane time constants. Using this neuron allows for neuron variability which is an important property for achieving network robustness. Direct training of the membrane constant has the additional benefit of eliminating its hand tuning which alleviates issue number 3. We use data neuromorphic data augmentation to reduce the problems of overfitting and unstable convergence present during training of SNNs [10].
In terms of implementation, we test the developed networks on the Loihi2 neuromorphic chip [21]. We have access to Loihi2 through an agreement between VCU and Intel. The developed software will be added to the existing Loihi2¿s library, called Lava. Accuracy, memory requirements, energy consumption, and latency are used to measure performance of the developed SNNs for solving a variety of tasks on the neuromorphic/event-based data.
3. Conclusion NC systems and the SNNs they use have great potential for developing low-power adaptable AI. However, challenges such as training complexity, hyperparameter selection, computational flexibility and scarcity of training data hinder their wider use.
In this dissertation, we aim to increase usage of NC by enhancing performance of SNNs. To achieve this goal, we proposed two SNNs architectures to address these limitations. The first architecture utilizes auxiliary learning for improving training performance and data efficiency. The network architecture consists of a feature extraction block connected in a feed-forward fashion to a main classification block and one or more auxiliary task classification blocks. By using auxiliary tasks, we use additional information during training that helps in the regularization of the feature extraction block. As a result, the feature extraction block is forced to learn more general and robust features which helps in improving network performance on the main task. Our experiments confirm that using AL during training results in improved performance. However, the improvement depends on a careful selection of the auxiliary task(s) and tuning of the loss rate constant. The presented experiments were obtained by simulation only, namely, using the SpikingJelly neuromorphic library.
The second architecture, namely Multi-task Spiking Neural Network (MT-SNN), leverages the neuromodulation capabilities of spiking neurons to enhance multitask performance. Specifically, firing threshold modulation is used to modify the network's operation following a single tasking of multiple tasks approach. Results for our experiments tested using Intel¿s Lava neuromorphic simulation platform show that MT-SNN predicts both tasks with only slightly lower accuracy than ST-SNN. Additionally, comparison of using the firing threshold vs using the external input current shows that with the firing threshold the accuracy is higher than with the external input current.
While our experiments demonstrate effectiveness of the proposed architectures, they also reveal some limitations that would be worth studying in future work. One of such limitation is the usage of LIF neurons only. Future work could analyze use more complex neurons, such as Izhikevich neurons. To utilize IZ neurons, a compatible with spike-based backpropagation version of it would be required.
4. Bibliography:
[1] C. Mead. ¿Neuromorphic electronic systems¿. In: Proceedings of the IEEE 78.10 (1990), pp. 1629¿1636. doi: 10.1109/5.58356.
[2] Steve Furber. ¿Large-scale neuromorphic computing systems¿. In: Journal of Neural Engineering 13.5 (Aug. 2016), p. 051001. doi: 10.1088/1741-2560/13/5/051001. url: https://dx.doi.org/10.1088/1741-2560/13/5/051001.
[3] Geoffrey W. Burr et al. ¿Neuromorphic computing using non-volatile memory¿. In: Advances in Physics: X 2.1 (2017), pp. 89¿124. doi: 10.1080/23746149. 2016.1259585. eprint: https://doi.org/10.1080/23746149.2016.1259585.
[4] Shufang Zhao et al. ¿Neuromorphic-computing-based adaptive learning using ion dynamics in flexible energy storage devices¿. In: National Science Review 9.11 (Aug. 2022). nwac158. issn: 2095-5138. doi: 10 . 1093 / nsr / nwac158.
[5] Aboozar Taherkhani et al. ¿A Review of Learning in Biologically Plausible Spiking Neural Networks¿. In: Neural Netw. 122.C (Feb. 2020), pp. 253¿272. issn: 0893-6080. doi: 10.1016/j.neunet.2019.09.036.
[6] Michael Pfeiffer and Thomas Pfeil. ¿Deep Learning With Spiking Neurons: Opportunities and Challenges¿. In: Frontiers in Neuroscience 12 (2018). issn: 1662-453X. doi: 10.3389/fnins.2018.00774.
[7] Jason K. Eshraghian et al. Training Spiking Neural Networks Using Lessons From Deep Learning. 2021. doi: 10.48550/ARXIV.2109.12894.
[8] Amirhossein Tavanaei et al. ¿Deep learning in spiking neural networks¿. In: Neural Networks 111 (2019), pp. 47¿63. issn: 0893-6080. doi: https://doi.org/10.1016/j.neunet.2018.12.002.
[9] C Koch, M Rapp, and I Segev. ¿A brief history of time (constants)¿. en. In: Cereb Cortex 6.2 (Mar. 1996), pp. 93¿101.
[10] Yuhang Li et al. ¿Neuromorphic Data Augmentation for Training Spiking Neural Networks¿. In: Computer Vision ¿ ECCV 2022. Ed. by Shai Avidan et al. Cham: Springer Nature Switzerland, 2022, pp. 631¿649. isbn: 978-3-031-20071-7.
[11] Andrew R. Barron. ¿Approximation and estimation bounds for artificial neural networks¿. In: Machine Learning 14.1 (Jan. 1994), pp. 115¿133. issn: 1573-0565. doi: 10.1007/BF00993164.
[12] Shikun Liu, Andrew Davison, and Edward Johns. ¿Self-Supervised Generalisation with Meta Auxiliary Learning¿. In: Advances in Neural Information Processing Systems. Ed. by H. Wallach et al. Vol. 32. Curran Associates, Inc., 2019.
[13] Trevor Standley et al. Which Tasks Should Be Learned Together in Multi-task Learning? 2019. doi: 10.48550/ARXIV.1905.07553. url: https://arxiv.org/abs/1905.07553.
[14] Wei Fang et al. SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: 2023-01-15. 2020.
[15] Hongmin Li et al. ¿CIFAR10-DVS: An Event-Stream Dataset for Object Classification¿. In: Frontiers in Neuroscience 11 (2017). issn: 1662-453X. doi: 10.3389/fnins.2017.00309.
[16] Arnon Amir et al. ¿A Low Power, Fully Event-Based Gesture Recognition System¿. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2017, pp. 7388¿7397. doi: 10.1109/CVPR.2017.781.
[17] Kevis-Kokitsi Maninis, Ilija Radosavovic, and Iasonas Kokkinos. ¿Attentive Single-Tasking of Multiple Tasks¿. In: CoRR abs/1904.08918 (2019). arXiv: 1904.08918. url: http://arxiv.org/abs/1904.08918.
[18] Eve Marder. ¿Neuromodulation of neuronal circuits: back to the future¿. In: Neuron 76.1 (Oct. 2012), pp. 1¿11. doi: 10.1016/j.neuron.2012.09.010.
[19] Sumit Bam Shrestha and Garrick Orchard. SLAYER: Spike Layer Error Reassignment in Time. 2018. doi: 10.48550/ARXIV.1810.08646.
[20] Wei Fang et al. Incorporating Learnable Membrane Time Constant to Enhance Learning of Spiking Neural Networks. 2020. doi: 10.48550/ARXIV.2007.05785.
[21] Garrick Orchard et al. ¿Efficient Neuromorphic Signal Processing with Loihi2¿. In: CoRR abs/2111.03746 (2021). arXiv: 2111.03746. url:https://arxiv.org/abs/2111.03746.
© 2008-2024 Fundación Dialnet · Todos los derechos reservados