Refine
Year of publication
- 2020 (5)
Document Type
- Article (4)
- Conference Proceeding (1)
Language
- English (5)
Institute
- Grundlagen (5)
Most multi-layer neural networks used in deep learning utilize rectified linear neurons. In our previous papers, we showed that if we want to use the exact same activation function for all the neurons, then the rectified linear function is indeed a reasonable choice. However, preliminary analysis shows that for some applications, it is more advantageous to use different activation functions for different neurons – i.e., select a family of activation functions instead, and select the parameters of activation functions of different neurons during training. Specifically, this was shown for a special family of squashing functions that contain rectified linear neurons as a particular case. In this paper, we explain the empirical success of squashing functions by showing that the formulas describing this family follow from natural symmetry requirements.
One-dimensional objects as nanowires have been proven to be building blocks in novel applications due to their unique functionalities. In the realm of magnetic materials, iron-oxides form an important class by providing potential solutions in catalysis, magnetic devices, drug delivery, or in the field of sensors. The accurate composition and spatial structure analysis are crucial to describe the mechanical aspects and optimize strategies for the design of multi-component NWs. Atom probe tomography offers a unique analytic characterization tool to map the (re-)distribution of the constituents leading to a deeper insight into NW growth, thermally-assisted kinetics, and related mechanisms. As NW-based devices critically rely on the mechanical properties of NWs, the appropriate mechanical modeling with the resulting material constants is also highly demanded and can open novel ways to potential applications. Here, we report a compositional and structural study of quasi-ceramic one-dimensional objects: α-Fe ⊕ α-FeOOH(goethite) ⊕ Pt and α-Fe ⊕ α-Fe3O4(magnetite) ⊕ Pt core–shell NWs. We provide a theoretical model for the elastic behavior with terms accounting for the geometrical and mechanical nonlinearity, prior and subsequent to thermal treatment. The as-deposited system with a homogeneous distribution of the constituents demonstrates strikingly different structural and elastic features than that of after annealing, as observed by applying atom probe tomography, energy-dispersive spectroscopy, analytic electron microscopy, and a micromanipulator nanoprobe system. During annealing at a temperature of 350 °C for 20 h, (i) compositional partitioning between phases (α-Fe, α-Fe3O4 and in a minority of α-Fe2O3) in diffusional solid–solid phase transformations takes place, (ii) a distinct newly-formed shell formation develops, (iii) the degree of crystallinity increases and (iv) nanosized precipitation of evolving phases is detected leading to a considerable change in the description of the elastic material properties. The as-deposited nanowires already exhibit a significantly large maximum strain (1–8%) and stress (3–13 GPa) in moderately large bending tests, which become even more enhanced after the annealing treatment resulting at a maximum of about 2.5–10.5% and 6–18 GPa, respectively. As a constitutive parameter, the strain-dependent stretch modulus undoubtedly represents changes in the material properties as the deformation progresses.
Over the past few years, deep neural networks have shown excellent results in multiple tasks, however, there is still an increasing need to address the problem of interpretability to improve model transparency, performance, and safety. Achieving eXplainable Artificial Intelligence (XAI) by combining neural networks with continuous logic and multi-criteria decision-making tools is one of the most promising ways to approach this problem: by this combination, the black-box nature of neural models can be reduced. The continuous logic-based neural model uses so-called Squashing activation functions, a parametric family of functions that satisfy natural invariance requirements and contain rectified linear units as a particular case. This work demonstrates the first benchmark tests that measure the performance of Squashing functions in neural networks. Three experiments were carried out to examine their usability and a comparison with the most popular activation functions was made for five different network types. The performance was determined by measuring the accuracy, loss, and time per epoch. These experiments and the conducted benchmarks have proven that the use of Squashing functions is possible and similar in performance to conventional activation functions. Moreover, a further experiment was conducted by implementing nilpotent logical gates to demonstrate how simple classification tasks can be solved successfully and with high performance. The results indicate that due to the embedded nilpotent logical operators and the differentiability of the Squashing function, it is possible to solve classification problems, where other commonly used activation functions fail.
Interpretable neural networks based on continuous-valued logic and multicriteria decision operators
(2020)
Combining neural networks with continuous logic and multicriteria decision-making tools can reduce the black-box nature of neural models. In this study, we show that nilpotent logical systems offer an appropriate mathematical framework for hybridization of continuous nilpotent logic and neural models, helping to improve the interpretability and safety of machine learning. In our concept, perceptrons model soft inequalities; namely membership functions and continuous logical operators. We design the network architecture before training, using continuous logical operators and multicriteria decision tools with given weights working in the hidden layers. Designing the structure appropriately leads to a drastic reduction in the number of parameters to be learned. The theoretical basis offers a straightforward choice of activation functions (the cutting function or its differentiable approximation, the squashing function), and also suggests an explanation to the great success of the rectified linear unit (ReLU). In this study, we focus on the architecture of a hybrid model and introduce the building blocks for future applications in deep neural networks.
The theories of multi-criteria decision-making (MCDM) and fuzzy logic both aim to model human thinking. In MCDM, aggregation processes and preference modeling play the central role. This paper suggests a consistent framework for modeling human thinking by using the tools of both fields: fuzzy logical operators as well as aggregation and preference operators. In this framework, aggregation, preference, and the logical operators are described by the same unary generator function. Similarly to the implication being defined as a composition of the disjunction and the negation operator, preference operators were introduced as a composition of the aggregative operator and the negation operator. After a profound examination of the main properties of the preference operator, our main goal is the implementation into neural networks. We show how preference can be modeled by a perceptron, and illustrate the results in practical neural applications.