Architectures for deep learning

Suriya July 09, 2021

Architectures for deep learning

Artificial intelligence's Ascension

Connectionist systems have been around for almost 70 years, but new designs and GPUs have propelled them to the forefront of artificial intelligence. Deep learning isn't a single technique, but rather a collection of algorithms and topologies that may be used to solve a variety of problems. While deep learning is not a new concept, it is exploding due to the convergence of deeply layered neural networks and the usage of GPUs to expedite their execution. This expansion has also been fueled by big data.

The structures and algorithms utilised in deep learning are many and diverse. This section looks at six deep learning architectures that have been developed during the last 20 years. Long short-term memory (LSTM) and convolutional neural networks (CNNs) are two of the oldest techniques on the list, but they're also two of the most popular.

This article distinguishes between supervised and unsupervised deep learning architectures and introduces some common deep learning architectures: Recurrent neural networks (RNNs), long short-term memory/gated recurrent unit (GRU), self-organizing map (SOM), autoencoders (AE), and limited Boltzman machine are all examples of convolutional neural networks (RBM). Deep belief networks (DBN) and deep stacking networks are also discussed (DSNs)

The basic architecture of deep learning is the artificial neural network (ANN). ANN has led to the development of a number of algorithm variants.

Deep learning under supervision:

The problem space in which the objective to be predicted is explicitly labelled within the data used for training is referred to as supervised learning.

We present two of the most common supervised deep learning architectures, convolutional neural networks and recurrent neural networks, as well as some of their variations, at a high level in this section.

Convolutional neural networks are a type of neural network that uses layers of information:

A CNN is a multilayer neural network based on the visual brain of animals. In image-processing applications, the design is especially beneficial. Yann LeCun invented the first CNN, which focused on handwritten character recognition and postal code interpretation at the time. Early layers of a deep network identify characteristics (such as edges), which are then recombined into higher-level input properties by later layers. The LeNet CNN architecture is comprised of multiple layers that perform feature extraction and classification (see the following image).

Recurrent neural networks are neural networks that repeat themselves:

The RNN is a basic network design that additional deep learning architectures are constructed on top of. The main distinction between a conventional multilayer network and a recurrent network is that a recurrent network may include connections that feed back into previous levels rather than being fully feed-forward (or into the same layer). RNNs can keep track of past inputs and model issues in real time because to this feedback. RNNs may be built in a variety of ways (we'll look at one prominent topology called LSTM next).

Deep learning without supervision:

Unsupervised learning refers to a problem space in which the data being utilised for training has no goal label. Self-organized maps, autoencoders, and limited Bolzmann machines are three unsupervised deep learning architectures discussed in this section. We also go over how the underlying unsupervised architecture is used to build deep belief networks and deep stacking networks.

Boltzmann Machines with Restrictions:

Though RBMs became famous much later, they were first created in 1986 by Paul Smolensky and were called as Harmoniums. A two-layered neural network is referred to as an RBM. Input and concealed layers are the layers. Every node in a hidden layer is connected to every node in a visible layer in RBMs, as illustrated in the diagram. Nodes in the input and hidden layers are also linked in a conventional Boltzmann Machine. In a Restricted Boltzmann Machine, nodes within a layer are not linked due to computational complexity.

Networks of deep stacking:

The DSN, also known as a deep convex network, is the final architecture. A DSN differs from typical deep learning frameworks in that it is really a deep set of individual networks, each with its own hidden layers, rather than a single deep network. The complexity of training is one of the issues with deep learning, and this design addresses that. Because each layer in a deep learning architecture exponentially increases the difficulty of training, the DSN considers training as a collection of discrete training issues rather than a single problem.

Going a step further:

Deep learning is made up of a variety of architectures that may be used to solve a variety of problems. These solutions might be feed-forward or recurrent networks that take prior inputs into account. Although creating these sorts of deep architectures might be challenging, open source tools like Caffe, Deeplearning4j, TensorFlow, and DDL can help you get started quickly.

Post a Comment

0 Comments