Deep Learning has produced a deep impact on the technological world, having diverse intelligent applications in a multitude of fields. With the emergence of new deep learning models like CNNs (Convolutional Neural Networks), RNNs (Recurrent Neural Networks), LSTMs (Long Short Term Memory) as well as GANs (Generative Adversarial Networks), deep learning has truly pushed the limits of Artificial Intelligence.

Most of the applications of deep learning use what is known as a deep learning framework, which can be defined as a library or a tool that allows to generate deep learning models quickly and easily. Frameworks are preferred over writing…

Q-learning is an off policy reinforcement learning algorithm that seeks to find the best action to take given the current state. It’s considered off-policy because the Q-Learning function learns from actions that are outside the current policy, like taking random actions, and therefore a policy isn’t needed. More specifically, Q-Learning seeks to learn a policy that maximizes the total reward.

Today we will try to find the shortest path connecting the Start and End Vertices, using Q-Learning and C Language. For our implementation, we have considered the following undirected unweighted graph -