Back

gradient descent

Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest decrease, as defined by the negative of the gradient. In the context of machine learning, this function is typically a cost or loss function, which represents the difference between the predicted values by the model and the actual values. The algorithm updates the parameters of the model, such as weights in neural networks, to reduce this cost.

The process involves three main steps: calculating the gradient (the slope of the cost function) of the current parameters, updating the parameters in the direction opposite to the gradient, and repeating this process until the algorithm converges to a minimum of the cost function. There are variations of gradient descent, including batch, stochastic, and mini-batch, which differ in the amount of data used to calculate the gradient at each iteration[1][2][3][4][5].

DALL·E 2024-02-29 06.00.17 - A detailed illustration depicting the concept of gradient descent. The image should show a 3D landscape with hills and valleys, representing a loss fu.webp

^{An illustration depicting the concept of gradient descent. It visualizes the journey of an algorithm as it iteratively makes its way towards the minimum point of a loss function landscape, represented by a ball navigating down a hill through the path of steepest descent.}

Citations:

[1] https://www.ibm.com/topics/gradient-descent

[2] https://developer.nvidia.com/blog/a-data-scientists-guide-to-gradient-descent-and-backpropagation-algorithms/

[3] https://youtube.com/watch?v=qg4PchTECck

[4] https://www.ruder.io/optimizing-gradient-descent/

[5] https://en.wikipedia.org/wiki/Gradient_descent

[6] https://stats.stackexchange.com/questions/181629/why-use-gradient-descent-with-neural-networks

[7] https://www.khanacademy.org/math/multivariable-calculus/applications-of-multivariable-derivatives/optimizing-multivariable-functions/a/what-is-gradient-descent

[8] https://www.geeksforgeeks.org/optimization-techniques-for-gradient-descent/

[9] https://builtin.com/data-science/gradient-descent

[10] https://youtube.com/watch?v=IHZwWFHWa-w

[11] https://machinelearningmastery.com/gradient-descent-optimization-from-scratch/

[12] https://youtube.com/watch?v=i62czvwDlsw

[13] https://machinelearningmastery.com/gradient-descent-for-machine-learning/

[14] https://towardsdatascience.com/gradient-descent-algorithm-a-deep-dive-cf04e8115f21

[15] https://www.geeksforgeeks.org/gradient-descent-algorithm-and-its-variants/

[16] https://ml-cheatsheet.readthedocs.io/en/latest/gradient_descent.html