Understanding Gradients: The Engine Behind Neural Network Learning

In the previous article, we explored activation functions and visualized them using Python.

Now, let’s see what gradients are.

Neural networks use activation functions to transform inputs inside them.

But if a neural network gives a wrong output, how does it know what to fix?

This is where gradients come in.

What is a gradient?

Imagine you are walking on a hill. If the ground is steep, you can feel which direction goes up or down.

If the ground is almost flat, it is hard to tell where to go.

A gradient can be simply thought of as a number that tells us how steep a curve is at a point.

How does this apply in the case of neural networks? Let’s see.

Why gradients matter in neural networks

In neural networks:

  • Gradients tell us how much a parameter should change
  • The bigger the gradient, the bigger the update
  • If the gradient is 0, then learning stops

When training a neural network:

  1. We make a prediction
  2. We calculate how wrong it is
  3. We update weights to reduce the loss
  4. This update depends entirely on gradients

Gradients of activation functions

Each activation function has:

  • A curve
  • A gradient curve

Let’s check this in Python, starting first with ReLU.

Gradient of ReLU

We can define the ReLU gradient as:

def relu_grad(x):
    return np.where(x > 0, 1, 0)

This means:

  • Gradient = 0 when input ≤ 0
  • Gradient = 1 when input > 0

Let’s plot it:

plt.figure()
plt.plot(x, relu_grad(x), label="ReLU Gradient")
plt.title("Gradient of ReLU")
plt.xlabel("Input")
plt.ylabel("Gradient")
plt.grid(True)
plt.legend()
plt.show()

This is our gradient. From the above, you can observe the following:

  • The entire negative side has zero gradient
  • The positive side has a constant gradient of 1

From this, we can further understand that:

  • ReLU learns very fast when active
  • ReLU neurons can die if they always receive negative inputs

Gradient of Softplus

def softplus_grad(x):
    return 1 / (1 + np.exp(-x))

Let’s plot it:

plt.figure()
plt.plot(x, softplus_grad(x), label="Softplus Gradient")
plt.title("Gradient of Softplus")
plt.xlabel("Input")
plt.ylabel("Gradient")
plt.grid(True)
plt.legend()
plt.show()

You can observe that it is the same as the sigmoid activation function.

You can also observe that:

  • There is a smooth transition
  • Learning always continues
  • This avoids dying neurons and adds stability, but it is slower than ReLU

Gradient of Sigmoid

The sigmoid gradient looks like this:

def sigmoid_grad(x):
    s = sigmoid(x)
    return s * (1 - s)

Let’s plot it:

plt.figure()
plt.plot(x, sigmoid_grad(x), label="Sigmoid Gradient")
plt.title("Gradient of Sigmoid")
plt.xlabel("Input")
plt.ylabel("Gradient")
plt.grid(True)
plt.legend()
plt.show()

From the above, you can observe the following:

  • The gradient is very small at both extremes
  • It is strong only around the middle
  • It is almost zero for large positive or negative values

This leads to a famous problem called the vanishing gradient problem. We will explore this more in the next article.

You can try the examples out via the Colab notebook.

If you’ve ever struggled with repetitive tasks, obscure commands, or debugging headaches, this platform is here to make your life easier. It’s free, open-source, and built with developers in mind.

👉 Explore the tools: FreeDevTools
👉 Star the repo: freedevtools

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

Automated email segmentation: Setting up for better targeting

Next Post

STEP 3: SETTING UP AKS STEP-BY-STEP

Related Posts
arkui-x平台差异化

ArkUI-X平台差异化

跨平台使用场景是一套ArkTS代码运行在多个终端设备上,如Android、iOS、OpenHarmony(含基于OpenHarmony发行的商业版,如HarmonyOS Next)。当不同平台业务逻辑不同,或使用了不支持跨平台的API,就需要根据平台不同进行一定代码差异化适配。当前仅支持在代码运行态进行差异化,接下来详细介绍场景及如何差异化适配。 使用场景 平台差异化适用于以下两种典型场景: 1.自身业务逻辑不同平台本来就有差异; 2.在OpenHarmony上调用了不支持跨平台的API,这就需要在OpenHarmony上仍然调用对应API,其他平台通过Bridge桥接机制进行差异化处理; 判断平台类型 可以通过let osName: string = deviceInfo.osFullName;获取对应OS名字,该接口已支持跨平台,不同平台上其返回值如下: OpenHarmony上,osName等于OpenHarmony-XXX Android上,osName等于Android XXX iOS上,osName等于iOS XXX 示例如下:…
Read More