Download our e-book of Introduction To Python

Matplotlib - Subplot2grid() FunctionDiscuss Microsoft Cognitive ToolkitMatplotlib - Working with ImagesMatplotlib - PyLab moduleMatplotlib - Working With TextMatplotlib - Setting Ticks and Tick LabelsCNTK - Creating First Neural NetworkMatplotlib - MultiplotsMatplotlib - Quiver PlotPython - Chunks and Chinks View More

How can I write Python code to change a date string from "mm/dd/yy hh: mm" format to "YYYY-MM-DD HH: mm" format? Which sorting technique is used by sort() and sorted() functions of python? How to use Enum in python? Can you please help me with this error? I was just selecting some random columns from the diabetes dataset of sklearn. Decision tree is a classification algo...How can it be applied to load diabetes dataset which has DV continuous Objects in Python are mutable or immutable? How can unclassified data in a dataset be effectively managed when utilizing a decision tree-based classification model in Python? How to leave/exit/deactivate a Python virtualenvironment Join Discussion

Shashank Shanu

2 years ago

- What is a Sigmoid Function?

- Advantages of Sigmoid Function: -

- Disadvantages of Sigmoid functions:

- How to write a sigmoid function and its derivative in python?

- Simple implementation of the sigmoid activation function in python

As
we discussed earlier in the previous article **what is activation functions**
and **types of activation function** briefly. In this article I will try to
explain you in detail about one the activation function which is **Sigmoid Activation
function.**

So,
let’s start

Most
of you I think are already familiar about the activation functions if not I
would recommend you to through my previous article first and then come back to
this article to get better understanding.

The Sigmoid function is the most frequently widely used
activation function in the beginning of deep learning. It is a smoothing
function that is easy to derive and implement.

The name Sigmoidal is derived from the Greek
letter Sigma, and when it is plotted, appears as a sloping “S” across the
Y-axis.

A sigmoidal function is a logistic function which
purely refers to any function that retains the “S” shape, for example tanh(x).
Where a traditional sigmoidal function exists between** 0** and **1**,
tanh(x) follows a similar shape, but exists between **1** and -**1**. On
its own, a sigmoidal function is also differentiable, we can easily find the
slope of the sigmoid curve, at any given two points.

In the sigmoid function, we can see that its output lies in
between the open interval (0,1). We can think of probability, but in the strict
sense, don't treat it as probability. The sigmoid function was once more
popular. It can be thought of as the firing rate of a neuron. In the middle
where the slope is relatively large, it is the sensitive area of the neuron. On
the sides where the slope is very gentle, it is the neuron's inhibitory area.

The equation of the Sigmoid function is given by:

And, the graph of the sigmoid function can be represented as:

Sigmoid function itself contains
some defects.

1) When the input is slightly away from the coordinate origin, the
gradient of the function becomes very small, almost zero. In the process of
neural network backpropagation, we all use the chain rule of differential to
calculate the differential of each weight w. When the backpropagation passes
through the sigmoid function, the differential on this chain is very small.
Moreover, it may pass through many sigmoid functions, which will eventually
cause the **weight(w) **to have little effect on the loss function, which is
not conducive to the optimization of the weight. This problem is called **gradient
saturation or gradient dispersion.**

2) The function output is not centred on 0, which will reduce the efficiency of the weight update.

3) The sigmoid function performs exponential operations, which is
slower for computers.

Some of the advantages and
disadvantages of Sigmoid functions are mentioned below:

- It provides Smooth gradient which helps us in preventing “jumps” in output values.

- Output values bound between 0 and 1, normalizing the output of each neuron.

- It provides clear predictions, i.e. very close to 1 or 0 which helps us to improve model performance.

- It is most prone to gradient vanishing problem.
- Function output is not zero-centred.
- Power operations are relatively time-consuming which increases model complexity.

So,
writing a sigmoid function and its derivative is quite easy. Simply we have to
define a function for the formula. It is implemented as shown below:

```
def sigmoid(z):
return 1.0 / (1 + np.exp(-z))
```

```
def sigmoid_prime(z):
return sigmoid(z) * (1-sigmoid(z))
```

```
#import libraries
import matplotlib.pyplot as plt
import numpy as np
#creating sigmoid function
def sigmoid(x):
s=1/(1+np.exp(-x))
ds=s*(1-s)
return s,ds
a=np.arange(-6,6,0.01)
sigmoid(x)
# Setup centered axes
fig, ax = plt.subplots(figsize=(9, 5))
ax.spines['left'].set_position('center')
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.xaxis.set_ticks_position('bottom')
ax.yaxis.set_ticks_position('left')
# Create and show plot
ax.plot(a,sigmoid(x)[0], color="#307EC7", linewidth=3, label="sigmoid")
ax.plot(a,sigmoid(x)[1], color="#9621E2", linewidth=3, label="derivative")
ax.legend(loc="upper right", frameon=false)
fig.show()
```

The plot shown below is the output
of the above code which plots the sigmoid and its derivative function

I hope you enjoyed reading this article and finally, you came
to know about **Sigmoid Activation Function and how we can implement it in python. **

For more such blogs/courses on data science, machine
learning, artificial intelligence and emerging new technologies do visit us at InsideAIML.

Thanks for reading…

Happy Learning…