Activation functions

Environment setup

import platform

print(f"Python version: {platform.python_version()}")
assert platform.python_version_tuple() >= ("3", "6")

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns

from scipy.special import softmax
Python version: 3.7.5
# Setup plots
%matplotlib inline
plt.rcParams["figure.figsize"] = 10, 8
%config InlineBackend.figure_format = 'retina'
sns.set()

Sigmoid function

Outputs a number between 0 and 1.

\[\sigma(z) = \frac{1}{1 + e^{-z}}\]
\[\sigma'(z) = \frac{e^{-z}}{(1 + e^{-z})^2} = \sigma(z)\big(1 - \sigma(z)\big)\]
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

tanh function (hyperbolic tangent)

\[tanh(z) = 2\sigma(2z) - 1\]
def tanh(z):
    return 2 * sigmoid(2 * z) - 1

ReLU function (Rectified Linear Unit)

\[ReLU(z) = max(0,z)\]
def relu(z):
    return np.maximum(0, z)
z = np.linspace(-5, 5, 200)
plt.plot(z, sigmoid(z), "g--", linewidth=2, label="Sigmoid")
plt.plot(z, tanh(z), "b-", linewidth=2, label="Tanh")
plt.plot(z, relu(z), "m-.", linewidth=2, label="ReLU")
plt.plot([-10, 10], [0, 0], "k-")
plt.plot([0, 0], [-1.5, 1.5], "k-")
plt.axis([-5, 5, -1.5, 1.5])
plt.legend(loc="lower right", fontsize=14)
plt.show()
../_images/activation_functions_11_0.png

Softmax function

Transforms a vector \(\pmb{v} \in \pmb{R}^K\) into a probability distribution. Multiclass generalization of the sigmoid function.

\[\sigma(s(\pmb{x}))_k = \frac{e^{s_k(\pmb{x})}}{\sum_{k=1}^K {e^{s_k(\pmb{x})}}}\;\;\;\; \sum_{k=1}^K \sigma(s(\pmb{x}))_k = 1\]
  • \(K\): number of classes.

  • \(s(\pmb{x})\): vector containing the scores of each class for the instance \(\pmb{x}\).

  • \(\sigma(s(\pmb{x}))_k\): probability that \(\pmb{x}\) belongs to class \(k\).

# Print probability distribution for a sample vector
scores = [3.0, 1.0, 0.2]
s = softmax(scores)
print(s)

# Sum of all probabilities is equal to 1
print(sum(s))
[0.8360188  0.11314284 0.05083836]
1.0