8000 GitHub - sailfish009/Mila: Mila: Controlling Minima Concavity in Activation Function
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
forked from OrFeuerSnyk/Mila

Mila: Controlling Minima Concavity in Activation Function

License

Notifications You must be signed in to change notification settings

sailfish009/Mila

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HitCount Donate contributions welcome CircleCI Codacy Badge

Mila

Mila is an uniparametric activation function inspired from the Mish Activation Function. The parameter β is used to control the concavity of the Global Minima of the Activation Function where β=0 is the baseline Mish Activation Function. Varying β in the negative scale reduces the concavity and vice versa. β is introduced to tackle gradient death scenarios due to the sharp global minima of Mish Activation Function.

The mathematical function of Mila is shown as below:

It's partial derivatives:

1st derivative of Mila when β=-0.25:

Function and it's derivatives graphs for various values of β:

Contour Plot of Mila and it's 1st derivative:

Early Analysis of Mila Activation Function:

The output landscape of 5 layer randomly initialized neural network was compared for ReLU, Swish and Mila (β = -0.25). The observation clearly shows the sharp transition between the scalar magnitudes for the co-ordinates of ReLU as compared to Swish and Mila (β = -0.25). Smoother transition results in smoother loss functions which are easier to optimize and hence the network generalizes better.

The Gradients of Mila's scalar function representation was also visualized:

Benchmarks:

Please view the Results.md to view the benchmarks.

Try it

Run the demo.py file to try out Mila in a simple network for Fashion MNIST classification.

  • First clone the repository and navigate to the folder using the following command.

cd \path_to_Mila_directory

  • Install dependencies

pip install requirements.txt

  • Run the Python demo script.

python3 demo.py --activation mila --model_initialization class

Note: The demo script is initialized with Mila having a β to be -0.25. Change the β ('beta' variable) in the script to try other beta values

About

Mila: Controlling Minima Concavity in Activation Function

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%
0