Neural Networks, also called Artificial Neural Network (ANN) is purposed to mimic how a human learns in a computer.
Although today Artificial Neural Networks does behave as Biological Neural Networks as intended, but we can compare them:
![[Pasted image 20250404122814.png]]
### History
Neural Networks was created in 1950's but fell out of favor.
They were again used in 1980's but fell out of favor in 1990's.
It was surrogated again in 2005 with *deep learning*.
### Neurons
A neuron is primarily a [[Classification#Logistic Regression|logistic regression]] calculation unit. Given a data, it sometimes activates based on its inner [[Classification#Cost Function|sigmoid function]]:
![[Pasted image 20250404133638.png]]
Now, despite [[Classification|classification]] problems, neurons can be chained together to get more meaningful outcame. In each layer, a more feasible *feature* is created. Now based on those features, a new set of features are created (known as layers).
This algorithm is also done *manually* as **feature engineering** in Regression.
![[Pasted image 20250404134019.png]]
### Model
![[Pasted image 20250406191219.png]]
Each neuron has $w$ and $b$ values where $i$ refers to neuron index in the layer and $j$ refers to layer number in $w_i^{[j]}$ . The activation function of each neuron is:
$a_j^{[l]} = g(\vec{w}_j^{[l]}\cdot a^{[l-1]} + b_j^{[l]}) $
where $g$ is usually a [[Classification#Cost Function|sigmoid function]].
> [!info]
> The input and output of each layer is a vector. Also every neuron in a layer uses the whole vector values.
> [!tip]
> Usually, number of neurons **decreases** in each layer.