A DFA has exactly 1 transition out of each state for a given symbol of $\sum$
A NFA can have none or multiple transitions out of a state for a given symbol.
Say we want to reverse a language $L$ into $L^{Rev}$. We can reverse its DFA, and get a machine which kind of works.
We will have valid transitions to final states for valid words and no available transitions to final states for words which are not in the language.
However, we can also be left with multiple choices for transitioning or multiple start states.
This is an NFA. Non-determinism essentially refers to how we will evaluate a machine of this type - by brute force.
NFA are evaluated by splitting themselves into copies and taking each available choice whenever there are multiple.

NFA can have multiple start states. These are represented by creating a new start state, and creating epsilon $\epsilon$ transitions to all the start states we want to include.
<aside> ⚧️ An automaton can change its state using $\epsilon$ transitions without consuming an input letter.
</aside>
$$ \epsilon CLOSE: Q \rightarrow 2^Q $$
This denotes all states that can be reached from some $q$ by only taking $\epsilon$ transitions, including $q$ itself.
This can be generalised to sets of states easily:
$$ \forall X \subseteq Q, \epsilon CLOSE(X) = \bigcup_{x\in X} \epsilon CLOSE(x) \\ \epsilon CLOSE(\emptyset) = \emptyset $$
So, they are written using the same tuple notation as DFA’s, except our transition function will be different.
$$ \delta: Q \times (\sum \cup \{\epsilon\}) \rightarrow 2^Q $$
We now have an epsilon transition, and each (state, letter) pair can transition to a new set of states.