softmax loss (binary-class case)
$$ p_1=\frac{\exp(W^T_1 x + b_1)}{\exp(W^T_1 x + b_1) + \exp(W^T_2 x + b_2)} \tag{1} $$
$$ p_2=\frac{\exp(W^T_2 x + b_2)}{\exp(W^T_1 x + b_1) + \exp(W^T_2 x + b_2)} \tag{2} $$
original softmax
$$ L=\frac{1}{N} \sum_i L_i = \frac{1}{N} \sum_i -\log(\frac{e^{f_{y_i}}}{\sum_j e^{f_j}}) \tag{3} $$
$$ \begin{aligned} L_i &= -\log(\frac{e^{W^T_{y_i} x_i + b_{y_i}}}{\sum_j e^{W^T_j x_i + b_j}}) \\ &=-\log(\frac{e^{||W_{y_i}|| \ ||x_i|| \cos(\theta_{y_i,i}) + b_{y_i}}}{\sum_j e^{||W_j||\ ||x_i|| \cos(\theta_{j,i}) + b_j}}) \tag{4} \end{aligned} $$
$$ L_\text{modified} = \frac{1}{N} \sum_i -\log(\frac{e^{||x_i|| \cos(\theta_{y_i,i})}}{\sum_j e^{||x_i|| \cos(\theta_{j,i})}}) \tag{5} $$