We start with our initial optimization problem

$$\underset{\overrightarrow{\theta_G}}{\min}\underset{\overrightarrow{\theta_D}}{\max} \hspace{3pt} \Bigg(\text{Pr}\bigg(D(\overrightarrow{\theta_D}, R) = |\text{real}\rangle\bigg) - \hspace{2pt} \text{Pr}\bigg(D(\overrightarrow{\theta_D}, G(\overrightarrow{\theta_G},z)) = |\text{fake}\rangle\bigg)\Bigg)$$

and then represent the above as a cost function with their expectation values instead of probabilities

$$V(\overrightarrow{\theta_G}, \overrightarrow{\theta_D}) = \Bigg( \frac{1}{2}\Big(\text{tr}(Z\rho^{DR}(\overrightarrow{\theta_D}))+1\Big)\Bigg) - \Bigg( \frac{1}{2} \Big( \text{tr}(Z\rho^{DG}(\overrightarrow{\theta_D}, \overrightarrow{\theta_G}, z))+1\Big)\Bigg)$$

which simplifies to

$$V(\overrightarrow{\theta_G}, \overrightarrow{\theta_D}) = \frac{1}{2}\Bigg(\text{tr}(Z\rho^{DR}(\overrightarrow{\theta_D}))- \text{tr}(Z\rho^{DG}(\overrightarrow{\theta_D}, \overrightarrow{\theta_G}, z))\Bigg)$$

Although it is a cost function, for now, we will keep the scaling factor to retain the meaning of probability.

Up till now, we haven't constrained the bias of the data source we use. We can do that by using Φ to denote the bias of the source.

$$V(\overrightarrow{\theta_G}, \overrightarrow{\theta_D}) = \frac{1}{2}\Bigg( \text{cos}^2(\phi) \text{tr}\Big(Z\rho^{DR}(\overrightarrow{\theta_D})\Big) - \text{sin}^2(\phi) \text{tr}\Big(Z\rho^{DG}(\overrightarrow{\theta_D}, \overrightarrow{\theta_G}, z)\Big) \Bigg)$$

where $\text{cos}^2(\phi)$ and $\text{sin}^2(\phi)$ denote the probabilities of feeding real data and generated data, respectively.

Assuming a balanced random choice of R or G (Φ =π/4) the cost function has the form

$$V(\overrightarrow{\theta_G}, \overrightarrow{\theta_D}) = \frac{1}{2}\Bigg( \frac{1}{2}\text{tr}\Big(Z\rho^{DR}(\overrightarrow{\theta_D})\Big) - \frac{1}{2} \text{tr}\Big(Z\rho^{DG}(\overrightarrow{\theta_D}, \overrightarrow{\theta_G}, z)\Big) \Bigg)$$

Which can be simplified to

$$V(\overrightarrow{\theta_G}, \overrightarrow{\theta_D}) = \frac{1}{4}\Bigg(\text{tr}\Big(Z\rho^{DR}(\overrightarrow{\theta_D})\Big) - \text{tr}\Big(Z\rho^{DG}(\overrightarrow{\theta_D}, \overrightarrow{\theta_G}, z)\Big) \Bigg)$$

Following the linearity of traces, the cost function can be given by

$$V(\overrightarrow{\theta_G}, \overrightarrow{\theta_D}) = \frac{1}{4}\Bigg(\text{tr}\bigg(\big(\rho^{DR}(\overrightarrow{\theta_D})- \rho^{DG}(\overrightarrow{\theta_D}, \overrightarrow{\theta_G}, z)\Big) Z\bigg)\Bigg)$$

Dropping the scaling factors, the quantum optimization problem has the final form

$$\underset{\overrightarrow{\theta_G}}{\min}\space \underset{\overrightarrow{\theta_D}}{\max} \space \text{tr}\bigg(\big(\rho^{DR}(\overrightarrow{\theta_D})- \rho^{DG}(\overrightarrow{\theta_D}, \overrightarrow{\theta_G}, z)\Big) Z\bigg)$$

Pavan.J