To enable analytical services driven by historical data processing, DeFi Rating has introduced the ‘Q’ formula, based on the learning algorithm.

The formula will also become the official logotype of the DeFi rating tool:

$Q(s,a)= r(s,a)+γmaxQ(s′,a)$

Q (s,a) is the expected value or reward for taking an action in a certain state and following the appropriate policy.

r (reward) is the value received after completing a particular action at a given state.

γ (gamma) is a discount factor that’s used to balance the immediate and future reward.

max is taking the maximum of the future reward and applying it to the reward for the current state.

What is Learning?

Learning is a model-free reinforcement learning algorithm. It is aimed at finding the best action in the current circumstances/state. The Learning function learns from random events outside the currency policy, so no model is required. In other words, the formula helps us understand how useful a particular action is for getting some reward in the future.

Learning uses Temporal Differences to estimate the value of Q. That means an agent is learning from the environment through episodes without having any prior knowledge of the environment. That’s why it can be applied to DeFi, a newly-appeared segment that is forming with certain rules and tendencies.

Since Learning involves artificial intelligence (AI), the agent does not learn from a single episode - it develops a strategy to eventually find the optimal Q values. Machine learning speeds up function approximation. DeFi Rating will use AI algorithms too, which will accelerate the algorithm’s development.

>> Rating Methodology

> Cheesus Exploration Tools