First method
→ Classification model: XGBoost (for Baseline), Transformer Get stock data with pykrx, news with gnews Sentiment analysis for the news with ‘tabularisai/multilingual-sentiment-analysis’ And add technical features with ‘ta’ Using class_weights in CrossEntropyLoss to unbalance classes (decline / flat / rise)
XGBoost: 43.5%, Transformer: 41.7%
⇒ Class Imbalance problem?
Trials (but all failed):
Okay. I feel like I'm missing something.
Turn a classification problem into a regression problem
→ RMSE: 0.0212
Impressive…. It’s even worse. Okay, regression is much harder than classification.
Roll back into classification
Add macroeconomic scale data using Fred
⇒ Still low perf…