Paper 06 NLP / Frontier Research LLM BIST Qwen Llama

Sentiment Analysis in the Turkish Stock Market (BIST): Generating Signals from Financial News with Qwen and Llama

Building a time-aware, Turkish-native NLP pipeline using Qwen and Llama for financial news signal extraction on Borsa Istanbul.

Abstract

Sentiment analysis in equities is easy to oversell and hard to do well. This paper presents a practical framework for building a time-aware, Turkish-native financial NLP pipeline using open-weight LLMs (Qwen and Llama) for signal extraction on Borsa Istanbul. We address label design, forward return targets, local inference prototyping, and the critical pitfalls of leakage, non-stationarity, and Turkish morphology in financial text.

Key Takeaways

Sentiment as a Forecast Variable

Sentiment analysis in equities is easy to oversell and hard to do well. The useful version treats sentiment as a conditional forecast variable: a noisy signal that may explain cross-sectional returns, volatility, or volume once aligned to the correct event timestamp and trading horizon.

In the Turkish market, open-weight LLM ecosystems have matured significantly. Qwen has publicly released Qwen3-family weights, and Meta promotes Llama 4-family models and Llama Stack distributions for self-hosted workflows. That makes a local, Turkish-language financial NLP stack increasingly practical.

The central problem is label design. For a news item arriving at time \(t\), a common target is the forward return over horizon \(h\):

Forward Return Target
$$r_{t,h} = \ln\left(\frac{P_{t+h}}{P_t}\right)$$

A practical sentiment score can be defined as:

Sentiment Score
$$s_t = p(\text{bullish} \mid x_t) - p(\text{bearish} \mid x_t)$$

Local Inference Prototype

bist_sentiment.py Python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "Qwen/Qwen3-8B"  # replace with local checkpoint
tokenizer  = AutoTokenizer.from_pretrained(model_name)
model      = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=3)

texts = [
    "Şirket, beklentilerin üzerinde net kar açıkladı ve yeni yatırım planı duyurdu.",
    "Faiz kararı sonrası banka hisselerinde satış baskısı artıyor."
]

inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
    probs  = torch.softmax(logits, dim=-1)

print(probs)  # [bearish, neutral, bullish]

Critical Pitfalls

The edge comes not from "using an LLM," but from building a time-aware, Turkish-native, market-aligned inference pipeline. The alpha comes from labeling discipline, entity resolution, and proper out-of-sample testing.