Introduction
In the previous episode, we explored sentiment analysis of financial news using FinBERT. Now we take this a step further: can we predict or explain market volatility using news sentiment? This episode demonstrates how to correlate news sentiment with market movements using the Daily Financial News for 6000+ Stocks Kaggle dataset.
We’ll build a complete pipeline covering:
- News aggregation and preprocessing at scale
- Sentiment scoring using FinBERT
- Time-series alignment of sentiment with stock prices and VIX
- Event study methodology for measuring news impact
- Granger causality testing
- Visualization with heatmaps and rolling correlations
Dataset Overview
The Daily Financial News for 6000+ Stocks dataset from Kaggle contains headlines and metadata for thousands of stocks over multiple years. Key fields include:
| Field | Description |
|---|---|
date |
Publication date |
stock |
Stock ticker symbol |
headline |
News headline text |
source |
News source |
url |
Article URL |
Download the dataset from Kaggle and extract it:
import pandas as pd
import numpy as np
from pathlib import Path
# Load the dataset
data_path = Path('daily_financial_news.csv')
df = pd.read_csv(data_path)
# Basic inspection
print(f"Total news articles: {len(df)}")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")
print(f"Unique stocks: {df['stock'].nunique()}")
print(df.head())
News Aggregation and Preprocessing Pipeline
Data Cleaning
First, we clean and standardize the data:
import re
from datetime import datetime
def clean_headline(text):
"""Remove special characters and normalize whitespace"""
if pd.isna(text):
return ""
text = re.sub(r'http\S+', '', text) # Remove URLs
text = re.sub(r'[^a-zA-Z0-9\s.,!?-]', '', text) # Keep basic punctuation
text = re.sub(r'\s+', ' ', text).strip()
return text
# Apply cleaning
df['headline_clean'] = df['headline'].apply(clean_headline)
# Convert date to datetime
df['date'] = pd.to_datetime(df['date'])
# Remove rows with empty headlines
df = df[df['headline_clean'].str.len() > 10]
# Sort by date
df = df.sort_values('date').reset_index(drop=True)
print(f"Cleaned dataset: {len(df)} articles")
Aggregation Strategy
For volatility analysis, we aggregate news at the daily level per stock:
# Group by stock and date
daily_news = df.groupby(['stock', 'date']).agg({
'headline_clean': lambda x: ' | '.join(x), # Concatenate headlines
'source': 'count' # Count number of articles
}).rename(columns={'source': 'article_count'}).reset_index()
print(f"Daily aggregated records: {len(daily_news)}")
print(daily_news.head())
Sentiment Scoring at Scale
FinBERT Setup
We use FinBERT (covered in Part 1) for domain-specific sentiment analysis:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from tqdm import tqdm
# Load FinBERT
model_name = "ProsusAI/finbert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
model.eval()
def get_sentiment_score(text, max_length=512):
"""
Returns sentiment score in range [-1, 1]
Positive: bullish, Negative: bearish, Neutral: 0
"""
if not text or len(text.strip()) == 0:
return 0.0
# Tokenize (truncate long texts)
inputs = tokenizer(text, return_tensors="pt",
truncation=True, max_length=max_length,
padding=True)
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)[0]
# FinBERT classes: [positive, negative, neutral]
pos, neg, neu = probs.cpu().numpy()
# Convert to continuous score: positive - negative
score = float(pos - neg)
return score
Batch Processing
Process all headlines efficiently:
# Sample for demonstration (process all in production)
sample_size = 10000
daily_news_sample = daily_news.head(sample_size).copy()
# Compute sentiment scores
tqdm.pandas(desc="Computing sentiment")
daily_news_sample['sentiment'] = daily_news_sample['headline_clean'].progress_apply(
get_sentiment_score
)
print("Sentiment distribution:")
print(daily_news_sample['sentiment'].describe())
Time-Series Alignment with Market Data
Fetch Stock Prices and VIX
We need price data to measure volatility. Using yfinance:
import yfinance as yf
def fetch_stock_data(ticker, start_date, end_date):
"""Fetch OHLCV data and compute returns and realized volatility"""
try:
stock = yf.Ticker(ticker)
hist = stock.history(start=start_date, end=end_date)
if hist.empty:
return None
# Compute log returns
hist['returns'] = np.log(hist['Close'] / hist['Close'].shift(1))
# Realized volatility (20-day rolling std of returns)
hist['volatility'] = hist['returns'].rolling(window=20).std() * np.sqrt(252)
hist.reset_index(inplace=True)
hist['Date'] = pd.to_datetime(hist['Date']).dt.date
return hist[['Date', 'Close', 'returns', 'volatility']]
except:
return None
# Fetch VIX (CBOE Volatility Index)
vix = yf.Ticker("^VIX")
vix_hist = vix.history(start="2020-01-01", end="2024-01-01")
vix_hist.reset_index(inplace=True)
vix_hist['Date'] = pd.to_datetime(vix_hist['Date']).dt.date
vix_hist = vix_hist[['Date', 'Close']].rename(columns={'Close': 'VIX'})
print(f"VIX data points: {len(vix_hist)}")
print(vix_hist.head())
Merge Sentiment with Market Data
# Convert date to date type for merging
daily_news_sample['date'] = pd.to_datetime(daily_news_sample['date']).dt.date
# Example: analyze AAPL
aapl_news = daily_news_sample[daily_news_sample['stock'] == 'AAPL'].copy()
aapl_price = fetch_stock_data('AAPL', '2020-01-01', '2024-01-01')
if aapl_price is not None:
# Merge news sentiment with price data
aapl_merged = pd.merge(aapl_price, aapl_news,
left_on='Date', right_on='date',
how='left')
# Fill missing sentiment with 0 (no news days)
aapl_merged['sentiment'] = aapl_merged['sentiment'].fillna(0)
aapl_merged['article_count'] = aapl_merged['article_count'].fillna(0)
# Merge with VIX
aapl_merged = pd.merge(aapl_merged, vix_hist, on='Date', how='left')
print(f"Merged AAPL data: {len(aapl_merged)} days")
print(aapl_merged[['Date', 'Close', 'returns', 'sentiment', 'VIX']].head(10))
Event Study Methodology
Identifying High-Impact News
Event studies measure abnormal returns around specific events. We define high-impact news as days with extreme sentiment:
# Define thresholds for extreme sentiment
sentiment_threshold = aapl_merged['sentiment'].std() * 2
aapl_merged['extreme_positive'] = aapl_merged['sentiment'] > sentiment_threshold
aapl_merged['extreme_negative'] = aapl_merged['sentiment'] < -sentiment_threshold
print(f"Extreme positive days: {aapl_merged['extreme_positive'].sum()}")
print(f"Extreme negative days: {aapl_merged['extreme_negative'].sum()}")
Computing Abnormal Returns
Abnormal return is the actual return minus expected return:
Where:
– : actual return on day
– : expected return (we use 20-day moving average)
# Compute expected returns (20-day moving average)
aapl_merged['expected_return'] = aapl_merged['returns'].rolling(window=20).mean()
# Abnormal returns
aapl_merged['abnormal_return'] = aapl_merged['returns'] - aapl_merged['expected_return']
print(aapl_merged[['Date', 'returns', 'expected_return', 'abnormal_return']].head(10))
Event Window Analysis
Measure cumulative abnormal returns (CAR) around news events:
Where is the event window (e.g., days).
def compute_event_window_car(df, event_indices, window=(-1, 1)):
"""
Compute CAR for each event in the given window
window: tuple (days_before, days_after)
"""
cars = []
for idx in event_indices:
start_idx = max(0, idx + window[0])
end_idx = min(len(df) - 1, idx + window[1])
car = df.iloc[start_idx:end_idx + 1]['abnormal_return'].sum()
cars.append(car)
return np.array(cars)
# Get indices of extreme events
positive_events = aapl_merged[aapl_merged['extreme_positive']].index.tolist()
negative_events = aapl_merged[aapl_merged['extreme_negative']].index.tolist()
# Compute CARs
positive_cars = compute_event_window_car(aapl_merged, positive_events, window=(-1, 3))
negative_cars = compute_event_window_car(aapl_merged, negative_events, window=(-1, 3))
print(f"Positive news average CAR: {positive_cars.mean():.4f}")
print(f"Negative news average CAR: {negative_cars.mean():.4f}")
Granger Causality Testing
Granger causality tests whether past values of sentiment help predict future returns. The null hypothesis : sentiment does not Granger-cause returns.
The test equation:
Where:
– : returns at time
– : sentiment at time
– : lag order
– : coefficients for sentiment lags
If are jointly significant, sentiment Granger-causes returns.
from statsmodels.tsa.stattools import grangercausalitytests
# Prepare data (remove NaN)
granger_data = aapl_merged[['returns', 'sentiment']].dropna()
print("Testing: Does sentiment Granger-cause returns?")
try:
# Test with lags 1-5
results = grangercausalitytests(granger_data[['returns', 'sentiment']],
maxlag=5, verbose=True)
# Extract p-values
p_values = [results[lag][0]['ssr_ftest'][1] for lag in range(1, 6)]
print(f"\nP-values for lags 1-5: {p_values}")
print(f"Significant at 0.05 level: {[p < 0.05 for p in p_values]}")
except Exception as e:
print(f"Error in Granger test: {e}")
# Reverse test: Does returns Granger-cause sentiment?
print("\nTesting: Do returns Granger-cause sentiment?")
try:
results_reverse = grangercausalitytests(granger_data[['sentiment', 'returns']],
maxlag=5, verbose=True)
except Exception as e:
print(f"Error in reverse Granger test: {e}")
Correlation Analysis and Visualization
Rolling Correlation
Compute time-varying correlation between sentiment and volatility:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")
# Rolling 60-day correlation
window = 60
aapl_merged['rolling_corr'] = aapl_merged['sentiment'].rolling(window).corr(
aapl_merged['volatility']
)
# Plot
plt.figure(figsize=(14, 6))
plt.plot(aapl_merged['Date'], aapl_merged['rolling_corr'],
linewidth=1.5, color='steelblue')
plt.axhline(0, color='red', linestyle='--', linewidth=1)
plt.title(f'AAPL: {window}-Day Rolling Correlation (Sentiment vs Volatility)',
fontsize=14, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Correlation Coefficient', fontsize=12)
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('rolling_correlation.png', dpi=150)
plt.show()
print(f"Mean rolling correlation: {aapl_merged['rolling_corr'].mean():.3f}")
Sentiment-Return Heatmap
Create a heatmap showing average returns binned by sentiment:
# Bin sentiment into quintiles
aapl_merged['sentiment_bin'] = pd.qcut(aapl_merged['sentiment'],
q=5, labels=['Very Negative', 'Negative',
'Neutral', 'Positive', 'Very Positive'],
duplicates='drop')
# Compute average returns and volatility by bin
agg_stats = aapl_merged.groupby('sentiment_bin').agg({
'returns': 'mean',
'volatility': 'mean',
'abnormal_return': 'mean'
}).reset_index()
print("\nAverage metrics by sentiment bin:")
print(agg_stats)
# Heatmap visualization
plt.figure(figsize=(10, 6))
sns.heatmap(agg_stats.set_index('sentiment_bin').T,
annot=True, fmt=".4f", cmap="RdYlGn",
center=0, linewidths=0.5, cbar_kws={'label': 'Value'})
plt.title('AAPL: Average Returns and Volatility by Sentiment Bin',
fontsize=14, fontweight='bold')
plt.ylabel('Metric', fontsize=12)
plt.xlabel('Sentiment Bin', fontsize=12)
plt.tight_layout()
plt.savefig('sentiment_heatmap.png', dpi=150)
plt.show()
Sentiment vs VIX Scatter
Visualize relationship between aggregate market sentiment and VIX:
# Aggregate daily sentiment across all stocks
daily_market_sentiment = daily_news_sample.groupby('date').agg({
'sentiment': 'mean',
'article_count': 'sum'
}).reset_index()
# Merge with VIX
market_vix = pd.merge(daily_market_sentiment, vix_hist,
left_on='date', right_on='Date', how='inner')
# Scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(market_vix['sentiment'], market_vix['VIX'],
alpha=0.5, s=30, c=market_vix['article_count'],
cmap='viridis', edgecolors='k', linewidth=0.5)
plt.colorbar(label='Daily Article Count')
plt.xlabel('Average Market Sentiment', fontsize=12)
plt.ylabel('VIX (Volatility Index)', fontsize=12)
plt.title('Market Sentiment vs VIX', fontsize=14, fontweight='bold')
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('sentiment_vix_scatter.png', dpi=150)
plt.show()
# Correlation
corr = market_vix[['sentiment', 'VIX']].corr().iloc[0, 1]
print(f"\nMarket sentiment vs VIX correlation: {corr:.3f}")
Lagged Impact Analysis
Investigate how sentiment affects future volatility:
# Create lagged sentiment features
for lag in range(1, 6):
aapl_merged[f'sentiment_lag{lag}'] = aapl_merged['sentiment'].shift(lag)
# Correlation matrix
lag_cols = ['volatility'] + [f'sentiment_lag{i}' for i in range(1, 6)]
corr_matrix = aapl_merged[lag_cols].corr()
plt.figure(figsize=(8, 6))
sns.heatmap(corr_matrix, annot=True, fmt=".3f", cmap="coolwarm",
center=0, linewidths=1, square=True)
plt.title('Volatility Correlation with Lagged Sentiment',
fontsize=14, fontweight='bold')
plt.tight_layout()
plt.savefig('lagged_sentiment_correlation.png', dpi=150)
plt.show()
print("\nCorrelation with lagged sentiment:")
print(corr_matrix['volatility'].drop('volatility'))
Statistical Significance Testing
Test whether sentiment-return relationship is statistically significant:
from scipy.stats import pearsonr, spearmanr
# Pearson correlation (linear relationship)
pearson_r, pearson_p = pearsonr(aapl_merged['sentiment'].dropna(),
aapl_merged['returns'].dropna())
# Spearman correlation (monotonic relationship)
spearman_r, spearman_p = spearmanr(aapl_merged['sentiment'].dropna(),
aapl_merged['returns'].dropna())
print("\n=== Correlation Test Results ===")
print(f"Pearson correlation: {pearson_r:.4f} (p-value: {pearson_p:.4e})")
print(f"Spearman correlation: {spearman_r:.4f} (p-value: {spearman_p:.4e})")
if pearson_p < 0.05:
print("✓ Sentiment and returns are significantly correlated (p < 0.05)")
else:
print("✗ No significant correlation detected (p >= 0.05)")
Practical Insights
Key Findings
- Lead-Lag Relationship: Sentiment often leads volatility by 1-3 days
- Asymmetric Impact: Negative news has stronger impact than positive news
- VIX Correlation: Market-wide sentiment inversely correlates with VIX
- Event Windows: Maximum impact occurs within [-1, +2] day window
Trading Implications
# Simple sentiment-based signal
def generate_signal(sentiment, threshold=0.3):
"""Generate trading signal based on sentiment"""
if sentiment > threshold:
return 1 # Bullish
elif sentiment < -threshold:
return -1 # Bearish
else:
return 0 # Neutral
aapl_merged['signal'] = aapl_merged['sentiment'].apply(
lambda x: generate_signal(x, threshold=0.2)
)
# Backtest signal
aapl_merged['strategy_return'] = aapl_merged['signal'].shift(1) * aapl_merged['returns']
# Performance metrics
cumulative_return = (1 + aapl_merged['returns'].dropna()).cumprod().iloc[-1] - 1
strategy_cumulative = (1 + aapl_merged['strategy_return'].dropna()).cumprod().iloc[-1] - 1
print("\n=== Backtest Results ===")
print(f"Buy-and-hold return: {cumulative_return:.2%}")
print(f"Sentiment strategy return: {strategy_cumulative:.2%}")
print(f"Outperformance: {strategy_cumulative - cumulative_return:.2%}")
Advanced Extensions
Multi-Stock Portfolio Analysis
# Analyze top 10 stocks by news volume
top_stocks = daily_news_sample.groupby('stock')['article_count'].sum().nlargest(10).index
portfolio_results = []
for ticker in top_stocks:
stock_news = daily_news_sample[daily_news_sample['stock'] == ticker]
stock_price = fetch_stock_data(ticker, '2020-01-01', '2024-01-01')
if stock_price is not None:
merged = pd.merge(stock_price, stock_news,
left_on='Date', right_on='date', how='left')
merged['sentiment'] = merged['sentiment'].fillna(0)
corr, p_val = pearsonr(merged['sentiment'].dropna(),
merged['returns'].dropna())
portfolio_results.append({
'ticker': ticker,
'correlation': corr,
'p_value': p_val,
'significant': p_val < 0.05
})
portfolio_df = pd.DataFrame(portfolio_results)
print("\n=== Portfolio-Wide Sentiment Analysis ===")
print(portfolio_df.sort_values('correlation', ascending=False))
print(f"\nSignificant stocks: {portfolio_df['significant'].sum()}/{len(portfolio_df)}")
Conclusion
This episode demonstrated a complete pipeline for mapping news sentiment to market volatility. We covered:
- Data engineering: Cleaning and aggregating 6000+ stock news dataset
- Sentiment scoring: Applying FinBERT at scale
- Time-series alignment: Merging sentiment with price and VIX data
- Event studies: Measuring abnormal returns around news events
- Granger causality: Testing predictive relationships
- Visualization: Heatmaps, rolling correlations, and scatter plots
Key takeaways:
- News sentiment has statistically significant correlation with returns and volatility
- Lagged sentiment (1-3 days prior) shows stronger predictive power
- Extreme sentiment days exhibit measurable abnormal returns
- Market-wide sentiment inversely correlates with VIX
In the next episode, we’ll explore Decoding Central Bank Speeches with NLP, analyzing how Fed meeting transcripts impact bond markets and currency pairs.
Did you find this helpful?
☕ Buy me a coffee
Leave a Reply