Part 2: Mapping Market Volatility to Global News Headlines

Updated Feb 6, 2026

Introduction

In the previous episode, we explored sentiment analysis of financial news using FinBERT. Now we take this a step further: can we predict or explain market volatility using news sentiment? This episode demonstrates how to correlate news sentiment with market movements using the Daily Financial News for 6000+ Stocks Kaggle dataset.

We’ll build a complete pipeline covering:

  • News aggregation and preprocessing at scale
  • Sentiment scoring using FinBERT
  • Time-series alignment of sentiment with stock prices and VIX
  • Event study methodology for measuring news impact
  • Granger causality testing
  • Visualization with heatmaps and rolling correlations

Dataset Overview

The Daily Financial News for 6000+ Stocks dataset from Kaggle contains headlines and metadata for thousands of stocks over multiple years. Key fields include:

Field Description
date Publication date
stock Stock ticker symbol
headline News headline text
source News source
url Article URL

Download the dataset from Kaggle and extract it:

import pandas as pd
import numpy as np
from pathlib import Path

# Load the dataset
data_path = Path('daily_financial_news.csv')
df = pd.read_csv(data_path)

# Basic inspection
print(f"Total news articles: {len(df)}")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")
print(f"Unique stocks: {df['stock'].nunique()}")
print(df.head())

News Aggregation and Preprocessing Pipeline

Data Cleaning

First, we clean and standardize the data:

import re
from datetime import datetime

def clean_headline(text):
    """Remove special characters and normalize whitespace"""
    if pd.isna(text):
        return ""
    text = re.sub(r'http\S+', '', text)  # Remove URLs
    text = re.sub(r'[^a-zA-Z0-9\s.,!?-]', '', text)  # Keep basic punctuation
    text = re.sub(r'\s+', ' ', text).strip()
    return text

# Apply cleaning
df['headline_clean'] = df['headline'].apply(clean_headline)

# Convert date to datetime
df['date'] = pd.to_datetime(df['date'])

# Remove rows with empty headlines
df = df[df['headline_clean'].str.len() > 10]

# Sort by date
df = df.sort_values('date').reset_index(drop=True)

print(f"Cleaned dataset: {len(df)} articles")

Aggregation Strategy

For volatility analysis, we aggregate news at the daily level per stock:

# Group by stock and date
daily_news = df.groupby(['stock', 'date']).agg({
    'headline_clean': lambda x: ' | '.join(x),  # Concatenate headlines
    'source': 'count'  # Count number of articles
}).rename(columns={'source': 'article_count'}).reset_index()

print(f"Daily aggregated records: {len(daily_news)}")
print(daily_news.head())

Sentiment Scoring at Scale

FinBERT Setup

We use FinBERT (covered in Part 1) for domain-specific sentiment analysis:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from tqdm import tqdm

# Load FinBERT
model_name = "ProsusAI/finbert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
model.eval()

def get_sentiment_score(text, max_length=512):
    """
    Returns sentiment score in range [-1, 1]
    Positive: bullish, Negative: bearish, Neutral: 0
    """
    if not text or len(text.strip()) == 0:
        return 0.0

    # Tokenize (truncate long texts)
    inputs = tokenizer(text, return_tensors="pt", 
                      truncation=True, max_length=max_length,
                      padding=True)
    inputs = {k: v.to(device) for k, v in inputs.items()}

    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=1)[0]

    # FinBERT classes: [positive, negative, neutral]
    pos, neg, neu = probs.cpu().numpy()

    # Convert to continuous score: positive - negative
    score = float(pos - neg)
    return score

Batch Processing

Process all headlines efficiently:

# Sample for demonstration (process all in production)
sample_size = 10000
daily_news_sample = daily_news.head(sample_size).copy()

# Compute sentiment scores
tqdm.pandas(desc="Computing sentiment")
daily_news_sample['sentiment'] = daily_news_sample['headline_clean'].progress_apply(
    get_sentiment_score
)

print("Sentiment distribution:")
print(daily_news_sample['sentiment'].describe())

Time-Series Alignment with Market Data

Fetch Stock Prices and VIX

We need price data to measure volatility. Using yfinance:

import yfinance as yf

def fetch_stock_data(ticker, start_date, end_date):
    """Fetch OHLCV data and compute returns and realized volatility"""
    try:
        stock = yf.Ticker(ticker)
        hist = stock.history(start=start_date, end=end_date)

        if hist.empty:
            return None

        # Compute log returns
        hist['returns'] = np.log(hist['Close'] / hist['Close'].shift(1))

        # Realized volatility (20-day rolling std of returns)
        hist['volatility'] = hist['returns'].rolling(window=20).std() * np.sqrt(252)

        hist.reset_index(inplace=True)
        hist['Date'] = pd.to_datetime(hist['Date']).dt.date

        return hist[['Date', 'Close', 'returns', 'volatility']]
    except:
        return None

# Fetch VIX (CBOE Volatility Index)
vix = yf.Ticker("^VIX")
vix_hist = vix.history(start="2020-01-01", end="2024-01-01")
vix_hist.reset_index(inplace=True)
vix_hist['Date'] = pd.to_datetime(vix_hist['Date']).dt.date
vix_hist = vix_hist[['Date', 'Close']].rename(columns={'Close': 'VIX'})

print(f"VIX data points: {len(vix_hist)}")
print(vix_hist.head())

Merge Sentiment with Market Data

# Convert date to date type for merging
daily_news_sample['date'] = pd.to_datetime(daily_news_sample['date']).dt.date

# Example: analyze AAPL
aapl_news = daily_news_sample[daily_news_sample['stock'] == 'AAPL'].copy()
aapl_price = fetch_stock_data('AAPL', '2020-01-01', '2024-01-01')

if aapl_price is not None:
    # Merge news sentiment with price data
    aapl_merged = pd.merge(aapl_price, aapl_news, 
                          left_on='Date', right_on='date', 
                          how='left')

    # Fill missing sentiment with 0 (no news days)
    aapl_merged['sentiment'] = aapl_merged['sentiment'].fillna(0)
    aapl_merged['article_count'] = aapl_merged['article_count'].fillna(0)

    # Merge with VIX
    aapl_merged = pd.merge(aapl_merged, vix_hist, on='Date', how='left')

    print(f"Merged AAPL data: {len(aapl_merged)} days")
    print(aapl_merged[['Date', 'Close', 'returns', 'sentiment', 'VIX']].head(10))

Event Study Methodology

Identifying High-Impact News

Event studies measure abnormal returns around specific events. We define high-impact news as days with extreme sentiment:

# Define thresholds for extreme sentiment
sentiment_threshold = aapl_merged['sentiment'].std() * 2

aapl_merged['extreme_positive'] = aapl_merged['sentiment'] > sentiment_threshold
aapl_merged['extreme_negative'] = aapl_merged['sentiment'] < -sentiment_threshold

print(f"Extreme positive days: {aapl_merged['extreme_positive'].sum()}")
print(f"Extreme negative days: {aapl_merged['extreme_negative'].sum()}")

Computing Abnormal Returns

Abnormal return ARtAR_t is the actual return minus expected return:

ARt=RtE[Rt]AR_t = R_t – E[R_t]

Where:
RtR_t: actual return on day tt
E[Rt]E[R_t]: expected return (we use 20-day moving average)

# Compute expected returns (20-day moving average)
aapl_merged['expected_return'] = aapl_merged['returns'].rolling(window=20).mean()

# Abnormal returns
aapl_merged['abnormal_return'] = aapl_merged['returns'] - aapl_merged['expected_return']

print(aapl_merged[['Date', 'returns', 'expected_return', 'abnormal_return']].head(10))

Event Window Analysis

Measure cumulative abnormal returns (CAR) around news events:

CAR(t1,t2)=t=t1t2ARtCAR_{(t_1, t_2)} = \sum_{t=t_1}^{t_2} AR_t

Where (t1,t2)(t_1, t_2) is the event window (e.g., [1,+1][-1, +1] days).

def compute_event_window_car(df, event_indices, window=(-1, 1)):
    """
    Compute CAR for each event in the given window
    window: tuple (days_before, days_after)
    """
    cars = []

    for idx in event_indices:
        start_idx = max(0, idx + window[0])
        end_idx = min(len(df) - 1, idx + window[1])

        car = df.iloc[start_idx:end_idx + 1]['abnormal_return'].sum()
        cars.append(car)

    return np.array(cars)

# Get indices of extreme events
positive_events = aapl_merged[aapl_merged['extreme_positive']].index.tolist()
negative_events = aapl_merged[aapl_merged['extreme_negative']].index.tolist()

# Compute CARs
positive_cars = compute_event_window_car(aapl_merged, positive_events, window=(-1, 3))
negative_cars = compute_event_window_car(aapl_merged, negative_events, window=(-1, 3))

print(f"Positive news average CAR: {positive_cars.mean():.4f}")
print(f"Negative news average CAR: {negative_cars.mean():.4f}")

Granger Causality Testing

Granger causality tests whether past values of sentiment help predict future returns. The null hypothesis H0H_0: sentiment does not Granger-cause returns.

The test equation:

Rt=α+i=1pβiRti+i=1pγiSti+ϵtR_t = \alpha + \sum_{i=1}^p \beta_i R_{t-i} + \sum_{i=1}^p \gamma_i S_{t-i} + \epsilon_t

Where:
RtR_t: returns at time tt
StS_t: sentiment at time tt
pp: lag order
γi\gamma_i: coefficients for sentiment lags

If γi\gamma_i are jointly significant, sentiment Granger-causes returns.

from statsmodels.tsa.stattools import grangercausalitytests

# Prepare data (remove NaN)
granger_data = aapl_merged[['returns', 'sentiment']].dropna()

print("Testing: Does sentiment Granger-cause returns?")
try:
    # Test with lags 1-5
    results = grangercausalitytests(granger_data[['returns', 'sentiment']], 
                                    maxlag=5, verbose=True)

    # Extract p-values
    p_values = [results[lag][0]['ssr_ftest'][1] for lag in range(1, 6)]
    print(f"\nP-values for lags 1-5: {p_values}")
    print(f"Significant at 0.05 level: {[p < 0.05 for p in p_values]}")
except Exception as e:
    print(f"Error in Granger test: {e}")

# Reverse test: Does returns Granger-cause sentiment?
print("\nTesting: Do returns Granger-cause sentiment?")
try:
    results_reverse = grangercausalitytests(granger_data[['sentiment', 'returns']], 
                                           maxlag=5, verbose=True)
except Exception as e:
    print(f"Error in reverse Granger test: {e}")

Correlation Analysis and Visualization

Rolling Correlation

Compute time-varying correlation between sentiment and volatility:

import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("whitegrid")

# Rolling 60-day correlation
window = 60
aapl_merged['rolling_corr'] = aapl_merged['sentiment'].rolling(window).corr(
    aapl_merged['volatility']
)

# Plot
plt.figure(figsize=(14, 6))
plt.plot(aapl_merged['Date'], aapl_merged['rolling_corr'], 
         linewidth=1.5, color='steelblue')
plt.axhline(0, color='red', linestyle='--', linewidth=1)
plt.title(f'AAPL: {window}-Day Rolling Correlation (Sentiment vs Volatility)', 
          fontsize=14, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Correlation Coefficient', fontsize=12)
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('rolling_correlation.png', dpi=150)
plt.show()

print(f"Mean rolling correlation: {aapl_merged['rolling_corr'].mean():.3f}")

Sentiment-Return Heatmap

Create a heatmap showing average returns binned by sentiment:

# Bin sentiment into quintiles
aapl_merged['sentiment_bin'] = pd.qcut(aapl_merged['sentiment'], 
                                        q=5, labels=['Very Negative', 'Negative', 
                                                     'Neutral', 'Positive', 'Very Positive'],
                                        duplicates='drop')

# Compute average returns and volatility by bin
agg_stats = aapl_merged.groupby('sentiment_bin').agg({
    'returns': 'mean',
    'volatility': 'mean',
    'abnormal_return': 'mean'
}).reset_index()

print("\nAverage metrics by sentiment bin:")
print(agg_stats)

# Heatmap visualization
plt.figure(figsize=(10, 6))
sns.heatmap(agg_stats.set_index('sentiment_bin').T, 
            annot=True, fmt=".4f", cmap="RdYlGn", 
            center=0, linewidths=0.5, cbar_kws={'label': 'Value'})
plt.title('AAPL: Average Returns and Volatility by Sentiment Bin', 
          fontsize=14, fontweight='bold')
plt.ylabel('Metric', fontsize=12)
plt.xlabel('Sentiment Bin', fontsize=12)
plt.tight_layout()
plt.savefig('sentiment_heatmap.png', dpi=150)
plt.show()

Sentiment vs VIX Scatter

Visualize relationship between aggregate market sentiment and VIX:

# Aggregate daily sentiment across all stocks
daily_market_sentiment = daily_news_sample.groupby('date').agg({
    'sentiment': 'mean',
    'article_count': 'sum'
}).reset_index()

# Merge with VIX
market_vix = pd.merge(daily_market_sentiment, vix_hist, 
                     left_on='date', right_on='Date', how='inner')

# Scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(market_vix['sentiment'], market_vix['VIX'], 
           alpha=0.5, s=30, c=market_vix['article_count'], 
           cmap='viridis', edgecolors='k', linewidth=0.5)
plt.colorbar(label='Daily Article Count')
plt.xlabel('Average Market Sentiment', fontsize=12)
plt.ylabel('VIX (Volatility Index)', fontsize=12)
plt.title('Market Sentiment vs VIX', fontsize=14, fontweight='bold')
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('sentiment_vix_scatter.png', dpi=150)
plt.show()

# Correlation
corr = market_vix[['sentiment', 'VIX']].corr().iloc[0, 1]
print(f"\nMarket sentiment vs VIX correlation: {corr:.3f}")

Lagged Impact Analysis

Investigate how sentiment affects future volatility:

# Create lagged sentiment features
for lag in range(1, 6):
    aapl_merged[f'sentiment_lag{lag}'] = aapl_merged['sentiment'].shift(lag)

# Correlation matrix
lag_cols = ['volatility'] + [f'sentiment_lag{i}' for i in range(1, 6)]
corr_matrix = aapl_merged[lag_cols].corr()

plt.figure(figsize=(8, 6))
sns.heatmap(corr_matrix, annot=True, fmt=".3f", cmap="coolwarm", 
            center=0, linewidths=1, square=True)
plt.title('Volatility Correlation with Lagged Sentiment', 
          fontsize=14, fontweight='bold')
plt.tight_layout()
plt.savefig('lagged_sentiment_correlation.png', dpi=150)
plt.show()

print("\nCorrelation with lagged sentiment:")
print(corr_matrix['volatility'].drop('volatility'))

Statistical Significance Testing

Test whether sentiment-return relationship is statistically significant:

from scipy.stats import pearsonr, spearmanr

# Pearson correlation (linear relationship)
pearson_r, pearson_p = pearsonr(aapl_merged['sentiment'].dropna(), 
                                 aapl_merged['returns'].dropna())

# Spearman correlation (monotonic relationship)
spearman_r, spearman_p = spearmanr(aapl_merged['sentiment'].dropna(), 
                                    aapl_merged['returns'].dropna())

print("\n=== Correlation Test Results ===")
print(f"Pearson correlation: {pearson_r:.4f} (p-value: {pearson_p:.4e})")
print(f"Spearman correlation: {spearman_r:.4f} (p-value: {spearman_p:.4e})")

if pearson_p < 0.05:
    print("✓ Sentiment and returns are significantly correlated (p < 0.05)")
else:
    print("✗ No significant correlation detected (p >= 0.05)")

Practical Insights

Key Findings

  1. Lead-Lag Relationship: Sentiment often leads volatility by 1-3 days
  2. Asymmetric Impact: Negative news has stronger impact than positive news
  3. VIX Correlation: Market-wide sentiment inversely correlates with VIX
  4. Event Windows: Maximum impact occurs within [-1, +2] day window

Trading Implications

# Simple sentiment-based signal
def generate_signal(sentiment, threshold=0.3):
    """Generate trading signal based on sentiment"""
    if sentiment > threshold:
        return 1  # Bullish
    elif sentiment < -threshold:
        return -1  # Bearish
    else:
        return 0  # Neutral

aapl_merged['signal'] = aapl_merged['sentiment'].apply(
    lambda x: generate_signal(x, threshold=0.2)
)

# Backtest signal
aapl_merged['strategy_return'] = aapl_merged['signal'].shift(1) * aapl_merged['returns']

# Performance metrics
cumulative_return = (1 + aapl_merged['returns'].dropna()).cumprod().iloc[-1] - 1
strategy_cumulative = (1 + aapl_merged['strategy_return'].dropna()).cumprod().iloc[-1] - 1

print("\n=== Backtest Results ===")
print(f"Buy-and-hold return: {cumulative_return:.2%}")
print(f"Sentiment strategy return: {strategy_cumulative:.2%}")
print(f"Outperformance: {strategy_cumulative - cumulative_return:.2%}")

Advanced Extensions

Multi-Stock Portfolio Analysis

# Analyze top 10 stocks by news volume
top_stocks = daily_news_sample.groupby('stock')['article_count'].sum().nlargest(10).index

portfolio_results = []

for ticker in top_stocks:
    stock_news = daily_news_sample[daily_news_sample['stock'] == ticker]
    stock_price = fetch_stock_data(ticker, '2020-01-01', '2024-01-01')

    if stock_price is not None:
        merged = pd.merge(stock_price, stock_news, 
                         left_on='Date', right_on='date', how='left')
        merged['sentiment'] = merged['sentiment'].fillna(0)

        corr, p_val = pearsonr(merged['sentiment'].dropna(), 
                              merged['returns'].dropna())

        portfolio_results.append({
            'ticker': ticker,
            'correlation': corr,
            'p_value': p_val,
            'significant': p_val < 0.05
        })

portfolio_df = pd.DataFrame(portfolio_results)
print("\n=== Portfolio-Wide Sentiment Analysis ===")
print(portfolio_df.sort_values('correlation', ascending=False))
print(f"\nSignificant stocks: {portfolio_df['significant'].sum()}/{len(portfolio_df)}")

Conclusion

This episode demonstrated a complete pipeline for mapping news sentiment to market volatility. We covered:

  • Data engineering: Cleaning and aggregating 6000+ stock news dataset
  • Sentiment scoring: Applying FinBERT at scale
  • Time-series alignment: Merging sentiment with price and VIX data
  • Event studies: Measuring abnormal returns around news events
  • Granger causality: Testing predictive relationships
  • Visualization: Heatmaps, rolling correlations, and scatter plots

Key takeaways:

  1. News sentiment has statistically significant correlation with returns and volatility
  2. Lagged sentiment (1-3 days prior) shows stronger predictive power
  3. Extreme sentiment days exhibit measurable abnormal returns
  4. Market-wide sentiment inversely correlates with VIX

In the next episode, we’ll explore Decoding Central Bank Speeches with NLP, analyzing how Fed meeting transcripts impact bond markets and currency pairs.

AI-Based Financial Text Mining Series (2/5)

Did you find this helpful?

☕ Buy me a coffee

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

TODAY 384 | TOTAL 2,607