Improving Time Series Analysis: STL Decomposition with Price Normalization
In the world of data analysis, understanding seasonal patterns in time series data can be crucial for making informed business decisions. Whether you're analyzing web traffic, product sales, or marketing impressions, seasonal trends often tell an important story. However, external factors like price changes can mask these patterns, making it difficult to identify true seasonality.
In this article, I'll walk you through an enhanced approach to Seasonal-Trend-Loess (STL) decomposition that normalizes for price impacts and automatically flags seasonal patterns. This technique is particularly valuable for businesses where pricing dynamics can significantly affect performance metrics.
## The Challenge with Traditional Time Series Analysis
Standard time series decomposition separates data into three components:
- **Trend**: The long-term progression of the series
- **Seasonality**: Repeating patterns at fixed intervals
- **Residual**: The irregular remainder after removing trend and seasonality
But what happens when external factors like price changes influence your data? A sudden drop in prices might cause a spike in impressions or sales, which could be misinterpreted as a seasonal effect. This is where price normalization becomes essential.
## Solution: Price-Normalized STL Decomposition
I've developed a Python solution that enhances traditional STL decomposition by:
1. Normalizing your metrics (like impressions) against price fluctuations
2. Extracting clean seasonal patterns after removing price effects
3. Automatically flagging seasons and transition periods
4. Calculating seasonal strength to identify significant patterns
Let's dive into the implementation.
## The Implementation
First, let's import the necessary libraries:
```python
import pandas as pd
import numpy as np
import statsmodels.api as sm
from statsmodels.tsa.seasonal import STL
import matplotlib.pyplot as plt
```
Now, here's the core function that performs the price-normalized STL decomposition:
```python
def extract_normalized_stl(df, time_col, value_col, price_col=None, period=52):
"""
Extract STL decomposition with price normalization
Parameters:
-----------
df : pandas DataFrame
Input data containing time series
time_col : str
Column name for datetime index
value_col : str
Column name for the metric to decompose (e.g., impressions)
price_col : str, optional
Column name for price data to normalize against
period : int
Number of observations per seasonal cycle (52 for weekly data in a year)
Returns:
--------
DataFrame with decomposition components and season flags
"""
# Set datetime index
df = df.copy()
df = df.set_index(pd.DatetimeIndex(df[time_col]))
# Normalize for price impact if price column is provided
if price_col is not None:
# Create a simple price elasticity model
# Log-log regression to find elasticity
model = sm.OLS(
np.log(df[value_col]),
sm.add_constant(np.log(df[price_col]))
).fit()
elasticity = model.params[1]
print(f"Price elasticity: {elasticity:.4f}")
# Normalize the value using the estimated price impact
baseline_price = df[price_col].median()
df['normalized_value'] = df[value_col] * (baseline_price / df[price_col]) ** elasticity
target_col = 'normalized_value'
else:
target_col = value_col
# Apply STL decomposition
stl = STL(df[target_col], period=period, robust=True)
result = stl.fit()
# Extract components
trend = result.trend
seasonal = result.seasonal
residual = result.resid
# Create result DataFrame
result_df = pd.DataFrame({
'original': df[value_col],
'trend': trend,
'seasonal': seasonal,
'residual': residual
}, index=df.index)
if price_col is not None:
result_df['price'] = df[price_col]
result_df['normalized_value'] = df['normalized_value']
# Add seasonal strength metric
result_df['seasonal_strength'] = np.abs(seasonal) / (np.abs(trend) + np.abs(seasonal))
# Flag seasons based on month
result_df['month'] = df.index.month
# Define seasons (Northern Hemisphere)
result_df['season'] = 'all_season'
result_df.loc[result_df['month'].isin([12, 1, 2]), 'season'] = 'winter'
result_df.loc[result_df['month'].isin([3, 4, 5]), 'season'] = 'spring'
result_df.loc[result_df['month'].isin([6, 7, 8]), 'season'] = 'summer'
result_df.loc[result_df['month'].isin([9, 10, 11]), 'season'] = 'fall'
# Flag cusps (transition months)
cusps = {2: 'winter-spring', 5: 'spring-summer', 8: 'summer-fall', 11: 'fall-winter'}
for month, cusp_name in cusps.items():
result_df.loc[result_df['month'] == month, 'season_cusp'] = cusp_name
return result_df
```
For visualization, I've also created a helpful plotting function:
```python
def plot_decomposition(result_df, title="STL Decomposition with Price Normalization"):
"""Plot the STL decomposition components"""
fig, axes = plt.subplots(4, 1, figsize=(12, 10), sharex=True)
if 'normalized_value' in result_df.columns:
axes[0].plot(result_df.index, result_df['original'], label='Original')
axes[0].plot(result_df.index, result_df['normalized_value'], label='Price Normalized')
axes[0].set_ylabel('Value')
axes[0].legend()
else:
axes[0].plot(result_df.index, result_df['original'])
axes[0].set_ylabel('Original')
axes[1].plot(result_df.index, result_df['trend'])
axes[1].set_ylabel('Trend')
axes[2].plot(result_df.index, result_df['seasonal'])
axes[2].set_ylabel('Seasonal')
axes[3].plot(result_df.index, result_df['residual'])
axes[3].set_ylabel('Residual')
plt.suptitle(title)
plt.tight_layout()
return fig
```
## How It Works
### 1. Price Normalization
The key innovation here is the price normalization step. Using log-log regression, we estimate the price elasticity—how much the value changes in response to price changes. We then normalize the values to a baseline price (the median in this case), effectively removing price effects from our data.
The formula used is:
```
normalized_value = original_value * (baseline_price / actual_price) ^ elasticity
```
This approach is based on standard economic price elasticity modeling.
### 2. STL Decomposition
After normalization, we apply STL decomposition, which is particularly effective because:
- It's robust to outliers
- It can handle missing values
- It allows for evolving seasonality
### 3. Season Flagging
The code automatically identifies and flags seasons based on the month:
- Winter (December, January, February)
- Spring (March, April, May)
- Summer (June, July, August)
- Fall (September, October, November)
Additionally, it marks transition months as "cusps" (e.g., February as "winter-spring"), which can be valuable for detecting shoulder-season effects.
### 4. Seasonal Strength Analysis
To quantify the significance of seasonal patterns, we calculate a seasonal strength metric:
```
seasonal_strength = |seasonal component| / (|trend component| + |seasonal component|)
```
This gives us a value between 0 and 1, where higher values indicate stronger seasonality.
## Example Usage
Here's how you can apply this analysis to your own data:
```python
# Load your data
data = pd.read_csv("impressions_data.csv")
# Extract STL with price normalization
result = extract_normalized_stl(
data,
time_col='date',
value_col='impressions',
price_col='price',
period=52 # Weekly data
)
# Plot results
plot_decomposition(result)
plt.show()
# Identify strong seasonal patterns
seasonal_threshold = 0.3 # Adjust based on your data
seasonal_periods = result[result['seasonal_strength'] > seasonal_threshold]
print(f"Periods with strong seasonality:\n{seasonal_periods[['season', 'seasonal_strength']].head()}")
```
## Business Applications
This enhanced STL decomposition can be valuable for:
1. **Marketing Analysis**: Understand which impression spikes are due to seasonal effects versus price promotions
2. **Inventory Planning**: Prepare for seasonal demands after accounting for price elasticity
3. **Budget Forecasting**: Separate price-driven fluctuations from seasonal patterns for better predictions
4. **Competitor Analysis**: Identify true seasonal trends in your market after normalizing for pricing strategies
## Conclusion
By normalizing for price effects before performing STL decomposition, we can uncover true seasonal patterns that might otherwise be masked. This approach provides a more accurate picture of time series data, especially in price-sensitive markets.
The automatic season flagging and strength analysis further enhance this method, making it a powerful tool for business analysts and data scientists working with time series data.
Give it a try with your own data and see what hidden seasonal patterns you might discover!
---
*Note: This implementation uses the statsmodels package for STL decomposition and price elasticity modeling. Make sure you have pandas, numpy, statsmodels, and matplotlib installed in your Python environment.*
Comments
Post a Comment