December 2, 2025
Correlation based on historic returns is arbitrarily calculated and ignores price context
How long should the lookback window be? What periodicity of returns should be used? Such questions confront anyone using Pearson correlation1 of historic returns to estimate portfolio risk. Results can vary significantly depending on the choices made. What doesn’t matter is the chronology of the return observation pairs.
Marcos López de Prado has argued that quants neglect a lot of valuable information when they base their analytics exclusively on returns and totally ignore price2. So, should price based correlation be used? Most quants would say no. Should we utilize the “fractionally differenced” values as de Prado proposes? Many likely do, but some find them challenging to implement, interpret, and work with.
The VecViz metric “fingerprint” provides a broad basis for measuring ticker similarity
Tickers trading just above an important top behave differently than those that are not, even if their recent daily returns look similar. Conversely, two stocks might have disparate recent daily returns (low Pearson correlation), but if they currently have weak support and low expected forward return rankings they might be poised to gap lower together. These are the types of considerations captured by the VecViz “Fingerprint”.
Specifically, the Vecviz “Fingerprint” comprises twenty one metrics across the three categories described below:
- Chart Geometry (6 features): Where is the price relative to support/resistance channels? (e.g., Proximity to the highest “top”, days since last top, etc.). These relate directly to the Vector Strength Histogram chart depicted below, and are elements of the V-Score spider chart to the left of it.
- Probability Distributions (7 features): Returns to upside and downside Vector Model price percentiles outright and relative to bell curve based Sigma Model? (e.g., 95D_Ret, VecViz 99U_Ret/Sigma 99U_Ret, etc.). These related to the relationships between the blue and red price probability percentile lines in the Vector Strength HIstogram, depicted below.
- Expected Return (8 features): How does the model rank the ticker’s future potential (e.g.: V-Score, and related regime based feature selection metrics).

VecViz Fingerprint Based Correlation is the Pearson Correlation of Fingerprint Elements
When each metric is standardized to a zero to one scale each is set to contribute equally to calculations of ticker similarity. VecViz “fingerprint” correlation take the Pearson correlation of fingerprint criteria percentiles across all ticker pairs.
The distribution of Fingerprint correlation values are displayed in the chart below in pink, and contrasted against rolling 252d daily return based Pearson in blue and another VecViz based approach to correlation, a text based variant that utilizes VecEvents, in green.

Fingerprint correlation outperforms T252d return based Pearson correlation
We did a study of VecViz analytics, including the regime based expected return feature discussed above, as inputs to portfolio mean-variance optimization (MVO)3. In this study we compared the VecViz inputs to each other and to simple trailing 252d return based alternatives. You can find the full study, entitled “VecViz Analytics Performance as MVO Portfolio Optimization Inputs” here.
The study ranks portfolios based on a metric we call “SummaryZ”, which contemplates standardized scores for Annual Average Return, Max Drawdown, Sharpe Ratio, Calmar Ratio4, Multi-Factor Alpha5, and Kupiec P-Value6.
The table below, excerpted from the report, indicates the results across a grid search of constraints and rebalance frequencies. The VecViz fingerprint based correlation process described here is denoted as “Correl (FP)_VV”. It outperform rolling 365 day Pearson correlation, denoted as “Correl_252d”, but underperformed VecEvent based correlation, denoted as “Correl(VE)_VV”7.

Key to Input Variable abbreviations:
- Vol_VV = VecViz’s 99D_Ret (i.e. VecViz 99% VaR)
- Correl(VE)_VV = VecEvent based correlation
- Ret_T252d= average price return over the prior 252 days
- Correl(FP)_VV = VecViz analytic metric “fingerprint” based correlation
- Ret_VV = VecViz’s VaR and OaR breakage regime based expected return metric
- CorrelT252d= correlation of price returns over the prior 252 days
- Vol_T252d = standard deviation of price returns over the prior 252 days
Conclusion
In summary, the VecViz ‘Fingerprint’ is a composite DNA of a stock’s historic and estimated prospective behavior. It combines Chart Geometry (is it near support?), Forward Price Probability (is the upside tail fat?), and Expected Return. By combining these 21 metrics, we create a holistic view of the ticker’s current state and signposts of its future state, not just its recent motion.
My thanks to Yushuang (Sylvia) Wu for her help in developing this approach to correlation.
Notes:
- A standard measure (-1 to +1) of linear relationships. At its core is summing the product of differences from the mean for two sets of data that are linked in some way (ex: they relate to the same date). Essentially, it quantifies how often two variables are on the “same side” of their respective averages simultaneously, giving greater weighting to larger deviations. ↩︎
- “Advances in Financial Machine Learning” (2018), Marcos López de Prado ↩︎
- Mean-Variance Optimization (MVO) is a quantitative tool used to construct portfolios. It weighs three factors: the expected return of an asset (the Mean), the risk of that asset (the Variance) and how the asset relates to other assets under consideration (correlation). MVO identifies the portfolio with the maximum expected return for a given level of risk and other user defined constraints (ex: max ticker weighting). ↩︎
- Calmar Ratio = Average Annualized Return / Max Drawdown ↩︎
- Multi-Factor Alpha is determined via mutliple regression of MVO generated portfolio returns on the returns of MTUM, VLUE and SPY (the iShares MSCI USA Momentum Factor ETF, the iShares MSCI USA Value Factor ETF, and the SPDR S&P 500 Trust ETF, respectively) ↩︎
- “Kupiec P-Value” = Kupiec Test Statistic P-Value, which here reflects the probability that the portfolio’s 99% VaR, as implied by its volatility constraint, (assuming normality, and independent, identically distributed daily returns) was well specified. ↩︎
- VecEvent correlation relies on VecEvents and associated start and end dates of influence generated via LLM in April and July 2025, deep into the Test period. LLM generated retrospectives may suffer from look ahead bias relative to the influence dates generated. Thus, the performance of VecEvent correlation (“Correl(VE)_VV”) may be at least somewhat overstated. ↩︎