Transparency

AI Signal Methodology

We believe AI-powered investment tools should be transparent. This page explains exactly how our machine learning models generate stock signals, what data they use, and how we measure accuracy.

Last updated: March 5, 2026

1. Data Sources

Our AI models analyze four primary data sources, all of which are publicly available:

Source	Data	Volume	Update Frequency
SEC EDGAR	10-K, 10-Q, 8-K, Form 4 filings	3,000+ filings/day	Real-time
Earnings Transcripts	Full call transcripts via FMP API	4,000+ per quarter	Within hours of call
Financial News	Major outlets, press releases	10,000+ articles/day	Real-time
Financial Data	Prices, fundamentals, estimates	All S&P 500 stocks	Real-time / daily

2. NLP Pipeline

Unstructured text (earnings call transcripts, SEC filings, news articles) is processed through our NLP pipeline to extract structured features:

a.Sentiment scoring — Each document receives a sentiment score from -1 (strongly negative) to +1 (strongly positive) using fine-tuned transformer models trained on financial text.
b.Entity extraction — We identify companies, financial metrics, dates, and monetary values mentioned in the text.
c.Tone analysis — For earnings calls, we measure management confidence level, hedging language frequency, and forward-looking statement ratio.
d.Change detection — For SEC filings, we compare against prior filings to flag material changes in risk factors, accounting policies, and financial metrics.

3. Machine Learning Models

We use a two-stage model architecture:

Stage 1: Individual Signal Models

Five specialized gradient-boosted decision tree models (XGBoost), each trained on features specific to their signal type. Each model outputs a score from 1-10 with a confidence interval.

Stage 2: Ensemble Meta-Model

A meta-model that combines individual signal scores, weighted by their historical accuracy for each stock's sector and market cap range. Weights update monthly based on rolling 12-month performance.

4. Backtesting & Accuracy

All signal models are backtested on historical data before deployment. Our backtesting methodology:

Walk-forward validation with no lookahead bias
3-year rolling window (2023-2025 historical data)
Out-of-sample testing on 20% held-out data
Directional accuracy measured (did the signal correctly predict the stock's direction over the next 30 days?)
Results published monthly with full transparency

Signal Type	Directional Accuracy	Avg. Confidence
Earnings NLP	78%	72%
Filing Analysis	82%	68%
News Sentiment	71%	65%
Insider Activity	74%	70%
Composite Signal	76%	74%

5. Limitations & Honest Disclaimers

AI signals are not investment advice. They are informational tools that should be one input among many in your research process.

Past performance does not predict future results. Backtested accuracy may not reflect real-world performance due to market regime changes.

Models have blind spots. Our AI cannot predict black swan events, regulatory changes, or management fraud that isn't reflected in public filings.

NLP has inherent limitations. Sarcasm, ambiguity, and context-dependent language can lead to misinterpretation in automated text analysis.

6. Academic References

Our approach builds on peer-reviewed research in financial NLP and machine learning:

Loughran & McDonald (2011). "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks." Journal of Finance.
Chen et al. (2024). "Artificial Intelligence in Financial Market Prediction." Frontiers in AI.
Aggarwal et al. (2023). "GEO: Generative Engine Optimization." KDD 2024.

Get AI Signals — Join Waitlist