Alternative Data for Investing: Satellite, Web Scraping & Unconventional Sources
Guide to alternative data—satellite imagery, web traffic, credit card data, and other unconventional sources for investment insights.
What Is Alternative Data?
Alternative data = Non-traditional data used for investment insights.
Traditional data: Financial statements, economic releases, prices
Alternative data: Everything else that signals economic activity
Market size: ~$7B and growing rapidly.
Categories of Alternative Data
Satellite & Geolocation
Satellite Imagery:
- Parking lot car counts (retail traffic)
- Oil storage tank shadows (inventory)
- Crop health (agriculture)
- Construction activity
- Shipping/port activity
Providers: Orbital Insight, Planet, Descartes Labs, SpaceKnow
Geolocation Data:
- Foot traffic (SafeGraph, Placer.ai)
- Store visits
- Trade area analysis
- Competitor monitoring
Web & App Data
Web Traffic:
- Site visits (SimilarWeb)
- Search trends (Google Trends)
- App downloads/usage (Sensor Tower, App Annie)
- E-commerce activity
Web Scraping:
- Pricing data
- Job postings
- Product availability
- Review sentiment
Transaction Data
Credit/Debit Card Data:
- Consumer spending (anonymized, aggregated)
- Sector trends
- Geographic patterns
Providers: Bloomberg Second Measure, Earnest Research, Affinity Solutions
Point-of-Sale Data:
- Retailer sales
- Item-level data
- Real-time indicators
Social & Sentiment
Social Media:
- Twitter/X sentiment
- Reddit discussions
- StockTwits
- Influencer tracking
News Sentiment:
- News volume
- Tone analysis
- Topic extraction
Providers: Dataminr, Sprinklr, RavenPack
Expert & Survey
Expert Networks:
- GLG, AlphaSights, Third Bridge
- Industry expert calls
- Primary research
Proprietary Surveys:
- Consumer surveys
- Business surveys
- Custom panels
Other Alternative Data
Patent Filings: Innovation tracking
Regulatory Filings: SEC, FDA, EPA
Government Data: Permits, licenses
Weather: Agricultural, energy impact
Shipping/Logistics: Container data, AIS
How Alternative Data Is Used
Company-Level Signals
Revenue Nowcasting:
- Credit card data → quarterly sales estimate
- Web traffic → customer trends
- App usage → engagement metrics
Example: Track Target foot traffic to estimate same-store sales.
Sector/Industry Signals
Consumer Health:
- Aggregate spending patterns
- Restaurant visits
- Travel activity
Example: Airline bookings data for travel sector.
Macro Signals
Real-Time Economic Activity:
- Aggregate credit card spend → consumption
- Job postings → labor demand
- Shipping data → trade activity
Example: Satellite parking lot counts for retail sales estimate.
Evaluating Alternative Data
Key Questions
- Coverage: Does it represent the target adequately?
- History: Enough backtest data?
- Frequency: How timely?
- Accuracy: Validated against ground truth?
- Alpha decay: How long before signal commoditized?
- Compliance: Legal and ethical sourcing?
Common Pitfalls
Survivorship bias: Only current companies in sample
Backtesting issues: Data not available in real-time
Overfitting: Too many variables, too little history
Correlation ≠ Causation: Spurious relationships
Regime changes: Pandemic disrupted everything
Data Quality Considerations
Data Sourcing Issues
- Panel representativeness
- Geographic coverage
- Demographic coverage
- Opt-in bias
Processing Challenges
- Noise vs signal
- Seasonality adjustment
- Normalization
- Missing data
Compliance & Ethics
- Privacy regulations (GDPR, CCPA)
- Consent and disclosure
- Material non-public information (MNPI)
Alternative Data Vendors
Data Aggregators
- Quandl/Nasdaq Data Link: Multiple alt data sources
- Eagle Alpha: Alt data marketplace
- Neudata: Alt data scout and reviews
Sector-Specific
Consumer:
- Earnest Research (credit cards)
- Placer.ai (foot traffic)
- SimilarWeb (web traffic)
Energy:
- Orbital Insight (tank levels)
- Kpler (shipping)
- Kayrros (satellite)
Agriculture:
- Gro Intelligence
- aWhere
- Descartes Labs
DIY Alternative Data
Free sources:
- Google Trends (search interest)
- Reddit API (sentiment)
- Government data (unconventional uses)
- OpenStreetMap (location data)
Web scraping (carefully):
- Job postings
- Pricing data
- Product availability
- Review aggregation
Integration with Traditional Data
Blending Approaches
- Alternative confirms traditional: Higher conviction
- Alternative contradicts: Early warning?
- Alternative leads: Nowcasting advantage
Weighting Considerations
- Alternative data often higher frequency
- Traditional data more reliable
- Optimal blend depends on use case
Building an Alt Data Capability
Starting Points
- Google Trends: Free, easy to start
- Indeed/LinkedIn: Job posting analysis
- Web traffic tools: Free tiers available
- Satellite (free): Sentinel data available
Scaling Up
- Identify specific use case
- Evaluate vendors (trials usually available)
- Build data pipeline
- Backtest rigorously
- Monitor live performance
Team Capabilities
- Data engineering
- Statistical skills
- Domain expertise
- Compliance awareness
Pro Tips
- Start with thesis: What are you trying to predict?
- Beware overfitting: Alternative data = lots of variables
- Real-time challenge: Ensure data available when claimed
- Decay is real: Good signals get arbitraged
- Cost-benefit: Some data very expensive
- Combine sources: Multiple alt data > single source
Related Articles
China Economic Data: Complete Guide to NBS, PBoC & Alternative Sources
Navigating Chinese economic data—official sources, reliability concerns, alternative indicators, and best practices for China macro analysis.
Free Macro Data Tools for Economists and Analysts (2026)
The best free macro data tools in 2026—FRED, DataSetIQ, World Bank Open Data, Eurostat, and more. Build a complete research stack without spending a dollar.
Complete Guide to Unemployment Rate Data: BLS, OECD & More
2008 took 6 years to recover. 2020 took 18 months. The data tells a fascinating story about two very different economic crises.
