Base Mech Interp

Blogging is a great way to share your knowledge and expertise with the world. However, until now, bloggers have had to use several different services to create, maintain and promote their blogs. Blogging is an increasingly popular medium to generate leads and customers. If a business has a blog, they're more likely to convert their leads into customers. Notion can help you build, maintain and promote your blog.

Mar 7, 2023

Start writing awesome blog post from Notion

Use cases

Trading

  • Data - News price up/down/no change
  • Layer -16
  • Probe and SAE - trained on above data and Layer
  • Buy and sell when both results from probe/SAE agree.
  • Use vectorbt and quantstats - for backtesting and reporting.
  • Result:
    • notion image
notion image
notion image
notion image
 

Audit and agentic tracing

Agentic

  • Setup - Get market data, get news data, get sector info, investment recommedation agent.
  • Green, Amber or red based on the expected features activate. Work needs to be done
  • Question : Value add on top of COT ? Ground truth ?
  • Sample Results:
    • notion image

Audit

  • For a prompt, look at the top “selected” features activated across layers
notion image
 
notion image

What happens during fine tuning

New Features emerge:
notion image
 

Hallucination

  • SAE features classification
  • In progress:
    • Train linear probe and get the hallucination percentage for a sentence.
    • Check whether they overlap
notion image
 

Other supervised : doc parsing, sentiment, credit risk

  • probe vs SAE based classification
 

Mech Interp : Domain specific model/tool prep

SAE Training

  • Findings on SAE training with following items and SAE trained on LLama/BERT
    • Reconstruction loss vs. sparsity.
    • Importance of OOS
    • #Latents for data/model size

Auto-interp

  • Domain specific vs generic data - superposition
  • F1 and other metrics
  • Self-interp based cross validation of auto-interp labels
  • Feature clustering - similar to below from anthropic
notion image
 
UMap on our cluster:
notion image

Supervised tool prep

  • Probe - Increasing the accuracy, advantages, lasso
  • SAE classifier - Increasing accuracy through Lasso, random forest etc. Using relevant features