How Andrej Karpathy uses LLMs

LLMs
coding
tips
A great overview of ways to use AI, although I knew most of it already. It gave me some ideas.
Published

March 18, 2025

https://www.youtube.com/watch?v=EWvNQjAaOHw

Personal takeaways

  • A great overview of ways to use AI. Some thoughts and ideas stimulated by this:
    • I should start reading books using a ChatAI as a research companion.
      • I could use this with audiobooks -> mp3 -> txt -> LLM summaries, key info.
      • Add summaries to reviews so I keep everything together.
    • I need to try Cursor again, and VS Code CoPilot.
    • Also experiment more with Projects in Claude and ChatGPT.

tiktokenizer.vercel.app

“Hi I am ChatGPT. I am a 1 terabyte zip file. My knowledge comes from the internet, which I read ~6 months ago and remember only vaguely. My winning personality was programmed, by example, by human labelers at OpenAI :)”

Stuff that occurs frequently on the internet is remembered better. Ask a question where the answer doesn’t change and is well represented on the internet, it will (probably) give an accurate answer.

Always start a new chat when you start a new topic. Previous tokens are a distraction. Also the query gets gradually more expensive and slow. Tokens in the context are like working memory.

Keep in mind what model you are using — is it right for the task? If you don’t have a paid account you may be using a dumber model. Ask all of them.

“Thinking” models

  • Additional RF
  • Use RL in such a way as a model can try out lots of ways to find a solution.
  • Used for example with math problems.
  • A recent method.
  • A lot slower and more expensive during inference.
  • For OpenAI, it is the models that start with o that are the thinking models, e.g. o1, o3-mini-high.
  • DeepSeek R1 is a thinking model.

LLMs with tools

Deep Research

  • Combination of web search and thinking.
  • Creates a huge context window.
  • Grok has a good interface for this.
  • ChatGPT the most thorough.
  • [Me: Problem of companies filling the web with information just for AI, for example on medicines or products]
  • There can still be hallucinations. Treat as first research but not as definitely true.

Adding documents to increase the context

  • Some of the information in a PDF may be lost (e.g. images)
  • Or complex diagrams may not be understood by the LLM.
  • PDF will be converted to text file, that is loaded into the token window.
  • Useful when reading books.
    • I could use this with audiobooks -> mp3 -> txt -> LLM summaries, key info.
  • Copy-paste into LLM text box, by chapter.
  • Read with the LLM, asking questions as you read.

Python interpreter

  • LLM writes a computer program to answer a question.
  • LLM writes code, then uses special tokens to get it executed.
  • If a code interpreter is not used, LLMs will give the wrong result for big sums.
  • ChatGPT advanced data analysis
    • ask it to plot data - it will plot using Python.
    • Careful - it can misreport and change figures.

Claude artifacts

  • Create JS apps and see/edit them in the browser.
  • Create diagrams (uses Mermaid).

Cursor

  • VS Code, Windsurf, Cursor…
  • I need to try Cursor again, and VS Code CoPilot.

Other modalities

Speech

  • Talk to the LLM.

  • Much faster.

  • With basic type speech -> Text -> Tokens.

  • But there is advanced voice mode where the voice is handled natively inside the language model.

    • These use audio tokens, not text tokens.

Notebooklm from Google for making podcasts from documents.

Images

  • Possible to convert images into tokens.
  • Model doesn’t know which tokens are text, which audio, which images.
    • Multimodal.
    • It does the same thing with all of the tokens.
  • Transcribe from image of table into text table.

ChatGPT memory

  • Memory is prepended to all conversations.
  • Claude can save new memories as md files in a project.

Custom instructions

  • How you want ChatGPT to respond.

Custom GPTs

  • Just special instructions/prompts.
  • Give description and examples.