“What if your AI pair programmer knew everything you’ve ever learned?”
In school, we’re told to take notes.
In dev life, we forget they exist after saving notes.md
.
So I asked myself:
What happens if I feed my entire dev knowledge — journal entries, notes, tips, terminal commands — into GitHub Copilot or an LLM?
Answer:
I built a personal knowledge-trained Copilot clone, and suddenly I was pair programming with… my past self.
This is how I turned years of scattered markdowns into the most useful coding companion I’ve ever had.
💡 The Problem: Your Knowledge Is Trapped
Let me guess:
- You’ve saved
cheatsheet.md
,til.txt
, anddocker-fix-notes.md
- You wrote blog drafts, copied StackOverflow tricks, and kept AI prompts
- But when it’s time to use them, you can’t find or recall anything
What if you could just type:
“What’s the trick I used for SSH tunneling inside Docker Compose?”
And get an instant answer from your own writing, inside VS Code?
🧠 The Idea: Build a Self-Trained Coding Copilot
Instead of asking an AI “how to do X,” I wanted an AI that answered:
“How did I do X in the past?”
So I built a local LLM that:
- Reads all my
.md
,.txt
,.ipynb
, and.bash_history
- Indexes them into a searchable database
- Responds to natural language queries like:
- “How did I deploy that huggingface model?”
- “What did I learn about PostgreSQL full text search?”
- “What alias did I use for kubectl?”
And it runs entirely offline using small open-source models like Phi-3
, TinyLlama
, and Instructor-XL
.
🛠️ How I Built It
Component | Tool Used |
---|---|
Note ingestion | Python scripts + recursive folder scanning |
Embedding |
Instructor-XL (context-aware for dev docs) |
Vector DB | Chroma |
Local LLM |
Phi-3 via Ollama |
Interface | VS Code sidebar + CLI |
Memory sync | Git hooks + Dropbox sync script |
🧾 Sample script to index notes:
from langchain.document_loaders import TextLoader
from langchain.vectorstores import Chroma
from instructor_embedding import INSTRUCTOR
docs = TextLoader("/my/notes/").load()
embeddings = INSTRUCTOR()
db = Chroma.from_documents(docs, embedding=embeddings)
🗣️ Sample query CLI
mycopilot "What’s the SSH trick for Django + Postgres in Docker?"
Response:
“In
til-docker.md
, you usedssh -L 5433:db:5432 your-ec2
and addedlocalhost:5433
in settings.”
🤖 Why This Works: It’s Like Dev Journaling on Steroids
This isn’t just automation. It’s augmentation.
It solves:
- 👻 Forgetting how you solved a bug last year
- 🔁 Rewriting the same bash script over and over
- 💬 Re-answering questions your teammates already asked
And it teaches you something schools never do:
How to talk to your own mind.
📚 Bonus: I Hooked It Up to My GitHub Repos Too
What if your AI could also answer:
“What was I thinking when I wrote this function?”
So I did this:
- Pulled my old GitHub repo commit messages
- Parsed
README.md
and code comments - Linked commits to note entries using timestamps
Now it answers stuff like:
“What’s the difference between
tokenizer.py
v1 and v2?”“v2 introduces HuggingFace fast tokenizer; v1 had regex hacks, see ‘notes/tokenizer-rewrite.md’”
It’s like GPT met Obsidian + Git + me.
✨ Unexpected Magic
- 🧠 Improved recall: I remembered why I did something, not just how
- 🧱 Modular reuse: My old notes turned into reusable snippets
- 💬 Debugging partner: I’d paste an error and it’d say “You solved this in
bugfix-log-2023.md
”
And guess what?
I started writing better notes, because I knew they’d actually be used.
🧰 You Can Build Yours Too (for Free)
Requirements:
- A folder of notes, commits, or codebases
- A free HuggingFace model like
Instructor-XL
- ChromaDB + LangChain
- Ollama + Phi-3 or
mistral-7b
(optional)
If you want:
✅ Full walkthrough
✅ GitHub template repo
✅ VS Code extension
Let me know and I’ll publish it all!
🧘 Final Thought: This Is the Future of Learning
AI won’t just write code for us.
It will teach us from our own minds.
Don’t build a second brain.
Build a smart brain that talks back.