LLM Digital Twin

Fine-tuning a model to contain my memories, personality, and preferences. Data gathered from an iPhone, trained using local compute (2 Mac Studios), to run on a 7b model (Mistral) that is capable of running inference on an iPhone.

Fine-tuning framework

Apple MLX (LoRA)

Training data set

100k Q&A examples

Skills

Data, training, evals, alignment

Project length

2 weeks

Year

2025

Challenge

I wanted an always-up-to-date “digital twin” that could answer questions exactly the way I would—from personal anecdotes to my latest calendar plans—without ever leaving my devices or exposing raw data to the cloud. Existing tools were either heavyweight (server-side LLMs) or too limited (on-device chatbots). The real hurdle: turning gigabytes of iPhone data (Photos, Messages, Mail, Calendar, Location, Web history…) into a lightweight, privacy-preserving model that could run in real time on an iPhone-class chip.

Results

MeModel: A Digital Daniel—a 7-B parameter model that mirrors my personality, knowledge and writing style, running locally on my iPhone with <1 s first-token latency.

Zero-cloud footprint: all raw data stays encrypted on my personal hardware; only the distilled weights live on the phone.

Seamless updates: new memories appear in answers within hours, not weeks.

Foundation for more: this pipeline can power future user simulations in an AI first computing world.

90%

Accuracy on personal evals

7b

Parameter fully tuned model

Mac Studio

Local compute used for fine-tuning

Process

Data Extraction & Normalisation

Used Apple’s data-export APIs to pull each app’s history, then converted every stream into unified JSON schemas.

Synthetic Q&A Generation

Prompted a base LLM to read those JSON slices and auto-write ~100 k conversational Q&A pairs that reflect my tone, memories and relationships.

Dataset Curation

Applied heuristics for balance (time, topic, sentiment) and stripped sensitive fields; encrypted everything at rest on two local Mac Studios.

Fine-Tuning Pipeline

Split into train/val/test, then fine-tuned a Mistral-7B model in Apple MLX with LoRA adapters and safesensors weight format.

Quantised to 4-bit (SmoothQuant) for on-device inference.

Continuous Refresh

Nightly cron job exports the day’s new JSON, regenerates incremental Q&A, and performs a quick LoRA refresh so the model evolves with me.

Deployment

Packaged the quantised weights and adapters into an iOS inference app; integrated a retrieval layer for very recent memories (<24 h) to avoid constant re-training.