Two solid open-source options:
1. PocketPal AI (GitHub: a-ghorbani/pocketpal-ai) — easiest, runs llama.cpp under the hood, decent UI, works well with 3B-7B models
2. Termux + llama.cpp — more control, can run larger quants if your device has the RAM
For most people PocketPal is the right start. Phi-3 mini or Gemma 2B runs fine even on mid-range phones.
