Comparing AI-Generated Running Plans
For the last few weeks, I've been following a training plan from Runna to train for my first marathon in 2025. I'm about 5 weeks into that plan and super happy with it so far. I'll get into some specifics of that Runna plan below, but as someone who works with Large Language Models (LLMs) at my day job - I wanted to see the types of plans AI could generate and compare them.
Setup
I wanted to play around with a few different models, and in order to do that quickly, I leveraged GitHub Models. GitHub Models 1 is a free playground where you can quickly test and compare various models, which makes it perfect for this quick experiment! I swear, this is not just a shameless plug!!
For those familiar with LLMs, you might know that prompts play a key role in the equation. More folks might be familiar with OpenAI's ChatGPT, which is a Large Language Model that you can chat with. In simple terms, you can think of whatever you ask/send to ChatGPT as the prompt. The model takes that prompt, thinks about it, and spits out a response. For the purpose of this running plan experiment, here's the prompt I gave the various models:
Prompt
I am an intermediate runner who averages 17 miles per week. I am training for a marathon and have 26 weeks to prepare for it. I want to run 4 times a week. My most recent half marathon race was done at a 8:29 minute per mile pace. Every week, I want:
- 1 recovery run
- 1 long run
The other two runs can be anything from recovery, tempo, intervals, etc. Can you build me a plan for the first 4 weeks? For each run, I want the type of run and the pace I should be going at.
That's it! I gave specific information in the prompt because this is what Runna asked for, and I want as fair of a comparison as possible:
- Running level (beginner, intermediate, advanced)
- Mileage per week
- Number of running days
- Last race time
I gave other information like a definite recovery run and long run because that seems to be consistent in all of Runna's training plans. I also instructed for the first four weeks for quicker/easier comparison. Finally, I gave it a basic structure for the output (type of run and pace).
Model / Plan Comparison
The models I compared are:
These models were available on GitHub Models and stood out to me. Of course, there are other models that would've been cool to play around with (Claude, Gemini), but I think the above gives us a mix of open-source and non-open-source models.
And with that, let's get into the comparison! I'll be breaking it down by week below.
Week 1
See Comparison
-
GPT-4o
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30–10:00/mile
Run 2: Tempo Run
- Distance: 5 miles
- Pace: 1 mile (9:30-10:00), 3 miles (8:10-8:15), 1 mile (9:45-10:15)
Run 3: Long Run
- Distance: 8 miles
- Pace: 9:15-9:45/mile
Run 4: Recovery Run
- Distance: 3 miles
- Details: 9:30–10:00/mile
Total Mileage: 19 miles
-
o1-preview
Run 1: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 2: Tempo Run
- Distance: 5 miles
- Pace: 1 mile (9:18–9:58), 3 miles (8:17), 1 mile easy pace
Run 3: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 4: Long Run
- Distance: 6 miles
- Details: 9:18–9:58/mile
Total Mileage: 19 miles
-
Llama 3.1
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Intervals
- Distance: 5 miles
- Pace: 4x800m (8:00-8:15), with 400m active recovery (9:30-10:00)
Run 3: Tempo Run
- Distance: 5 miles
- Pace: 1 mile (9:30-10:00), 3 miles (8:20-8:40), 1 mile (9:30-10:00)
Run 4: Long Run
- Distance: 8 miles
- Details: 9:00-9:20/mile
Total Mileage: 21 miles
-
Mistral Large
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Tempo Run
- Distance: 5 miles
- Pace: 8:15-8:20/mile
Run 3: Intervals
- Distance: 4 miles
- Pace: 8x400m (7:30), with 400m active recovery (9:30)
Run 4: Long Run
- Distance: 8 miles
- Details: 9:00-9:30/mile
Total Mileage: 20 miles
Week 2
See Comparison
-
GPT-4o
Run 1: Recovery Run
- Distance: 3-4 miles
- Pace: 9:30–10:00/mile
Run 2: Intervals
- Distance: 4 miles
- Pace: 1 mile (9:30-10:00), 4 x 400 meters (7:30-7:45) with 2 minute jog recovery in between, cooldown (9:45-10:15)
Run 3: Long Run
- Distance: 9 miles
- Pace: 9:15-9:45/mile, optional 8:45/mile for last mile
Run 4: Recovery Run
- Distance: 3 miles
- Details: 9:30–10:00/mile
Total Mileage: 20-21 miles
-
o1-preview
Run 1: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 2: Intervals
- Distance: 5 miles
- Pace: 1 mile easy pace, 4 x 800 meters (7:41 min/mile) with 400 meter jog recoveries, 1 mile easy pace
Run 3: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 4: Long Run
- Distance: 7 miles
- Details: 9:18–9:58/mile
Total Mileage: 20 miles
-
Llama 3.1
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Hill Repeats
- Distance: 5 miles
- Pace: 6x400m hills (8:00-8:15), with 200m active recovery (9:30-10:00)
Run 3: Tempo Run
- Distance: 5.5 miles
- Pace: 1 mile (9:30-10:00), 3.5 miles (8:20-8:40), 1 mile (9:30-10:00)
Run 4: Long Run
- Distance: 9 miles
- Details: 9:00-9:20/mile
Total Mileage: 22.5 miles
-
Mistral Large
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Tempo Run
- Distance: 6 miles
- Pace: 8:15-8:20/mile
Run 3: Hill Repeats
- Distance: 4 miles
- Pace: 6x200m hills (8:00), with jog down recovery
Run 4: Long Run
- Distance: 9 miles
- Details: 9:00-9:30/mile
Total Mileage: 22 miles
Week 3
See Comparison
-
GPT-4o
Run 1: Recovery Run
- Distance: 3-4 miles
- Pace: 9:30–10:00/mile
Run 2: Tempo Run
- Distance: 6 miles
- Pace: 1 mile (9:30-10:00), 4 miles (8:10-8:15), 1 mile (9:45-10:15)
Run 3: Long Run
- Distance: 10 miles
- Pace: 9:15-9:45/mile, optional 8:45/mile for last mile
Run 4: Recovery Run
- Distance: 3-4 miles
- Details: 9:30–10:00/mile
Total Mileage: 22-23 miles
-
o1-preview
Run 1: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 2: Tempo Run
- Distance: 6 miles
- Pace: 1 mile easy pace, 4 miles (8:17 min/mile), 1 mile easy pace
Run 3: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 4: Long Run
- Distance: 8 miles
- Details: 9:18–9:58/mile
Total Mileage: 22 miles
-
Llama 3.1
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Intervals
- Distance: 6 miles
- Pace: 5x800m (8:00-8:15), with 400m active recovery (9:30-10:00)
Run 3: Easy Run
- Distance: 4 miles
- Pace: 9:00-9:20/mile
Run 4: Long Run
- Distance: 10 miles
- Details: 9:00-9:20/mile
Total Mileage: 23 miles
-
Mistral Large
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Tempo Run
- Distance: 7 miles
- Pace: 8:15-8:20/mile
Run 3: Intervals
- Distance: 5 miles
- Pace: 8x800m (7:45), with 400m active recovery (9:30)
Run 4: Long Run
- Distance: 10 miles
- Details: 9:00-9:30/mile
Total Mileage: 25 miles
Week 4
See Comparison
-
GPT-4o
Run 1: Recovery Run
- Distance: 4 miles
- Pace: 9:30–10:00/mile
Run 2: Intervals
- Distance: 5 miles
- Pace: 1 mile (9:30-10:00), 6 x 400 meters (7:20-7:40) with 90 second jog recovery in between, cooldown (9:45-10:15)
Run 3: Long Run
- Distance: 11 miles
- Pace: 9:15-9:45/mile, optional 8:30-8:45/mile for last mile
Run 4: Recovery Run
- Distance: 3 miles
- Details: 9:30–10:00/mile
Total Mileage: 23 miles
-
o1-preview
Run 1: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 2: Intervals
- Distance: 6 miles
- Pace: 1 mile easy pace, 5 x 800 meters (7:41 min/mile) with 400 meter jog recoveries, 1 mile easy pace
Run 3: Recovery Run
- Distance: 4 miles
- Pace: 9:18–9:58/mile
Run 4: Long Run
- Distance: 9 miles
- Details: 9:18–9:58/mile
Total Mileage: 23 miles
-
Llama 3.1
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Tempo Run
- Distance: 6 miles
- Pace: 1 mile (9:30-10:00), 4 miles (8:20-8:40), 1 mile (9:30-10:00)
Run 3: Hill Repeats
- Distance: 6 miles
- Pace: 7x400m hills (8:00-8:15), with 200m active recovery (9:30-10:00)
Run 4: Long Run
- Distance: 11 miles
- Details: 9:00-9:20/mile
Total Mileage: 26 miles
-
Mistral Large
Run 1: Recovery Run
- Distance: 3 miles
- Pace: 9:30-10:00/mile
Run 2: Tempo Run
- Distance: 8 miles
- Pace: 8:15-8:20/mile
Run 3: Hill Repeats
- Distance: 5 miles
- Pace: 8x200m hills (8:00), with jog down recovery
Run 4: Long Run
- Distance: 11 miles
- Details: 9:00-9:30/mile
Total Mileage: 27 miles
Comparison TLDR
If you don't want to look at those detailed plans by week, I'll give my insight on the main takeaways below from a running perspective:
- Mistral wants me to run A LOT - I'm already up to 27 miles by week 4
- Mistral and Llama both have me doing 2 speed workouts per week
- Although I typically only do 1 speed workout, Runna recommends I do 2 per week
- I found I was getting injured when doing 2 speed workouts per week, so I brought it down to 1
- I got much more detail / structure from the OpenAI models, and they actually simulate what I did for the first two weeks in terms of mileage and types of runs
- Bonus points to
o1-preview
because the week structure of recovery-speed-easy-long run is exactly what I do now - No deload weeks!
- Runna usually includes a deload week every 4 weeks where I run a lot less. It helps with recovery and I actually enjoy these weeks!
- Given all of this, the plan that matched most closely with Runna's structured plan was Llama 3.1
- The weekly mileage is slightly higher, but the type of runs are similar
Final thoughts
If you want a custom running plan, using an LLM might be the way to go! Of course, you can give additional prompts to further customize it to your liking, but I found o1-preview
to match my preferences the most using a super simple prompt. It's probably worth saying that I myself probably won't use AI to replace Runna , even though it requires a subscription.
I love the integration Runna has with my smartwatch where I can start a workout from my plan and it gives me cues if I'm on pace or off pace. Most newer running watches have a feature where you can build a custom workout, but Runna makes that so simple. I also like how Runna gives you predicted race times and those vary depending on your recent performance. I didn't prompt the LLM to give me a predicted race time, but that's possible. Although, I trust Runna more with those predictions as they actually use my data.
Finally, it's pretty interesting that these different models give varying plans - even with the same prompt. Some were more focused on speed work, and others kept the load a bit easier. While there were differences in the plans, I don't think you can go wrong with any of these models if you wanted to use AI to build a custom running plan for you.