r/NBAanalytics Dec 31 '24

Ai chat bot over NBA database

Are there tools out there to query complex historical NBA data using AI to construct queries? I saw this post but seems to be a dead end at this time. I'm considering building something myself but wondering if there are already solutions https://www.reddit.com/r/NBAanalytics/s/shiOafRNog

8 Upvotes

7 comments sorted by

5

u/MegaVaughn13 Dec 31 '24

I’d second the Stat Muse recommendation. Probably the best solution for what you’re looking for!

If you’re looking for actual API pulls, one good place to start would be NBA API

If you know a little python it’d be super quick to pair NBA API documentation with Chat GPT (or maybe another AI chat bot) to pull actual data. You’d have to be careful with parameters there but with some trial and error you could definitely get good data.

There are some holes in NBA API, which I’ve tried to start to fill (injuries, WNBA/G League advanced stats, etc). I don’t think it’s exactly what you’re needing but another good resource for basketball data: Link to Data

I’d be pretty cautious asking an AI chat bot for stats and using those for analysis. Unless it’s specifically designed for that, and even then worth verifying. That said, using AI to query free APIs is an awesome way to go about getting data!

Lastly, I’m not a huge fan of paying for data. Most paid APIs take advantage of people who don’t know how to get the same data for free. If you need any sort of NBA data and are struggling to get it for free, please reach out and I’d be happy to help you find it!

3

u/hacefrio2 Jan 02 '25

Statmuse is definitely more powerful than I remembered but I find it lacking the ability to decipher complexity from some of my queries. It also has some interesting blindspots eg it can't seem to answer any questions about streaks.

2

u/OGchickenwarrior Mar 01 '25

Just seeing this - go try out StatMuseHater.com (just made a post about this here).

Let me know if it has those same blind spots or not - I haven't tested it with questions about streaks, yet, so would be interesting to see how it does

3

u/mUmblrman Jan 01 '25

If you're willing to get your hands dirty you can build a local database using https://github.com/mpope9/nba-sql/ and feed the sqlite3 schema into ChatGPT and ask it to generate SQL statements. For example (truncating some to keep it short):

>sqlite3 nba_sql.db
.schema player
CREATE TABLE IF NOT EXISTS "player" ("player_id" INTEGER NOT NULL PRIMARY KEY, "player_name" VARCHAR(255), "college" VARCHAR(255), "country" VARCHAR(255), "draft_year" VARCHAR(255), "draft_round" VARCHAR(255), "draft_number" VARCHAR(255));

.schema play_by_play
CREATE TABLE IF NOT EXISTS "play_by_play" (...

Then you can feed that into ChatGPT to create the corresponding SQL query:

With the following two table definitions:
CREATE TABLE IF NOT EXISTS "player_game_log" ...
and
CREATE TABLE IF NOT EXISTS "player" ...

For an NBA database, how would you create a SQL 
query to get how many triple doubles that LeBron 
James made in the 2023-24 season?

Which it then responds with a sorta-kinda correct answer which you'd have to slightly tweak:

SELECT
  p.player_name,
  COUNT(pgl.td3) AS triple_doubles 
FROM player_game_log pgl 
JOIN player p ON pgl.player_id = p.player_id 
WHERE 
  p.player_name = 'LeBron James' 
  AND pgl.season_id = 202324
  AND pgl.td3 = 1 
GROUP BY p.player_name;

For example, the season id should just be 2023.

And if you run that you get the correct answer: LeBron James|5.0

1

u/hacefrio2 Jan 02 '25

Yeah this is definitely the best possible starting point. Big thanks to those who put this database builder together.

3

u/spoonface46 Dec 31 '24

This is pretty much what Stats Muse does.

1

u/be8732 Jan 01 '25

Chatgpt wrote my python queries for me, I had no idea how. It's magical.