Those of you who have followed me on my other Substack know that I've been a thorn in the side of many health data folks over the last three years. I've been something of a data contrarian when it comes to Covid policy (see my book). '
I've also familiarized myself with a lot of data sets which the CDC provides. One of these asks specific questions of American youth - the YRBSS - and I wanted to compare what the results were from 2019 to 2021. (As a sidenote, they decided I believe not to do the survey in 2020 because the results were so dramatic.) These cover all sorts of questions from drug use in school to bullying to phyisical activity. I wanted to know what the impact of the lockdowns were on our youth.
Now that would take a lot of data parsing, but fortunately, I have chatGPT. Using the Code Interpreter, I was able to upload two sheets in a single Excel spreadsheet one for 2019 and one for 2020 and I simply explained what I wanted done.
It worked on the problem for just about a minute. It came back and asked some very specific questions. This is a pretty straightforward table, but there are many many questions in the format is not one of a normal data set. I love that it stopped to ask directions before messing things up.
No, it's first time around was a fail as it wasn't able to participate it correctly, but I told her to focus on the questions and mapping those between the years and see if that would help.
It turns out those footnote symbols may have been messing it up so it took to parsing those out. If you click on the show work, you can actually see it performing the python code in real time.
Here are the questions with the largest proportional changes from 2019 to 2021:
Total % Change: The question with the largest overall proportional change is "Did not eat breakfast (during the 7 days before the survey)" with an increase of approximately 31.74%.
Female % Change: The question with the largest proportional change among females is "Did not eat fruit or drink 100% fruit juices (during the 7 days before the survey)" with an increase of approximately 48.98%.
Male % Change: The question with the largest proportional change among males is "Smoked more than 10 cigarettes per day (on the days smoked during the past 30 days)" with an increase of approximately 51.46%.
I asked for some additional insights:
Here are some interesting findings based on the largest proportional changes from 2019 to 2021 related to physical activity:
Total % Change - Top Physical Activity Question
"Did not do exercises to strengthen or tone muscles (on 3 or more of the past 7 days)": 9.11% increase
Female % Change - Top Physical Activity Question
"Did not do exercises to strengthen or tone muscles (on 3 or more of the past 7 days)": 12.27% increase
Male % Change - Top Physical Activity Question
"Did not do exercises to strengthen or tone muscles (on 3 or more of the past 7 days)": 5.85% increase
It's interesting to note that the same question appears for all three categories (Total, Female, Male). This question is related to strength exercises, and the data indicates an increase in the number of respondents who did not engage in such activities in 2021 compared to 2019.
Please note that these are proportional changes and not absolute changes.