Thanks to everyone for their help with the Miss USA survey. There were 713 responses, which should give me what I need. I'll post the results shortly. I closed the survey and have started crunching numbers.
There was a question raised about the question wording, so I changed how the questions were introduced about halfway through. It was a modest change in the wording, clarifying that I wanted respondents to assess the contestants' answers in terms of their objective merits, not just whether they answered the question asked. I didn't think it was a big impact, but I was curious whether that caused a significant change in how people rated the Miss USA responses. To check, I graphed the average scores given for each contestant after the change against the scores given before the change. As you can see, the impact is small, and well within the error bars, which represent the median difference between ratings among all 713 answers (there were 368 responses before the wording change, and 345 after).
The faint diagonal line is the line with slope of 1, which is where the data should cluster if the wording change had no effect. As you can see, the data do cluster there, and do so fairly consistently and within the error bounds. It looks a little bit like the answers after the change are higher for higher values and lower for low values, but the slope is not statistically different from 1, the correlation between the results before and after is 0.99, and I'm confident that it's safe to pool all 713 results.
Thank you all for your help.