Does anyone around here know of a program or programs that can do the following things with text:
- Frequency counts for parts of speech (nouns, verbs, adjectives, etc.).
- Sort or score words/phrases based on how abstract or concrete they are.
UPDATE: Thank you everyone for the suggestions and tips. I’ll try them out tomorrow when I get in the lab.
Since I asked without giving you any details, let me give you a brief, though vague description of the project. A few years ago, another psychologist and I wrote a review/theory paper about a particular type of category that we thought sounded plausible, and could have important implications for concept research. We tried a bunch of different ways to test for the existence of these categories empirically after we published the paper, but it proved difficult, mostly due to my own lack of creativity, and ultimately the research program stalled. However, this spring, I sat down with another colleague who’d been doing research that was related, though not directly linked to the paper. In one lunch (well, I just had coffee), he and I came up with a bunch of possible empirical routes, one of which involved the typical/ideal distinction that the concepts folks out there might recognize from Larry Barsalou’s work on ad hoc categories from the 80s and Doug Medin’s work on concepts and expertise. Basically, we wondered if the prototypical members of our type of category might be ideals, rather than central tendencies, much as the prototypical members of ad hoc categories, and the categories of some experts, are ideals. If that was the case, then we’d have a pretty good way of determining whether a particular category was one of ours or not.
To make a long story short, we had participants list characteristics for and examples of typical and ideal members of various natural categories, without hoping to find anything in this particular task (it’s meant to serve as a comparison for another task), but in entering all the characteristics people listed, I began to notice some things — like possible differences in the word-types (e.g., adjectives vs. nouns) used to describe different categories, and the abstractness of the characteristics (not surprising, since adjectives tend to be more abstract), and after the three of us working on the project talked it over, we decided there might be something interesting in there, but we weren’t sure exactly how to measure those sorts of things.