Datasets for BETSY

Here are some datasets to play with

  ERIC Abstracts Abstracts written by the ERIC Clearinghouses on Assessment, Early Childhood and Education Management. Excellent classification success.
  Federalists papers coming
  High School essays Responses to a high school biology item. This is the data used in the JTLA paper. 2 groups.
  Grade 5 essays Responses to a prompt written by 5th graders. 3 score groups. Betsy does horribly with these essays (50% accuracy).
  Grade 8 essays Being typed.


The essays are all in ZIPed format. Each category is its own subdirectory.

Suggestions are welcome. Again, please keep me posted of your results using BETSY - Larry Rudner