Tech Report
A Comparative Study of Term Weighting Methods for Information Filtering
The users of an information filtering system can only be expected to provide a small amount of information to initialize their user profile. Therefore, term weighting methods for information filtering have somewhat different requirements to those for information retrieval and text categorization. We present a comparative evaluation of term weighting methods, including one novel method, relative document frequency, designed specifically for information filtering. The best weighting methods appear to be those which balance exploiting user input and data from the collection.