Available stopword lists are:
catalan
Catalan stopwords (obtained from
http://latel.upf.edu/morgana/altres/pub/ca_stop.htm),
romanian
Romanian stopwords (extracted from
http://snowball.tartarus.org/otherapps/romanian/romanian1.tgz),
SMART
English stopwords from the SMART information
retrieval system (as documented in Appendix 11 of
http://jmlr.csail.mit.edu/papers/volume5/lewis04a/)
(which coincides with the stopword list used by the MC toolkit
(http://www.cs.utexas.edu/users/dml/software/mc/)),
and a set of stopword lists from the Snowball stemmer project in different
languages (obtained from
http://svn.tartarus.org/snowball/trunk/website/algorithms/*/stop.txt).
Supported languages are danish
, dutch
, english
,
finnish
, french
, german
, hungarian
, italian
,
norwegian
, portuguese
, russian
, spanish
, and
swedish
. Language names are case sensitive. Alternatively, their
IETF language tags may be used.