CLANES

CLANES  is an English / Spanish comparable corpus made up of seven sub-corpora in the domains of food and drink. It contains nearly 1.5 million words considering both languages and is grammatically and semantically annotated. Pragmatic annotation is at the stage of testing and will be deployed shortly. CLANES is being used to design authorship support materials and as a basis for the CLANES drafter-type CNL. The table below shows the composition:

NameNumber of words (EN)Number of words (ES)Contents
Word count704,451712,334
C- FIVI: Wine tasting notes52,41354,408http://contraste2.unileon.es/web/es/corpus0_FIVI.html
BiTeX: Culinary recipes227,294251,218http://contraste2.unileon.es/web/es/corpus0_BiTeXcook.html
C-FITECVI: Wine technical sheets 118,846118,760http://contraste2.unileon.es/web/es/corpus0_C-FITECVI.html
C-GDQ: Cheese descriptions121,461111,871http://contraste2.unileon.es/web/es/corpus0_C-GDQ.html
ACTEaS_Promo: Promotional texts for herbal teas100,28845,313http://contraste2.unileon.es/web/es/corpus0_ACTEaS_Promo.html
C-GEFEM: Sausages and cured meats34,68170,994http://contraste2.unileon.es/web/es/corpus0_C-GEFEM.html
C-BakedGoods: Bakery and pastry descriptions49,46859,770http://contraste2.unileon.es/web/es/corpus0_C-BakedGoods.html
TOTAL1,416,785

OTHER COMPARABLE CORPORA

NameNumber of words (EN)Number of words (ES)Contents
C-GDPE: Electronic products32,51946,878http://contraste2.unileon.es/web/es/corpus0_ElectronicProducts.html
C-GITEC: Technical reports28,84454,940http://contraste2.unileon.es/web/es/corpus0_GITEC.html
C-GIT: Appliance instruction manuals174,134206,669http://contraste2.unileon.es/web/es/corpus0_GIT.html
C-GARE: Meeting minutes 139,919174,347http://contraste2.unileon.es/web/es/corpus0_GARE.html
C-AuRs: Audit reports 90,105117,082Unavailable on confidentiality grounds.
C-FMR: Football match reports 30,98630,153http://contraste2.unileon.es/web/es/corpus0_FMR.html
C-OPRES: opinion articles1,007,4141,007,384http://contraste2.unileon.es/web/es/corpus0_OPRES.html
C-GAC: biomedical abstracts15,11314,484http://contraste2.unileon.es/web/es/corpus0_BioABSTRACTS_C-ACTRES.html
C-CT: Clinical trials 80,80889,129http://contraste2.unileon.es/web/es/corpus0_CT.html