Parallel Corpus

The ACTRES Parallel Corpus (P-ACTRES 2.0) is a bidirectional English-Spanish corpus consisting of original texts in one language and their translation into the other. P-ACTRES 2.0 contains over 6 million words considering both directions together, i.e., from original English texts to their Spanish translations (the former P-ACTRES 1.0) and from original Spanish texts to their English translations. The table below shows the composition:

English Spanish Total
P-ACTRES 2.0 Total
Spanish English
1.276.791 1.357.296 2.634.087 Books – fiction
1.556.969 766.796 790.173
514.786 573.523 1.088.309 Books – nonfiction
28.273 15.068 13.205
118.665 129.810 248.475 Newspaper articles
114.634 122.024 236.658 Magazine articles
70.852 37.027 33.825
40.178 49.026 89.204 Miscellaneous
4.296.733 words TOTAL Corpus
5.952.827 words
1.656.094 words

Contains slightly more than 2.5 million words distributed into five sub-corpora, comprising different text-types: books fiction, books non-fiction, newspaper articles, magazine articles and miscellaneous texts. Regarding the first two sub-categories, excerpts of around 15,000 words have been extracted from a variety of books. As for the other three sub-corpora, full articles or texts have been included.


Is still under construction, mirroring the compilation of P-ACTRES EN→ES. At present, it contains ca. 1.7 million words belonging mainly to the sub-corpora of books-fiction, while pairs of non-fiction books are being aligned presently.

Has been compiled as a tool to carry out corpus-based contrastive studies and translation studies either independently or jointly. It has proved to be a useful tool for studies at both lexico-grammatical and rhetorical level. It is searched with a browser originally developed by Knut Hofland (University of Bergen) on the basis of CWB (Corpus Web Bench) for P-ACTRES 1.0. The browser has later been modified to house both repositories, by Hugo Sanjurjo-González (University of León) in collaboration with Knut Hofland.
Is a specialized parallel corpus of Spanish original texts and their translations into English. It consists of approximately 2,000,000 words in English and 2,000,000 in Spanish and it includes annual reports and university manuals of marketing, macroeconomics, microeconomics and organization. This corpus provides material for studies comparing original English with translated English in a terminological, grammatical and textual level.

Is a specialized parallel corpus of English original texts and their translations into Spanish. It consists of 49,421 words in English and 32,334 words in Spanish and includes magazine articles from The Economist published in Actualidad Económica. It is used in contrastive rhetorical studies comparing original Spanish with translated Spanish.