About the Project

The BYU-Oxford University Syriac Corpus Project aims to prepare a large corpus of annotated Syriac texts using machine assisted annotation systems. Texts will be annotated with grammatical information, linked to one or more dynamic lexica, and made freely accessible, both to researchers and as open linked data to project partners.


Project Team & Partners


Dr. David Taylor (University of Oxford)

Project Directors

Dr. Kristian Heal (Maxwell Institute, BYU)

Prof. Eric Ringger (Computer Science, BYU)

Dr. David Taylor (University of Oxford)

Computer Assisted Annotation Development

Prof. Deryle Lonsdale (Linguistics, BYU)

Prof. Eric Ringger (Computer Science, BYU)

Prof. Kevin Seppi (Computer Science, BYU)

Current Student Associates

Paul Felt (MS, PhD Student, Computer Science, BYU)

Kevin Black (MS Student, Computer Science, BYU)

Past Student Associates

James Carroll (PhD, Computer Science, BYU)

Marc Carmen (MA, Linguistics, BYU)

Robbie Haertel (PhD, Computer Science, BYU)

Peter McClanahan (MS, Computer Science, BYU)

Contributing Partners

Prof. Stephen Kaufman, The Comprehensive Aramaic Lexicon

Dr. George Kiraz, Gorgias Press & Beth Mardutho: The Syriac Institute.

Profs. Bas Romeny & Wido Van Peursen, The Leiden Peshitta Institute.

Prof. Michael Sokoloff (Bar Ilan), Syriac Lexicon

Strategic Partners

Prof. David Michelson, The Syriac Reference Portal

Publications and Presentations


Eric Ringger, Peter McClanahan, Robbie Haertel, George Busby, Marc Carmen, James Carroll, Kevin Seppi, and Deryle Lonsdale. June 2007. “Active Learning for Part-of-Speech Tagging: Accelerating Corpus Annotation.” In Proceedings of the ACL Linguistic Annotation Workshop, Association for Computational Linguistics, Prague, Czech Republic. Pp. 101-108.

Deryle Lonsdale, “A Computational Perspective on Syriac Corpus Development and Annotation” (IOSOT, Ljubljana, Slovenia).

Kristian Heal & David Taylor, “Towards an Electronic Corpus of Syriac Texts” (IOSOT, Ljubljana, Slovenia).


Robbie Haertel, Kevin Seppi, Eric Ringger, James Carroll. December 2008. “Return on Investment for Active Learning.” In the Proceedings of the NIPS 2008 Workshop on Cost-Sensitive Machine Learning. Whistler, British Columbia, Canada.

Robbie Haertel, Eric Ringger, Kevin Seppi, James Carroll, Peter McClanahan. June 2008. “Assessing the Costs of Sampling Methods in Active Learning for Annotation.” In the Proceedings of the Conference of the Association of Computational Linguistics (ACL-NAACL: HLT 2008), Columbus, Ohio.

Eric Ringger, Marc Carmen, Robbie Haertel, Noel Ellison, Kevin Seppi, Deryle Lonsdale, Peter McClanahan, James Carroll. May 2008. “Assessing the Costs of Machine-Assisted Corpus Annotation through a User Study.” In the Proceedings of the Language Resources and Evaluation Conference (LREC) 2008.

James L. Carroll, Robbie Haertel, Peter McClanahan, Eric Ringger, and Kevin Seppi. 2008. “Modeling the Annotation Process for Ancient Corpus Creation.” In Proceedings of the 2007 Conference on Electronic Corpora of Ancient Languages (ECAL). Prague, Czech Republic.

Kristian Heal & Eric Ringger, “The BYU-Oxford Corpus of Syriac Literature: An Interim Report” (Symposium Syriacum XI, Granada, Spain)


Kristian S. Heal, “The BYU-Oxford Corpus of Syriac Literature,” at the Launching Conference of the RNP Comparative Oriental Manuscript Studies, Hamburg, 2009.*


Peter McClanahan; George Busby; Robbie Haertel; Kristian Heal; Deryle Lonsdale; Kevin Seppi; Eric Ringger, “A Probabilistic Morphological Analyzer for Syriac.” Pages 810-20 in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, edited by Hang Li and Lluís Màrquez. Cambridge, MA: Association for Computational Linguistics, 2010.

Kristian S. Heal, “Corpora, eLibraries and databases: Locating Syriac Studies in the 21st Century,” at Beth Mardutho/Syriac Institute Symposium on Syriac Libraries, May 2010.*


Paul Felt, Eric Ringger, Kevin Seppi, Kristian Heal, Robbie Haertel & Deryle Lonsdale) “First Results in a Study Evaluating Pre-labeling and Correction Propagation for Machine-Assisted Syriac Morphological Analysis.” Pages 878-885 in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC ‘12). Istanbul, Turkey, 2012.

Kristian S. Heal, “Corpora, eLibraries and Databases: Locating Syriac Studies in the 21st Century.” Hugoye: Journal of Syriac Studies 15.1 (2012): 65-78.

Kristian S. Heal, “Report on Syriac Projects at BYU,” at the XI quadrennial Symposium Syriacum, Malta, 2012.*


Paul Felt, Eric Ringger, Kevin Seppi, Kristian Heal, Robbi Haertel & Deryl Lonsdale, “Evaluating Machine-Assisted Annotation in Under-Resourced Settings.” Language Resources and Evaluation (accepted).

Kristian S. Heal, “Accessing Late Antiquity: Syriac Digital Humanities Projects at BYU,” Invited Lecture, Committee for the Study of Late Antiquity, Princeton University, Feb. 20th, 2013.*