Markup Languages and XML

Extensible Markup Language (XML) is a system for creating markup languages: languages for digital text representation that enable us to convey information about textual structure and content in a way that can be read by both humans and computers. The use of XML in digital scholarship is focused on communicating what data is, and XML-based markup languages can be created to suit a project’s specific needs. DSG uses XML for projects that include full-text data (such as the Early Caribbean Digital Archive, DHQ, TAPAS, and the Women Writers Project), where the structure and content of the texts is an important aspect of their research value. In XML-based projects, this information can support advanced searching, visualization, and other forms of analysis.
XML Publishing Tools


The Text Encoding Initiative (TEI) is an XML language for representing documents, particularly those intended for digital scholarly research. Examples of TEI projects here at Northeastern include TAPAS and the Early Caribbean Digital Archive.

Northeastern University is an institutional member of the TEI Consortium, and members of the Northeastern community can take advantage of membership benefits including discounts on the Oxygen XML editor, membership in TAPAS, and discounts on registration for the TEI annual conference. If you use TEI and would like to use Northeastern’s account, please sign up here.

TEI P5 Guidelines