Job: Research Assistant (E13 TV-L, qualification) in machine learning, TU Berlin

Dear members of the HNR community,

this position might be of interest to some of you, via Prof. Matteo Valleriani and Dr. des. Hassan El-Hajj (more info at

Research Assistant – salary grade 13 TV-L Berliner Hochschulen – For qualification

part-time employment may be possible

The Berlin Institute for the Foundations of Learning and Data (BIFOLD) at Technische Universität Berlin (Prof. Dr. Klaus-Robert Müller) is seeking a Research Associate in Machine Learning for an Agility subproject. The agility project will be carried out in close cooperation with the project “The Sphere. Knowledge System Evolution and the Shared Scientific Identity of Europe” ( by Prof. Dr. Matteo Valleriani at the Max Planck Institute for the History of Science in Berlin.

Valleriani’s group is developing algorithms to study knowledge systems in the history of science. Building on a dataset extracted from astronomical tracts of the early modern period (ca. 1450-1650), the overall goal of the project is to identify mechanisms of knowledge evolution and to quantify these processes. Data refer to texts, images, and numerical computational tables. The focus of this unit is on transcribing, augmenting and analyzing texts using machine learning.



Independent and responsible research in the area of machine learning. The goal is to quantitatively determine semantic relations between texts.

The tasks involved are:

  • Data extraction from over 110,000 pages of the Sphaera corpus
  • Building efficient image segmentation pipelines and fine-tuning OCR approaches to adapt to different early modern print styles and languages
  • Improve speech recognition for under-represented languages by transferring modern language technology, e.g., Large Language Models
  • Developing and analyzing approaches for extracting historical insights from the results
  • Communicating results through presentations
  • Assist in the maintenance and enhancement of the Sphaera database (semantic technologies)


  • Successfully completed academic university degree (Master, Diplom or equivalent) in Mathematics, Physics, Computer Science, Data Science, Digital Humanities or related field;
  • Good knowledge of German and/or English required; willingness to acquire the respective missing language skills;
  • Solid mathematical foundations, especially in statistics and probability theory, analysis and linear algebra;
  • Proven experience in machine learning and data science with a strong understanding of algorithms, statistics and mathematical concepts;
  • Very good programming skills in Python, and solid knowledge of common machine learning frameworks such as PyTorch, TensorFlow or scikit-learn;
  • Familiarity with SOTA machine learning models and approaches;
  • Familiarity with Explainable Artificial Intelligence (XAI);
  • Ability to interact with a team of historians and other ML experts.

Desirable qualifications:

  • Solid knowledge of SQL and SPARQL for efficient data extraction, manipulation and analysis;
  • Good knowledge of knowledge graph data structure (e.g. RDF data structure);
  • Good understanding of network analysis;
  • Experience with version control systems (e.g. Git) and Docker containers;
  • Strong communication skills in English and the ability to explain complex topics to a broad audience from different backgrounds (i.e. both historians and computer scientists);
  • Basic knowledge of HTML and JavaScript;
  • Experience in implementation of ML methods;
  • Interest in humanities/historical subjects;
  • Publications relevant to machine learning.

For information on how to apply, please visit

Published by Aline Deicke
December 15, 2023

Stay tuned

Subscribe to the HNR Newsletter

We use Mailchimp to send emails. By signing up you agree to their terms of use.