Digital archive visualization tools: lessons from the Biolographes experiment

In the scope of the ANR Biolographes project (2012-2016)[1], bringing French and German researchers together under the direction of Gisele Seginger (University of Marne-la-Vallee) and Thomas Klinkert (University of Zurich), we aim to represent networks gathering French-language literary specialists and naturalists from the 19th Century.

On a broader front, “Biolographes” seeks an understanding of the methods of interaction between knowledge of the living, popularization and texts considered literary, particularly at a time when the use of literary form is not yet foreign to the scholarly world and where a vast territory of “literary communication” [VAI 06] spreads out with aims not exclusively esthetic or informative. Similarly, the objects of science still occupy the spare time of so-called literary men, in such a way that the participation of the enlightened hobbyist in the scientific debates of the time is still frequent enough for the literary man not to be immediately discredited upon bringing up these subjects. A canonical example would be the invention of the “Alpine” subject by literary texts (Haller’s didactic poem The Alps (1732) and Rousseau’s New Heloise, in particular), texts whose considerable popularity elicits the calling of naturalist-explorers. This singularity of the 19th Century prevents all thoughts of relationships between science and literature in the simple form of influence, and limits us to more sophisticated approaches than the description of “relationships” oriented by the scholar towards literature that would put the scientific material into words. We would thus hope to explain what circulated between our different scientific or literary poles and how: in fact, it quickly appeared that ideas and concepts could circulate as well as words (even concepts reduced to the rank of keywords) and that vectors of this movement were just as much people, called to exchange in real life or by mutual correspondence, as texts, magazines or books that responded to one another. The team, quite extensive, initially gathered 10 German and 12 French researchers from different universities reaching as far as Switzerland, including doctoral candidates. Only three of them, coming from the Euterpe project, had experience in the creation and exploitation of digital archives.

The project’s first gesture, symbolically, was therefore to create a “stratified” online chronology (we used a simple tool, Preceden) that provided an initial tool, admittedly brief, allowing for the visualization of concurrences or, on the other hand, profound differences; it was then necessary to find out if they were significant - in fact, the existence of delaying effects, even the persistence of an old order, had been foreseen starting in the pre-project phase. This first action demonstrates the need to bring the researchers together, as it happens, the team’s science historians, long before work, in the collection phase, while they show themselves to be more voluntary as final users of the assembled data. It is their disciplinary skill that had to be leaned upon to assess the importance in a given field of a certain event, a certain publication, a certain name that seemed obscure to non-specialists and which holds special meaning for experts.

The second task was to complete the bibliography with a list of magazines likely to serve as an intermediary between specialists and laypeople and that were then analyzed. The digital tool, initially planned exclusively for the visualization of sociability networks, from then on seemed to need to take on an unexpected importance, for the data to be correlated, already abundant, was spreading across a spatial, even geographic plane, but also a diachronic plane. The project then saw a sort of “digital turn”, if we may call it that, for it seemed that only digital tools could truly respond to the initial attempts, and the skills of two specialists were called upon, one in information sciences, Samuel Szoniecky, the other in automatic language processing, Philippe Gambette, as well as a postdoctoral student with the same training.

The first difficulty of this type of project for non-specialists consists of identifying the tools that will maximize the yield of the required investments - understood as investments in human time. As said, one of the reasons for success resides in the involvement of researchers in the assembly of data and the use of visualization tools: it is therefore important to bring the researchers together and make them aware of the tools’ potential. At this stage, every presentation of progress also falls under communication. Moreover, we hoped that the team could seize the tools, which excluded powerful but dreary tools like R, or libraries like d3.js that assume JavaScript skills. After reflection, we also excluded Gephi, a very widespread tool, but one which looks austere and requires graphic work to improve the visualizations obtained and make them fully convincing, which we did not want to do at an experimental stage.

As a first step, we tested tools available on the Web with the data from a spreadsheet created by Muriel Louapre to identify the points of contact between the knowledge of the living and a central author for these issues, successful historian and popularizer of science Jules Michelet. It was a matter of locating the direct and indirect relationships with scholars or professionals of popularization by qualifying the relationship (contact inperson, by letter, through books, etc.); we also integrated geographic or institutional contact spaces and locations, speculating about the advantage of having maps of exchanges available over the long term for all the writers in the corpus (places of residence, institutions, vacations).

