History of Science as a Challenge for Virtual Research Environments

History of Science as a Challenge for Virtual Research Environments Guenther Goerz University of Erlangen-Nuremberg, Computer Science Dept., and MPIWG, Berlin March 4, 2013 The digital age has given us
of 5
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
History of Science as a Challenge for Virtual Research Environments Guenther Goerz University of Erlangen-Nuremberg, Computer Science Dept., and MPIWG, Berlin March 4, 2013 The digital age has given us an enormous potential as well as new forms for the provision and the analysis of scientific primary data and historical sources. With appropriate tools, reinterpretation and multiple uses of research data from different disciplinary perspectives as well as the dissemination of results have become a lot easier. Although digital data can in principle be accessed withoout temporal or spatial limitations, there is still a problem with limited access to modern critical editions due to legal issues, which can only be overcome with an offensive open access strategy. The section title jpeg or tiff? History of science in the digital age also addresses the question of standards which is in fact an essential topic for interoperability and in particular for flexible applications of analysis and evaluation procedures. For the first time in history, it is possible to apply statistical, linguistic, and other methods to huge amounts of data. A new dimension of interpretation comes with symbolic and numerical simulations and visualization procedures. Last but not least, networked science offers many new opportunities for cooperation in research and for publications of results. Of course, there are still many problems for which digital techniques are still to be developed or improved such as spatio-temporal models, formal representations of concept change, belief revision, or modelling argumentation structures. As a motivating example, let us take up Kuhn s Copernican Revolution [3], first published in 1957, supposedly within a larger investigation on astronomic observations, instruments, mathematical astronomy and computation, and the role it played in Renaissance astronomy. Among the reviews of Kuhn s book one of the most important is [4]. Many among the twenty-seven reasons in favor of the Copernican theory outlined by Swerdlow are based on planetary observations, e.g. periods, retrograde movements, opposition of outer planets vs. limited elongation of inner planets, or brightness and elongation. Hypothesizing which data Copernicus could have known, we would expect that the diverse sources are available not only in digital form (e.g., as page images), but in representations allowing for uniform search, annotation and evaluation. In particular for evaluation, in our workspace we would like to have access to tools for model building and calculation to assess Kuhn s findings and to integrate them into a comprehensive context. 1 Virtual Research Environments (VREs) are software infrastructures which try to integrate the mentioned issues in uniform and standardized ways. The JISC VRE programme [1] characterizes them as follows: A VRE comprises a set of online tools and other network resources and technologies interoperating with each other to support or enhance the processes of a wide range of research practitioners within or across disciplinary and institutional boundaries. A key characteristic of a VRE is that it facilitates collaboration amongst researchers and research teams providing them with more effective means of collaboratively collecting, manipulating and managing data, as well as collaborative knowledge creation. In more detail, salient features of VREs can be summarized as follows (cf. [2]): View, manipulate and enhance digitized images of documents and manuscripts within a portal framework, Search across multiple, distributed data sets, images and texts, Select, store and organise items from the above, in a personal workspace, Add annotations and links to these items to store personal thoughts and responses, Support collaboration by allowing multiple researchers in separate locations to share a common view of the workspace, in conjunction with real time communication via Chat, VoIP and desktop integration, Allow a collaborator to comment, point/highlight, discuss and annotate the items in the shared workspace. To characterize workflows in the history of science and in general in the transformation, translation, and transmission of cultural heritage items, in discussions with Juergen Renn and others we came up with what I prefer to call the Scholarly Processing Cycle. Primarily, it was motivated by the documentation of collection objects and requirements from the field of digital editions. It consists of four essential steps, ensuring authentification, authorization, persistence, and interoperability: 1. Provision of digital primary sources and their conditioning; 2. Modelling, including transcription and normalization, leading to annotated linked sources; 3. Semantic enrichment and interpretation including collaborative refinement steps in scholarly communities; 4. Release, presentation and publishing of results leading to new primary sources for future research. 2 Now, let us consider these four steps in detail. In the first step, sources, digital primary sources, either captured or digitally born, have to be conditioned, transformed into standard formats, and assigned metadata. Conditioning comprises various image processing steps, eventually image digitization by means of OCR, text transcription into a standardized (XML) format, post processing operations and archiving. It is important to notice that digital media reach far beyond text and images, and comprise different multimedia formats, and dynamic environments such as visualization and mapping modules, and simulation models. XML production includes the collection, normalization and integration of text variants and the synchronous expression of text-image relations, for which expressive XML languages such as the modular system provided by the Text Encoding Initiative (TEI, are available. Step two, modelling, addresses the building of conceptual models for curated knowledge as it is typical for cultural heritage institutions, and their application to the digital resources. For many metadata standards which have already been around for a while, were further developed into conceptual models for certain domains, formal ontologies, consisting of hierarchies of concepts and properties. To achieve interoperability and reusability, common parts of different domain ontologies were represented on a separate level as generic reference ontologies, e.g., CIDOC s Conceptual Reference Model (CRM), which has been an ISO standard (21127) since Usually, such conceptual models are being supplemented by thesauri and authority files, i.e., controlled vocabularies for person names, toponyms, etc. Ideally, modelling an application domain has to be done once by one or a few domain experts, and. except for occasional updates, the models can be used by many. Such an inventory of standardized terms for concepts and properties serves for tagging and annotating digital resources, and formal descriptions of objects and events which are fundamental for the CRM in certain domains can be constructed with them in the form of associative networks. The terms can also support typed linking of parts of digital texts and images internally and externally with other resources for digital editions. In the case of texts, we can think of dictionaries, thesauri, tools for multilingual morphological analysis and terminology maintenance. Standardized representations facilitate text analysis with statistics modules, wordlist, concordance and other corpus and general linguistic tools as well as interfaces to GIS services and various visualizations. The results of the second step are annotated linked sources with metadata which serve as inputs for further processing yielding at more comprehensive interpretations. It is important to keep in mind that the ontological modelled concepts and properties are an important prerequisite for semantic annotation, but semantics in the proper sense comes in not until we provide a reasoning framework, what we will need for a true Epistemic Web. The third step, semantic enrichment, addresses the issue of scholarly interpretation and collaborative research. With formal ontologies and thesauri as basic building blocks, formal knowledge representations can be extracted from the digital annotated resources. If, e.g., we have a specific text reporting a measurement of a planet s position, this could in CRM terms be instantiated as an observation event (in CRM: with actor, instrument, time, and place) containing a measurement (as an attribute assignment) of some entity with a dimension, etc., with certain numerical values. With analogous representations of other observations from other texts or tables, and depending on the knowledge model, a variety of inferences can be drawn regarding observers, places, planetary motion, etc. With the use of a reference ontology like the CRM as a 3 generic semantic index, data federation from heterogeneous sources and by different scholars becomes possible. And finally, the uniformity of such formal representations facilitates to combine them as basic ingredients for a reconstructions of mental models. A salient feature of VREs is the support of personal and group research workspaces. In particular, assistance is to be provided for the collection of elements, text, images, and multimedia objects, for annotations and linking, and for interfaces for collaboration and discussion of intermediate and final results within a community. So, semantically annotated sources will develop into knowledge bases with the help of inference techniques. Paths through personal workspaces can be regarded as representing preliminary stages of publications open for peer reviewing through the same communication mechanisms. With the fourth and last step, Open Access Publication, research results are releases for publication in various formats. Support of different presentation and visualization forms is required from online publications to print on demand in classical formats to virtual exhibitions and scholarly discussion forums. At least online publications must include primary data and models. The VRE should support an XML production chain with Formatting Objects (FO) for formatted printing and for web publications. For the latter ones, enrichment by modelling and simulation will become a regular option. Furthermore, the connection between research and publication will come tighter in terms of micropublications. With WissKI (DFG GO 452/6-1, accessed 1 Mar 2013) we CSD of Erlangen University, Germanic National Museum Nuremberg and Zoologisches Museum Koenig Bonn developed an approach to a research tool with the aim to get as close as possible to the mentioned goals. WissKI is a VRE focussing on object documentation, in particular for museums, i.e. of physical objects, handwritten and printed texts, images, etc. The methodological starting point for its development were research questions, motivated by needs of museum documentation, object-based research, and interoperability. It tries to support Unsworth s scholarly primitives [5] discovering, annotating, comparing, referring, sampling, illustrating, representing in an utmost generic fashion based on a reference ontology (configurable, but CRM in the actual case). Meaning is tied to the construction process and in particular encoded in the linked infrastructure. Among WissKI s design goals were the support of authorship of curated knowledge, authenticity, persistence, and link stability. It grew out of a general need for an infrastructure for interactive and net-based cooperation, but it is clearly oriented towards standardized semantic indexing in terms of reference ontologies, meshed up with thesauri. In this fashion, WissKi tries to catch up with the epistemic level by integrating knowledge modelling and reasoning. Currently, analytic (deductive) reasoning is supported by description logic inference engines. Further developments aim at probabilistic inferences and synthetic (inductive) reasoning in order to support hypothesize and test cycles. Our expectation is that VREs such as WissKI will support research by not only making (re-) sources better available, but better accessible from a subject perspective, i.e. in terms of thematic search and inference. Furthermore, they will facilitate linking and annotation, and in combination with ontology-based markup and indexing provide new insights into the data by means of automatic reasoning methods. In the long run, federated sources will turn into knowledge bases. 4 References [1] JISC, : Virtual Research Environments Programme Phase 2 roadmap. Briefing paper, Joint Information Systems Committee, London, June 2006, accessed 1 Mar [2] JISC, : Virtual Research Environments Programme: Final Report VRE-SDM, Joint Information Systems Committee, London, March 2009, accessed 1 Mar [3] Kuhn, T. S.: The Copernican Revolution. Planetary Astronomy in the History of Western Thought, Harvard University Press, Cambridge, Mass. and London, England, 18th. Ausg., [4] Swerdlow, N.: An Essay on Thomas Kuhn s First Scientific Revolution, The Copernican Revolution, Proceedings of the American Philosophical Society, Bd. 148, Nr. 1, 2004, S [5] Unsworth, J.: What is Humanities Computing and What is not?, in Jahrbuch für Computerphilologie, Bd. 4, Darmstadt, 2002, accessed 1 Mar
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!