One day, when we sit in front of the computer and click the mouse to open a commonly used link, we find that what appears on the screen is not the familiar web page, but a series of annoying characters: "404 File not found". I believe this is an unpleasant experience that everyon

2024/05/0110:15:33 technology 1931

One day, when we sit in front of the computer and click the mouse to open a commonly used link, we find that what appears on the screen is not the familiar web page, but a string of annoying characters: "404 File not found". I believe this is an unpleasant experience that everyone who has dealt with the Internet has encountered more than once. The Internet has opened up a whole new world for us, making the ideal of information being far-reaching a reality, but this is a world full of variables and uncertainties. It is estimated that about 16% of links become "dead links" (Linkrot) every 6 months. What is most lacking on the Internet is no longer information, but rules.

The instability and haphazard loss of information is disastrous for academic research. Research results based on large amounts of uncertain information are tantamount to castles on the beach. Some organizations and institutions have realized the seriousness of this problem and have begun to establish information standards on the Internet. DOI is one of the most effective ones.

DOI (Digital Object Identifier) ​​means "digital object identifier" and is a tool for identifying digital information including Internet information. In traditional physical publications, whether they are books, periodicals, tapes, or CDs, they will be assigned international standard numbers such as ISBN, ISSN, ISCN, etc. and their barcodes, which serve as the unique identifier of the publication in the sea of ​​books and publications. These labels enable publications to be managed effectively and make it easier for people to find and use them. Once a document on the Internet changes its website address (URL), it disappears without a trace, making it impossible to trace it. If you add a DOI to digital information, it is like attaching a barcode to the publication, and it can be traced no matter where you go. Therefore, DOI is vividly called the barcode of digital resources. The encoding method and technical characteristics of

1 DOI

DOI The birth can be traced back to the establishment of the "Enabling Technologies Committee" (Enabling Technologies Committee) under the Association of American Publishers (AAP) in 1994 , the committee is tasked with designing a system to protect intellectual property and the commercial interests of copyright owners in the digital environment. First, a publishing industry standard digital information identification code must be introduced to support the mutual conversion of various systems between publishers and users, and provide a basis for coordinated management between copyright and usage rights. The DOI system made its debut at the 1997 Frankfurt Book Fair and became a standard for naming digital resources. In 1998, the International DOI Foundation (IDF), a non-profit organization, was established in Frankfurt, responsible for policy formulation, technical support, name and address registration and other services related to DOI. The encoding method of

1.1

DOI is the structural formula of

DOI:

<DOI>=<DIR>. /

DOI is divided into two parts: prefix and suffix, separated by a slash in the middle. The prefix is ​​divided into two parts with a small dot.

is the specific code of DOI, and its value is 10, which is used to distinguish DOI from other systems that apply Handle System (handle system) technology. (Registrant’s Code) is the code of the DOI registration agency. It is assigned by the DOI management agency IDF (International DOI Foundation) and consists of four Arabic numerals. The suffix (DOI Suffix String) is given by the DOI registration agency - currently mainly academic publishers. The rules are not limited, as long as it is unique among all products of the publisher. For example, the following examples can be legal encodings of DOI:

10.1234/5678

10.2341/0—7645—4889—1

10.5678/978—0—7645—4889—4

10.1000/ISBN0764548891

10.1234/Norman- presentation

10. The naming structure of 2224/2003-1-29-CENDI-DOI

DOI enables each digital resource to be uniquely identified globally. DOI is different from URL, it is the name of digital resource and has nothing to do with the address.In fact, it is a kind of URI (Universal Resource Identifier, Uniform Resource Identifier) ​​or URN (Universal Resource Name, Uniform Resource Name), which is a digital label of information and ID . With it, the information is unique and traceable. Technical characteristics of

1.2

DOI

DOI is based on two technologies: Handle System (handle system) and metadata framework. The

Handle System is a technology platform developed by the Corporation for National Research Initiative (CNRI) and is used for the naming, parsing and management of Internet information. (Interoperability of Data in Ecommerce Systems) is a set of metadata framework for realizing data interoperability under e-commerce environment . Choosing as the metadata framework provides a foundation for various applications of DOI. The

Handle System and the metadata framework provide applications from single parsing to multiple parsing for DOI. The first applied single address resolution mechanism provides users with permanent access to digital resources. In order to avoid user link failure caused by changes in resource addresses, the DOI system effectively manages resource addresses. When a publisher registers a DOI for each of its resources, it must also submit the resource's DOI name and URL (URL) to the Handle System host. The publisher is responsible for maintaining DOI data. When the resource address changes, such as when an online journal article is moved from the current issue directory to the archive directory, the publisher should notify the Handle System host to make corresponding changes to ensure the validity of the link. When a user clicks on the DOI of a resource to request information, the user's request is transmitted to the Handle System server. The Handle System server parses the DOI into a URL and returns it to the user terminal, allowing the user to access the resource. All this is done in the background. For users, there is no need to pay attention to any changes in resource addresses, and they always face the same DOI. In theory, the resource links provided by DOIs have permanent validity. The provision of permanent links to resources by

is only a basic and preliminary application of DOI. In fact, the Handle System technology itself also includes the function of multiple resolution, that is, a DOI can not only point to one URL, but also multiple URLs, as well as various other types of metadata other than URLs. The following diagram shows that a DOI can be parsed into multiple types of data:

One day, when we sit in front of the computer and click the mouse to open a commonly used link, we find that what appears on the screen is not the familiar web page, but a series of annoying characters:

Multiple parsing of DOI provides users with more choices and convenience. When parsing multiple URLs, they can choose the mirror site closest to them to download the data; at the same time, they can also link to a lot of related information about the resource, such as obtaining metadata, related subject works, related review literature, and the same author's Other works, and related multimedia information such as music, pictures, animations, information and contact information of copyright holders and publishers, etc. Multiple parsing not only ensures access to resources, but also opens the door to various in-depth utilizations of resources.

2 Application and development prospects of DOI

Currently, more than 300 organizations and institutions have joined DOI, and the number of DOI records is close to 10 million. Documents using DOI began to develop from a single language, English, to multilingual documents, including French, German, Spanish, Italian, Korean and other documents. DOI is currently mainly used for text, but encoding for non-text objects such as sounds and images is already being explored. The successful application of

2.1

CrossRef

DOI provides a powerful tool to ensure stable links to online academic resources. It was first effectively used in this regard. This is the birth of CrossRef.

CrossRef is a reference link system. In September 2000, it became the first registration agency authorized by the International DOI Foundation. Join CrossRef's academic publishers and annotate the academic papers they publish with DOI. When users see the paper in the reference list of other papers, they only need to click on its DOI to link to the page where the paper is located and read the abstract. or full text.Crossref realizes dynamic links between references of academic papers, creating great convenience for academic research and achieving great success.

Currently, about 200 publishing organizations have joined CrossRef. Since January 2001, CrossRef has added about 3 million DOI identifiers every year, and the server has to process about 2 million parsings every month.

2.2

Application prospects in e-commerce

In fact, ensuring stable links to academic information is only a basic application of DOI. As mentioned earlier, in addition to the resource itself, the multiple resolution mechanism also provides users with a large number of links to related information. But it's more than that. DOI is an actionable system dedicated to stimulating action, a system that promotes and serves e-commerce. This is the reason why DOI adopts the metadata framework.

Looking back at the history of DOI, it can be seen that DOI is mainly a standard initiated and established by the publishing industry. It takes more into account the need to promote e-commerce in the publishing industry and protect intellectual property rights and publishers' interests. The larger and main goal of DOI is applications in the field of e-commerce. The various types of data exported by DOI through multiple parsing contain all the basic elements required for e-commerce. When the reader clicks on the DOI link to the relevant resource, if the resource requires payment, the reader can be directed immediately to the e-commerce process. For example, by embedding a DOI system in electronic document reading software, online ordering of documents can be realized. Users can directly link to the publisher's website through the DOI to purchase eBooks, or pay to print electronic documents. The application potential of DOI in the field of e-commerce in the publishing industry is huge. Although this field is still being developed, it is developing rapidly and some experimental projects have been launched.

2.3

Limitations of DOI and the participation of the library community

Although DOI has begun to take shape and has great development potential, it also has certain limitations. Since DOI's review of registered agencies is relatively strict, and in order to maintain the operation of DOI, members are required to pay not-so-low membership fees. At present, most of the participants in DOI are large publishers, and the products of many small publishers are still outside the scope of DOI application. . Although the number of DOI records is considerable, it is still only a drop in the bucket compared to the massive amount of network information. Judging from the participation of DOI, the leading roles in the operation and development of DOI are still representatives of the publishing industry. This makes DOI somewhat tainted with a commercial smell and hinders its wider promotion as an information standard. and applications.

A noteworthy phenomenon is that government agencies, libraries and other information user representatives are participating in the DOI development process. The Stationery Office (TSO) in the UK, which is responsible for publishing government documents, has become the first registered agency from the government. At the same time, the National Library of Germany, the Netherlands and the British Library have joined the DOI informal forum. After all, digital information resources are the common wealth of all mankind, not the exclusive property of publishers. The encoding and interoperability of digital information are of great importance to the sharing and utilization of information resources, and require the joint participation of all parties concerned to reflect the interests and requirements of all parties. As a public welfare institution that preserves and disseminates information resources, libraries should actively participate in the formulation of digital resource sharing rules to safeguard the public's right to reasonable use of information and achieve a balance between the interests of copyright owners and users.

Original publication " Library Work and Research " Issue 5, 2003

Responsible editor: Chu Xintong

About the author

One day, when we sit in front of the computer and click the mouse to open a commonly used link, we find that what appears on the screen is not the familiar web page, but a series of annoying characters:

He Zhaohui, PhD in history, is currently a professor at the Institute of Classical Literature, Institute of Advanced Confucian Studies, Shandong University. The main research fields are Ming history, edition bibliography, and book history. He is the author of "Research on County Government in the Ming Dynasty", "Scholars and Commercial Publishing in the Late Ming Dynasty", "The Social History of Books - Books and Scholar Culture in the Late Chinese Empire" (translation), "Introduction to the History of Books" (translation), etc.

published by six

publishers' Xiaojia

published by six public accounts. All contents are original.

Please do not use without permission.

welcomes cooperation and reprinting.

technology Category Latest News