Abstract
Student archives are records of students’ learning activities during school and important proof of students’ learning experiences, so the construction of a credible digital archives system is of great significance. Blockchain is a data structure and information storage technology that has become well-known with the development of Bitcoin and is currently widely used to build a trusted digital environment. The use of blockchain technology can solve the problems of low confidentiality and easy tampering of traditional archives management. Therefore, this paper proposes a blockchain-based student archives management scheme, designs the key data structures and algorithms of the system model, and tests the functions and performance of the system model on the ESXi platform. Experimental results show that the scheme proposed in this paper can effectively improve the confidentiality and difficulty of archive tampering, and at the same time have good resource sharing and retrieval efficiency, making archives management safer, more credible and more intelligent.
Introduction
Blockchain is an information storage technology and data structure that is gradually known and widely used with the development of Bitcoin, using hash pointers to form a linked list of blocks with stored data to achieve a decentralized, non-tamperable, full trace, traceable, collective maintenance, open and transparent data storage mode [1]. A smart contract is a program written in the blockchain that can realize trusted, traceable, immutable transactions or functions without the need for third-party supervision, and is a set of digital agreements for the parties to execute their commitments [2].
Student archives are detailed records of students’ participation in various learning activities during school, recording students’ basic information, academic performance, comprehensive assessments, rewards and punishments information and other data, which are important bases for students to apply for degrees, further education and employment, and will directly affect the future development of students, its authenticity and accuracy are very important. At the same time, student archives information is generated according to the relevant regulations of school and government at that time, which are objective and cannot be tampered with, and special personnel or special systems must be set up to supervise to ensure the relevant interests of students. Student archives are gradually accumulated with chronological order, and the earlier the data create, the more stable it is, and it cannot be modified [3]. The school’s regulations documents are also updated and saved by the version in chronological order for easy reference.
Now many universities have applied information systems to improve the efficiency of archives management, but there are still many problems in the specific practice process, and we need to use new ideas and technologies to improve the archives management work. The traditional student archives management system is generally a centralized database storage scheme, which has risks such as data duplication, information inconsistency, information leakage, arbitrary change, and hacker attacks. Inconsistencies in some archives data and potential human manipulation space may trigger mistrust and disputes between archives owners and archives management departments. Specifically, it can be summarized as follows.
The security and reliability of the archives are questioned In the process of generating student archives, there are many forms of information, and even electronic student files are often recorded or transmitted using paper documents before being entered into the system, so that file readers may question the content of the files [4]. When doubts arise, it is still necessary to find the original paper document support, which returns to the traditional way of archives management. Inconvenient to query and share data Although many universities have realized the information management of archives, the content of archives is generally created by many different departments, the information is scattered and stored in the information system of each department, which is not conducive to rapid and comprehensive access to archival information. The development platform and data structure of each information system are also different, resulting in the inability to integrate and utilize the data between each system, and there are many redundant and conflicting information, forming information islands [5]. The login authentication form of each information system is also different, and users need to maintain multiple account passwords or identity certificates to query information, which is very inconvenient. The update of archives is not standardized The formation of archives is a long-term process, and some archives may have to be updated and adjusted over time and environment. Manually completing these tasks is not only time-consuming, but also has many uncontrollable factors, which is not conducive to the standardization of student archives management [6]. For paper archives, updating the content is cumbersome, and the old paper documents may also be damaged or missing. For electronic files in information systems, updating the contents of the file is relatively easy, but after the update, the new content may overwrite the old content, and the file update will reduce the credibility of the archives.
In order to solve the existing problems in archives management, I read the research literature on the application of blockchain technology in similar fields. Scholars have proposed different data application schemes based on blockchain technology. Liu [7], Rocsana [8], Wang [9], Singh [10] and other scholars proposed applications of blockchain technology in medical, education, energy and other fields, and proposed new management methods based on blockchain for various types of data generated in different work scenarios. Some scholars put forward theoretical research schemes for differentiated practical needs, Wei [11] et al. optimizes the Raft consensus algorithm for the Hyperledger Fabric platform to address the problem of performance degradation caused by the blockchain backup mechanism, and Liu [12] et al. designed an algorithm called Pharmaceutical-Practical Byzantine Fault Tolerance (P-PBFT) to solve the problems of high latency, high system overhead, and small supported scale in the current application of pharmaceutical traceability combined with blockchain technology. Based on this, this paper proposes a blockchain-based archives management scheme, and simplifies some system processes for the student archives management environment, so that the system can operate more concisely and efficiently.
The overall architecture of the blockchain-based student archives system can be divided into interface layer, logic layer and data layer, as shown in Fig. 1. The interface layer is generally implemented by a browser or client, users enter the corresponding data on the graphical interface and submit it to the logic layer for processing, and the logic layer reflects the processed result to the terminal interface for users to browse or use. The logic layer is responsible for the application functions and role permissions of the system, including CA module, system administrator module, school administrator module, teaching management module, teacher module, and student module. The data layer is an important part of the system, which is used to normalize, verify, encrypt and store the data transmitted from the logical layer, while also maintaining the normal operation of each node server [13, 14].
System deployment diagram.
According to different role functions, the clients of the interface layer are divided into three types: system management node, business management node, and ordinary node. The system administrator module and the school administrator module run on the system management node, the teaching management module runs on the business management node, and the teacher module and the student module run on the ordinary node. Ordinary nodes can only create and publish transactions, business management nodes can publish transactions, confirm transactions, query transactions, verify transactions, etc., and the system management nodes are responsible for monitoring the number of transactions in the confirmed transaction pool in real time, and packaging and publishing the blockchain. The client has two modes of operation: full node mode and light node mode. The full node saves all the data of the blockchain, can query and verify the transactions on the blockchain, and needs to have a large hard disk storage space and strong data processing capabilities; Light nodes only save the header information of the blockchain, can query some authorized transactions, but can verify the integrity of all blocks.
The system management node runs on the school’s server cluster, and the hot standby node saves all the complete data of the system blockchain and runs 24 hours a day in full node working mode to ensure that the system is up and running around the clock. Business management nodes are involved in the creation of smart contracts, initiating transactions, packaging transactions into the blockchain, and other operations. Business management nodes generally run in full node mode, which is used by school administrators and can be shut down and offline when not in use. Ordinary nodes run in light node mode, and only download the block header data of the blockchain for creating, publishing and verifying transactions, which can not only ensure the information privacy of teachers and students, but also reduce the data storage and computing pressure of ordinary nodes.
The CA module is used to generate and provide digital certificates, encryption keys, and other information for each account. When the system creates a user account, the CA assigns different role permissions to the account based on the user’s real information, and generates digital certificates and other information for the account. When a user needs to query the information of an account, he can obtain trusted public information from the CA.
The system administrator module is used by the system administrator to initialize the system before the system is officially put into operation, including creating the Genesis block of the blockchain and storing some basic information related to the current system, setting the relevant parameters of the system, and creating the school administrator accounts.
The school administrator module is used for school administrator users to make some personalized settings for the system, including the division of user roles and permissions, the creation of usernames and passwords for administrators, teachers, students, etc., and the creation and review of smart contracts.
The teaching management module is used by school administrators to upload various work specification documents, training plans, course scheduling plans and other information into the system, and create key business-related smart contracts according to the current work specifications. After the smart contract is compiled, it needs to be confirmed by multiple signatures of the relevant personnel of the contract, and submitted to the blockchain for storage. When a smart contract needs to be updated, the previous contract is still stored in the blockchain, and a new smart contract needs to be created to guarantee data integrity and traceability.
The teacher module is used for teachers to enter students’ learning records, assessments, rewards and punishments information, etc. in the system, and call preset smart contracts to evaluate students. All evaluation records of students by teachers are generated in the form of transactions, signed by both teacher and student, and confirmed by management and released.
The student module is used by students to query their own learning records and assessments on the system, call smart contracts to select courses, apply for re-examination or retake, apply for graduation defense, apply for degrees, etc. The smart contract approves or rejects the student’s application based on the student’s current learning status. All students’ learning records and application information are generated in the form of transactions, which are signed and confirmed by the relevant parties and issued by the management.
All transactions confirmed by management are staged in the confirmed transaction pool for writing to the blockchain, and when the number of transactions in the transaction pool reaches a certain number, the system management node packages all transactions in the current transaction pool into a new block and writes it into the blockchain. After the new block is successfully added to the blockchain, the system management node broadcasts the latest blockchain information to other nodes.
The data structure of the block is divided into block header and block body, the block body stores transaction data, and the block header stores the main information of the block. Specifically, it includes the block number: the unique index number of the block, and the Genesis block number starts from 0; Timestamp: the approximate time when the block was generated; Block header hash: hash value of the current block header; Preceded block pointer: hash value of the previous block; Block body hash: hash value of all transaction information in the block body; Nonce: A random variable that adjusts the block hash value.
Data structure of block header
Data structure of block header
The transactions in the block body are saved in a tree structure, and the leaf nodes at the bottom of the transaction tree save the specific content of the transaction, gather into the parent node according to the department information of the trader and calculate the node hash value, and then aggregate the higher-level node and calculate the node hash value according to a larger range of department grouping information until the tree root node is generated, and the hash value of the tree root node is saved in the “block body hash” field in the block header.
The data structure of the transaction includes, Transaction number: it is generated by taking a 64-bit digest from the hash value of the transaction as an index number; Transaction time: the time when the transaction is created; Transaction initiator: the account number of the transaction initiator, the account number is automatically generated by the CA module when the account is created; Transaction recipient: the account number of the transaction recipient; Transaction data: the information of the current transaction or the smart contract code, for a file, stores file name, version number and hash value of the file; Transaction signature: the signature data of the current transaction; Nonce: A random number used to adjust the hash of the transaction.
Data structure of transaction
Storage of files and smart contracts
The issuance of school policy documents is generated in the form of transactions, the documents content is written into the transaction data and saved to the blockchain. If a document is too large, can only store the file name, file version, issuing unit, issuance time, digital signature, hash value and other information into the transaction, and the document file itself is stored in the file server directory of the permanent node for user reading, and the integrity and legitimacy of the file can be verified by the hash value and digital signature.
Smart contracts are also stored in the blockchain in the form of transactions, and when users want to query or call smart contracts, they need to make a request to the management node. Each smart contract has a release time and version number, and when the content of the smart contract is updated, a transaction containing the new smart contract code is created. For smart contracts that implement the same function, users can only call the latest version with the most recent creation time. When a user needs to initiate a key business, he first sends a request to the management node, then the management node will return the corresponding smart contract address to the user. After the user calls the smart contract, the smart contract checks whether current user meets various conditions required by the business according to the code logic, performs the business operation if it is met, and returns an error message if it is not met. After the smart contract is successfully executed, the result will be packaged into transaction data and signed, and then the transaction information and signature data will be sent to the user for confirmation. After that, the transaction information and their own signature data are sent to the management node. If the management node successfully validates the transaction, it will store the transaction into the transaction confirmation pool and broadcasts it to other nodes, who can view and verify the transaction information.
Generation of transactions
When creating a transaction, the initiator of the transaction first sends the transaction data and its signature to the recipient of the transaction. If the information needs to be transmitted encrypted, the transaction initiator encrypts the data with the public key of the transaction recipient before transmitting, and the transaction recipient receives the encrypted data and decrypts it with its own private key. The transaction receiver uses the public key of the initiator to verify the integrity of the transaction, and then signs the transaction with its own private key after the verification is passed, indicating the recognition and confirmation of the transaction, and then sends the transaction and the signature of both parties to the management node. After the management node receives the transaction and signatures, it verifies the transaction with the public keys of both parties, and stores the transaction into the confirmation transaction pool after confirming that it is correct.
Generation of blocks
The system management node monitors the number of transactions in the confirmed transaction pool in real time, and when the transaction reaches the preset number, all current transactions are packaged into the blockchain and published. The system periodically increases or decreases the number of transactions contained in a single block based on the current block generation rate. Since this blockchain is a consortium chain used internally by the school, the difficulty of blockchain calculation does not need to be set too high, and there is no need to adjust it frequently.
Inquire and verify transactions
When a user requests to query a transaction, the management node first verify whether the user has the permission to view this transaction, if so, find the block where the transaction is located, read out the transaction content, encrypt the transaction content with the requestor’s public key, and send it to the requestor. After receiving the data, the requestor can view the specific content of the transaction after decrypting it with his own private key [15].
Verifying transactions.
When a user requests to verify the reliability of a transaction, the management node finds the block stored in the transaction and sends the transaction content, transaction signature, structure of the transaction tree, hash values of related transactions on the transaction tree and other information to the requester. The requester can obtain the data structure as shown below, where the blockchain header information (including the block hash value and the root hash of the transaction tree in the block body, etc.) is known to all nodes. In order for the requester to verify the validity of the transaction, data such as the structure of the transaction tree, the details of the node to be verified, the hashes of other nodes are also given. The authenticity of the transaction can be confirmed by verifying the digital signatures of both parties to the transaction, and the integrity of the transaction can be confirmed by calculating the hash value of the transaction tree layer by layer. For example, in Fig. 2, transaction node 0 is to be verified, and the hash value h8 of the tree root node can be obtained by calculating the hash value layer by layer. By comparing the calculated tree root hash h8 with the tree root hash x saved in block B, we can verify the validity of the transaction. Obviously, if h8 equals x, the transaction is complete and valid.
This paper uses Python 3.7 to implement the system model, builds virtual servers based on the VMware ESXi 6.7 platform, and each node runs on the same physical server. The hardware configuration of the physical server is 24 CPUs x Intel Xeon Silver 4310 CPU @ 2.10 GHz with 256 GB of memory. The virtual machines configuration used for the test is as follows: Windows Server 2012 64-bit operating system, a virtual Intel Xeon Silver 4310 CPU @ 2.10 GHz with 8 GB of memory. The experiment uses transaction latency and transaction throughput as indicators of system performance, and transaction throughput is expressed as transactions per second and TPS. Transaction latency and transaction throughput are also considered to be the most commonly used evaluation metrics in blockchain systems, for example, the throughput of Bitcoin transactions is 7 transactions per second, and the transaction delay is about 10 minutes, while the throughput of Ethereum is 17 transactions/second, and the transaction delay can be reduced to 30 seconds. We test the throughput and latency changes of the system using different block sizes and mining difficulty, and the results are shown in the table below.
Difficulty 1–5 block time
Difficulty 1–5 block time
Difficulty 1-4 block time.
Difficulty 5 block time.
The results show that there is a clear positive correlation between the time of calculating blocks and the mining difficulty, and less related to the number of transactions contained in each block. When the mining difficulty is set low (1 to 4), the calculation of the transaction hash accounts for the main part of the total computation volume of the system, so the block time shows a rough positive correlation with the number of transactions per block; When the mining difficulty is set high (above 5), the computation of mining accounts for the main part of the total calculation volume of the system, so the block time is displayed as a random distribution. Compared to difficulty 1 to 4, the block time increases sharply when the mining difficulty is increased to 5. Therefore, in actual use, it is recommended to set the difficulty below 5, which can improve the throughput of the system by increasing the number of transactions in the block.
The student archives are important credential for students’ learning experiences and learning effects, and the confidentiality and immutability must be guaranteed. The blockchain-based student archives system designed in this paper undergoes multiple reviews when the data is first entered to reduce the occurrence of errors, avoid modification after the file data is formed, and let the data update be implemented in the form of adding information entries. Data operators sign with their own private keys, prevent repudiation of operations, and preserve the data update process, making the operation of archive data transparent and trustworthy. When carrying out key business, users need to call the preset smart contract to complete, and the smart contract is based on the relevant laws and regulations, and the multiple signatures of different auditors take effect, ensuring the fairness and justice of the smart contract. Therefore, through blockchain and smart contract technology, the credibility of student archives is further guaranteed, the possibility of file fraud and school disputes is reduced, and the efficiency of archives management is improved. However, further research can focus on the implementation methods of consensus mechanisms in specific fields, and the consensus algorithm modules can be isolated to flexibly adapt to different application scenarios.
Footnotes
Acknowledgments
The authors acknowledge the Teaching and Research Project of Hubei Engineering University (Grant: 2019043) and the Hubei province Education Science Planning key topics (Grant: 2019GA043).
