A parallel biological computing algorithm to solve the vertex coloring problem with polynomial time complexity

Abstract

The vertex coloring problem is a well-known combinatorial problem that requires each vertex to be assigned a corresponding color so that the colors on adjacent vertices are different, and the total number of colors used is minimized. It is a famous NP-hard problem in graph theory. As of now, there is no effective algorithm to solve it. As a kind of intelligent computing algorithm, DNA computing has the advantages of high parallelism and high storage density, so it is widely used in solving classical combinatorial optimization problems. In this paper, we propose a new DNA algorithm that uses DNA molecular operations to solve the vertex coloring problem. For a simple n-vertex graph and k different kinds of color, we appropriately use DNA strands to indicate edges and vertices. Through basic biochemical reaction operations, the solution to the problem is obtained in the O (kn²) time complexity. Our proposed DNA algorithm is a new attempt and application for solving Nondeterministic Polynomial (NP) problem, and it provides clear evidence for the ability of DNA calculations to perform such difficult computational problems in the future.

Keywords

NP-hard problem the vertex coloring problem Adleman-Lipton model DNA computating

1 Introduction

As early as the 1950s, Feynman, one of the founders of nanotechnology, proposed the idea of manipulating matter on a small scale [1]. However, it was not until 1994 that Adleman achieved molecular-scale calculations [2]. Due to Adleman’s pioneering work, DNA computing has gradually evolved. Adleman firstly used DNA molecules to solve the directed Hamilton path problem (HPP). Adleman’s experiments show that the computation with DNA molecules is highly parallel and computationally intensive. The parallelism of DNA computing enables us to solve Non-deterministic Polynomial (NP)-hard problems with DNA molecules in the time of linear growth, and the time needed to solve these problems on the Turing machine tends to increase exponentially with the scale of the problem.

In this section, we briefly introduce the research progress of DNA computing, the existing algorithms of vertex coloring problem, and summarize the research content of this paper.

1.1 Research progress of DNA computing

In 1995, Lipton [3] proposed a DNA computing model to solve SAT problems with DNA computing by constructing contact network diagrams. DNA computing has become a hot topic in the field of science. Remarkable achievements have been made in the field of computing, which make the NP-hard problems become a vital research direction of the DNA computing model. Some typical DNA computing models, such as Adleman-Lipton model [2, 3], sticker model [4], surface-based model [5], hairpin model [6] and self-assembly model [7], have been proposed to promote the development of related research. Based on these models, there are many papers that design DNA programs and algorithms to solve various NP-hard problems [8 –29]. In terms of DNA computing modeling, after years of development, scientists in computer, mathematics, biology and other related fields have proposed a variety of models for DNA computing, and these models are generally divided into three categories: one is based on the formalization theory, which abstracts biological operations in DNA computing into formal symbols, and then based on the formal derivation and mathematical modeling, the final calculation results are obtained; the other type of model is based on biological theory, which is to realize all the operations related to DNA computing in the way of biological operation, and obtain the calculation results through the actual operation in the biological laboratory; the third type is the calculation based on the structure and characteristics of DNA molecules model, used to reduce DNA strand operations effectively. To make better use of the power of biological computing, it is necessary to use DNA biological computing to solve more NP-hard problems that have not been involved.

Chai et al. [43] proposed an image encryption algorithm based on chaotic system and deoxyribonucleic acid (DNA) sequence operations. They encoded ordinary images into a DNA matrix and used a two-dimensional Logistic chaotic map to generate chaotic sequences for row cycle permutation (RCP) and column cycle permutation (CCP). Then, the line by line image diffusion method at DNA level was used. Finally, the spread DNA matrix was decoded to get the cipher image. Experimental results and security analysis showed that the algorithm has a good encryption effect and could resist various typical attacks. Liang et al. [44] compared and analyzed the differences between molecular computing and bioinformatics, which are two critical interdisciplinary subjects of molecular and computer. There is a close relationship between them, but they are developing in different directions. The molecules are only a few nanometers in size and inexpensive, making it possible to make DNA chips containing billions or even trillions of switches and modules. Xu et al. [45] used a 3-color graph with 61 vertices to illustrate the ability of the DNA computing model. The experimental results show that when constructing the initial solution space, not only all the graph solutions are found, but also more than 99% of the false solutions are eliminated. Liu et al. [46] classified biological networks according to the types of networks used by several standard computing methods, and introduced the computing methods used by every type of network. They used the advantages of the DNA computing model, such as the construction of non-enumerable initial solution space, partition and combination of subgraphs, parallel biological operation, etc., to solve a graph vertex coloring problem with the computational complexity of O (3⁵⁹).

1.2 Algorithms of vertex coloring problem

The vertex coloring problem (VCP) is a famous subject of modern graph theory in discrete mathematics. It has good theoretical research significance and engineering application value, such as scheduling, circuit wiring, frequency arrangement, scheduling and storage, etc. However, the vertex coloring problem is an NP-hard problem [30], and there is no precise algorithm of polynomial time. Therefore, the application of heuristic algorithm to find its approximate solution has become a hot topic for many scholars and experts. The vertex coloring problem requires each vertex to be assigned a corresponding color from given k kinds so that the colors on adjacent vertices are different, and the total number of colors used is minimized. For instance, the undirected graph G in Fig. 1 defines such a problem. And the solution to VCP of Fig. 1 is the color set for {v₁, v₃} ⟶ “ color1 ", {v₂, v₆} ⟶ “ color2 ", {v₃, v₅} ⟶ “ color3 ".

Fig. 1

An undirected graph G with 6 vertices and 3 kinds color restriction.

Karp [31], Garey and Johnson [30] proved that the vertex coloring problem is an NP-hard problem in graph theory, so many approximate and exact solutions were proposed. Hertz and De Werra [32] proposed a tabu search algorithm that maintains the division of graph vertices in each iteration. In this algorithm, each block of a partition is given a different color, which does not always guarantee that it is an independent set. Therefore, the algorithm is suitable for solutions that are not necessarily feasible. Caramia and Dell’Olmo [33] proposed a local search algorithm HCD based on tabu search. The basic idea behind HCD is to use the tabu concept instead of expressing the tabu list explicitly. In contrast, the dynamic assignment of the vertices’ priorities in the graph performs the same task, avoiding duplication in subsequent movements of the algorithm. Voudouris and Tsang [34] proposed a local guided search (GLS) method for solving combinatorial optimization problems. Caramia et al. [35] proposed a priority search algorithm called CHECKCOL, which reduces runtime by avoiding unnecessary searches in most areas of the graph. Galiner et al. [36] gave an adaptive algorithm AMACOL to solve the problem of coloring graphs. The adaptive memory algorithm is a hybrid evolutionary heuristic algorithm that uses a central memory to store stable sets derived from the coloring generated in the previous stage of the search. Caramia and Dell’Olmo [37] also proposed a two-stage local search for vertex coloring. The algorithm alternately performs two closely related functions, random and deterministic local search.

1.3 Article outline

Based on the Adleman-Lipton model [2, 3], this paper uses a new parallel DNA algorithm to solve the vertex coloring problem. Its calculation time complexity is O (kn²), compared to the exponential time complexity of other electronic computing algorithms.

The rest of the paper is arranged as follows. Section 2 introduces the background of DNA computing and the Adleman-Lipton model. Section 3 presents the detailed steps of the DNA algorithm to solve the vertex coloring problem. Section 4 gives the proof and complexity of the proposed algorithm. The simulation results of DNA algorithm are shown in Section 5. We conclude in Section 6.

2 Background of DNA computing and adlema-liption model

An overlap between life sciences and engineering is a striking characteristic of the development of modern science and technology. DNA computing is a new calculation method using biomolecule DNA as a calculation medium and biochemical reaction as a calculation tool. Generally considering, although the ability of the classical digital computer is unassailable when it executes the serial task, DNA computing shows a natural advantage compared with the classical digital computer in solving the problems that all possible solutions should be verified, which exist everywhere. When used to solve the parallel processing in large-size computing, it is of tremendous advantages.

DNA computing is a controllable biochemical reaction process that uses the double helix structure of DNA and the complementary matching rule of bases to encode information, map the information to be calculated into DNA molecular chains, generate various data pools with the help of biological enzymes, and then map the data operation of the original problem into DNA molecular chains in high parallel according to specific rules. Finally, molecular biotechnologies (such as PCR, ultrasonic degradation, affinity chromatography, cloning, mutation, molecular purification, electrophoresis, magnetic separation, etc.) are used to screen the required results. Mathematically speaking, A single strand DNA can be regarded as a string composed of symbols A, C, G and T, which can be represented as a set of four letters to decode and calculate information just like the codes 0 and 1 in an electronic computer. The two different ends of the DNA sequence are named after the 5’and 3’ ends, respectively. As a pair, nucleotides A with T and nucleotides C with G are considered complementary. Two complementary single-stranded DNA sequences of opposite polarity join together to form a double helix during annealing. In the opposite process, a double helix separates into two composed single strands, called melting. Besides, the length of a single-stranded DNA is the number of nucleotides composed of a single strand. So if a single-stranded DNA contains 20 nucleotides, it is called 20mer. The length of double-stranded DNA (where each nucleotide is a base pair) is counted in the number of base pairs.

In the Adleman-Lipton model, one tube has many DNA strands with an alphabet set {A, G, C, T}. Given such tubes, we can perform some operations with the O (1) time complexity [38]. The DNA operations are included but not limited to: Merge, Copy, Separation, Selection, Detect, Ligation, Discard, Annealing, Denaturation, Read and Append - tail. The above operations of specific biological meaning can refer to [39]. Specific enzyme reactions can act as "software" to complete all kinds of information processing. In this way, a new type of computer based on DNA chip can be developed by carrying out rich, precise and controllable chemical reactions on DNA double helix to complete various kinds of operation processes. At least, in theory, it can be proved that DNA computing is universal and can solve all the problems that the Turing machine can solve.

3 DNA computing algorithm of the vertex coloring problem

3.1 Description of the vertex coloring problem

For simplicity of discussion, here is a formal description of the vertex coloring problem: for a simple undirected graph G without self-loops, its vertex set V (G) = {v_k|k = 1, 2, …, n} and edge set E (G) = {e_i,j|1 ⩽ i ≠ j ⩽ n} are known. The coloring function f is the mapping f from the vertex to the color set C = {1, 2, . . . k}: V (G) → C satisfies: ∀e_i,j ∈ E (G) , v_i, v_j ∈ V (G), then f (u) ≠ f (v); this is the k-vertex coloring problem of the graph. Generally, k is considered to be a constant value. For example, in the commonly studied graph vertex 3 coloring problem, k is 3.

3.2 Basic idea of the algorithm

The basic idea of solving the vertex coloring problem is as follows: the DNA chains representing the edges and vertices of the vertex coloring problem are put into the test tube, and the biological chains of various possible coloring schemes are obtained by the biochemical reaction. Then, the biological chains that do not meet the constraint conditions of the problem are screened out, and the color information contained in each feasible chain is recorded by the number of color types, to judge and search for the optimal solution. Specifically, the algorithm is implemented in four steps:

(1) The biological chain corresponding to various coloring schemes of vertex coloring problem is generated;

(2) The biological chains that do not meet the feasible solution conditions are deleted;

(3) The number of colors in each possible solution chain was recorded by adding a biological chain;

(4) Find the optimal solution.

The flow chart of our proposed algorithm is shown in Fig. 2.

Fig. 2

The flow chart of proposed algorithm.

3.3 Symbol and parameter

In this paper, we use the symbols $#, \bar{#}, A_{i}, B_{i}$ (i = 1, 2, …, n) to represent different DNA single strands of length tmer (t chooses a smaller integer based on the size of the problem [40]). In addition, we use the single-stranded DNA symbol A_itB_i(1 ⩽ i ⩽ n, 1 ≤ t ≤ k) to represent the color state of vertex v_i, where A_i represents the starting chain of the vertex v_i, B_i is the termination chain, and the vertex is given the k-th color information. Meanwhile, the symbols $#, \bar{#}$ are signal chains divided between different vertex coloring sets. If e_i,j ∈ E, a chain y_i,j is generated in the tube T₃. To calculate the number of colors including in the set, we also designed DNA strings X. Let

T₁ = {1, ⋯ , k, # A₁, B_n # , B_t-1A_t|t = 2, ⋯ , n},

$T_{2} = {\bar{#}, \bar{B_{t} 1 A_{t}}, \bar{B_{t} 2 A_{t}}, \dots, \bar{B_{t} {kA}_{t}} | t = 1, \dots, n}$ ,

T₃ = {y_i,j|1 ⩽ i < j ⩽ n, e_i,j ∈ E},

T₄ = {X} .

3.4 Algorithm steps

Procedure for production of all possible vertex coloring sets.

(1-1) Merge (T₁, T₂);

(1-2) Annealing (T₁);

(1-3) Ligation (T₁);

(1-4) Denaturation (T₁);

(1-5) Separation (T₁, {#} , T₅);

(1-6) Discard (T₁);

(1-7) Copy (T₅, T₁).

Filter and exclude combination chains with the same color in adjacent vertices.

For i = 1 to i = n - 1

For j = i + 1 to j = n

(2-1) Separation (T₃, {y_i,j} , T₆);

(2-2) If(Detect (T₆) = “ Yes ")

Then

(2-3) Discard (T₆);

For t = 1 to t = k

(2-4) Separation (T₁, {A_itB_i} , T₇);

(2-5) Separation (T₇, {A_jtB_j} , T₈);

(2-6) Discard (T₈);

(2-7) Merge (T₁, T₇);

(2-8) Discard (T₇).

End For

End for

Add biological chains to represent the number of colors.

For t = 1 to t = k

(3-1) Separation (T₁, t, T₉);

(3-2) Append - tail (T₉, X);

(3-3) Merge (T₁, T₉);

(3-4) Discard (T₉).

End for

Find the optimal solution according to the length of biological chain.

For j = 1 to j = k

(4-1) Selection (T₁, (3n + 2 + j) t, T₁₀);

(4-2) If(Detect (T₁₀) = “ Yes ")

break;

Else

Continue;

End for

(4-3) Read (T₁₀);

3.5 Illustrate the steps

For a simple graph with n vertices, each possible coloring set of vertex V_G can be represented by an n-bit number. A bit set to 1 indicates that the color of the vertex is the “color 1", a bit set to 2 indicates that the vertex is the “color 2". Correspondingly, a bit set to k indicates that the color of the vertex is the “color k". For instance, in Fig. 1, the 6-digit 132131 indicates that the color of vertex {v₁, v₄, v₆} is “color1", the color of vertex {v₃} is “color2", and the color of vertex {v₂, v₅} is “color3". By this method, we convert all kinds of coloring sets in an n-vertex graph to different representations with n-bit numbers. We mean this as the transformation of information.

After the seven steps of Algorithm 3.4.1, single strands in the tube T₁ will encode all possible vertex coloring sets. Taking Fig. 1 as an example, the DNA strand # A₁2B₁A₂1B₂A₃1B₃A₄3B₄A₅2B₅A₆3B₆ # ∈T₁ denotes the vertex coloring set: {v₂, v₃} ⟶ “ color1 ", {v₁, v₅} ⟶ “ color2 ", {v₄, v₆} ⟶ “ color3 ".

A feasible solution to the vertex coloring problem requires different colors on adjacent vertices. Therefore, the operation should check whether vertex coloring sets meet the above constraints. If e_i,j ∈ E, we should discard DNA strands whose vertices v_i and v_j are the same color. As shown in Fig. 1, # A₁3B₁A₂1B₂A₃2B₃A₄3B₄A₅2B₅A₆2B₆ # (representing the vertex coloring set: {v₂} ⟶ “ color1 ", {v₃, v₅, v₅} ⟶ “ color2 ", {v₁, v₄} ⟶ “ color3 ") should be excluded as the vertices v₅ and v₆ connected by edge e_5,6 are having the same “color 2". Only then all eligible vertex coloring sets can be selected. Therefore, algorithm 3.4.2 completes this work.

The optimal solution for the vertex coloring problem is to have the smallest type of color combination. We recorded the number of each color and selected the optimal coloring set. If one kind of color exists in the coloring set, we attach chain X to the end of corresponding DNA strand one time in Algorithm 3.4.3, and use the length of the strand to represent the amount of color kinds. Taking Fig. 1 for example, the strand # A₁2B₁A₂1B₂A₃4B₃A₄3B₄A₅4B₅A₆2B₆ # in T₁ represent the vertex coloring set with four kinds color for {v₁, v₅} ⟶ “2 ", {v₂, v₆} ⟶ “1 ", {v₃} ⟶ “4 ", {v₄} ⟶ “3 ", then it should be appended X four times, shown as # A₁2B₁A₂1B₂A₃4B₃A₄3B₄A₅4B₅ A₆2B₆ # XXXX.

In Algorithm 3.4.4, we take the shortest single chain in T₁ and get the solution of the VCP. In the example of Fig. 1, the singled strand in T₁ with shortest length is # A₁1B₁A₂2B₂A₃3B₃A₄1B₄A₅3B₅A₆2B₆ # XX- XX. Therefore, the solution of Fig. 1 is the color set for {v₁, v₃} ⟶ “1 ", {v₂, v₆} ⟶ “2 ", {v₃, v₅} ⟶ “3 ".

4 Proofs and computational complexity of proposed algorithm

The following theorems show that the algorithm can obtain solutions to the vertex coloring problem in O (kn²) time computing complexity of DNA molecular operations from a limited length range strands.

Theorem 1. The solution DNA strands for the vertex coloring problem in an n-vertex and m-edge graph can be found from a limited length range.

Proof. Each strand in T₁ after the first step denotes one vertex set. Strands can be described: $# A_{1} z_{1} B_{1} A_{2} z_{2} B_{2} \dots A_{k} z_{k} B_{k} \dots A_{n} z_{n} B_{n} #$ $z_{k} = 1, 2, \dots, k$ .

Meanwhile, every strands in T₁ after the second step represents a eligible vertex coloring set. And we originally design the chain length of # , 1, 2, ⋯ , k, A_k, B_k and X, for $| | A_{k} | | = | | B_{k} | | = | | # | | = | | 1 | | = | | 2 | | = \dots$ $= | | k | | = | | z_{k} | | = | | X | | = tmer .$ We suppose S as the chains after the third step. Then S can be shown as: $# A_{1} z_{1} B_{1} A_{2} z_{2} B_{2} \dots A_{k} z_{k} B_{k} \dots A_{n} z_{n} B_{n} # \underset{p}{\underset{︸}{X \dots X}}$ The p, as the times of pasting X, is determined by the number of vertex color kinds characterized on strands. Then

$\begin{matrix} | | S | | = | | # | | + | | A_{1} | | + | | z_{1} | | + | | B_{1} | | + | | A_{2} | | + \\ | | z_{2} | | + | | B_{2} | | + \dots + | | A_{n} | | + | | z_{n} | | + | | A_{n} | | \\ + | | # | | + \underset{p}{\underset{︸}{| | X | | + \dots + | | X | |}} \\ = 2 | | # | | + \sum_{k = 1}^{n} | | A_{k} | | + \sum_{k = 1}^{n} | | z_{k} | | + \sum_{k = 1}^{n} | | B_{k} | | + \\ p | | X | | \\ = (3 n + 2) t + pt \\ ∵ k \geq p \geq 1 \\ ∴ (3 n + 2 + k) t \geq | | S | | \geq (3 n + 2 + 1) t \end{matrix}$

Thereby, the solution of the vertex coloring problem in a limited length range is obtained.

Theorem 2. The solutions of vertex coloring problem for a graph with n vertices and m edges can be solved in O (kn²) time computing complexity using DNA molecules computation, where k is the number of colors.

Proof. We assume that the graph processed in this paper is a simple undirected graph without loops and multiple edges. At the same time, we find that the complexity of each biological operation is at O (1) time computing complexity [32]. The total time computing complexity T of the algorithm is as follows:

$\begin{matrix} T (Step . 1) = O (7) = O (1); \\ T (Step . 2) = O (k * n^{2}); \\ T (Step . 3) = O (k); \\ T (Step . 4) \leq O (k); \\ T = T (Step . 1) + T (Step . 2) + T (Step . 3) + T (Step . 4) \\ = O (1) + O (k * n^{2}) + O (k) + O (k) \\ ∵ k \leq n \\ ∴ O (k) \leq O (n) \\ T \leq O (n) + O (O ({kn}^{2})) + O (n) + O (n) \\ = O ({kn}^{2}) \end{matrix}$

In summary, we can solve the vertex coloring problem in O (kn²) time computing complexity, where k is the number of colors in the vertex coloring problem and n is the number of vertices.

5 Simulation experiment of DNA algorithms

The calculation of DNA depends on the accuracy of the biochemical molecules’ operations; otherwise, it will lead to the accumulation and expansion of errors in biochemical reactions. Therefore, designing a suitable DNA sequence is a necessary basis for ensuring accuracy with DNA calculations. To this end, we used the sequence design methods in Ref. [38 –40].

In this paper, we use the computational molecular biology tool Biopython as the system development platform, and Braich’s program to generate “appropriate DNA sequences” suitable for biological computing algorithm, which can be used to solve the vertex coloring problem. When generating one DNA chain, the first step is to determine whether the chain satisfies the restrictions specified in references [38 –40]. If the produced DNA sequence does not meet the restrictions, the program will continue to generate another new sequence. When these restrictions can be met, the sequence is accepted for subsequent biochemical reactions. Using this method, we can get the “appropriate DNA sequences” which meet the restriction conditions to improve the accuracy of biochemical reactions.

Therefore, taking the vertex coloring problem (Fig. 1) as an example, the program generates random four-base sequences to form A_i, B_i, #, c₁, c₂, c₃ and X as shown on Table 1. Table 2 demonstrates node sequences composed of Braich’s methods. The program also calculates enthalpy, entropy, free energy of DNA strands under various interactions. Table 3 calculates them for each probe bound to the homologous region in the strands.

Table 1
Strands for representing A_i, B_i, #, c₁, c₂, c₃ and X (i ∈ {1, 2, ⋯ , 6}) for the vertex coloring problem

Bit 3′ –5′ DNA sequence Bit 3′ –5′ DNA sequence

A ₁ ATAG B ₁ GTAT

A ₂ CCGT B ₂ TTAG

A ₃ GCCT B ₃ ACGC

A ₄ TCTC B ₄ ATCC

A ₅ CGGT B ₅ TGAC

A ₆ GCAC B ₆ CTCG

# AGGA c ₁ GTCA

c ₂ ACTG c ₃ CGAC

X TAGA

Bit	3′ –5′ DNA sequence	Bit	3′ –5′ DNA sequence
A ₁	ATAG	B ₁	GTAT
A ₂	CCGT	B ₂	TTAG
A ₃	GCCT	B ₃	ACGC
A ₄	TCTC	B ₄	ATCC
A ₅	CGGT	B ₅	TGAC
A ₆	GCAC	B ₆	CTCG
#	AGGA	c ₁	GTCA
c ₂	ACTG	c ₃	CGAC
X	TAGA

Table 2

Sequences for representing the A_iB_i and B_iA_j (i < j) for the vertex coloring problem

Bit	3′ –5′ DNA sequence	Bit	3′ –5′ DNA sequence
A ₁ c ₁ B ₁	AGGTTCAG	A ₁ c ₂ B ₁	CCGTATCG
A ₁ c ₃ B ₁	GCGTTCTA	A ₂ c ₁ B ₂	GCAATATT
A ₂ c ₂ B ₂	CATGTAGC	A ₂ c ₃ B ₂	CGTCTTAG
A ₃ c ₁ B ₃	GCGTTCTA	A ₃ c ₂ B ₃	GCAATATT
A ₃ c ₃ B ₃	CATGTAGC	A ₄ c ₁ B ₄	CGTCTTAG
A ₄ c ₂ B ₄	GCGTTCTA	A ₄ c ₃ B ₄	GCAATATT
A ₅ c ₁ B ₅	CATGTAGC	A ₅ c ₂ B ₅	CGTCTTAG
A ₅ c ₃ B ₅	GCGTTCTA	A ₆ c ₁ B ₆	GCAATATT
A ₆ c ₂ B ₆	CATGTAGC	A ₆ c ₃ B ₆	CGTCTTAG

Table 3

Energy used to combine each probe with its corresponding region on the library strand

Vertex	Enthalpy energy H	Entropy energy S	Free energy G
A ₁ c ₁ B ₁	94.5	245.2	22.8
A ₁ c ₂ B ₁	110.6	273.9	26.0
A ₁ c ₃ B ₁	100.9	252.8	22.9
A ₂ c ₁ B ₂	103.9	268.3	23.9
A ₂ c ₂ B ₂	115.7	287.2	26.8
A ₂ c ₃ B ₂	105.6	274.9	23.3
A ₃ c ₁ B ₃	104.8	267.7	22.7
A ₃ c ₂ B ₃	113.4	291.3	27.2
A ₃ c ₃ B ₃	108.4	283.1	24.3
A ₄ c ₁ B ₄	107.5	270.7	25.4
A ₄ c ₂ B ₄	109.7	288.9	26.5
A ₄ c ₃ B ₄	102.4	273.7	22.7
A ₅ c ₁ B ₅	110.3	279.1	25.3
A ₅ c ₂ B ₅	97.5	249.9	24.7
A ₅ c ₃ B ₅	98.0	240.6	23.4
A ₆ c ₁ B ₆	106.5	255.0	24.7
A ₆ c ₂ B ₆	104.7	249.7	24.8
A ₆ c ₃ B ₆	103.6	264.2	25.7

Table 4 shows their average deviation and standard deviation level. Then, we get DNA solution chains for the vertex coloring problem, which are shown on Table 5.

Table 4

Energy in all probe/library chain interactions

	Enthalpy energy H	Entropy energy S	Free energy G
Average	105.444	267.567	24.617
Standard deviation	5.5616	15.6546	1.4678

Table 5

Solution strands to the vertex coloring problem in Fig. 1

3′ - AGGAATAGGTCAGTATCCGTACTGTTAGGCCTCGACACGCTCTCGTCAATCCCGGTCGACTGAC

GCACACTGCTCGAGGATAGATAGATAGATAGA - 5′

6 Conclusions

In this paper, we propose a new DNA vertex coloring algorithm based on biological operations. This algorithm has two advantages. First of all, because a relatively fixed-length DNA sequence is used to express the vertices and edges of the problem, the algorithm has a lower error rate for connection and degeneration. Secondly, for vertex coloring problems with n vertices and k color restrictions, the proposed algorithm can work with O (kn²) time complexity, which is obviously superior to the previous algorithms (O (1 .415ⁿ) [41], O (1 .3289ⁿ) [42], O (1 .7504ⁿ) [47], O (n⁵) [48] and so on). Our algorithm can solve the vertex coloring problem in the polynomial time range, while the other algorithms mentioned above can mostly be in the exponential time complexity range (see Table 6). From this point, our algorithm has obvious advantages. Through comparative analysis, we can find that DNA-based computing may be a good choice for large-scale combinatorial optimization problems.

Table 6

Time complexity of different algorithms for vertex coloring problem

Algorithm	Our algorithm	I. Schiermeyer	R. Beigel and D. Eppstein	J.M. Byskov	M.A. Shalu
Time complexity	O (kn²)	O (1 .415ⁿ)	O (1 .3289ⁿ)	O (1 .7504ⁿ)	O (n⁵)

The key to reduce the computational complexity of our algorithm lies in the parallel operation of biological chains to generate all possible solutions, which is more efficient than the serial operation of traditional computer. Therefore, our proposed DNA computing algorithm points out a parallel intelligent algorithm for generating all feasible solutions of the problem. In order to solve the complex NP problems such as the assignment of set elements and continuous path search, it can learn from our DNA algorithm to try to solve them. Now, some intelligent algorithms for vehicle routing problem and task assignment problem have been proposed. Our algorithm also has particular promotion and reference significance to solve these problems. Moreover, we also believe that more complex issues will be solved by using our DNA computing algorithm, which can effectively improve the computational efficiency of these problems. What we have done is the application and attempt of new methods in intelligent computing algorithms, which also promotes the research progress in this field to a certain extent.

Because of the remarkable parallelism and mass storage of DNA computing, it can solve complex computing problems in polynomial time. However, the current strategy of all DNA computing models is based on exhausting all candidate solutions and then eliminating non-solutions through detection. However, as the number of calculated variables increases, the initial data pool size of this algorithm will increase exponentially. Therefore, as the scale of the problem continues to expand, this method of violent exhaustion will not be feasible. So the integration of intelligent technology will break the barrier of this kind exhaustive violent method, get a reasonable solution from a relatively small initial data pool, and avoid exhausting all candidate solutions.

At present, much research on DNA computing is still on paper, many ideas and schemes are idealized, and there is no condition to put them into experiments. There are still many technical obstacles in how to realize DNA computing and make DNA computers. Further research is needed on the reality and potential of DNA computing construction, reducing errors in DNA computing, effective general algorithms and human-computer interaction. Especially in DNA computing, the error code is generated randomly according to probability and can be amplified step by step. The bit error rate has a direct impact on the accuracy of DNA computing, which can not be effectively overcome at present. Perhaps the DNA computer only acts as an arithmetic unit. Even so, the complementary computers of DNA computing and traditional computers will have an immeasurable impact.

Footnotes

Acknowledgment

It was supported by the Open Research Fund of State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, China Institute of Water Resources and Hydropower Research (Grant No. IWHR-SKL-201905, SKL2020ZY08). The project is also funded by research Start-up project of Wenzhou Business School (RC202002) and National Natural Science Foundation of China (Grant No. 11701363).

References

Feynman

R.P.

, Mathematical formulation of the quantum theory of electromagnetic interaction, Physical Review 80(3) (1950), 440–457.

Adleman

L.M.

, Molecular computation of solution to combinatorial problems, Science 266 (1994), 1021–1024.

Lipton

R.J.

, DNA solution of HARD computational problems, Science 268 (1995), 542–545.

Kamio

, Takehara

and Fujiwara

, Procedures for computing the maximum with DNA strands, in: Humid R. Arabnia, Youngsong Mun (Eds.), Proceedings of the International Conference on DNA Based Computers, 2003.

Hug

and Schuler

, DNA-based parallel computation of simple arithmetic, in: Proceedings of the 7th International Meeting on DNA Based Computers, (2001) 159–166.

Guarnieri

, Fliss

and Bancroft

, Making DNA add, Science 273 (1996), 220–223.

W.X.

, Xiao

D.M.

and He

, DNA ternary addition, Applied Mathematics and Computation 182 (2006), 977–986.

Xiao

D.M.

, Li

W.X.

, Yu

, Zhang

X.D.

, Zhang

Z.Z.

and He

, Procedures for a dynamical system on {0, 1} ⁿ with DNA molecules, BioSystems 84 (2006), 207–216.

Chang

W.L.

, Fast Parallel DNA-Based Algorithms for Molecular Computation: Quadratic Congruence and Factoring Integers, IEEE Transactions on NanoBioscience 11(1) (2012), 62–69.

10.

Chang

W.L.

, Huang

S.C.

and Lin

K.W.

, Fast parallel DNA-based algorithms for molecular computation: discrete logarithm, The Journal of Supercomputing 56(2) (2011), 129–163.

11.

Z.W.

, Wang

Z.C.

, Wu

T.H.

and Huang

, Solving the 0-1 knapsack problem based on a parallel intelligent molecular computing model system, Journal of Intelligent & Fuzzy Systems 33(5) (2017), 2719–2726.

12.

Wang

X.L.

, Bao

Z.M.

, Hu

J.J.

, Wang

and Zhan

, Soling the SAT problem using a DNA computing algorithm based on ligase chain reaction, BioSystems 91 (2008), 117–125.

13.

Deng

, Xu

, Song

and Zhao

, Differential evolution algorithm with wavelet basis function and optimal mutation strategy for complex optimization problem, Applied Soft Computing (2020), 106724.

14.

Wang

Z.C.

, Ren

X.M.

, Ji

Z.W.

, Huang

and Wu

T.H.

, A novel bio-heuristic computing algorithm to solve the capacitated vehicle routing problem based on Adleman-Lipton model, Biosystems 184 (2019), 103997.

15.

Deng

, Liu

, Xu

, Zhao

and Song

, An improved quantum-inspired differential evolution algorithm for deep belief network, IEEE Transactions on Instrumentation & Measurement 69(10) (2020), 7319–7327.

16.

Yamamura

, Hiroto

and Matoba

, Solutions of shortest path problems by concentration control, Lecture Notes Computer Science 2340 (2002), 231–240.

17.

Song

Z.G.

, Xu

and Zhen

, Mixed-coexistence of periodic orbits and chaotic attractors in an inertial neural system with a nonmonotonic activation function, Mathematical Biosciences and Engineering 16 (2019), 6406–6425.

18.

Paun

, Rozeberg

and Salomaa

, DNA Computing, Springer-Verlag, (1998).

19.

Yao

S.W.

, Ding

L.W.

, Song

Z.G.

and Xu

J.Q.

, Two bifurcation routes to multiple chaotic coexistence in an inertial two-neural system with time delay, Nonlinear Dynamics 95 (2019), 1549–1563.

20.

Ouyang

, Kaplan

P.D.

, Liu

and Libchaber

, DNA solution of the maximal clique problem, Science 278 (1997), 446–449.

21.

Zhao

H.M.

, Zheng

J.J.

, Deng

and Song

Y.J.

, Semi-supervised broad learning system based on manifold regularization and broad network, IEEE Transactions on Circuits and Systems I- Regular Papers 67(3) (2020), 983–994.

22.

Wang

Z.C.

, Ji

Z.W.

, Wang

X.M.

, Wu

T.H.

and Huang

, A new parallel DNA algorithm to solve the task scheduling problem based on inspired computational model, Biosystems 162 (2017), 59–65.

23.

Deng

, Wang

, Liu

and Wu

, A bio-inspired algorithm for a classical water resources allocation problem based on Adleman-Lipton model, Desalination and Water Treatment 185 (2020), 168–174.

24.

Wang

Z.C.

, Huang

D.M.

, Meng

H.J.

, et al., A new fast algorithm for solving the minimum spanning tree problem based on DNA molecules computation, Biosystems 114(1) (2013), 1–7.

25.

Gualandi

and Malucelli

, A simple branching scheme for vertex coloring problems, Discrete Applied Mathematics 160 (2012), 192–196.

26.

Malaguti

, Monaci

and Toth

, An exact approach for the Vertex Coloring Problem, Discrete Optimization 8 (2011), 174–190.

27.

Czap

, Parity vertex coloring of outerplane graphs, Discrete Mathematics 311 (2011), 2570–2573.

28.

Majid

, A New Solution for Maximal Clique Problem based Sticker Model, Biosystems 95 (2009), 145–149.

29.

Mokhtar

, DNA sequence design for DNA computation based on binary particle swarm optimization, International Journal of Innovative Computing, Information and Control 8(5B) (2012), 3441–3450.

30.

Garey

M.R.

and Johnson

D.S.

, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman, New York, 1979.

31.

Karp

R.M.

, Reducibility among combinatorial problems[M], Complexity of computer computations. Springer, Boston, MA, 1972:85–103.

32.

Hertz

and de Werra

, Using tabu search techniques for graph coloring, Computing 39(4) (1987), 345–351.

33.

Caramia

and Dell’Olmo

, A Fast and Simple Local Search for Graph Coloring, Lecture Notes in Computer Science 1668 (1999), 317–330.

34.

Voudouris

and Tsang

E.P.K.

, Guided Local Search. In Handbook of Metaheuristics, edited by Glover

, 185–218. Norwell, MA: Kluwer Academic Publishers, 2003.

35.

Caramia

and Dell’Olmo

, Coloring graphs by iterated local search traversing feasible and infeasible solutions, Discrete Applied Mathematics 156(2) (2008), 201–217.

36.

Galinier

, Hertz

and Zufferey

, An adaptive memory algorithm for the k-coloring problem, Discrete Applied Mathematics 156(2) (2008), 267–279.

37.

Caramia

and Dell’Olmo

, Embedding a novel objective function in a two-phased local search for robust vertex coloring, European Journal of Operational Research 189(3) (2008), 1358–1380.

38.

Wang

Z.C.

, Huang

D.M.

, Tan

, Liu

T.G.

, Tan

and Li

, A parallel algorithm for solving the n-queens problem based on inspired computational model, Biosystems 131 (2015), 22–29.

39.

Wang

Z.C.

, Zhang

Y.M

, Zhou

W.H.

and Liu

H.F.

, Solving traveling salesman problem in the Adleman-Lipton model, Applied Mathematics and Computation 219 (2012), 2267–2270.

40.

Narayanan

, Zorbalas

, et al., DNA algorithms for computing shortest paths, in: J.R. Koza (Ed.), Proceedings of the Genetic Programming 1998, Morgan Kaufmann, (1998), 718–723.

41.

Schiermeyer

, Deciding 3-colourability in less than O(1.415ⁿ) steps[C], International Workshop on Graph-Theoretic Concepts in Computer Science, Springer, Berlin, Heidelberg (1993), 177–188.

42.

Beigel

and Eppstein

, 3-coloring in time O (1 . ⁿ), Journal of Algorithms 54(2) (2005), 168–204.

43.

Chai

, Chen

and Broyde

, A novel chaos-based image encryption algorithm using DNA sequence operations, Optics and Lasers in Engineering 88 (2017), 197–213.

44.

Liang

, Zhu

and Lv

, Molecular Computing and Bioinformatics, Molecules 24(13) (2019), 2358.

45.

, Qiang

, Zhang

and Yang

, A DNA Computing Model for the Graph Vertex Coloring Problem Based on a Probe Graph, Engineering 4(1) (2018), 61–77.

46.

Liu

, Hong

, Liu

, Lin

, Alfonso

R.P.

, Zou

and Zeng

, Computational methods for identifying the critical nodes in biological networks, Briefings in Bioinformatics 21(2) (2020), 486–497.

47.

Byskov

J.M.

, Exact algorithms for graph colouring and exactsatisfiability, Operations Research Letters 32 (2004), 547–556.

48.

Shalu

M.A.

, Vijayakumar

and Sandhya

T.P.

, On the complexity of cd-coloring of graphs, Discrete Applied Mathematics 280 (2020), 171–185.