Abstract
A textile fabric consists of countless parallel vertical yarns (warps) and horizontal yarns (wefts). While common looms can weave repetitive patterns, Jacquard looms can weave the patterns without repetition restrictions. A pattern in which the warps and wefts cross on a grid is defined in a binary matrix. The binary matrix can define which warp and weft is on top at each grid point of the Jacquard fabric. The process can be regarded as encoding from pattern to textile. In this work, we propose a decoding method that generates a binary pattern from a textile fabric that has been already woven. We could not use a deep neural network to learn the process based solely on the training set of patterns and observed fabric images. The crossing points in the observed image were not completely located on the grid points, so it was difficult to take a direct correspondence between the fabric images and the pattern represented by the matrix in the framework of deep learning. Therefore, we propose a method that can apply the framework of deep learning viau the intermediate representation of patterns and images. We show how to convert a pattern into an intermediate representation and how to reconvert the output into a pattern and confirm its effectiveness. In this experiment, we confirmed that 93% of correct pattern was obtained by decoding the pattern from the actual fabric images and weaving them again.
Introduction
With the rapid development of modern production techniques, the textile industry has undergone rapid changes. Intelligent weave machines have been introduced to make textiles convenient to produce and diversified in style and pattern. In addition, we have a clear need for personalized customization and on the whole, are not satisfied with traditional, uniform styles. Many of the ancient textiles exist as real objects only, and the patterns are still valuable and not outdated. If we want to reproduce such ancient fabrics, we need to analyze the patterns. Technicians can analyze the pattern by observing the fabric using a microscope and recording the crossing state of each yarn by disassembling them, but this is time consuming and tedious. At the same time, the original textile will be destroyed. Therefore, it is not a good way to solve this problem and we need to find a novel technique for automatically extracting the pattern. This would be useful for both reproduction and generating new patterns.
In this work, we focused on Jacquard fabric in which the crossing of warp and weft was specified by a binary matrix pattern. In contrast to ordinary looms, which only weave repetitive patterns, Jacquard looms can weave free patterns without repetition restrictions. With Jacquard fabric, the warp and weft can be defined for each crossing point; the resulting fabric is not composed of uniform pattern regions. If the fabric is made with small repetitive patterns, the analysis can be done manually without much effort, but for Jacquard fabrics, the analysis of large patterns that make up the entire fabric is required, and manual analysis requires much effort. The segmentation based on pattern uniformity does not work for analyzing Jacquard textile patterns.
Jacquard textiles are often used in valuable fabrics such as traditional costumes and neckties, and there is high industrial value for the encoding and decoding of the patterns. Therefore, in this work, we focused on Jacquard fabric, in which the crossing of the warp and weft was specified by a binary matrix pattern.
The warp and weft yarns were dyed different colors and the pattern defined how the yarns crossed. The binary matrix pattern was defined as a binary image. For example, when the black yarn was above the white yarn, the crossing point was displayed in either black or white. We input the binary pattern into the weaving machine to produce the fabric. The Jacquard loom was able to accept the over-under relationship for the individual crossing points. Modern Jacquard looms can weave a coded pattern by loading a file of the pattern [1].
The appearances of images are different from each other, even at the same crossing point where a weft overlaps a warp. Figure 1 shows zoomed-in images of the crossing point area in an observational image. Conventional template matching was not able to estimate the exact positions of the crossing points.
The variety of crossing points makes the problem difficult. Conventional template matching cannot deal with the crossing points.
Where there are many pairs of fabric images and their corresponding binary patterns, deep neural networks (DNNs) can decode unknown binary patterns from a corresponding fabric image. However, DNNs cannot directly output a binary pattern with thousands of crossing points from a fabric image with millions of pixels. To solve the problem, we introduced an intermediate representation that bridged the pair and enabled us to output the intermediate representational patterns by a DNN. In addition, to convert a fabric image into the intermediate representation, and the intermediate representation into a binary pattern, we built a practical method for the conversion. We introduced intermediate representation images because we expected the images to directly represent the likelihoods which are the crossing points. In addition, due to insufficient sample size, complex networks that require a large amount of training data are not likely to be trained well. A more complex network, like an end-to-end network which directly answers the positions of crossing points, trained by numerous samples could provide accurate results.
We believe that the objective of this study, textile pattern decoding, has not been done to the best of the authors’ knowledge in previous work. Therefore, we have not found any previous method that allows direct comparison of accuracy. Instead, by changing the parameters and kinds of filters in various ways, we confirmed the best possible settings for the current situation.
The overview of our proposed method including the steps of (1) Pre-processing; (2) Manual labeling; (3) Integration; (4) Training; (5) Execution; and (6) Post-processing.
The contributions of this paper are as follows:
Introduction of an intermediate representation for textile decoding tasks; Interface design for manual tagging of cross points on an observed textile image (Section 3.1.1); Proposal of a pre-process for converting a fabric image to an intermediate representation (Section 3.1.3); and Proposal of a post-process for converting an output intermediate representation pattern into a regular binary matrix pattern (Section 3.3).
Figure 2 shows the overview of our proposed method. Section 2 introduces the background and related work, and Section 3 shows a method of converting a fabric image into an intermediate representation pattern as well as an intermediate representation pattern into a binary pattern. We also describe the details of DNN configurations. In Section 4 we present experimental results and in Section 5, a summary.
Pattern creation by computer support
Depending on the local density of the crossing points as well as the colors of yarn appearing on the top, a textile pattern brings different levels of brightness. Inappropriate patterns result in misaligned grid points and partial fabric stiffness. Considering many conditions, and through trial and error, textile patterns have been manually created since ancient times. In recent years, systems for designing textile patterns with the support of computers have been proposed and it has become possible to create more complex patterns [2, 3, 4].
Toyoura et al. [5] proposed a dithering method for reproducing smoothly changing tones and fine details of natural images on woven fabric, focusing on representing gray scale images by using two colors of warp and weft yarns. The weaving pattern is generated by binarizing the input image using dither masks. The step dithering method alternately places values of 0 and 255 at given intervals in each row of the dither mask such that there is at least one cross point of warp and weft yarns in the spacing in the resulting fabric. Within these intervals, the threshold increases from 0 to 255 or from 255 to 0. This forms a stepping structure up and down. The resulting binary image reproduces the brightness of the input image while limiting the number of crossing points in the weave pattern.
By modeling and rendering 3D CG (computer graphics) from a textile pattern, the woven result can also be predicted. By adding a physical collision detection, it is possible to reproduce the appearance of fabrics by CG. Users will be able to modify the pattern without actually weaving it.
There have been many studies on computer graphics to generate photo-realistic fabric images from defined patterns [6, 7, 8, 9]. The images obtained by observing yarns, fabrics, and 3D yarn data obtained by CT are used to generate realistic images. 3D fabric models are constructed from the data. These are the opposite of directional studies, as our proposed method restores patterns from actual fabrics. In this work, we aimed to output a pattern of weaving, which would be a complementary work.
Pattern analysis for fabric images
Due to industrial demand, the pattern analysis by DNNs has been introduced for content-based image retrieval [10], noisy image recognition [11], 3D medical image super-resolution [12], video surveillance [13], foreground detection [14], multi-object tracking [15], explosive device detection [16], pupil detection [17], online data streaming [18], airport baggage handling [19], and many other objectives. Computational photography [20] and purpose-specific machine learning [21] have been also employed to solve industrial problems.
Many studies have been conducted to detect defects from fabric images. Huangpeng et al. [22] constructed a weighted, low-rank, representational model of textures and detected defects. Ren et al. [23] realized the detection from a small sample set using pre-trained DNNs, and Wei et al. [24] also employed DNN for the classification of fabric using a small number of samples. Jeyaraj et al. [25] improved the accuracy of defect detection by introducing the characteristic of texture in the training of DNN. Jing et al. [26] analyzed the detailed parameter settings of pre-trained networks, image patch sizes, the number of layers, and so on for detecting defects in repetitive patterns. Li et al. [27] proposed a method for detecting defects by Fisher criterion-based stacked denoising autoencoders. Liu et al. focused on information entropy and frequency domain saliency [28]. The ability to detect defects is valuable for the industry [29, 30, 31] and patents for defect detection have been published [32]. On the other hand, defect detection is a task of anomaly detection or classification; however the pattern analysis of interest in this work is the binarization of the regularly aligned crossing points of warp and weft yarns.
Estimation of warp and weft positions. For the input image shown in (a), the LOG filter image shown in (b) is calculated (
In order to measure the density of the weft for the woven fabric, Schneider et al. detected the intensity minima of brightness and estimated the positions for each [33]. Compared with the simple Fourier transform, the change in the density of the weft can be obtained more accurately. Luo and Li [34] employed image processing techniques for quantifying fiber distribution uniformity in blended yarns based on cell counting and dilation area estimation. Meng et al. [35] realized their goal of estimating the yarn density of multi-color woven fabrics by deep learning. In this method, assuming that the whole fabric is woven in a periodical pattern, the intersection of the warp and the weft arranged on a regular grid is detected. Near-infrared spectroscopy revealed the material of yarns [36]. Machine learning also contributed to estimate the kinematic behavior of yarns [37]. In this paper, we aimed to analyze different patterns in a fabric. The grid formed by the warp and the weft may be greatly collapsed.
Zheng et al. [38] proposed a method for recognizing how predefined patterns are arranged on a piece of fabric. Loke and Cheong used co-occurrence matrices for the recognition [39]. These methods can be applied when the fabric is composed of a small number of known patterns, but this cannot be assumed for woven fabrics in which the pattern is created manually. Therefore, we aimed to determine the intersection of each point on the grid.
Here, we describe a method for decoding binary textile patterns from observed images. The method consists of three parts; pre-processing for converting a fabric image to an intermediate representation (Section 3.1); using a DNN for generating a label image (Section 3.2); and post-processing for converting the intermediate representation into a binary matrix textile pattern (Section 3.3). The training data is a set of observed images and a corresponding set of intermediate representations of these images with manually labeled cross point positions. At runtime, the final binary pattern is output from an observed image only. Figure 2 shows an overview of the process.
Pre-process for converting a fabric image to an intermediate representation
Manual tagging of crossing points
Analysis of the weaving pattern can be done manually by observing the fabric using a microscope and recording the cross state of each yarn by disassembling it. This method is very time consuming, laborious, inefficient, and costly. In addition, it is also undesirable to perform such destructive inspections on textiles of high historical value. As is shown in Fig. 3, we estimated fabric patterns from captured images and interactively modified fabric patterns to analyze weaving patterns. Initial estimation results of the crossing point positions of warp and weft yarns were given by image analysis. The positions were interactively modified with a GUI so that the pattern could be obtained in a short time without destroying the actual fabric. Figure 3 outlines the detection of the positions of weft yarns by filtering the image with a Laplacian of Gaussian (LOG) filter. A LOG filter is used to extract edges, such as regional boundaries, while removing small noise. We expected to ignore the fine threads and to output the boundaries between warp and weft yarns. Where the boundary pixels are few, it can be assumed that it is the center of the crossing point of warp and weft yarns. Since the edge pixels could be observed at the edge of the yarn, the center of the yarn was found by picking up the local minima with fewer edge pixels [38].
Converting processes of input image into labeled image. The observed image was captured with a camera with a macro lens. The image was smoothed in the pre-process and the positions of the crossing points were manually tagged. By filtering the crossing points, we obtained the final, labeled image.
When the pixel values were integrated in the horizontal direction (X direction), the intensity of the edge component in each row was obtained. The position where the edge intensity reached the maximum was the end of the row with many edge elements, and the row whose edge intensity reached the local minimum was the center of the yarn with relatively few edges. Applying the same process to the warp yarns, the crossing points of the warp and weft yarns could be obtained as the initial positions of crossing points. Although this method can analyze patterns quickly, it is not automated enough to perform large-scale analysis. Therefore, we used this method only to prepare training data.
Once the initial positions of the warps and wefts were determined from the input image, we computed which of the warps or wefts was higher for each of the grid points. By giving two representative colors of warp and weft, we could see which color was closer at a grid point, and then give the state of the grid at that point as 0 or 1.
In the developed GUI, a user can add, remove, and move yarns by clicking and dragging on the screen. The operational mode was defined as move, add, or delete by key presses: The operation with the shift key pressed was for the warp; the operation with the ctrl key pressed was for the weft, and the operation without pressing any key was for the crossing point. When the weft and warp positions were updated, the initial intersection (based on the new warp and weft positions) could be obtained by pressing the C key. When the F key was pressed, the vertical direction of the weft and warp at the crossing point closest to the clicked mouse position was reversed.
In the observed image, even fine fibers are observed, as shown in Fig. 4a. The fine fibers produce intensity edges which result in high values in the LOG-filtered image. This problem can be easily solved by the image processing of erosion and dilation, resulting in the image shown in Fig. 4b. Erosion and dilation are used to remove the regions with little area. In our experiments, we used three functions in MATLAB; strel, imerode, and imdilate. To obtain the results, the image was eroded with a 5.5 pixel radius setting and then expanded with the same setting. The radius setting was determined by checking to see if the thin fibers disappeared in the sample image.
Figure 4c shows the image with the manually labeled crossing points. The red and blue points are central with the red point indicating that the warp is on top at the point. The blue point indicates that the weft is on top at the point. Figure 4d shows the image with white pixels for the point of the warp on the weft; black pixels for the point of weft on the warp; and gray pixels for other than the crossing points in the image. For machine learning, the pre-processed image in Fig. 4b and the intermediate representation image in Fig. 4d are provided as a pair; the DNN receives the pre-processed image and outputs an intermediate representational image.
Impulse peak pattern and three kinds of labeled images.
Here, we considered how to make an intermediate representational image. If we were to use the result of manual labeling in its present form, we would get an image
In the input image, there is no significant change in the immediate vicinity of the crossing position, so it is expected that DNN would have difficulty producing such an image. The image
Note that
In Fig. 5d, the final binary pattern is transformed into an image, but the size of the image is much smaller than the input image. Additionally, the positions of the crossing points are different from those of the input image. To obtain such an image by DNN is difficult because DNN tries to determine whether a point is a crossing point based on the information of a pixel and its neighbors. In that situation, DNN cannot produce an image like Fig. 5d in which the position is completely misaligned.
We used a DNN to output a label image from the input image. It was necessary to determine whether a pixel corresponded to a crossing point by looking at the surroundings of the pixel without losing the accuracy of the position. We employed a DNN with a U-net structure [40] to solve this task in which a network with this structure can take into account the surroundings by a network of autoencoders and maintain the resolution by jumping paths.
Neural network model structure for converting labeled intermediate representation image.
The model structure is shown in Fig. 6. The input to the DNN is a pre-processed image, which is an image with fine noise removed, and the output from the DNN is an intermediate representation of the image, which shows the likelihood of the existence of the crossing points on the image. The model has a total of six down-roll multi-layer and six upper convolution layers; the final output size is the same as the input size. When the pre-processed observation images were input, the network was trained so that the label images were output. The input image size of
In the learning phase, we gave a pair of observed images and a label image that were manually generated. For the loss function, the L1 norm of pixel values was employed. The number of neurons in each layer is also shown in the figure. PyTorch was used for implementation. A more detailed implementation environment is described at the beginning of the experimental section.
Post processing for the conversion of output image from a deep neural network.
DNN outputs an intermediate representation image; however, a non-trivial post-processing is required to obtain the final binary pattern. This occurs in several steps: 1) the intermediate representation image is converted into a tri-valued image of 0, 0.5, and 1; 2) each peak region is merged into one; 3) the approximate horizontal and vertical positions of the warp and weft yarns are found; and 4) a determination is made as to which of warp and weft are over at each grid point. Finally, we get a binary pattern with 0 s and 1 s on the grid points, as shown in Fig. 7.
Converting the intermediate representation image into a tri-valued image
The intermediate representation image output from the DNN had continuous values in the range of (0, 1). In order to find the crossing points by segmenting the image, each pixel must have a discrete value. We replaced the value of each pixel with the closest of the three values, in which the crossing point of the warp was 0, the crossing point of the weft was 1, and 0.5 otherwise. This process is equivalent to thresholding by two values of 0.25 and 0.75. An image is converted into a tri-valued image only with white, black, and gray pixels.
Integrating regions of multiple values
In the image obtained by the process described in 3.3.1, based on the gray background, a region consisting of white pixels only, a region consisting of black pixels only, and a region containing both white and black pixels appeared. Since the crossing points were never adjacent to each other in real fabrics, no region ever contained both white and black pixels.
We aim to avoid the coexistence of white and black pixels in each region that represents a crossing point. The region containing both white and black pixels was merged into the region with the larger number of white or black pixels. The pseudo code for the process is described in Algorithm 1.
[H] Integrating regions with multiple values with connected component labeling (CCL)[1] [Foreach]ForeachEndForeach[1]
The regions were labeled by segmentation and the adjacent regions for each region were searched. The connected components labeling (CCL) gave the individual, connected regions of a binary image drawn by identical numbers. A region containing both white and black pixels could be detected from the adjacency matrix. Ignoring the background gray region, we first flagged the adjacent regions. We then flagged the adjacent regions of the first set of adjacent regions and repeated the process until no more flags appeared. Finally, the black and white regions included in each independent region were recognized. The number of white and black pixels belonging to each region were counted and the region was merged by the higher color.
After we merged the regions with only white pixels and black pixels, we next found the representative points of each region. Since the representative point should be located in the center of the crossing point region, the pixel at the gravity point of each segmented region was to be the representative point of that region.
Finding approximate horizontal and vertical positions
Regional representative points indicating crossing points included duplicated and missing points. The crossing points were constraints that existed in a grid along the yarns. The regional representative points satisfying this constraint were picked up and converted into binary matrix patterns.
First, by detecting the positions of the warp and weft yarns, the approximate positions of the grid points were estimated. The distance transformation image for the obtained candidate crossing points were able to show the likelihood of a warp or weft passing through its position. The positions of warp and weft yarns were estimated in the same way as in the pre-processing, using the distance transformation image as the target.
The crossing point pixels were replaced by 1 as the object region and the other pixels were replaced by 0 as the background region. The value of each pixel in the distance transformation image indicated the distance from that pixel to the nearest pixel in the object region. The distance was 0 for pixels in the object region; the further away the pixel was, the larger the value was.
We integrated the distance transformation values of pixels in a column. Along a column, the more pixels that were in the object region or close to it, the smaller the integral became while the likelihood of being the center of the warp or weft yarn increased. As described in Section 3.A, the positions of warp and weft yarns were estimated by determining the likelihood according to the power of intensity edges. The integral was smoothed in the same way to reduce the effect of local noise. By way of this process, the positions of the warp and weft yarns were estimated.
Determining value at each grid point
As the positions of the warp and weft yarns were estimated, the positions of the grid points were also estimated automatically. By assigning a value of 0 or 1 to each grid point, we were able to obtain the final binary pattern in a matrix form. In Fig. 7, the blue lines indicate the estimated location of the warps while the red lines indicate the estimated location of the wefts. The white and black points are representative points with the values of 1 or 0.
For each grid point, we extracted the best candidate point within distance
Experiment
Comparison of the different kinds of intermediate representations for DNN training. The results shown in (c) are given with 
Textile samples were observed with an SLR camera Canon EOS M for consumer use; we attached a fixed focus macro lens with LED illumination Canon EF-S 35 mm f/2.8 Macro IS STM. High-resolution images were divided into small, partial images. The images were made by observing samples represented in reference [5]. The samples were woven with black and white yarns only, and the patterns were generated from natural images. We obtained 176 images; each had a resolution of
PyTorch was employed as a deep learning framework and GPU GeForce2080Ti 11 GB was used for training and validation. It took about 2.4 hours to train. The batch size was set to 4 and the maximum epoch was set to 400. To complete the process of one image at runtime it took about 1.21 seconds, which included 0.24 second for pre-processing, 0.08 second for executing DNN, and 0.89 second for post-processing.
DNNs were respectively trained by the three intermediate representational images described in Section 3.2. We performed an 11-fold cross-validation procedure to ensure the robustness of the method. The 176 images were divided into 11 groups of 16 images. After training the network with 160 images, we verified the accuracy of the remaining 16 images and repeated this process 11 times. The output images obtained by each training are shown in Fig. 8. For the results, no post-processing was done.
First, we can see that impulse peak pattern training does not provide good results. Many regions of the white yarns were detected as crossing points, and candidate points also appeared at the edges formed by the yarn contours. When considering the segmentation of post-processing, the noisy clutters did not result in the correct detection of candidate points.
Although much better results were provided than those by impulse peak pattern training, Gaussian-filtered peak pattern training generated blurry images. The post-processing of the segmentation was not successful because each of the regions became one continuous region of adjacent, segmented regions. This trend did not improve, even after changing the variance of the Gaussian peak. When the variance was set too small, the results were similar to those induced by the impulse peak pattern.
Finally, the results trained by the box-filtered peak pattern were better than the two previous results. The region indicating each crossing point was less likely to be integrated with its adjacent regions. The extraction of the candidate representative points by post-processing also showed good results in appearance. The correct rate for all groups was 83.25% on average, and the minimum was 74.13%, the maximum was 99.17%, and the standard deviation was 8.40%.
Performance of cross point detection
The resulting binary patterns obtained in the form of matrices could not be evaluated directly because the same number of crossing points in each row and column could not be aligned in the observed image. Therefore, the observed crossing points could not be represented in a matrix manner. Instead of the direct evaluation of the binary patterns, we verified whether the label images were correctly obtained. The correct label image could be given by manual labeling. The accuracy of the position of the crossing points could also be examined.
Here, we evaluated the representative points of the crossing point regions obtained from the resulting label image. Many pixels took a value of 0.5 and a few took 0 or 1 to indicate that the pixels were representative crossing points. Similarly, in the manually labeled images, a small number of crossing points were placed in a background of many pixels that took a value of 0.5. In order to verify the match between them, we set a threshold
Threshold 
Comparison experiment for different number of layers, filters, and distance threshold 
For quantitative evaluation, for each representative crossing point in the estimated binary pattern, the rate of correctly detected crossing points (Correct), the rate of incorrectly detected crossing points (Error), and the rate of points with no candidate points in distance
Figure 8e shows the resulting label images and ROC-like curves. Note that we omitted the quantitative analysis for impulse peak pattern training. There were not enough extracted crossing points from impulse peak pattern training because resulting output images had few large white and black regions after post-processing.
After extensive trial and error, we found a DNN with 12 layers, a
Reproduced woven fabric by final decoded binary patterns.
To confirm the optimal structure of the DNN, we designed comparative experiments, reducing and increasing the number of neural network layers. The number of layers should be even because the same number of layers is required for encoder and decoder parts. The deepest layers near the bottleneck were added and removed for changing the number of layers. The smallest layer of 14-layer network is
Filter parameters
For the intermediate representation, we controlled other unchanged variables, and compared their results by changing the size of the filter. Although there were a number of values from which to choose, considering the feasibility of the experiment, as shown in Fig. 10b and c, we compared the Box methods with filter sizes of 5, 7, 9, and 11; we compared the Gaussian methods with filter sizes of 3, 5, 7, and 9. When the filter size was 9, the Box-filter method had the highest accuracy of 78.92%, 7 and 11 had accuracy levels of 72.21% and 73.99%, respectively. When the filter size was 5, the Gaussian-filter method had the highest accuracy of about 71.69%; 7 and 9 had accuracy levels of about 60.77% and 41.00%, respectively. For Gaussian size 3 and Box size 5, we could not extract the binary pattern, so their accuracy was 0. By comparison, when the filter size of the Box-filter method was 9, that yielded the best result, and that was the scheme adopted for our method.
Distance threshold
Each resulting label image showed the same trend, as shown in Fig. 10d and e, there were 47.47% of correct crossing points at
Validation of decoded binary patterns
Figure 11 shows the results that were woven by the obtained binary pattern, together with the observed image. Since the input image contained crossing points near the image boundary that could not be expressed in the form of a matrix, we ignored the intersections near the boundary and evaluated them. In addition, we added binary pattern validation in Fig. 11c to show the analysis results. We evaluated whether the values of 0 and 1, given the grid points in the estimated binary image matched the values in the ground truth image. Each grid point corresponded to a pixel of the observed image, so we could check the match between it and the binary value of the nearest crossing point in the ground truth image. In Fig. 11c, a red box shows that 0 was wrongly estimated as 1, and a blue box shows that 1 was wrongly estimated as 0 for each grid point. In the patterns, we were able to obtain results that were close in appearance. Since some of the patterns were extracted that are not common as woven patterns, such as too many or too few crossing, there was room for the pattern to be improved by converting it with an uncommon pattern such as a restriction. Although the error in which 0 was wrongly set to 1 and vice-versa is different from a false positive or a false negative in a general sense, the accuracy and F-measure of the statistical metric can be applied to the data. The accuracy for 176 images was 0.930 and the F-measure was 0.929, on average.
Although the obtained pattern was not perfect, it can be said that the same pattern was approximately obtained automatically.
Conclusions
In this paper, we proposed a method for decoding the binary patterns that define the weaving of fabric. By developing intermediate representations, we were able to accomplish the task by deep learning. The pre-processing and post-processing allowed us to bridge the intermediate representation image and the binary pattern. The experimental results showed that our method allowed for correctly extracting 93% of the crossing points, and the reproduced textiles were close to the original one in appearance. Although the box filter may not be optimal, it gives better results than the isotropic Gaussian filter. We consider this to be due to the grid arrangement of the yarns.
Only black and white yarn images were discussed in this paper. If the warp and weft are of one color each, the same processing is considered possible by converting the image according to the proximity of each color. Multiple colors may be used for weft yarns, and this will be handled by converting the image by the color distance from that of warp yarn; however we have not confirmed whether this will work well. That is a topic for future studies.
It is necessary to ensure that each of the pieces of yarn are captured one by one in the observed image, which limits the scope of the observation. We need a method to integrate the resulting patterns observed at multiple locations. The resulting patterns are partially incomplete, which makes the problem more difficult. Various image stitching techniques are helpful. Although we used LOG-filtered images for detecting the positions of the yarns, there is the possibility of improving the accuracy by introducing the structural tensor. In the case of textiles, the yarns are made of even finer yarns twisted together, so the fine edges can interfere with calculating the correct tensor. We would like to address these issues in the future.
We will also leave the further tasks of the optimization of network configuration and the selection of the backbone network to future work. Recent powerful machine learning techniques [41, 42, 43, 44] will contribute to improve the accuracy of pattern decoding even when the dataset is not large enough.
Footnotes
Acknowledgments
The authors appreciate the joint support for this pro- ject by the JSPS KAKENHI (Grant Nos. JP16H05867, JP20H04472), the NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Informatization (Grant No. U1909210), the National Natural Science Foundation of China (Grant Nos. 61761136010, 61772163), the Natural Science Foundation of Zhejiang Province (No. LY18F020016), and Zhejiang Lab Tianshu Open Source AI Platform (No. 111007-AD1901).
