Abstract
Image retrieval is a task which retrieves similar images from a large database based on a given input query image. The lacy and embroidered fabric contains repetitive patterns and rich texture, making the image retrieval difficult. The GIST feature is a spatial information feature that performs well on retrieving images with duplicate patterns. Speeded-up robust features (SURF) feature is invariant to rotation, which makes it powerful in retrieving rotated images. The method proposed in this paper is to combine the benefits of both GIST and SURF features, supporting the image retrieval from a fabric image database. In addition, we extract the structure from the texture via the relative total variation to eliminate the influence of complex texture on the feature point extraction. A key insight and contribution of our paper is that the combination enables accurate fabric image retrieval, especially for rotated images. To demonstrate the robustness and accuracy of our method, we applied it to a database that contains 527 fabric images. The experimental results show that the proposed algorithm outperforms the state-of-the-art methods on the fabric images with hollow and embroidery patterns.
At present, images are extensively used in people’s lives in different varieties. Consequently, there is an increasingly high demand for efficient and effective image indexing and retrieval methods, and image retrieval has become a long-standing and important research topic in computer vision. One common application is the use of image retrieval in the context of apparel industry. However, in the textile industry, there is still a great challenge for the retrieval of fabric images, especially for lacy and embroidered fabrics. These fabric images usually have the following characteristics: 1. Fabric images usually contain a large number of duplicate patterns. 2. Due to the richness of textures contained in embroidered textiles, it is difficult to obtain dominant feature information and retrieve the fabric images. 1 3. Queried images, taken by hand, are usually rotated.
Recently, a number of methods for retrieving images have been proposed (discussed in the section on related works on image retrieval), which borrow ideas from low-level descriptors or deep learning. Despite much research conducted on image retrieval, there is still limited amount of attention paid on the retrieval of fabric images, especially for lacy and embroidered fabrics. GIST descriptor can provide appropriate representation of spatial information contained in fabric images. However, it cannot handle the rotation problem very well in the retrieval process. The rotation invariance of speeded-up robust features (SURF) can effectively compensate this drawback. It has been our goal to develop an algorithm that, in comparison to the state-of-the-art methods, is accurate to retrieve fabric images while retrieving rotated images effectively. Therefore, we propose a method combining GIST and SURF descriptors to increase accuracy of fabric image retrieval with respect to the characteristics of fabric images. In this method, GIST descriptor is used to extract spatial information features to detect repetitive patterns of fabric images and SURF descriptor is used to tackle the image rotation. Moreover, we have used the algorithm for structure extraction from texture via relative total variation to remove useless textures, which allows us to extract more meaningful SURF features.
The difference between the new design approach and the existing approaches lies in the combination of two features. Furthermore, we present a step to determine the most appropriate weighting values for SURF and GIST descriptors. Apart from the two differences, the rotate-rate index is introduced to measure the ability of retrieving rotated images.
Related works of image retrieval
In the past several decades, extensive work has been done on image retrieval. Detecting repetitive pattern of images becomes one of the most important steps in fabric image retrieval. To find repeated patterns, Karp et al. 2 presented a theoretical treatment to attack such problems. Schindler et al. 3 developed an algorithm to detect the repetitive patterns contained in man-made environments. The method fully exploited the symmetric wallpaper patterns to geo-tag the image. After these early works, a shift-invariant descriptor that represents duplicate patterns was proposed to retrieve images from a dataset. 4 However, the retrieval results of this method would not be accurate on large datasets or complex images result from the single feature.
In recent years, color features are widely used as indexing features for image retrieval.5–7 Huang et al. 7 defined an image feature called color correlogram which distills the spatial correlations of colors to compare images. Color moments (CM) 8 is a classic feature which characterizes color distribution in an image. The method is inappropriate for retrieving partially similar images but is generally used to narrow down the retrieval range and combined with other features.
Apart from color features, there are some other low-level features adopted for image retrieval. Lowe 9 proposed the scale invariant feature transform (SIFT) feature that provides invariance with respect to image retrieval. Consequently, SIFT and other refinements10,11 can be used to match repetitive patterns in fabric images. A widely used method for image retrieval is SURF, proposed by Bay in 2006 and improved by Pang in 2012,12–14 which can extract fully affine invariant interest points. SURF has good performance in matching rotated images, but it does not perform well in retrieving images with rich interest points. 15 To extract features by simulating visual processing, Liu et al. 16 proposed a novel feature descriptor, namely the micro-structure descriptor, to effectively integrate texture, shape and color information. Although the method achieves a higher retrieval precision than SURF, retrieving images with rich interest points is less successful and still requires further research. The GIST and its variants17–19 are significant methods for extracting spatial information. The spatial information feature can provide an appropriate representation of fabric images, as the fabric images contain numerous duplicate patterns, which are rich in spatial information. Jing et al. 20 proposed a method that utilized CM and GIST features to retrieve fabric images. However, issues from rotated images have not been addressed by this algorithm.
Inspired by the advancement of artificial intelligence (AI), a series of AI techniques can be employed to handle the images retrieval problem in the apparel industry. 21 Lin et al. 22 introduced an effective supervised convolutional neural network (CNN) framework that can simultaneously learn image representations and binary codes for rapid image retrieval. Lettry et al. 23 explored the capabilities of pre-trained CNN to detect repetitions in images. However, the above methods based on deep learning are under the assumption that the data are labeled. Without label information, these algorithms suffer from severe performance degradation.
Description of proposed algorithm
We propose a robust fabric image retrieval method based on GIST and SURF descriptors. Figure 1 illustrates the framework of our method. Our pipeline takes as input a fabric image and outputs the 10 most similar images. The process can be divided into the following three parts:
Extract spatial information features. Fabric images usually contain repetitive patterns, which bring a large amount of spatial information. Hence, the spatial information feature is selected as the dominant feature. Given that GIST is efficient and has a good performance in extracting global features, we use GIST descriptor to extract spatial information from fabric images. Extract interest points by SURF. Considering that spatial information features are easily disturbed by affine changes such as rotation, the accuracy of fabric image retrieval is unstable if only GIST descriptor is used as some of the images taken by hand are rotated. Therefore, affine invariant SURF features are extracted to cover the deficiency of GIST. Similarity matching. After obtaining the GIST features and the SURF features, similarities of both features are computed by Euclidean distances between query fabric images and the images in the database. The overall similarity is summed to a weighted similarity of normalized distances. The process of image retrieval.

Extracting spatial information
The repetitive patterns of fabric images contain rich spatial information between pixels. The GIST feature is a biologically inspired feature that simulates human vision, captures contextual information in the fabric image and forms a spatial representation of the fabric image.
Gabor filter group
The GIST feature filters the fabric image through a multi-scale and multi-directional Gabor filter group, and then the filtered image is divided into
The multi-scale and multi-directional Gabor filter group is a multi-channel filtering scheme based on the Gabor filter
Here the parameter m is the number of the scale, n is the number of directions of the Gabor filter group,
GIST feature extraction
The essence of image feature extraction using the Gabor filter is to combine a set of the Gabor wavelet function basis with the image to perform the convolution operation. Each Gabor function in the set of Gabor wavelet functions generates a strong response at the edge perpendicular to its oscillating direction, so some of the salient features of the image with corresponding directional frequency information can be detected. Thus, a robust and compact feature of the original input image is obtained. Divide a fabric image
In this paper, a fabric image is divided into 4×4 regular grids. To compute the GIST feature, we set experimentally m = 4 and n = 8 for the Gabor filter group.
Extracting interest points
Interest points are local representations of image features, which can reflect the local specificity of images. Global features, such as the color feature, the texture feature and the spatial information feature, are vulnerable to environmental interference (for example light, rotation noise, et cetera). Unlike the global features, the interest point features tend to correspond to some lines of intersections in the image, and there is less interference in the structure of light and shade changes.
Image pre-processing
It is ubiquitous that fabric patterns are formed by or appear on textured surfaces. A variety of texture patterns make the fabrics more diverse, but bring difficulties to feature extraction as well. In order to make the extracted features more accurate, we should remove textures first.
Xu et al.
24
proposed a technique that achieves structure extraction from texture via relative total variation. The technique as described in equation (8) has proved that it can provide a satisfactory result of separating structures and textures. In this paper, we use this algorithm to remove textures from the fabric image. The results of fabric image texture removal are shown in Figure 2.
The result of removing texture from a fabric image: the left is original and the right is result image.
In equation (8), I is the input, which represents the luminance channel of the fabric image. Let p indexes 2D pixels and S denotes the resulting structure image. The data term
Structure extraction from texture via relative total variation is simple and yet very effective to make main structures stand out. However, as images need to be smoothed differently, we need to constantly modify the number of executions. To enable this algorithm to be applied to practical engineering applications, we add automatic convergence to structure extraction from texture via relative total variation as follows
SURF features
In this paper, SURF is used to extract the features of fabric images and construct feature vectors.
SURF is a stable and fast feature extraction algorithm. This algorithm, on the basis of the integral image, detects the local extreme points on the different scales of images by the fast Hessian detector, and then determines the main direction and the feature point description word of each feature point by calculating wavelet transform. The integral image which allows for fast computation of box type convolution filters is expressed as
Given a point
The extreme point, obtained by calculating the Hessian matrix determinant of each pixel in the image, is expressed as
As the correspondence search often requires comparison between different scales, extreme points should be found at different scales to gain interest points. Traditionally, scale spaces are usually implemented as an image pyramid, in which an image is subjected to repeated smoothing and subsampling to achieve a higher level. However, the scale space is implemented by up-scaling the filter size here. Meanwhile, the use of the integral image and the box filter enables this procedure to be accomplished at a constant cost.
The Hessian matrix is used to find extreme points as candidate interest points. Furthermore, we have used the Haar wavelet to determine the main direction of the characteristic points. Let s be the scale at which the point is detected; the sampling step and the length of the wavelets are scale dependent and set to be s and 4s respectively. After extracting interest points, the Haar wavelet responses are calculated in horizontal and vertical directions within a circular neighborhood of radius 6s around the point. Finally, horizontal and vertical responses within a window of π/3 are summed to form a local direction vector. The longest local vector is defined as the dominant direction.
To compute the SURF descriptor of each interest point, we take the interest point as the center and construct a square window of size 20s around the point. Meanwhile, the square direction should be consistent with the dominant direction of the interest point to preserve rotation invariance. Then the window is divided into 4×4 subregions, and thus the length of each subregion is 5s. The horizontal and vertical Haar wavelet responses relative to the dominate direction are recorded as
To further substantiate the rotation invariance of the SURF descriptor, a rotate-rate index (discussed in the results and analysis section) is introduced. The experimental results verify the applicability and superiority of SURF for rotating fabric images. One of the results is given in Figure 3. The corresponding relationship between the interest points of two fabric images is shown in Figure 3, indicating that the features of SURF are not affected by the rotation angle of the fabric image.
The corresponding interest points between image 12101B and rotated image 12101B.
Similarity matching
The final step of retrieval pipeline calculates the distance between features to assess the fabric image similarity. Euclidean distance is adopted to compute the distance between the queried fabric image and images in the database as described in equation (19). The shorter the distance between two features, the higher the similarity between them
The similarity of two fabric images, which combines the distance of SURF and GIST, is expressed as
Here,
Experiment
We evaluated the performance of the proposed method using a fabric image database and compared it to the state-of-the-art descriptors. For the quantitative evaluation, we first introduce the fabric image dataset we compiled. Then we present the state-of-the-art algorithms we compare them to, including CM, GIST, SURF, CM+SURF and CM+GIST, which allow a fair assessment of the image retrieval.
Fabric image retrieval database
The performance of the proposed algorithm is tested with a fabric image database which contains 527 fabric images. All of the images come from fabrics in an actual factory and each image has a unique number. Experiments are conducted in a MATLAB compiling environment on a personal computer with Intel 1197 MHz processor and 4 GB RAM. Note that the patterns contained in the images include streamlined, ring, leaf, flower and their combinations. These images, based on the image elements’ diversity and the degree of regularity in the arrangement, can be divided into a variety of categories. As shown in Figure 4, the elements of the fabric images become more complex and more arbitrary from panel (a) to panel (f), where the categories of the fabric images are described. The diversity of the image dataset helps to verify the robustness and effectiveness of our method. For each query image, the system returns the 10 most similar images as a result.
Categories of the fabric image database. (a) Streamlined pattern, (b) Ring pattern, (c) Ring and leaf pattern, (d) Flower and leaf pattern, (e) Flower and leaf patter without distinct borderline and (f) Unstructured pattern.
We use precision and recall to measure the accuracy and robustness of image retrieval, and use rotate-rate to measure the effect of retrieving rotated images. The precision and recall indices are the most commonly used assessment criteria for image retrieval. The precision denotes the ratio of the number of retrieved similar images to retrieved images and measures the ability of the method to retrieve only relevant images in the retrieval result. Recall is the ratio of the total number of retrieved similar images to similar images, which indicates the ability to retrieve all images that are relevant in the database. The rotate-rate depicts the ability to retrieve all rotated images that are relevant to the query image in the fabric image dataset. Note that the precision,
Results and analysis
There are multiple visual descriptors for expressing the fabric images. Spatial information can accurately detect repetitive pattern, and the interest point features can appropriately detect images considering the rotation. In this fabric image retrieval method, GIST is adopted to extract spatial information, whereas SURF is employed to acquire the interest point feature. In order to verify the feasibility of the proposed method, a series of experiments were conducted on fabric image retrieval, including comparison with CM, GIST, SURF, CM+SURF and CM+GIST. Note that the weight will affect retrieval results, so it is necessary to determine the weighting value in integrations with GIST and SURF descriptors. For this purpose, 100 images are retrieved to find the appropriate weight. The mean of retrieval precision, recall and rotate-rate of 100 query results in different weight values is shown in Figure 5. These factors respectively reflect the accuracy, the robustness and the ability to detect rotated images of the proposed method. As shown in Figure 5, when the weight of SURF descriptor is equal to or larger than 0.3, the rotate-rate remains unchanged. When the weights of GIST and SURF are 0.3 and 0.7, respectively, the peaks of recall and precision rise. Therefore, the optimal weights are 0.3 and 0.7. As GIST is the dominant descriptor to represent the repetitive pattern and global features of images, the performance decreases when the weight of GIST is too small. Hence, the peaks of recall and precision drop when the weight is less than 0.3.
The mean of retrieval recall, precision and rotate-rate of 100 query results for different weight values.
To intuitively display the experimental results, a fabric image is randomly selected and the retrieval results using the above methods are shown in Figure 6. It is noteworthy that SURF scheme retrieves all four rotated images successfully. GIST achieves higher precision than CM and SURF descriptors. However, for CM, GIST, SURF and CM+SURF, partial retrieved images differ greatly from the query image in terms of color and texture features. In contrast, most of the retrieved images by CM+GIST and our algorithm have very similar texture and color appearance to the query image. Despite relatively high precision and recall, the CM+GIST method is insensitive to the rotation and some rotated query images are not identified.
The fabric image retrieval results using six schemes. (a) CM, (b) GIST, (c) SURF, (d) CM+GIST, (e) CM+SURF, and (f) SURF+GISTThe proposed method.
To better illustrate the retrieval precision of our method, 20 images were randomly selected from each category and the comparison was conducted. Figure 7 visualizes the average recall, precision and rotate-rate of our algorithm and the others. As can be seen in the figure, the overall result is consistent with the above sample. The average recall and precision of our method is the highest followed by CM+GIST. Moreover, the figure shows that SURF, CM+SURF and the proposed method can better handle the rotation in fabric image. As precision and recall represent the overall effect of image retrieval, the two indices are more important in practical production. The result confirms that the proposed algorithm has the greatest ability to retrieve those fabric images that contain rotation, complex texture and repetitive patterns. Therefore, this method is appropriate in fabric image retrieval and meets the requirements in the actual factory.
Average recall, precision and rotate-rate of six methods.
Conclusions and future work
In this paper, a new fabric image retrieval algorithm based on GIST and SURF is proposed. Our main contribution is twofold. First, the proposed method makes full use of the accurate merit of GIST and the rotation invariant advantage of SURF, which promises the robustness of the algorithm and overcomes the drawback of GIST by introducing SURF descriptor. Second, suitable weighting values are decided to confirm the best performance of image retrieval.
Our experiments on the fabric image dataset show that the method achieves higher precision and rotate-rate than the existing representative image feature algorithms, such as CM, GIST and SURF, for lacy and embroidered fabric. Although this new way alleviates the key challenge in fabric image retrieval, the mismatch due to fabric wrinkles has not been solved. We plan to explore this direction in future work.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by Industrial Prospective Project of Jiangsu Technology Department under Grant Number BE2017081.
