Unsupervised multi-source domain adaptation for person re-identification via sample weighting

Abstract

The aim of unsupervised domain adaptation (UDA) in person re-identification (re-ID) is to develop a model that can identify the same individual across different cameras in the target domain, using labeled data from the source domain and unlabeled data from the target domain. However, existing UDA person re-ID methods typically assume a single source domain and a single target domain, and seldom consider the scenario of multiple source domains and a single target domain. In the latter scenario, differences in sample size between domains can lead to biased training of the model. To address this, we propose an unsupervised multi-source domain adaptation person re-ID method via sample weighting. Our approach utilizes multiple source domains to leverage valuable label information and balances the inter-domain sample imbalance through sample weighting. We also employ an adversarial learning method to align the domains. The experimental results, conducted on four datasets, demonstrate the effectiveness of our proposed method.

Keywords

Person re-identification unsupervised domain adaptation sample weighting unsupervised multi-source domain adaptation

1. Introduction

1.1 Research challenges

Person re-identification (re-ID) is a task that involves identifying a specific individual across different cameras by retrieving relevant images from a large set of candidate images captured under non-overlapping camera views. This is considered an open-set problem because the identity information known at the time of testing is not the same as at the time of training. In order to identify person features, valid feature representations that are unknown in training have to be learned. Recently, person re-ID methods in a supervised scenario have made significant progress [1, 2, 3], but learning a person re-ID model that generalizes well over an unlabeled target domain is still very difficult. That is, when tested on top of an unknown dataset, person re-ID methods in a supervised scenario suffer from severe degradation in recognition retrieval. One of the primary causes for these issues stems from dissimilarities in the distribution of data between the source and target domains. These differences can be attributed to several factors, such as discrepancies in body posture, camera perspective, lighting conditions, image quality, surroundings, and obstructing objects. Hence, to overcome these challenges [4, 5, 6], it is crucial to gather a substantial amount of training data and manually label it for a specific setting. However, this process can be incredibly time-consuming and arduous. To mitigate the burden of labeling data, various unsupervised domain adaptation (UDA) methods [7, 8, 9, 10, 11, 12, 13] have been developed.

To deal with the person re-ID problem in UDA scenarios, some recent works [14, 15, 16, 17, 18, 19] have focused on migrating knowledge from a labeled source domain to a target domain or using clustering algorithms on an unlabeled target domain. When applied to a novel target domain scenario, many person re-ID methods in the context of UDA rely on source domain data for pre-training or joint training [20]. Recently, a large number of UDA background person re-ID methods have been proposed. One of the most effective class of these methods is the person re-ID methods based on pseudo-labeling strategies [21, 22, 23]. This category of methods can be divided into two main stages. First, a pre-trained model is obtained by training on the source domain. Next, an iterative pseudo-label prediction and fine-tuning strategy is employed to train the model on the target domain. While pseudo-label-based methods have demonstrated their effectiveness, a majority of these methods only use a limited amount of data from a single source domain for pre-training the model. Since such methods do not fully utilize label information, they result in a large amount of wasted label information.

Figure 1.

Multi-source domain adaptation scenario. In the figure, $S_{1}$ , $S_{2}$ , and $S_{3}$ on the left represent the three source domains, and $T$ on the right represents a target domain. Multiple source domains can contain more valuable label information.

1.2 Main contributions

In order to effectively utilize a vast amount of available labeled data, we have introduced multi-source domain adaptation into the realm of UDA for person re-identification. Figure 1 shows the multi-source domain adaptation scenario. The multi-source domain adaptation scenario involves the utilization of multiple domain datasets during both the pre-training and fine-tuning stages of the model. This approach leverages both the true labels from the source domains and the pseudo labels generated from the target domain to provide joint supervision. Furthermore, prior methods [24, 25, 26] did not address the issue of imbalanced domain sample sizes when working with multiple domains. Specifically, domains with larger sample sizes have greater weights in the model optimization process. This sample size imbalance can easily cause model bias problems during training. Additionally, this imbalance can easily deteriorate the discrimination effect. To address the issue of imbalanced sample sizes across different domains, we suggest employing a sample weighting strategy to balance out the variations in samples between domains.

In summary, the main contributions of this paper are as follows:

•
We propose an unsupervised multi-source domain adaptation person re-ID method based on sample weighting. By applying multi-source domain adaptation to UDA person re-ID, valuable label information is fully utilized.
•
We propose the sample weighting method to weight the samples as a way to balance the problem of unbalanced sample size between domains. Then, the adversarial learning method is used for domain alignment.
•
To evaluate the effectiveness of the proposed approach, we carry out a comprehensive set of comparison experiments, ablation experiments, and supplementary experiments on four different datasets: Market1501, DukeMTMC-reID, CUHK03 and MSMT17. Our experimental results demonstrate the effectiveness of the proposed method and show its superior performance compared to other existing methods.

The remainder of this paper is organized as follows. Section 2 introduces the related work. Section 3 firstly gives an overview of the proposed unsupervised multi-source domain adaptation for person re-ID via sample weighting framework, then explains the sample weighting strategy and the various loss functions in this paper. Section 4 illustrates the implementation details and discusses the experimental results. Section 5 describes the threats to this work and potential solutions. Section 6 concludes this paper and gives an outlook on future work.
2. Related work

2.1 Unsupervised domain adaptation in person re-ID

The current mainstream UDA person re-ID methods are mainly divided into the following aspects. One is by using a family of generative adversarial networks (GANs) to transfer the style of labeled images from the source domain to the target domain, and the transferred images are then used for training. Based on this pipeline, PTGAN [16] uses semantic segmentation techniques to constrain the consistency of human regions during style migration. Similarly, SPGAN [27] proposes a method which maintains image similarity before and after image translation. Although these methods have achieved some results, the performance of these methods is still unsatisfactory.

The other is through the use of pseudo-label methods. Along this line, ACT [22] mitigates label noise generated by clustering algorithms in an asymmetric cooperative teaching framework. MMT [23] utilizes both hard and soft labels by constructing a mutual mean-teacher network. SSG [28] uses both global and local features, and when assigning pseudo-labels, pseudo-label assignments are performed separately for global features and local features. Moreover, DG-Net $++$ [29] extracts and focuses on identity-related information by using a decomposition module. PPLR [30] reduces label noise by leveraging complementary relationships between global and partial features.

Moreover, there are deep domain adaptation methods that have been utilized in person re-ID to reduce the inter-domain distance between the source and target domains through feature representation. MMFA [31] reduces the inter-domain interval by using the Maximum Mean Discrepancy (MMD). CAT [32] introduces an adversarial framework to mitigate inter-camera differences by confusing a camera discriminator. ACAN [33] reduces the distribution differences by using only intra-camera labeling information and not inter-camera labeling information.

2.2 Multi-source unsupervised domain adaptation

The UDA methods mentioned above mainly consider UDA methods for a single source domain, these methods are limited in their adaptation scenarios and are not as practical as multi-source domain adaptation methods. The wide application of multi-source domain adaptation has been illustrated by key studies in [34, 35, 36]. Based on the above work, MDAN [37] aligns source and target domains by using domain adversarial networks. M3SDA [38] migrates knowledge learned from multiple labeled source domains to unlabeled target domains by dynamically aligning feature distribution moments. Besides, MDDA [39] not only considers the difference in distance between multiple source and target domains, but also investigates the dissimilarity between source and target domain samples. LtC-MSDA [40] performs class-level alignment by using graph convolutional networks.

In recent years, more and more multi-source domain adaptation methods have been proposed for practical application perspectives. DSBN [41] reduces the inter-domain interval by integrating different BN. CMSS [42] sets up an adversarial agent through which the dynamic curriculum of the source domain samples is learned. RDSBN [43] reduces the inter-domain interval by reducing the inter-domain interval from both domain-invariant view and multi-domain fusion view to make full use of the labeled data.

2.3 Sample weighting

When there is an imbalance in the sample size between domains, it can result in model bias and negative transfer during training. To address this issue, several methods have been proposed. DWL [44] dynamically weights the learning loss of alignment and discrimination by taking into account the degree of alignment and discrimination, while also ensuring the balance of information between domains by weighting the samples. DSW [45] balances positive and negative samples by spatial location and confidence level, respectively. In addition, DRMN [46] uses a pairwise re-weighting mechanism to address the domain discrepancy and mis-matching problem between source and target domains in multi-source domain adaptation. DWDA [47] performs sample weighting on source and target domain samples separately by k-means clustering algorithm. TIT [48] weights the samples by increasing the weights of pivot samples and decreasing the weights of outlier samples.

To summarize, researches on person re-ID in UDA, at present, are carried out in the scenario of single-source domain and single-target domain, but few are carried out in the scenario of multi-source domain and single-target domain. Therefore, inspired by multi-source UDA, we propose unsupervised multi-source domain adaptation for person re-ID. Although previous work [43] introduced the concept of multi-source to the field of person re-ID for the first time, this method did not consider the problem of sample size imbalance between each domain. To solve this problem, we propose to weight the samples in each domain, as a way to balance the difference in sample size between domains. In summary, in this work, we propose an unsupervised multi-source domain adaptation method for person re-ID via sample weighting.

3. Methodology

3.1 Overview

In the scenario of unsupervised multi-source domain adaptation, given multiple source domains $S=\{S_{1},S_{2},\dots,S_{K}\}$ and target domain $T$ , source domain samples $S_{k}=\{(x_{1}^{S_{k}},y_{1}^{S_{k}}),\dots,(x_{n_{S_{k}}}^{S_{k}},y_{n_{S_{k}% }}^{S_{k}})\}$ , and target domain samples $T=\{x_{1}^{T},\dots,x_{n_{T}}^{T}\}$ , where $K$ represents the number of source domains and $k=1,2,\dots,K$ , $n_{S_{k}}$ represents the sample size of the $k$ -th source domain, $n_{T}$ represents the sample size of the target domain. In this paper, we aim to develop a network algorithm model for person re-ID using labeled samples from the source domain and unlabeled samples from the target domain in the scenario of unsupervised multi-source domain adaptation. The ultimate objective is to achieve good generalization performance of the model in the target domain. To simplify the derivation of the model and algorithm, we have defined the relevant variables and symbols used in this paper in Table 1.

Table 1
Definition of symbols involved in this paper

Notation	Meaning
$S$	The collection of source domains
$S_{k}$	The $k$ -th source domain
$T$	The target domain
$x_{n_{S_{k}}}^{S_{k}}$	The $n$ -th sample from the $k$ -th source domain
$y_{n_{S_{k}}}^{S_{k}}$	The label of the $n$ -th sample from the $k$ -th source domain
$x_{n}^{T}$	The $n$ -th sample from the target domain
$K$	The number of the source domains
$n_{S_{k}}$	The number of samples from the $k$ -th source domain
$n_{T}$	The number of samples from the target domain
$\widetilde{x}_{i}^{S_{k}}$ , $\widetilde{x}_{j}^{T}$	The weighted samples of the source and target domains, respectively
$\alpha_{k}^{\prime}$	The weight of the $k$ -th source domain samples
$\alpha_{K+1}^{\prime}$	The weight of the target domain samples
$G$	The feature extractor
$D$	The model discriminator
$C$	The model classifier
$G(\tilde{x}_{i}^{S_{k}})$	The features of the weighted source domain samples
$G(\tilde{x}_{i}^{T})$	The features of the weighted target domain samples
$f_{i}$	The anchor sample feature
$f_{i,p}$	The most difficult positive sample feature
$f_{i,n}$	The most difficult negative sample feature

To avoid model bias caused by the imbalance in the number of samples between different domains during model training, we first weight the samples of multiple source domains and a single target domain, as shown in Fig. 2. Next, the feature extractor $G$ is responsible for extracting the domain-invariant features of the weighted sample. Feature extractors can be various popular deep feature extraction networks, such as LeNet [49], AlexNet [50] and ResNet [51], etc., we use ResNet-101 as the backbone feature extraction network in this paper. Based on the majority [21, 22], we use the DBSCAN [52] clustering algorithm as a pseudo-label generator in this paper. In each epoch, all target domain sample features are imported into the pseudo-label generator, and the clustering results output by the pseudo-label generator are used as pseudo-labels.

Figure 2.

The overall framework of the proposed methodology. Firstly, sample weighting is performed on the input samples. Secondly, the feature extractor $G$ extracts the sample-weighted features. Thirdly, the DBSCAN clustering algorithm is used as a pseudo-label generator. Finally, the domain alignment loss and classification loss are calculated by discriminator $D$ and classifier $C$ , respectively.

3.2 Sample weighting of source and target domains

In the field of unsupervised multi-source domain adaptation for person re-ID, in general, an imbalance in sample size between domains can lead to model bias and poor alignment and discrimination effects, resulting in negative transfer. To address this issue, we propose a sample weighting method in this paper. Specifically, we assign weights to all samples from all domains, where the weight for each domain is inversely proportional to the proportion of samples from that domain to the total sample size of all domains. This weighting scheme aims to achieve a more balanced representation of samples from all domains during training. Specifically, our proposed sample weighting method is as follows:

$\displaystyle\alpha_{k}=\frac{\sum_{i=1}^{K}n_{S_{i}}+n_{T}}{n_{S_{k}}}$ (1) $\displaystyle\alpha_{K+1}=\frac{\sum_{i=1}^{K}n_{S_{i}}+n_{T}}{n_{T}}$ (2)

Normalization:

$\displaystyle\alpha_{k}^{\prime}=\frac{\alpha_{k}}{\sum_{i=1}^{K}\alpha_{i}+% \alpha_{K+1}}$ (3) $\displaystyle\alpha_{K+1}^{\prime}=\frac{\alpha_{K+1}}{\sum_{i=1}^{K}\alpha_{i% }+\alpha_{K+1}}$ (4)

Weighted sample:

$\displaystyle\widetilde{x}_{i}^{S_{k}}=\alpha_{k}^{\prime}x_{i}^{S_{k}}$ (5) $\displaystyle\widetilde{x}_{j}^{T}=\alpha_{K+1}^{\prime}x_{j}^{T}$ (6)

where $\widetilde{x}_{i}^{S_{k}}$ and $\widetilde{x}_{j}^{T}$ are the weighted samples, $\alpha_{k}^{\prime}$ is the weight of the $k$ -th source domain samples after normalization, $\alpha_{K+1}^{\prime}$ is the weight of the normalized target domain samples, $x_{i}^{S_{k}}$ represents the $i$ -th sample of the $k$ -th source domain, $x_{j}^{T}$ represents the $j$ -th sample of the target domain, $i=1,2,\dots,n_{S_{k}}$ , $j=1,2,\dots,n_{T}$ .

3.3 Domain alignment

Adversarial domain adaptation [53, 54, 55] is based on the concept of generative adversarial networks (GANs) [56], which consists of a generator and a discriminator. In adversarial domain adaptation, the generator is a feature extractor that learns domain-invariant features from both the source and target domains. The discriminator, on the other hand, is responsible for distinguishing whether the samples come from the source domain or the target domain. During model training, the feature extractor and discriminator are trained in an adversarial manner. When a feature cannot be distinguished by the discriminator from the source domain or the target domain, the feature is domain-invariant. Our goal is to train a feature extractor to learn these domain-invariant features. In this paper, we use $G$ for feature extractor and $D$ for discriminator.

To obtain domain-invariant feature representations, the weighted samples $\tilde{x}^{S}$ and $\tilde{x}^{T}$ are input to the feature extractor $G$ , formally, our adversarial domain adaptation network is defined as follows:

$\displaystyle\min_{G}\max_{D}\mathcal{L}_{\textit{adv}}=\mathbb{E}_{\widetilde% {x}_{i}^{S_{k}}\sim S_{k}}\log[D(G(\tilde{x}_{i}^{S_{k}}))]+\mathbb{E}_{% \widetilde{x}_{j}^{T}\sim T}\log[1-D(G(\tilde{x}_{j}^{T}))]$ (7)

where $G(\tilde{x}_{i}^{S_{k}})$ and $G(\tilde{x}_{j}^{T})$ are the weighted features of the source domain samples and the target domain samples extracted by the feature extractor, respectively.

3.4 Classification loss

Since this paper is a research on person re-ID method in UDA scenarios, the data of the target domain is without labels, and labeling data is a time-consuming and laborious task. In order to make full use of the data information in the target domain, we adopt a pseudo-label generation strategy. In this paper, we apply the DBSCAN [52] clustering algorithm as a pseudo-label generator to generate pseudo-labels. In each epoch, the target domain feature are input to the pseudo-label generator, which outputs the clustering results.

During this stage, we leverage the labeled data from both the source and target domains to effectively utilize the domain-invariant feature representations and train the final model in a fully supervised mode. The model architecture incorporates two classification losses: the first one corresponds to the classification loss of the source domain with the true labels, while the second one corresponds to the classification loss of the target domain with pseudo-labels. The total classification loss is obtained by adding the classification loss of the source domain and the classification loss of the target domain in the following form:

$\displaystyle\mathcal{L}_{\textit{cls}}=\mathcal{L}_{\textit{cls}}^{S}+% \mathcal{L}_{\textit{cls}}^{T}$ (8)

where

$\displaystyle\mathcal{L}_{\textit{cls}}^{S}=-\sum_{k=1}^{K}\sum_{i=1}^{n_{S_{k% }}}y_{i}^{S_{k}}\log(C(G(\tilde{x}_{i}^{S_{k}})))$ (9) $\displaystyle\mathcal{L}_{\textit{cls}}^{T}=-\sum_{i=1}^{n_{T}}\tilde{y}_{i}^{% T}\log(C(G(\tilde{x}_{i}^{T})))$ (10)

In Eqs (9) and (10), $\mathcal{L}_{\textit{cls}}^{S}$ and $\mathcal{L}_{\textit{cls}}^{T}$ are the cross-entropy loss, $C$ is a classifier, $y_{i}^{S_{k}}$ is the true label of the source domain, and $\tilde{y}_{i}^{T}$ is the pseudo-label generated by the pseudo-label generator of the target domain characteristics.

3.5 Triplet loss

To enhance the discriminative power of our person re-ID model, we propose a hard batch triplet selection strategy. This approach selects the most difficult positive and negative samples for each anchor sample, which brings samples with the same person identity closer and separates samples with different person identities. Our model framework includes two triplet losses: one for the source domain sample and one for the target domain sample. The total triplet loss is the sum of the source and target domain triplet losses in the following form:

$\displaystyle\mathcal{L}_{\textit{tri}}=\mathcal{L}_{\textit{tri}}^{S}+% \mathcal{L}_{\textit{tri}}^{T}$ (11)

Inspired by PPLR [30], the definition of triplet loss is as follows:

$\displaystyle\mathcal{L}_{\textit{tri}}^{S}=-\sum_{k=1}^{K}\sum_{i=1}^{n_{S_{k% }}}\log\left(\frac{e^{\|f_{i}^{S_{k}}-f_{i,n}^{S_{k}}\|}}{e^{\|f_{i}^{S_{k}}-f% _{i,p}^{S_{k}}\|}+e^{\|f_{i}^{S_{k}}-f_{i,n}^{S_{k}}\|}}\right)$ (12) $\displaystyle\mathcal{L}_{\textit{tri}}^{T}=-\sum_{i=1}^{n_{T}}\log\left(\frac% {e^{\|f_{i}^{T}-f_{i,n}^{T}\|}}{e^{\|f_{i}^{T}-f_{i,p}^{T}\|}+e^{\|f_{i}^{T}-f% _{i,n}^{T}\|}}\right)$ (13)

where $\|\cdot\|$ refers to the $L_{2}$ norm, and $f_{i}$ , $f_{i,p}$ , and $f_{i,n}$ represent the anchor sample features, the most difficult positive sample feature and the most difficult negative case sample feature obtained by using the hard batch triplet selection strategy in a mini-batch, respectively. The most difficult positive sample refers to the positive sample farthest from the anchor sample, and the most difficult negative sample refers to the negative sample closest to the anchor sample.

3.6 Overall training objectives

The final loss function of the proposed method is shown in Eq. (14):

$\displaystyle\mathcal{L}=\mathcal{L}_{\textit{cls}}+\beta\mathcal{L}_{\textit{% adv}}+\lambda\mathcal{L}_{\textit{tri}}$ (14)

where $\beta$ and $\lambda$ are balancing hyperparameters. $\beta$ controls the degree of domain alignment and $\beta>0$ . We set $\lambda=1$ in our experiments. $\mathcal{L}_{\textit{cls}}$ is the classification loss which can be calculated by Eq. (8) and includes the classification loss of the source domain and the classification loss of the target domain, $\mathcal{L}_{\textit{tri}}$ is the softmax-triplet loss and $\mathcal{L}_{\textit{adv}}$ is the adversarial loss for domain alignment. For clarity, we describe the main steps of the method in Algorithm 3.6.

[ht] Unsupervised Multi-source Domain Adaptation for Person Re-ID via Sample WeightingSource domain data $\{(x^{S_{k}},y^{S_{k}})\}_{k=1}^{K}$ ; target domain data $x^{T}$ ; hyperparameter $\beta$ and $\lambda$ ; max_epoch. The predicted label of the unlabeled target domain data $y^{T}$ . The input samples are weighted: $\widetilde{x}_{i}^{S_{k}}\leftarrow\alpha_{k}^{\prime}x_{i}^{S_{k}}$ and $\widetilde{x}_{j}^{T}\leftarrow\alpha_{K+1}^{\prime}x_{j}^{T}$ .

not converge and epoch $<$ max_epoch

Apply DBSCAN clustering algorithm to generate pseudo-labels $\{\tilde{y}_{i}^{T}\}_{i=1}^{n_{T}}$ .

Calculate $\mathcal{L}_{\textit{adv}}\leftarrow\mathbb{E}_{\widetilde{x}_{i}^{S_{k}}\sim S% _{k}}\log[D(G(\tilde{x}_{i}^{S_{k}}))]+\mathbb{E}_{\widetilde{x}_{j}^{T}\sim T% }\log[1-D(G(\tilde{x}_{j}^{T}))]$ .

Calculate $\mathcal{L}_{\textit{cls}}^{S}\leftarrow-\sum_{k=1}^{K}\sum_{i=1}^{n_{S_{k}}}y% _{i}^{S_{k}}\log(C(G(\tilde{x}_{i}^{S_{k}})))$ and $\mathcal{L}_{\textit{cls}}^{T}\leftarrow-\sum_{i=1}^{n_{T}}\tilde{y}_{i}^{T}% \log(C(G(\tilde{x}_{i}^{T})))$ .

Calculate $\mathcal{L}_{\textit{tri}}^{S}\leftarrow-\sum_{k=1}^{K}\sum_{i=1}^{n_{S_{k}}}% \log\left(\frac{e^{\|f_{i}^{S_{k}}-f_{i,n}^{S_{k}}\|}}{e^{\|f_{i}^{S_{k}}-f_{i% ,p}^{S_{k}}\|}+e^{\|f_{i}^{S_{k}}-f_{i,n}^{S_{k}}\|}}\right)$ and $\mathcal{L}_{\textit{tri}}^{T}\leftarrow-\sum_{i=1}^{n_{T}}\log\left(\frac{e^{% \|f_{i}^{T}-f_{i,n}^{T}\|}}{e^{\|f_{i}^{T}-f_{i,p}^{T}\|}+e^{\|f_{i}^{T}-f_{i,% n}^{T}\|}}\right)$ .

Calculate $\mathcal{L}\leftarrow\mathcal{L}_{\textit{cls}}+\beta\mathcal{L}_{\textit{adv}% }+\lambda\mathcal{L}_{\textit{tri}}$ .

Update $G$ , $C$ and $D$ .

4. Experiments

4.1 Datasets and evaluation schemes

We conducted experiments to evaluate the effectiveness of our proposed method on four popular person re-ID datasets: Market1501 (Market) [57], DukeMTMC-reID (Duke) [58], CUHK03 (CUHK) [59] and MSMT17 (MSMT) [15]. A dataset represents a domain. Examples of person images from the four datasets are shown in Fig. 3.

Market consists of 32688 images of 1501 person identities from 6 non-overlapping camera views. The dataset is split into a training set, which includes 12936 images of 751 person identities, and a test set, which includes 19732 images of 750 person identities.

Duke contains 36411 images of 702 person identities captured by 8 cameras. The dataset is split into a training set, which has 6522 images from 702 people, and a test set, which has 19889 images from the same 702 people.

Figure 3.

Examples of persons from different datasets: (a) Market1501, (b) DukeMTMC-reID, (c) CUHK03, (d) MSMT17.

CUHK contains 14096 images of 1467 individuals captured from 6 camera views. The dataset is split into a training set comprising 767 person ID images and a test set comprising 700 person ID images.

MSMT is a challenging dataset that contains 126441 images of 4101 persons from 15 camera views. The dataset is split into a training set of 32621 images with 1041 person identities and a test set of 93820 images with 3060 person identities. It is more challenging compared to the other three datasets mentioned above.

We use mean average precision (mAP) and Rank-1 (R-1), Rank-5 (R-5), Rank-10 (R-10) accuracies in cumulative matching characteristic (CMC) to assess the effectiveness of our method.

Table 2

Experimental results of different quantities of source domain datasets on the target domain Market dataset

Source(s)	mAP	R-1	R-5	R-10
Duke	81.2	91.9	96.7	97.5
Duke $+$ CUHK	83.6	93.9	97.2	98.1
Duke $+$ CUHK $+$ MSMT	85.5	94.3	97.8	98.3

4.2 Implementation details

We utilize the ResNet-101 [51] architecture, which was pre-trained on ImageNet [60], as the backbone network. The experiment is conducted on two NVIDIA GeForce RTX3090 GPUs. We follow the experimental settings of MMT [23]. Specifically, we use the Adam optimizer [61] with a weight decay of 0.0005 to optimize the network. The input image data is resized to 256 $\times$ 128 and we apply data augmentation techniques such as random flipping, cropping, and erasing. The mini-batch size is set to 8, and the initial learning rate is set to 0.00035. The learning rate is reduced to 1/10 of its previous value on the 40th and 70th epochs of a total of 80 epochs. We use DBSCAN [52] as the clustering algorithm.

Table 3
Experimental results of different quantities of source domain datasets using the sample weighting method on the target domain Market dataset

Source(s)	mAP	R-1	R-5	R-10
Duke (w/o)	81.2	91.9	96.7	97.5
Duke (w)	81.9	93.1	97.6	98.7
Duke $+$ CUHK (w/o)	83.6	93.9	97.2	98.1
Duke $+$ CUHK (w)	84.5	94.6	97.9	98.7
Duke $+$ CUHK $+$ MSMT (w/o)	85.5	94.3	97.8	98.3
Duke $+$ CUHK $+$ MSMT (w)	86.4	95.3	98.5	99.1

4.3 Ablation study

4.3.1 The necessity of multi-source domains

In order to verify the necessity of multi-source domains, without using the sample weighting algorithm, we conduct the following experiments: Duke $\to$ Market, Duke $+$ CUHK $\to$ Market and Duke $+$ CUHK $+$ MSMT $\to$ Market. In these experiments, Duke, CUHK and MSMT are used as the source domain dataset and Market is used as the target domain dataset, and the experimental results are shown in Table 2. The comparison shows that the performance of using multiple source domain datasets is better than using a single source domain dataset. The mAP on the Duke $+$ CUHK $\to$ Market task is 2.4% higher than on the Duke $\to$ Market task. The mAP on the Duke $+$ CUHK $+$ MSMT $\to$ Market task is 4.3% higher than on the Duke $\to$ Market task, while the mAP on the Duke $+$ CUHK $\to$ Market task is 1.9% higher. Therefore, it can be shown that the concept of introducing multiple source domains is effective.

Table 4
Comparison with the state-of-the-art methods on Duke $\to$ Market and Market $\to$ Duke tasks

Methods	Duke $\to$ Market				Market $\to$ Duke
	mAP	R-1	R-5	R-10	mAP	R-1	R-5	R-10
ECN $++$ [62]	63.8	84.1	92.8	95.4	54.4	74.0	83.7	87.4
MMCL [63]	60.4	84.4	92.8	95.0	51.4	72.4	82.9	85.0
SNR [64]	61.7	82.8	–	–	58.1	76.3	–	–
NRMT [65]	71.7	87.8	94.6	96.5	62.2	77.8	86.9	89.5
MMT [23]	74.6	88.4	96.2	97.8	61.0	75.1	87.3	91.2
UNRN [66]	78.1	91.9	96.1	97.8	69.1	82.0	90.7	93.5
1-NNCT [60]	65.3	88.0	94.3	96.3	52.7	73.6	82.9	86.0
MMT $+$ SG [67]	70.5	88.1	–	–	64.8	78.5	–	–
SSKD [68]	78.7	91.7	–	–	67.2	80.2	–	–
GCL [69]	75.4	90.5	96.2	97.1	67.6	81.9	88.9	90.6
Dual-Refinement [70]	78.0	90.9	96.4	97.7	67.7	82.1	90.1	92.5
SECRET [71]	79.8	92.3	–	–	67.1	80.3	–	–
HGA [72]	70.3	89.5	93.6	95.5	67.1	80.4	88.7	90.3
MDL [73]	73.4	88.9	95.5	97.6	65.4	79.3	89.2	92.9
RDSBN [43]	81.5	92.9	97.6	98.4	66.6	80.3	89.1	92.6
Ours	81.9	93.4	97.6	98.7	66.9	81.0	90.7	93.0

Table 5

Comparison with the state-of-the-art methods on Market $\to$ MSMT and Duke $\to$ MSMT tasks

Methods	Market $\to$ MSMT				Duke $\to$ MSMT
	mAP	R-1	R-5	R-10	mAP	R-1	R-5	R-10
ECN $++$ [62]	15.2	40.4	53.1	58.7	16.0	42.5	55.9	61.5
MMCL [63]	15.1	40.8	51.8	56.7	16.2	43.6	54.3	58.9
MMT [23]	25.7	51.9	65.3	70.9	28.1	56.1	68.9	74.3
UNRN [66]	25.3	52.4	64.7	69.7	26.2	54.9	67.3	70.6
MLOL [74]	21.7	46.9	59.4	64.7	22.4	48.3	60.7	66.1
MMT $+$ SG [67]	23.5	50.2	–	–	27.5	56.1	–	–
NRLE-Net [75]	28.3	57.9	68.1	73.2	29.3	62.1	71.7	76.9
GCL [69]	27.0	51.1	63.9	69.9	29.7	54.4	68.2	74.2
Dual-Refinement [70]	25.1	53.3	66.1	71.5	26.9	55.0	68.4	73.2
HGA [72]	25.5	55.1	61.2	65.5	26.8	58.6	64.7	69.2
MDL [73]	20.6	45.5	59.6	66.3	23.1	50.0	62.5	69.2
RDSBN [43]	30.9	61.2	73.1	77.4	33.6	64.0	75.6	79.6
Ours	31.3	62.0	72.9	78.1	33.9	64.7	76.0	80.2

4.3.2 The necessity of sample weighting

In order to assess the impact of the sample weighting module on the performance of our model, on the basis of the previous experiments, we continue to do the following experiments: sample weighting on Duke $\to$ Market, Duke $+$ CUHK $\to$ Market and Duke $+$ CUHK $+$ MSMT $\to$ Market tasks, where ‘w’ indicates that the sample weighting method is used and ‘w/o’ means that the sample weighting method is not used. The experimental results are shown in Table 3. Through comparison, it can be seen that when the sample weighting method is used, the experimental results are improved regardless of the number of source domain datasets. On the Duke $\to$ Market task, the use of the sample weighting method increases by 0.7% compared with the mAP without the sample weighting method, on the Duke $+$ CUHK $\to$ Market task, the use of the sample weighting method increases by 0.9% compared with the mAP without the sample weighting method, and on the Duke $+$ CUHK $+$ MSMT $\to$ Market task, the use of the sample weighting method increases by 0.9% compared with the mAP without the sample weighting method. Therefore, it can be shown that the introduction of the sample weighting method is effective and necessary.

Table 6
Comparison with the state-of-the-art methods on Duke $+$ CUHK $+$ MSMT $\to$ Market task

Methods	Duke $+$ CUHK $+$ MSMT $\to$ Market
	mAP	R-1	R-5	R-10
MMT $+$ GRL [76]	77.1	90.4	96.8	97.8
MMT $+$ MomentMatching [38]	78.1	91.2	96.8	98.0
MMT $+$ DSBN [41]	81.1	92.8	97.3	98.5
RDSBN [43]	86.0	94.8	97.9	98.6
Ours	86.4	95.3	98.5	99.1

Table 7

Comparison with the state-of-the-art methods on Market $+$ CUHK $+$ MSMT $\to$ Duke task

Methods	Market $+$ CUHK $+$ MSMT $\to$ Duke
	mAP	R-1	R-5	R-10
MMT $+$ GRL [76]	64.3	77.6	88.1	91.3
MMT $+$ MomentMatching [38]	64.2	77.6	87.9	91.4
MMT $+$ DSBN [41]	65.6	79.6	89.1	92.1
RDSBN [43]	68.9	82.1	90.4	93.0
Ours	69.7	82.8	91.0	93.9

4.4 Comparison with the state-of-the-arts

To demonstrate the effectiveness of our proposed method, we conduct a comparison with state-of-the-art unsupervised domain adaptive person re-ID methods on four datasets: Duke, CUHK, MSMT and Market. We evaluate our method on the following domain adaptation tasks: Duke $\to$ Market, Market $\to$ Duke, Market $\to$ MSMT, Duke $\to$ MSMT, Duke $+$ CUHK $+$ MSMT $\to$ Market and Market $+$ CUHK $+$ MSMT $\to$ Duke. All experimental results are shown in Tables 4 to 7.

Figure 4.

Parameter sensitivity analysis. (a) Experiment of different values of $\beta$ on the Duke $\to$ Market task. (b) Experiment of different values of $\beta$ on the Duke $\to$ MSMT task. (c) Experiment of different values of $\beta$ on the Duke $+$ CUHK $+$ MSMT $\to$ Market task. (d) Experiment of batchsize with different values on the Duke $+$ CUHK $+$ MSMT $\to$ Market task.

Firstly, we carry out the UDA person re-ID experiments in the scenario of a single source domain and a single target domain. We compare the proposed method with recently advanced unsupervised domain adaptive person re-ID methods, and the results are presented in Tables 4 and 5. The results indicate that our proposed method outperforms the current state-of-the-art methods. On the Duke $\to$ Market task, our method achieves 81.9% mAP and 93.4% Rank-1 accuracy, which is 0.4% and 0.5% higher than RDSBN, respectively. On the Market $\to$ Duke task, our method achieves an accuracy of 66.9% mAP and 90.7% Rank-5, which is 0.3% and 1.5% higher than RDSBN, respectively. On the Market $\to$ MSMT task, our method achieves 31.3% and 62.0% accuracy on mAP and Rank-1, respectively, which is an improvement of 0.4% and 0.8% compared to RDSBN. On the Duke $\to$ MSMT task, our method achieves 33.9% and 64.7% accuracy on mAP and Rank-1, respectively, which is an improvement of 0.3% and 0.6% compared to RDSBN. Through the experiments on these four tasks, it can be proved that the sample weighting method is effective in improving the model performance.

Next, we extend the experiments to UDA person re-ID experiments in multiple source domains and single target domain, since there is not much research literature on unsupervised multi-source domain adaptation person re-ID, therefore, we follow the experimental setup of the literature [43] and carry out experiments on the Duke $+$ CUHK $+$ MSMT $\to$ Market and Market $+$ CUHK $+$ MSMT $\to$ Duke tasks, and the experimental results are shown in Tables 6 and 7. On the Duke $+$ CUHK $+$ MSMT $\to$ Market task, our method achieves 86.4% and 95.3% accuracy on mAP and Rank-1, which is 0.4% and 0.5% higher than RDSBN. On the Market $+$ CUHK $+$ MSMT $\to$ Duke task, our method achieves 69.7% and 82.8% accuracy on mAP and Rank-1, which is 0.8% and 0.7% higher than RDSBN. The experimental results indicate that the proposed method outperforms the state-of-the-art methods in the unsupervised multi-source domain adaptation for person re-ID scenarios.

4.5 Parameter sensitivity analysis

We analyze the importance of hyperparameter in the proposed method, and hyperparameter $\beta$ in Eq. (14) determines the degree of domain alignment. In order to verify the influence of the values of hyperparameter in Eq. (14) on the performance of the model, we conduct a series of experiments on Duke $\to$ Market, Duke $\to$ MSMT and Duke $+$ CUHK $+$ MSMT $\to$ Market tasks for different values of hyperparameter, in which the values of $\beta$ are 0.0, 0.2, 0.4, 0.6, 0.8, 1.0 and 1.2, respectively. The results are shown in Fig. 4. Specifically, we can see from Fig. 4a that when $\beta$ is 0.8, the effect is best. When $\beta$ is 0, the reason for the poor performance of the model is the lack of domain alignment, resulting in the distribution of the source domain and the target domain not being well aligned, which affects the performance of the model. Figure 4b shows the experimental results on the Duke $\to$ MSMT task, and Fig. 4c shows the experimental results on the Duke $+$ CUHK $+$ MSMT $\to$ Market task. As can be seen from Fig. 4b and c, when $\beta$ is 1.0, the effect is best. When $\beta$ is 1.2, the model performance begins to degrade, because $\beta$ occupies too much weight, resulting in excessive alignment of distribution between domains, resulting in model performance degradation.

Figure 5.

Time cost comparasion on Duke $+$ CUHK $+$ MSMT $\to$ Market task. ( $*$ ) indicates the implementation is based on the author’s code.

In addition, in order to verify the influence of batch size values and model performance, we conducted a series of experiments on the Duke $+$ CUHK $+$ MSMT $\to$ Market task. Due to GPU limitations, the batch size values are 2, 4, 8 and 16, and the experimental results are shown in Fig. 4d. From Fig. 4d, we can see that the value of batch size is not the larger the better. When batch size is 8, the model performance is the best and the model performance begins to decline when batch size is 16.

4.6 Time cost analysis

The time cost comparison between the proposed method and other methods is shown in Fig. 5. We notice that ResNet-101, despite being a non-DA method that doesn’t require time-costing iterations, exhibits significantly lower accuracy compared to other methods. Although it has the advantage of being the fastest in terms of time consumption, its performance in terms of accuracy is notably inferior. Regarding the two methods MMT $+$ GRL and MMT $+$ MomentMatching, it is worth noting that they offer significant time consumption advantages. However, their accuracy is disappointingly low. Despite the time-saving benefits, these methods exhibit remarkably poor performance in terms of accuracy. Among the remaining methods with comparable time costs, our method utilizing ResNet-101 achieves the highest accuracy. This highlights its promising efficacy and efficiency in comparison to other approaches. Despite the similar time requirements, our method stands out by delivering superior accuracy, further reinforcing its potential and value.

5. Threats to validity

While the work in this paper provides valuable insights into the research on person re-ID methods in the unsupervised multi-source domain adaptation scenario, we must acknowledge and address potential threats to the validity of our findings. Since this paper is researched in the multi-source and single target scenario, there will inevitably be domain gaps between multiple domains, and the existence of domain gaps will also limit the improvement of model performance. In addition, this paper uses a clustering algorithm to generate pseudo-labels, but the generated pseudo-labels may contain label noise, and the existence of label noise will also affect the performance of the model. Therefore, in the future, we need to find a suitable and effective method to narrow the domain interval between multiple domains and adopt an effective pseudo-label refinement method to reduce the deviation of the model and improve the performance of the model.

6. Conclusion

This paper proposed an unsupervised multi-source domain adaptation method for person re-ID based on sample weighting. Firstly, multiple datasets were used as source domains to make full use of valuable label information. Secondly, the samples of each domain were weighted to reduce the model deviation caused by the difference in the number of samples during training. Next, we used adversarial learning for domain alignment. Finally, in order to verify the effectiveness of the proposed method, we conducted sufficient comparative experiments, ablation experiments and parameter sensitivity experiments on Market, Duke, CUHK and MSMT datasets. The experimental results verified the effectiveness and superiority of the proposed method. Nevertheless, there may be domain gaps between multiple different domains, and in addition, pseudo-labels generated by clustering methods may contain label noise, which can impact the performance of the model. Therefore, in the future work, efforts will be made to address these issues and further improve the proposed method.

Footnotes

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant 62176128, the Open Projects Program of State Key Laboratory for Novel Software Technology of Nanjing University under Grant KFKT2022B06, the Fundamental Research Funds for the Central Universities No. NJ2022028, the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) fund, as well as the Qing Lan Project.

References

Sun

Zheng

Yang

Tian

Wang

, Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline), in: Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 480–496.

Wang

Yuan

Chen

Zhou

, Learning discriminative features with multiple granularities for person re-identification, in: Proceedings of the 26th ACM International Conference on Multimedia, 2018, pp. 274–282.

Zhang

Lan

Zeng

Jin

Chen

, Relation-aware global attention for person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3186–3195.

Tian

Zhang

Lin

J.C.-W.

Zuo

Zhang

Lin

C.-W.

, Generative adversarial networks for image super-resolution: A survey, arXiv preprint arXiv:2204.13620, 2022.

Tian

Zhang

Zuo

Lin

C.-W.

Zhang

Yuan

, A heterogeneous group CNN for image super-resolution, IEEE Transactions on Neural Networks and Learning Systems, 2022.

Fei

Zhang

Tian

Teng

Wen

, Jointly learning multi-instance hand-based biometric descriptor, Information Sciences 562 (2021), 1–12.

Chen

Zhu

Gong

, Instance-guided context rendering for cross-domain person re-identification, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 232–242.

H.-X.

Zheng

W.-S.

, Unsupervised person re-identification by deep asymmetric metric embedding, IEEE Transactions on Pattern Analysis and Machine Intelligence 42(4) (2018), 956–973.

Zhong

Zheng

Luo

Yang

, Invariance matters: Exemplar memory for domain adaptive person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 598–607.

10.

Guo

Feng

Hao

Chen

, JAC-Net: Joint learning with adaptive exploration and concise attention for unsupervised domain adaptive person re-identification, Neurocomputing 483 (2022), 262–274.

11.

Lin

Ren

Yeh

C.-H.

Yao

Song

Chang

, Unsupervised person re-identification: A systematic survey of challenges and solutions, arXiv preprint arXiv:2109.06057, 2021.

12.

Tian

Sun

Cao

Chu

Chen

, Heterogeneous domain adaptation with structure and classification space alignment, IEEE Transactions on Cybernetics 52(10) (2021), 10328–10338.

13.

Tian

Zhu

Sun

Chen

Yin

, Unsupervised domain adaptation through dynamically aligning both the feature and label spaces, IEEE Transactions on Circuits and Systems for Video Technology 32(12) (2022), 8562–8573.

14.

Deng

Zheng

Kang

Yang

Jiao

, Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 994–1003.

15.

Wang

Zhu

Gong

, Transferable joint attribute-identity deep learning for unsupervised person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 2275–2284.

16.

Wei

Zhang

Gao

Tian

, Person transfer gan to bridge domain gap for person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 79–88.

17.

H.-X.

Zheng

W.-S.

Guo

Gong

Lai

J.-H.

, Unsupervised person re-identification by soft multilabel learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2148–2157.

18.

Zhong

Zheng

Yang

, Generalizing a person retrieval model hetero-and homogeneously, in: Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 172–188.

19.

Zhong

Zheng

Luo

Yang

, Invariance matters: Exemplar memory for domain adaptive person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 598–607.

20.

Ding

Duan

, Source-free unsupervised multi-source domain adaptation via proxy task for person re-identification, The Visual Computer, 2022, 1–12.

21.

Song

Wang

Zhang

Huang

Wang

, Unsupervised domain adaptive re-identification: Theory and practice, Pattern Recognition 102 (2020), 107173.

22.

Yang

Zhong

Luo

Sun

Cheng

Guo

Huang

, Asymmetric co-teaching for unsupervised cross-domain person re-identification, Proceedings of the AAAI Conference on Artificial Intelligence 34(07) (2020), 12597–12604.

23.

Chen

, Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification, in: International Conference on Learning Representations, 2020.

24.

Ren

C.-X.

Liu

Y.-H.

Zhang

X.-W.

Huang

K.-K.

, Multi-source unsupervised domain adaptation via pseudo target domain, IEEE Transactions on Image Processing 31 (2022), 2122–2135.

25.

Wei

Yang

Han

, Multi-source Collaborative Contrastive Learning for Decentralized Domain Adaptation, IEEE Transactions on Circuits and Systems for Video Technology, 2022.

26.

Zuo

Zhang

, Dynamic classifier alignment for unsupervised multi-source domain adaptation, IEEE Transactions on Knowledge and Data Engineering, 2022.

27.

Wei

Wang

Zhou

Shi

Huang

T.S.

, Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6112–6121.

28.

Lin

Yan

Yang

, Unsupervised person re-identification via cross-camera similarity exploration, IEEE Transactions on Image Processing 29 (2020), 5481–5490.

29.

Zou

Yang

Kumar

B.V.K.V.

Kautz

, Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification, in: Proceedings of the European Conference on Computer Vision (ECCV), Vol. 12347, 2020, pp. 87–104.

30.

Cho

Kim

W.J.

Hong

Yoon

S.-E.

, Part-based pseudo label refinement for unsupervised person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7308–7318.

31.

Lin

Kot

A.C.

, Multi-task Mid-level Feature Alignment Network for Unsupervised Cross-Dataset Person Re-Identification, in: British Machine Vision Conference 2018, BMVC 2018, 2018, p. 9.

32.

Delorme

Alameda-Pineda

Lathuilière

Horaud

, Camera adversarial transfer for unsupervised person re-identification, arXiv preprint arXiv:1904.01308, 2019.

33.

Wang

Huo

Shi

Geng

Gao

, Adversarial camera alignment network for unsupervised cross-camera person re-identification, IEEE Transactions on Circuits and Systems for Video Technology 32(5) (2021), 2921–2936.

34.

Crammer

Kearns

Wortman

, Learning from multiple sources, Journal of Machine Learning Research 9(8) (2008).

35.

Mansour

Mohri

Rostamizadeh

, Domain Adaptation with Multiple Sources, Advances in Neural Information Processing Systems, 2008, 1041–1048.

36.

Mansour

Mohri

Rostamizadeh

, Domain adaptation with multiple sources, Advances in Neural Information Processing Systems 21 (2008).

37.

Zhao

Zhang

Moura

J.M.

Costeira

J.P.

Gordon

G.J.

, Adversarial multiple source domain adaptation, Advances in Neural Information Processing Systems 31 (2018).

38.

Peng

Bai

Xia

Huang

Saenko

Wang

, Moment matching for multi-source domain adaptation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1406–1415.

39.

Zhao

Wang

Zhang

Song

Chai

Keutzer

, Multi-source distilling domain adaptation, Proceedings of the AAAI Conference on Artificial Intelligence 34(07) (2020), 12975–12983.

40.

Wang

Zhang

, Learning to combine: Knowledge aggregation for multi-source domain adaptation, in: Proceedings of the European Conference on Computer Vision (ECCV), 2020, pp. 727–744.

41.

Chang

W.-G.

You

Seo

Kwak

Han

, Domain-specific batch normalization for unsupervised domain adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7354–7362.

42.

Yang

Balaji

Lim

S.-N.

Shrivastava

, Curriculum manager for source selection in multi-source domain adaptation, in: Proceedings of the European Conference on Computer Vision (ECCV), 2020, pp. 608–624.

43.

Bai

Wang

Ding

, Unsupervised multi-source domain adaptation for person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12914–12923.

44.

Xiao

Zhang

, Dynamic weighted learning for unsupervised domain adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15242–15251.

45.

Zhang

Jiang

Wang

Liu

Gao

et al., Dynamic sample weighting for weakly supervised object detection, Image and Vision Computing 122 (2022), 104444.

46.

Cai

Zhang

Jing

X.-Y.

, Dual Re-Weighting Network for Multi-Source Domain Adaptation, in: 2022 IEEE International Conference on Multimedia and Expo (ICME), 2022, pp. 1–6.

47.

Lü

, Unsupervised double weighted domain adaptation, Neural Computing and Applications 33 (2021), 3545–3566.

48.

Huang

Zhu

Shen

H.T.

, Transfer independently together: A generalized framework for domain adaptation, IEEE Transactions on Cybernetics 49(6) (2018), 2144–2155.

49.

LeCun

Bottou

Bengio

Haffner

, Gradient-based learning applied to document recognition, Proceedings of the IEEE 86(11) (1998), 2278–2324.

50.

Krizhevsky

Sutskever

Hinton

G.E.

, Imagenet classification with deep convolutional neural networks, Communications of the ACM 60(6) (2017), 84–90.

51.

Zhang

Ren

Sun

, Deep residual learning for image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.

52.

Ester

Kriegel

H.-P.

Sander

et al., A density-based algorithm for discovering clusters in large spatial databases with noise, kdd 96(34) (1996), 226–231.

53.

Hoffman

Tzeng

Park

Zhu

J.-Y.

Isola

Saenko

Efros

Darrell

, Cycada: Cycle-consistent adversarial domain adaptation, in: International Conference on Machine Learning, 2018, pp. 1989–1998.

54.

Long

Cao

Wang

Jordan

M.I.

, Conditional adversarial domain adaptation, Advances in Neural Information Processing Systems 31 (2018).

55.

Tang

Jia

, Discriminative adversarial domain adaptation, Proceedings of the AAAI Conference on Artificial Intelligence 34(04) (2020), 5940–5947.

56.

Goodfellow

I.J.

Pouget-Abadie

Mirza

Warde-Farley

Ozair

Courville

A.C.

Bengio

, Generative Adversarial Nets, Advances in Neural Information Processing Systemsï¼Œ 2014, 2672–2680.

57.

Zheng

Shen

Tian

Wang

Tian

, Scalable person re-identification: A benchmark, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2015, pp. 1116–1124.

58.

Ristani

Solera

Zou

Cucchiara

Tomasi

, Performance measures and a data set for multi-target, multi-camera tracking, in: Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 17–35.

59.

Zhao

Xiao

Wang

, Deepreid: Deep filter pairing neural network for person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014, pp. 152–159.

60.

Tang

K.-H.

, Unsupervised person re-identification via nearest neighbor collaborative training strategy, in: 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 1139–1143.

61.

Kingma

D.P.

, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980, 2014.

62.

Zhong

Zheng

Luo

Yang

, Learning to adapt invariance in memory for person re-identification, IEEE Transactions on Pattern Analysis and Machine Intelligence 43(8) (2020), 2723–2738.

63.

Wang

Zhang

, Unsupervised person re-identification via multi-label classification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10981–10990.

64.

Jin

Lan

Zeng

Chen

Zhang

, Style normalization and restitution for generalizable person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3143–3152.

65.

Zhao

Liao

Xie

G.-S.

Zhao

Zhang

Shao

, Unsupervised domain adaptation with noise resistible mutual-training for person re-identification, in: Proceedings of the European Conference on Computer Vision (ECCV), 2020, pp. 526–544.

66.

Zheng

Lan

Zeng

Zhang

Zha

Z.-J.

, Exploiting sample uncertainty for domain adaptive person re-identification, Proceedings of the AAAI Conference on Artificial Intelligence 35(4) (2021), 3538–3546.

67.

Dubourvieux

Audigier

Loesch

Ainouz

Canu

, Unsupervised domain adaptation for person re-identification through source-guided pseudo-labeling, in: 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 4957–4964.

68.

Liu

Nie

Yin

Wang

Gao

Jin

, SSKD: Self-supervised knowledge distillation for cross domain adaptive person re-identification, in: 2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC), 2021, pp. 81–85.

69.

Chen

Wang

Lagadec

Dantcheva

Bremond

, Joint generative and contrastive learning for unsupervised person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2004–2013.

70.

Dai

Liu

Bai

Tong

Duan

L.-Y.

, Dual-refinement: Joint label and feature refinement for unsupervised domain adaptive person re-identification, IEEE Transactions on Image Processing 30 (2021), 7815–7829.

71.

Shen

Guo

Ding

Guo

, Secret: Self-consistent pseudo label refinement for unsupervised domain adaptive person re-identification, Proceedings of the AAAI Conference on Artificial Intelligence 36(1) (2022), 879–887.

72.

Zhang

Liu

Guo

Duan

Long

Jin

, Unsupervised domain adaptation for person re-identification via heterogeneous graph alignment, Proceedings of the AAAI Conference on Artificial Intelligence 35(4) (2021), 3360–3368.

73.

Sun

Chen

Peng

Zhu

, Unsupervised cross domain person re-identification by multi-loss optimization learning, IEEE Transactions on Image Processing 30 (2021), 2935–2946.

74.

Sun

Chen

Peng

Zhu

, Unsupervised Cross Domain Person Re-Identification by Multi-Loss Optimization Learning, IEEE Transactions on Image Processing 30 (2021), 2935–2946.

75.

Xia

Zhu

, Refining pseudo labels for unsupervised domain adaptive person re-identification, IEEE Access 9 (2021), 121288–121301.

76.

Ganin

Lempitsky

V.S.

, Unsupervised domain adaptation by backpropagation, International Conference on Machine Learning 37 (2015), 1180–1189.

Unsupervised multi-source domain adaptation for person re-identification via sample weighting

Abstract

Keywords

1. Introduction

1.1 Research challenges

2.1 Unsupervised domain adaptation in person re-ID

2.2 Multi-source unsupervised domain adaptation

2.3 Sample weighting

3. Methodology

3.1 Overview

Table 1 Definition of symbols involved in this paper

4.1 Datasets and evaluation schemes

Table 3 Experimental results of different quantities of source domain datasets using the sample weighting method on the target domain Market dataset

4.3.1 The necessity of multi-source domains

Table 4 Comparison with the state-of-the-art methods on Duke → Market and Market → Duke tasks

Table 6 Comparison with the state-of-the-art methods on Duke + CUHK + MSMT → Market task

5. Threats to validity

6. Conclusion

Footnotes

Acknowledgments

References

Table 1
Definition of symbols involved in this paper

Table 3
Experimental results of different quantities of source domain datasets using the sample weighting method on the target domain Market dataset

Table 4
Comparison with the state-of-the-art methods on Duke $\to$ Market and Market $\to$ Duke tasks

Table 6
Comparison with the state-of-the-art methods on Duke $+$ CUHK $+$ MSMT $\to$ Market task