|
|
ORIGINAL ARTICLE |
|
Year : 2019 | Volume
: 5
| Issue : 2 | Page : 76-84 |
|
Robust point set registration method based on global structure and local constraints
Kai Yang, Yufei Chen, Haotian Zhang, Xianhui Liu, Weidong Zhao
College of Electronic and Information Engineering, Tongji University, Shanghai, China
Date of Web Publication | 23-Sep-2019 |
Correspondence Address: Yufei Chen College of Electronic and Information Engineering, Tongji University, No. 4800, Cao'an Highway, Shanghai 201804 China
 Source of Support: None, Conflict of Interest: None  | Check |
DOI: 10.4103/digm.digm_10_19
Background and Objectives: Point set registration is a very fundamental problem in computer vision. The registration problem can be divided into rigid and nonrigid registration. The transformation function modeling of rigid registration is simpler, whereas the nonrigid registration is better to solve the practical problems. Materials and Methods: We proposed a robust point set registration method using both global and local structures. Here, we use a popular probability model, Gaussian mixture model, to preserve the global structure of point set. Then, we designed a local constraint provided by some neighboring points to maintain the local structure of the point set. Finally, expectation–maximization algorithm is used to update model parameters in our method. Results: First of all, we carried out experiments on the synthesized data, which included four degradation cases: deformation, noise, outlier, and rotation. By comparing the mean and standard deviation of registration errors with the several state-of-the-art methods, our method was proved to have stronger robustness. Then, we conducted experiments on real retinal fundus images, aiming to establish reliable feature point correspondence between the two images. The experimental results show that we perform better when the two images have larger shooting angles and more noises. Conclusions: The Gaussian mixture protects the global structure of the point set, and the local constraints make full use of the local structure, which makes our method more robust. Experiments on synthetic data prove that our method obtains superior results to those of the state-of-the-art methods. Experiments on retinal image data show that our method also performs very well in practical applications.
Keywords: Gaussian mixture model, global structure, local constraints, retinal image registration
How to cite this article: Yang K, Chen Y, Zhang H, Liu X, Zhao W. Robust point set registration method based on global structure and local constraints. Digit Med 2019;5:76-84 |
Introduction | |  |
The so-called point set registration is to establish the correspondence between the two point sets and estimate the transformation function of the model point set to the scene point set. It is a very fundamental problem in computer vision and plays an important role in medical image analysis, object recognition, remote sensing analysis, image retrieval, and so on. By extracting edge points or interest points from the image, the above problems can be transformed into the matching of two point sets.
In general, the registration problem can be divided into two categories, namely rigid and nonrigid registration. For the former, the point set is rigidly transformed, which makes it easy to model the transformation function. For the latter, however, the point set is transformed nonrigidly. Nonrigid registration is complicated and difficult, but it is more suitable for solving many practical problems. At present, the registration method based on probability distribution is a hotspot of nonrigid point set registration. The idea of these methods is to first estimate the probability density of the point set and then establish the corresponding relationship and calculate the transformation function through the probability density. In this article, we also use the probability model to describe the global structure of the point set. In addition, local constraint terms are introduced through adjacent points to help solving the problem.
The first step in applying point set registration to a real task is to detect feature points from images. Feature points mainly include corner points detected by Harris algorithm and features from accelerated segment test (FAST),[1] and interest points detected by the scale-invariant feature transform (SIFT),[2],[3] speeded-up robust features (SURF),[4] and binary robust invariant scalable keypoints.[5] Then, through the registration of feature point set, image registration in practical problems is realized.
In recent years, to improve the performance of high point set registration, excellent methods have been proposed. Iterative closest point[6] is one of the most widely used point set registration algorithms. It selects the closest point to establish the correspondence. Another very classic algorithm, thin-plate spline-robust point matching (TPS-RPM),[7] models the transformation as a TPS and uses deterministic annealing and soft assignment to solve the joint optimization problem of the mapping and correspondence. In Tsin and Kanade,[8] the point set registration method was based on kernel correlation. The cost function is proportional to the correlation of the two kernel density estimate.
The probability methods based on Gaussian mixture model (GMM) is very popular in solving point set registration. This is because the two sets of points to be registered always have similar shape structure and conform to the same probability density model. These methods can fully explore the global structural relationship between the two point sets and ensure that the shape represented by the set of points after registration has the correct corresponding structure. Myronenko and Song[9] proposed the coherent point drift (CPD) algorithm. They considered the model point set as the GMM centroid, and the points in the scene point set were regarded as observed values of the GMM. Jian and Vemuri[10] used two GMMs to represent model point set and scene point set, respectively, and obtained the transformation by minimizing L2 distance between the two corresponding mixtures. In addition, mixture of asymmetric Gaussian (MoAG)[11] and Student's-t mixture model (SMM)[12] are also introduced to point set registration. Wang et al.[13] used two MoAGs to represent two point sets and chose a correlation-based method to estimate the transformation parameters. Zheng et al.[14] proposed pSMM algorithm based on SMM. Then, they used the Dirichlet distribution in the SMM to formulate the various mixture proportion and came up with the DSMM[15] algorithm with better performance.
Research on the local structure information of point sets has always been the focus of point set registration. Belongie et al.[16] introduced the shape context (SC) to measure the similarity between two shapes. Then, Peng et al.[17] and Ma et al.[18] optimized CPD algorithm by determining the membership probability through SC. Zhang et al.[19] used shape distance and Euclidean distance to represent the similarity measures of local and global features, respectively, and established a global-local correspondence based on them. Ge et al.[20] introduced local linear embedding to handle highly articulated nonrigid deformation while sustaining the local structure. Yang et al.[21] introduced local distance to measure local differences from the model point set to scene point set and presented a global and local mixture distance-based nonrigid point set registration method. Bai et al.[22] constructed the local connectivity constraint as a weighted least square error item of k-connected neighbors.
The rest of this article is organized as follows. In Section 2, we review the previous related work. In Section 3, we explain the principle of our algorithm in detail. Section 4 demonstrates our algorithm on synthetic data and real retinal fundus image data. In Section 5, we present the conclusion of this article.
Materials and Methods | |  |
Our method focuses on point set registration using global and local structure information. First, a GMM is used to describe the global structure of point set. Then, local constraints are constructed by adjacent points. Finally, the expectation–maximization (EM) algorithm is used to update the model parameters.
Global structure
For the model point set and the scene point set , we regard X as the centroids of GMM and Y as the data points generated by the GMM. The centroids gradually move toward the data points, and the closer the distance between the centroid and the data point, the higher the probability that two points are corresponding points. The relation between data points and Gaussian centroids can be expressed as:

We use a uniform distribution to deal with outliers and noise. The weight of uniform distribution is set to ω, 0≤ ω ≤1. In addition, the membership probability of the GMM πmn satisfying . Then, the mixture model takes the following form:

SC[7] is used to measure the similarity between two points. The SC of point p establish a log-polar coordinate system with point p as the origin, divide the log-polar coordinates into a histogram of 5 × 12 cells, and use h(k) to represent the number of points in the kth cell of the histogram. The SC difference between point xn and point ym is defined as follows:

Based on Equation (3), we can obtain the SC difference between any two points in the model point set and the scene point set. It is reasonable that the smaller the SC difference between two points xn and ym, the more likely they are corresponding points. The Hungarian method is used to establish the initial correspondence to minimize the total SC difference. Then, we can determine the membership probability πmn as follows:

Where xk denotes the point which y
m corresponds to, parameter τ, 0≤τ≤1, represents the confidence of a correspondence. Note that if a scene point ym does not match a corresponding model point, this can also be considered to have the same probability corresponding to all model points, so πmn is set to a constant 1/N. Obviously, when the total SC difference is small, we can set the parameter τ to a large value to help speed up the solution, whereas we have to set the parameter τ to a small value to avoid getting the wrong answer.
We denote the set of unknown parameters by θ = (Γ, σ2, ω). These parameters will be estimated by minimizing the negative log-likelihood function as follows:

According to the Jensen's inequality, an upper bound of E in Equation (5) is Qg as follows:

Ignoring the constant independent of θ, we rewrite Equation (4) and get the global term of the objective function as follows:

Where (with Mp= M only if ω = 0).
Local constraints
In order to make use of the local structure information of point set, we construct the local constraint through the adjacent points as follows:

where N(xn)k and N(ym)k are the kth closet neighbor point for the xn and ym, respectively. The weight . It is related to the distance between the model point and its neighbor point. The shorter the distance, the stronger the local constraint, and vice versa. T is the translation function and it is defined as:

Then, the definition of the local item of our objective function is shown as follows:

Parameter update
According to Equation (7) and Equation (10), the objective function is formulated as

Where Ф (Γ) is transformation constraint item, η and λ are the coefficients.
The updating of model parameters is solved by EM algorithm which alternates between an expectation step and maximization step. The details of these two steps are represented as follows:
E-Step: The current values θold of the parameter are guessed, and then the posterior probability of the mixture components can be written as:

M-step – The parameter is reestimated by minimizing the objective function Q in Equation (9).
Taking , and σ2 is updated as follows:

Then, taking , and we obtain

After omitting the terms independent of transformation Γ and defining Γ as the sum of the original position and displacement function v: Γ(X) = X + v(X), we obtain

Displacement function v is modeled in reproducing kernel Hilbert space (RKHS). It is defined by a diagonal Gaussian matrix kernel G as follows:

In RKHS, v has the form where c
n represents the coefficient set and the definition of smoothness function is then the Equation (15) can be easily expressed in the following matrix form:

Where , Pm is the m-th row of P, and d (·) is the diagonal matrix.
Let and C can be calculated by the following equation:

Where 1 is a column vector of all ones.
The complete process of our method is shown in [Table 1].
Results | |  |
To evaluate our method, we first design a set of experiments on synthetic contour point set data under degradations, for example, deformation, noise, outlier, and rotation, and then focus on building reliable feature point correspondence for real retinal fundus images.
Root mean-squared error (RMSE) and recall rate[10],[23] are used to evaluate our method. The RMSE is used to measure point set registration performance on synthetic data. Recall rate is used to evaluate our method's ability to find correct correspondences when applied to retinal fundus image registration.

where M is the number of the estimated correspondence and tj is the ground truth that corresponds to xj.

Where TP means true positive, FN means false negative.
In our experiments, the K equals to 3, that is, local constraints are provided by the nearest three neighbor points. The experiments are performed on a PC with 3.2 GHz Intel Core (TM) I5-6500 CPU, 8GB RAM, and Matlab Code.
Results on synthetic data
We first evaluate our method on two synthetic contour point set data: Chinese character[7] and Fish.[24]Chinese character contains 105 points, whereas fish contains 98 points. For both two data models, there are four types of degradation, namely deformation, noise, outlier, and rotation, and each degradation contains five different degrees.
We make a quantitative comparison with four state-of-the-art methods, such as TPS-RPM,[7] GMMREG,[10] CPD,[9] and PR-GLS.[18] In Chinese character and Fish, there are 100 samples for each degradation level. The mean and standard deviation of error s are used to evaluate the performance of all five registration methods. The error bars are shown in [Figure 1]. | Figure 1: The registration performance of our method on Chinese character and Fish
Click here to view |
As can be seen from [Figure 1], our method has achieved good results in all four types of degradation. For deformation, our method has a good effect on Chinese character and a very good effect on Fish. For noise, our method performs better on Chinese character when there is a higher level of noise, and our method always performs very well on Fish. For outliers, our method performs well on Chinese character, whereas for Fish, it is worse than CPD but significantly better than TPS-RPM and GMMREG. For rotations, all the methods perform badly when the rotation angle is large, but our method performs better when the angle is small.
Some of the most representative samples are shown in [Figure 2] and [Figure 3]. In [Figure 2], all the methods have a good effect on the deformation and noise samples, but our method performs better in details. Only PR-GLS and our method are successfully registered in the outlier sample, and in the rotation sample, except for GMMREG, the other three methods and our method have a good performance. In [Figure 3], GMMREG has a poor effect on the deformation and noise samples, while our method has a good effect. In the outlier sample, as before, only PR-GLS and our method are successfully registered. In the rotation sample, all the other four methods fail except our method. As a result, our method shows better robustness. | Figure 2: Point set registration results of our method on the Fish. The degree of deformation is 0.08, the noise level is 0.05, the outlier data ratio is 1.5, and the degree of rotation is 60
Click here to view |
 | Figure 3: Point set registration results of our method on the Chinese character. The degree of deformation is 0.08, the noise level is 0.05, the outlier data ratio is 1.5, and the degree of rotation is 60
Click here to view |
Results on retinal fundus image data
We next test our method on a real retinal fundus image dataset called Fundus image registration dataset (FIRE).[25] The dataset consists of 129 retinal images forming 134 image pairs. These image pairs are split into three different categories depending on their characteristics. The images were taken with a Nidek AFC-210 fundus camera from 39 patients at the Papageorgiou Hospital, Aristotle University of Thessaloniki, Thessaloniki. The complete step of retinal fundus image registration in the experiment is
- Detect feature points by SURF detector
- Extract feature descriptor based on PIIFD[26]
- Feature points matching using our method
- Estimate the transformation using affine and second-order polynomial.
In our experiment, SURF and PIIFD are used to detect points and extract features, respectively. According to the experimental results of Mukherjee et al.,[27] SIFT, FAST, and SURF all detect enough key points. FAST detects the most points, but its invariance against blurring is too bad. SURF detects more points than SIFT, and the points were more evenly distributed, so we choose SURF. PIIFD is short for PIIFD. It is designed by Chen et al. for retinal fundus image registration.
Our purpose is to find correspondence between two sets of feature points with corresponding PIIFD. Here, we use the PIIFD to determine the initial correspondences, and hence, the value of membership probability πmn does not have to be recalculated in each iteration. However, the initial correspondences may contain errors. Our method is used to eliminate errors and find the right correspondences. After the registration using our method, we consider that two points xn and ym correspond to each other if the posterior probability is larger than 0.5. Finally, the right correspondences are used to estimate the transformation. We select the transformation model based on the number of point pairs that have the correct correspondence, three for affine and six for second-order polynomial.
We use Recall Rate to evaluate our method's ability to find the correct correspondences from all the initial correspondences and compare it with Harris-PIIFD,[26] a very famous retinal fundus image registration algorithm. In FIRE, each pair of pictures provides 10 pairs of ground truth corresponding points. If the distance between two ground truth corresponding points is less than the threshold after transformation, it will be regarded as true positive. Otherwise, it will be regarded as false negative.
[Figure 4] shows the variation of the Harris-PIIFD algorithm, and our method recalls rate at different thresholds. As can be seen from [Figure 4], the performance of Harris-PIIFD and our method are similar in Category A and S. In Category S, the registration effect of Harris-PIIFD is better, while in Category A, the performance of our method is better. In Category S, the small overlap area of the two images makes the registration effect of Harris-PIIFD very poor, while our method still performs well. | Figure 4: The registration performance of our method on the retinal fundus image in FIRE
Click here to view |
In [Figure 5], we can see that a large number of feature points were detected in the image samples in the three directories. However, for the image pairs in Category A, the size and distribution of bright spots and dark spots in the two images are greatly different, which makes a lot of noise in the feature point sets. For the image pairs in the Category P, the shooting angles of the two images are quite different, which leads to a large number of outliers in the feature point sets. Therefore, the number of initial correspondences in Category A and P is small, and there are many errors. Hence, we can only find a few correct relations to estimate an inaccurate transformation. For the image pairs in Category S, the similarity between the two images is extremely high, which enables us to obtain enough correct correspondence to estimate an accurate transformation.
Discussion | |  |
Our method has good robustness in synthetic data and has obtained excellent results in deformation, noise, outlier, and rotation experiments. In our method, we use Peng et al.[17] and Ma et al.'s[18] idea: use SC to determine the weight of each Gaussian distribution. To solve the objective function faster and avoid falling into the local optimal solution, we set different values for t according to the size of SC difference values. Experimental results show that the performance of our method in the registration of synthetic data has been significantly improved. When we apply our point set registration method to retinal image registration, we have to admit that the quality of feature point detection and feature extraction method has a certain impact on our method. The better the quality and number of feature points, the better the performance of our method.
Theoretically, the construction of local constraint terms increases the computational burden, which will affect the efficiency of point set registration. However, considering the improvement of point set registration performance, the efficiency reduction is acceptable. In the experiment, although we did not calculate the time efficiency of each algorithm, there was no significant difference between our method and other methods in time efficiency. This is because local constraints can help solve with fewer iterations.
Conclusions | |  |
In this article, we proposed a robust point set registration method based on the global structure and local constraints. Experiments on synthetic data prove that our method obtains superior results to those of the state-of-the-art methods, even if there are significant deformations, noise, outliers, and rotation in the data. The application of retinal fundus images has also achieved very good results.
Acknowledgment
This work was supported by the National High Technology Research, Development Program of China (No. 2017YFB0304102), the Shanghai Engineering Research Center of Industrial Vision Perception & Intelligent Computing (17DZ2251600), the Shanghai Innovation Action Project of Science and Technology (No. 17411952200), and the Fundamental Research Funds for the Central Universities.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References | |  |
1. | Rosten E, Drummond T. Machine Learning for High-Speed Corner Detection. Berlin, Heidelberg: Springer Berlin Heidelberg; 2006. p. 430-43. |
2. | Lowe DG. Object Recognition from Local Scale-Invariant Features. IEEE; 1999. |
3. | Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vis 2004;60:91-110. |
4. | Bay H, Tuytelaars T. SURF: Speeded up Robust Features. Berlin, Heidelberg: Springer Berlin Heidelberg; 2006. p. 404-17. |
5. | Leutenegger S, Chli M, Siegwart RY. BRISK: Binary Robust Invariant Scalable Keypoints. IEEE; 2011. |
6. | Besl PJ, McKay ND. A method for registration of 3-D shapes. IEEE Trans Pattern Anal Mach Intell 1992;14:239-56. |
7. | Chui H, Rangarajan A. A new point matching algorithm for non-rigid registration. Comput Vis Image Underst 2003;89:114-41. |
8. | Tsin Y, Kanade T. A correlation-based approach to robust point set registration. Comput Vis Eccv PT 3 2004;3023:558-69. |
9. | Myronenko A, Song X. Point set registration: Coherent point drift. IEEE Trans Pattern Anal Mach Intell 2010;32:2262-75. |
10. | Jian B, Vemuri BC. Robust point set registration using gaussian mixture models. IEEE Trans Pattern Anal Mach Intell 2011;33:1633-45. |
11. | Kato T, Omachi S, Aso H. Asymmetric Gaussian and Its Application to Pattern Recognition. Berlin, Heidelberg: Springer Berlin Heidelberg; 2002. p. 405-13. |
12. | Svensén M, Bishop CM. Robust Bayesian mixture modelling. Neurocomputing 2005;64:235-52. |
13. | Wang G, Wang Z, Chen Y, Zhao W. A robust non-rigid point set registration method based on asymmetric gaussian representation. Comput Vis Image Underst 2015;141:67-80. |
14. | Zhou Z, Zheng J, Dai Y, Zhou Z, Chen S. Robust non-rigid point set registration using student's-t mixture model. PLoS One 2014;9:e91381. |
15. | Zhou Z, Tu J, Geng C, Hu J, Tong B, Ji J, et al. Accurate and robust non-rigid point set registration using student's-t mixture model with prior probability modeling. Sci Rep 2018;8:8742. |
16. | Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 2002;24:509-22. |
17. | Peng L, Li G, Xiao M, Xie L. Robust CPD algorithm for non-rigid point set registration based on structure information. PLoS One 2016;11:e0148483. |
18. | Ma J, Zhao J, Yuille AL. Non-rigid point set registration by preserving global and local structures. IEEE Trans Image Process 2016;25:53-64. |
19. | Zhang S, Yang Y, Yang K, Luo Y, Ong SH. Point Set Registration with Global-Local Correspondence and Transformation Estimation. IEEE International Conference on Computer Vision (ICCV); 2017. |
20. | Ge S, Fan G, Ding M. Non-Rigid Point Set Registration with Global-Local Topology Preservation. IEEE; 2014. |
21. | Yang Y, Ong SH, Foong KW. A robust global and local mixture distance based non-rigid point set registration. Pattern Recognit 2015;48:156-73. |
22. | Bai L, Yang X, Gao H. Nonrigid point set registration by preserving local connectivity. IEEE Trans Cybern 2018;48:826-35. |
23. | Starck J, Hilton A. Correspondence Labelling for Wide-Timeframe Free-Form Surface Matching. IEEE; 2007. |
24. | Zheng Y, Doermann D. Robust point matching for nonrigid shapes by preserving local neighborhood structures. IEEE Trans Pattern Anal Mach Intell 2006;28:643-9. |
25. | Hernandez-Matas C, Zabulis X, Triantafyllou A, Anyfanti P, Douma S, Argyros AA. FIRE: fundus image registration dataset. J Model Ophthalmol 2017;1:16-28. |
26. | Chen J, Tian J, Lee N, Zheng J, Smith RT, Laine AF, et al. A partial intensity invariant feature descriptor for multimodal retinal image registration. IEEE Trans Biomed Eng 2010;57:1707-18. |
27. | Mukherjee D, Jonathan Wu QM, Wang G. A comparative experimental study of image feature detectors and descriptors. Mach Vis Appl 2015;26:443-66. |
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5]
[Table 1]
|