Hi,
I have started 2 PRs to solve the problem you metioned.
About the "CentroidInitializer" I have a new idea:
Move CentroidInitializers as inner classes of "KMeansPlusPlusCluster",
and add a construct parameter and a property "useKMeansPlusPlus" to "KMeansPlusPlusCluster":
```java
// Add "useKMeansPlusPlus" to "KMeansPlusPlusClusterer"
public class KMeansPlusPlusClusterer<T extends Clusterable> extends Clusterer<T> {
public KMeansPlusPlusClusterer(final int k, final int maxIterations,
final DistanceMeasure measure,
final UniformRandomProvider random,
final EmptyClusterStrategy emptyStrategy,
+ final useKMeansPlusPlus) {
// ...
 // Use Kmeans++ to choose the initial centers.
 this.centroidInitializer = new KMeansPlusPlusCentroidInitializer(measure, random);
+ this.useKMeansPlusPlus = useKMeansPlusPlus;
}
public boolean isUseKMeansPlusPlus() {return this.useKMeansPlusPlus;}
// Make "chooseInitialCenters" packageprivate and call "CentroidInitializer.selectCentroids"
// Then the chooseInitialCenters can be reused by "MiniBatchKMeans".
List<CentroidCluster<T>> chooseInitialCenters(final Collection<T> points){
// Use Kmeans++ to choose the initial centers.
final CentroidInitializer centroidInitializer = useKMeansPlusPlus?
new KMeansPlusPlusCentroidInitializer(this.measure, this.random)
:new RandomCentroidInitializer(this.random);
return centroidInitializer.selectCentroids(points, this.k);
}
// Make CentroidInitializer private
private static interface CentroidInitializer {
<T extends Clusterable> List<CentroidCluster<T>> selectCentroids(final Collection<T> points, final int k);
}
private static class RandomCentroidInitializer implements CentroidInitializer {...}
private static class KMeansPlusPlusCentroidInitializer implements CentroidInitializer {...}
```
The "CentroidInitializer" only used in "KMeansPlusPlusClusterer" and "MiniBatchKMeans",
the other kmeans based algorithm use "KMeansPlusPlusClusterer" as a parameter.
```java
// Changes in "MiniBatchKMeansClusterer"
public class MiniBatchKMeansClusterer<T extends Clusterable>
public MiniBatchKMeansClusterer(final int k,
final int maxIterations,
final int batchSize,
final int initIterations,
final int initBatchSize,
final int maxNoImprovementTimes,
final DistanceMeasure measure,
final UniformRandomProvider random,
final EmptyClusterStrategy emptyStrategy,
+ final useKMeansPlusPlus) {
 super(k, maxIterations, measure, random, emptyStrategy);
+ super(k, maxIterations, measure, random, emptyStrategy, useKMeansPlusPlus);
//...
}
//...
private List<CentroidCluster<T>> initialCenters(final List<T> points) {
//...
 final List<CentroidCluster<T>> clusters = getCentroidInitializer().selectCentroids(initialPoints, getK());
+ final List<CentroidCluster<T>> clusters = chooseInitialCenters(initialPoints);
//...
}
}
```

Le mar. 24 mars 2020 à 06:39
To unsubscribe, email: [hidden email]
For additional commands, email: [hidden email]


