# [math] Matrix parallel operations

7 messages
Open this post in threaded view
|

## [math] Matrix parallel operations

 Hi, Hope ya'll are having an awesome new year! Some matrix operations, like createRealIdentityMatrix can be turned into one liners like this:         IntStream.range(0, dimension).forEach(i -> m.setEntry(i, i, 1.0)); And can be performed in parallel like this:          IntStream.range(0, dimension).parallel().forEach(i -> m.setEntry(i, i, 1.0)); Applying that approach in general we could probably create a ParallelXxxRealMatrix fairly easily.   Just thought I'd float the idea. Cheers, Ole --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Open this post in threaded view
|

## Re: [math] Matrix parallel operations

 On Sat, Jan 2, 2016 at 4:38 AM, Ole Ersoy <[hidden email]> wrote: > Hi, > > Hope ya'll are having an awesome new year! > > Some matrix operations, like createRealIdentityMatrix can be turned into one > liners like this: > >        IntStream.range(0, dimension).forEach(i -> m.setEntry(i, i, 1.0)); > > And can be performed in parallel like this: >         IntStream.range(0, dimension).parallel().forEach(i -> m.setEntry(i, > i, 1.0)); > > Applying that approach in general we could probably create a > ParallelXxxRealMatrix fairly easily.   Just thought I'd float the idea. > > Cheers, > Ole > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: [hidden email] > Once we go for Java 8, yes. However, all these constructs have some overhead. I doubt that the naive parallelization using parallel() is faster than the sequential counterpart, especially for operations such as matrix initializations. A more efficient parallel implementation would require the definition of a Spliterator that divides an operation into less but larger chunks of work in order to amortize synchronization costs. In general, the implementation of efficient and well scaling parallel matrix operations requires more work than writing just a single line of code. Otmar --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Open this post in threaded view
|

## Re: [math] Matrix parallel operations

 Hi Otmar, On 01/02/2016 10:33 AM, Otmar Ertl wrote: > On Sat, Jan 2, 2016 at 4:38 AM, Ole Ersoy <[hidden email]> wrote: >> Hi, >> >> Hope ya'll are having an awesome new year! >> >> Some matrix operations, like createRealIdentityMatrix can be turned into one >> liners like this: >> >>         IntStream.range(0, dimension).forEach(i -> m.setEntry(i, i, 1.0)); >> >> And can be performed in parallel like this: >>          IntStream.range(0, dimension).parallel().forEach(i -> m.setEntry(i, >> i, 1.0)); >> >> Applying that approach in general we could probably create a >> ParallelXxxRealMatrix fairly easily.   Just thought I'd float the idea. >> >> Cheers, >> Ole >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> > Once we go for Java 8, yes. However, all these constructs have some > overhead. I doubt that the naive parallelization using parallel() is > faster than the sequential counterpart, especially for operations such > as matrix initializations. I'm sure it depends...But just to find out whether there is any merit to it I ran the identity matrix code single threaded and parallel (With JMH), and these were the results (On my HP Envy laptop 8 cores): # Run complete. Total time: 00:14:08 Benchmark                          Mode  Cnt      Score     Error Units MyBenchmark.parallelIdentity      thrpt  200  18875.739 ± 400.382 ops/s MyBenchmark.singleThreadIdentity  thrpt  200  11046.079 ± 130.713 ops/s So almost twice the throughput with parallel().  This was on a 20K X 20K matrix.  I'll paste the test code at the bottom. >   A more efficient parallel implementation > would require the definition of a Spliterator that divides an > operation into less but larger chunks of work in order to amortize > synchronization costs. In general, the implementation of efficient and > well scaling parallel matrix operations requires more work than > writing just a single line of code. I'm sure you are right.  I'm going to test array based matrix vector multiplication next.  I'm curious to see what happens once some basic operations are added. Cheers, Ole public class IdentityBenchmark {      @State(Scope.Thread)      public static class Identity {          public int dimension = 20000;          public double[][] m = new double[dimension][dimension];      }      @Benchmark      public void singleThreadIdentity(Identity m) {          IntStream.range(0, m.dimension).forEach(i -> m.m[i][i] = 1.0);      }      @Benchmark      public void parallelIdentity(Identity m) {          IntStream.range(0, m.dimension).parallel().forEach(i -> m.m[i][i] = 1.0);      } } --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Open this post in threaded view
|

## Re: [math] Matrix parallel operations

 In reply to this post by Otmar Ertl Hi, I ran another test using a single parallel loop for array based matrix vector multiplication.  Throughput almost tripled (Test pasted at bottom): # Run complete. Total time: 00:13:24 Benchmark                                      Mode  Cnt Score Error  Units MultiplyBenchmark.parallelMultiplication      thrpt  200  2221.682 ± 48.689  ops/s MultiplyBenchmark.singleThreadMultiplication  thrpt  200   818.755 ±  9.782  ops/s public class MultiplyBenchmark {      public static double[] multiplySingleThreaded(double[][] matrix, double[] vector) {          return Arrays.stream(matrix)                  .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col]                          * vector[col]).sum())                  .toArray();      }      public static double[] multiplyConcurrent(double[][] matrix, double[] vector) {          return Arrays.stream(matrix).parallel()                  .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col]                          * vector[col]).sum())                  .toArray();      }      @State(Scope.Thread)      public static class Matrix {          static int size = 10000;          static double[] vector = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };          public static double[][] matrix = new double[size][10];          static {              for (int i = 0; i < size; i++) {                  matrix[i] = vector.clone();              }          }      }      @Benchmark      public void singleThreadMultiplication(Matrix m) {          multiplySingleThreaded(m.matrix, m.vector);      }      @Benchmark      public void parallelMultiplication(Matrix m) {          multiplyConcurrent(m.matrix, m.vector);      } } Cheers, Ole --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Open this post in threaded view
|

## Re: [math] Matrix parallel operations

 Am 03.01.2016 7:49 vorm. schrieb "Ole Ersoy" <[hidden email]>: > > Hi, > > I ran another test using a single parallel loop for array based matrix vector multiplication.  Throughput almost tripled (Test pasted at bottom): > > # Run complete. Total time: 00:13:24 > > > Benchmark                                      Mode  Cnt Score Error Units > MultiplyBenchmark.parallelMultiplication      thrpt  200  2221.682 ± 48.689  ops/s > MultiplyBenchmark.singleThreadMultiplication  thrpt  200   818.755 ± 9.782  ops/s > > public class MultiplyBenchmark { > >     public static double[] multiplySingleThreaded(double[][] matrix, double[] vector) { >         return Arrays.stream(matrix) >                 .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col] >                         * vector[col]).sum()) >                 .toArray(); >     } > >     public static double[] multiplyConcurrent(double[][] matrix, double[] vector) { >         return Arrays.stream(matrix).parallel() >                 .mapToDouble(row -> IntStream.range(0, row.length).mapToDouble(col -> row[col] >                         * vector[col]).sum()) >                 .toArray(); >     } > >     @State(Scope.Thread) >     public static class Matrix { >         static int size = 10000; >         static double[] vector = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; > >         public static double[][] matrix = new double[size][10]; > >         static { >             for (int i = 0; i < size; i++) { >                 matrix[i] = vector.clone(); >             } >         } >     } > >     @Benchmark >     public void singleThreadMultiplication(Matrix m) { >         multiplySingleThreaded(m.matrix, m.vector); >     } > >     @Benchmark >     public void parallelMultiplication(Matrix m) { >         multiplyConcurrent(m.matrix, m.vector); > >     } > } > > Cheers, > Ole > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > For additional commands, e-mail: dev-help@ <[hidden email]> commons.apache.org <[hidden email]> > I am curious to see how this compares to simple for-loops which I can imagine help the JIT compiler to do loop unrolling and to make use of instruction-level parallelism. Otmar