Scalable Scala

     

Back to main site

 

Why is using for/foreach on a Range slow?


When running timing tests for the first JVM Concurrency article I was surprised to see the original Scala code taking much longer to execute than the Java code - more than twice as long. I experimented a bit, and found that the performance issue came down to the structure of the loops in the Scala editDistance method. The original idiomatic Scala form of the loops:

      @tailrec
      def distanceByRow(rnum: Int, r0: Array[Int], r1: Array[Int]): Int = {
        if (rnum >= length) r0(limit)
        else {

          // first element of r1 = delete (i+1) chars from target to match empty 'word'
          r1(0) = rnum + 1

          // use formula to fill in the rest of the row
          for (j <- 0 until limit) {
            val cost = if (word(rnum) == targetText(j)) 0 else 1
            r1(j + 1) = min(r1(j) + 1, r0(j + 1) + 1, r0(j) + cost);
          }

          // recurse with arrays swapped for next row
          distanceByRow(rnum + 1, r1, r0)
        }
      }

      // initialize v0 (prior row of distances) as edit distance for empty 'word'
      for (i <- 0 to limit) v0(i) = i

uses for comprehensions, which actually translate into Range(0 to limit).foreach(...). When I converted these into tail recursive method calls (which Scala translates into the equivalent of Java while loops) the performance went up to par with the Java code.

So why the big difference? If you look into the Scala library code you'll see that the Range.foreach() just does a range check and then executes a simple while loop with a mutable control variable, calling the passed-in closure with the variable value each time through the loop. This is the version in use with the 2.10.3 library (it's changed a bit for 2.11.0, by pulling the range check into this method, but without any noticeable effect on performance):

  @inline final override def foreach[@specialized(Unit) U](f: Int => U) {
    if (validateRangeBoundaries(f)) {
      var i = start
      val terminal = terminalElement
      val step = this.step
      while (i != terminal) {
        f(i)
        i += step
      }
    }
  }

The call to foreach() is a one-time thing, so this shouldn't effect the overall time. Looking at the generated bytecode for the closures also doesn't show anything that leaps out at me — here's the closure (i.e., anonymous inner class in Java terms) method called by for (i <- 0 to limit) v0(i) = i:

       public void apply$mcVI$sp(int);
    Code:
       0: aload_0       
       1: getfield      #22                 // Field v0$3:[I
       4: iload_1       
       5: iload_1       
       6: iastore       
       7: return        

So as far as I can see, the only big difference is that this code is being called from the same Range.foreach method that's also being called in many other places, and because of this shared call site the JVM doesn't end up inlining it. That's a shame if so — I'd like the JVM to be smart enough to see it should inline the Range.foreach method at the point of use, and then inline the method from the closure.

Unfortunately it doesn't look like this is handled any better by the Java 8 JVM, either. And there's an open Scala improvement request from 2008 to improve simple for loop performance which has not seen any action: Optimize simple for loops, so there's probably no chance things are going to get better any time soon.

If you're concerned about performance, and low-level loops are a big part of your Scala application, you're probably best off using tail-recursive methods, while loops with a mutable control variable, or one of the macro transforms that changes how the for/foreach is generated by Scala. See this thread from last year for some pointers on this topic: Re: [scala-user] Request for adding a fast for loop to the language. I've also started a new thread on scala-lang to discuss why this case is not handled by hotspot: Range.foreach performance. I'll add a follow-up post later with a summary of the conclusions.

 
 
 
 
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
 

« December 2023
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
      
Today

Valid XHTML or CSS?

[This is a Roller site]
Theme by Rowell Sotto.
 
© dsosnoski