Skip to content

str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count depending on the string size #220

@ilyazub

Description

@ilyazub

str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count (ref: https://serpapi.com/blog/lines-count-failed-deployments/). The speed difference grows with the lines count.

$ ruby tmp/string_count_benchmark.rb
Warming up --------------------------------------
  String#count('\n')    86.000  i/100ms
   String#lines.size     1.000  i/100ms
  String#lines.count     1.000  i/100ms
String#each_line.count
                         1.000  i/100ms
Calculating -------------------------------------
  String#count('\n')    771.031  (± 6.6%) i/s -      3.870k in   5.041849s
   String#lines.size      4.785  (± 0.0%) i/s -     24.000  in   5.037242s
  String#lines.count      4.513  (± 0.0%) i/s -     23.000  in   5.112095s
String#each_line.count
                          4.763  (± 0.0%) i/s -     24.000  in   5.075882s

Comparison:
  String#count('\n'):      771.0 i/s
   String#lines.size:        4.8 i/s - 161.12x  (± 0.00) slower
String#each_line.count:        4.8 i/s - 161.87x  (± 0.00) slower
  String#lines.count:        4.5 i/s - 170.86x  (± 0.00) slower

Benchmark code:

require "benchmark/ips"

HTML = "\nruby\n" * 1024 * 1024

def fastest
  HTML.count("\n")
end

def faster
  HTML.each_line.count
end

def fast
  HTML.lines.length
end

def slow
  HTML.lines.size
end

Benchmark.ips do |x|
  x.report("String#count('\\n')")     { fastest }
  x.report("String#lines.size")       { faster  }
  x.report("String#lines.count")      { fast    }
  x.report("String#each_line.count")  { slow    }
  x.compare!
end

I'd like to add this benchmark to fast-ruby. Wdyt?


Based on our updates to the @guilhermesimoes' very helpful gist: https://gist.github.com/guilhermesimoes/d69e547884e556c3dc95?permalink_comment_id=4687645#gistcomment-4687645

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions