Sunday, October 28, 2007

Faster than ruby but scalable

Update 2007-11-01: Correction of typo in wf_pichi3.erl.
@@ -14,7 +14,7 @@

 -compile([native]).

-main([File]) -> start(File), halt().
+main([File]) -> start_bmets(File), halt().

 start_bmets(FileName) ->
     {ok, F} = nlt_reader:open(FileName),
I worked on Wide Finder Project again. But what happen? I improved Anders Nygren's code with suggestion from my previous blog and also big suggestion from Caoyuan's blog. First I made some bricks: chunk_reder.erl with read ahead reading and consequential support, nlt_reader.erl with concurrent new line terminated block splitter and catenator and file_map_reduce.erl engine. And I plugged it together in wf_pichi3.erl wide finder. And what is great? It's about 40% faster on single core than ruby code and still scalable:
$ time ruby1.8 tbray.rb o1M.ap
8900: 2006/09/29/Dynamic-IDE
2000: 2006/07/28/Open-Data
1300: 2003/07/25/NotGaming
800: 2006/01/31/Data-Protection
800: 2003/09/18/NXML
800: 2003/10/16/Debbie
700: 2003/06/23/SamsPie
600: 2006/01/08/No-New-XML-Languages
600: 2005/11/03/Cars-and-Office-Suites
600: 2005/07/27/Atomic-RSS

real    0m7.469s
user    0m6.528s
sys     0m0.940s
$ time erl -noshell -run wf_pichi3 main o1M.ap
8900: 2006/09/29/Dynamic-IDE
2000: 2006/07/28/Open-Data
1300: 2003/07/25/NotGaming
800: 2003/09/18/NXML
800: 2003/10/16/Debbie
800: 2006/01/31/Data-Protection
700: 2003/06/23/SamsPie
600: 2006/01/08/No-New-XML-Languages
600: 2006/09/07/JRuby-guys
600: 2005/07/27/Atomic-RSS

real    0m5.370s
user    0m4.412s
sys     0m0.952s
It's big improvement from my last code, about 365% ;-) Good thing, that it's nice jigsaw and very powerful. I think, it is just what Tim Bray want when started Wide Finder Project.

1 comment:

Tim said...

Hey, I went and got wf_pichi3.erl and this happened:

sca12-3200a-40 ~/> erlc -smp wf_pichi3.erl
./wf_pichi3.erl:17: function start/1 undefined

Huh? Could you email me the code... tim dot bray at sun dot com