Sunday, October 7, 2007

Is bfile faster than old erlang file?

Steve Vinoski is using klacke’s bfile module in his Wide Finder Project work, but I don't know why bfile should be faster than erlang OTP file. Well, then I measured. I tried Steve's read test and my test on my old home desktop (model name : AMD Athlon(tm) processor, stepping : 2, cpu MHz : 1199.805, cache size : 256 KB).
-module(readold).
-export([start/1, start/2]).
-compile([native]).

scan_file(F, Readsize, Total) ->
   Rd = file:read(F, Readsize),
   case Rd of
       {ok, Bin} -> scan_file(F, Readsize, size(Bin)+Total);
       eof -> Total
   end.
scan_file(F, Readsize) -> scan_file(F, Readsize, 0).

start(File, Readsize) ->
   {ok, F} = file:open(File, [raw, binary, read]),
   T = scan_file(F, Readsize),
   io:format("read ~p bytes~n", [T]),
   file:close(F).
start(File) ->
   start(File, 512*1024).
And there are results here:
2> timer:tc(readold,start,["o1M.ap"]).
read 200995500 bytes
{1041306,ok}
3> timer:tc(readold,start,["o1M.ap"]).
read 200995500 bytes
{836876,ok}
4> c(readold).
{ok,readold}
5> timer:tc(readold,start,["o1M.ap"]).
read 200995500 bytes
{837501,ok}
6> timer:tc(read,start,["o1M.ap"]).  
read 200995500 bytes
{1353678,true}
7> timer:tc(read,start,["o1M.ap"]).
read 200995500 bytes
{1237174,true}
8> timer:tc(read,start,["o1M.ap"]).
read 200995500 bytes
{1318029,true}
9> timer:tc(readold,start,["o1M.ap"]).
read 200995500 bytes
{856662,ok}
In generally, I don't know why erlang's file should be slower. I don't know why bfile is 45% slower than file on my old home desktop, but why should be faster anywhere? I tested it on Linux, may be bfile using BSD file implementation is faster on Darwin aka BSD clone? The file implementation is fast enough on my Erlang/OTP R11B-5.

1 comment:

Caoyuan Deng said...

Hi pichi,

I agree with your conclusion about the file reading. I think what bfile try to resolve is the bad file:get_line performance. bfile:get_line should be much faster than file:get_line.

In summary, the total problem is still the Erlang's processing performance for large Binary or List