media, modernity, and microcode from your average farmer-turned-programmer

30 January 2007

Erlang fizzbuzzery: Project Euler Problem 1

Another excercise in solving a simple problem using multiple programming idioms in the same language, Erlang. This time the simple problem is Problem 1 from Project Euler: "If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000." In particular, the solutions presented are (1) a simple single-function loop which uses if to check for termination and matching numbers, (2) a simple loop using Erlang's guard syntax, and (3) a few solutions using some of the functions of the lists module, one using Erlang's list comprehension syntax.

Simple loop without guards:

-module(euler_1_loop).

-export([start/0,solve_euler_1/1]).

start() -> io:format("~w~n",[solve_euler_1(1000)]).

solve_euler_1(N) -> solve_euler_1(0,1,N).

solve_euler_1(Sum,Cur,N) ->
if Cur >= N -> Sum;
(0 == (Cur rem 3)) or (0 == (Cur rem 5)) -> solve_euler_1(Sum+Cur,Cur+1,N);
true -> solve_euler_1(Sum,Cur+1,N)
end.
The above does a few things right, such as remembering that all calls in Erlang are call by value and thus the easiest way to implement a recursive loop for this little problem is by simply keeping all the relevant data around as function parameters. It uses Erlang's if to test a series of expressions. As before, the simple module is constructed so that its main path, solve_euler_1/1, is exported so it can be externally tested, and the simple start function is exported so we can simply run the example (after compiling with erlc, of course) with:
erl -s euler_1_loop -run init stop -noshell
We can use Erlang's guard syntax to transform the above code into:
-module(euler_1_guards).

-export([start/0,solve_euler_1/1]).

start() -> io:format("~w~n",[solve_euler_1(1000)]).

solve_euler_1(N) -> solve_euler_1(0,1,N).

solve_euler_1(Sum,Cur,N) when Cur >= N -> Sum;
solve_euler_1(Sum,Cur,N) when (0 == (Cur rem 3)) or (0 == (Cur rem 5)) -> solve_euler_1(Sum+Cur,Cur+1,N);
solve_euler_1(Sum,Cur,N) -> solve_euler_1(Sum,Cur+1,N).
The above is a fairly simplistic example, but after all the problem is itself fairly simplistic. The first match for solve_euler_1/3 checks the end case, for when we have reached the end of the list of numbers we are supposed to be adding, and simply returns our running total. The second match checks to see if the current number is evenly divisible by 3 or 5, and if so, adds the current number to the running total, adds one to the current number, and re-enters the loop, keeping in mind that the matching process will start over again and so the end case will be handled appropriately. The last match simply leaves the running total alone, adds one to the current number, and re-enters the loop.

Using the lists module and Erlang's list comprehension syntax:
-module(euler_1_comprehension).

-export([start/0,solve_euler_1/1]).

start() -> io:format("~w~n",[solve_euler_1(1000)]).

solve_euler_1(N) -> lists:sum([X || X <- lists:seq(1,N-1), (X rem 3 == 0) or (X rem 5 == 0)]).
The above illustrates four important concepts:

lists:sum(List) which returns the sum of each element in a list;

lists:seq(Start,End) which returns a list of numbers from Start through End;

[X || X <- List] which is a simple list comprehension syntax example that actually produces the exact same input List (unlike [math:pow(X,2) || X <- List] which produces a list of the squares of the elements in the input List, or [{n,X} || X <- List] which produces a list of tuples where the first element is the atom 'n' and the second element is the original list element, or [{X,math:pow(X,2)} || X <- List], or...);

[X || X <- List, Expression] which filters the original List by only including those elements for which the given Expression is true, in this case, (X rem 3 == 0) or (X rem 5 == 0).

Similar to the above is using the lists module's filter function:
-module(euler_1_filter).

-export([start/0,solve_euler_1/1]).

start() -> io:format("~w~n",[solve_euler_1(1000)]).

solve_euler_1(N) -> lists:sum(lists:filter(fun(X) -> (X rem 3 == 0) or (X rem 5 == 0) end,lists:seq(1,N-1))).
This introduces the filter function itself, as well as the creation of an anonymous function in Erlang.

Yet another method of solving this problem comes again from the lists module, this time using foldl to accumulate along the list, and showing that anywhere you might declare an anonymous function (such as above) you can also use named functions:
-module(euler_1_foldl).

-export([start/0,solve_euler_1/1]).

start() -> io:format("~w~n",[solve_euler_1(1000)]).

solve_euler_1(N) -> lists:foldl(fun accum_fun/2,0,lists:seq(1,N-1)).

accum_fun(X,Sum) when X rem 3 == 0; X rem 5 == 0 -> X + Sum;
accum_fun(_X,Sum) -> Sum.
Here we have a couple of new things to see, of course the foldl function itself, but also we use a named function, accum_fun, where previously we had declared an anonymous function inline, and we used a special guard syntax where ; is really quite the same as or.

Hopefully this was at least a little helpful in illustrating a few ways that using some of Erlang's features can be fun, simple, fast, and clear. This example didn't really give us an opportunity to look at processes or matching, but I'm sure those will come along again soon enough. [More]

25 January 2007

Erlang fizzbuzzery

Herein, a few different ways of implementing "a" solution to the FizzBuzz problem, showing examples of some different programming styles in Erlang solving the same problem.

Using a loop:

-module(fizzbuzz_with_loop).

-export([start/0]).

start() ->
fizzbuzz(1,100).

fizzbuzz(N,M) when N > M -> noop;
fizzbuzz(N,M) ->
io:format("~s~n",[fizzbuzz(N)]),
fizzbuzz(N+1,M).

fizzbuzz(N) when N rem 3 == 0, N rem 5 == 0 -> "FizzBuzz";
fizzbuzz(N) when N rem 3 == 0 -> "Fizz";
fizzbuzz(N) when N rem 5 == 0 -> "Buzz";
fizzbuzz(N) -> integer_to_list(N).

Compile the module with erlc fizzbuzz_loop.erl and run with erl -s fizzbuzz_loop -run init stop -noshell [1] to produce:
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
...

Using foreach (foreach is used because the order of evaluation is important):
-module(fizzbuzz_with_foreach).

-export([start/0]).

start() ->
lists:foreach(fun(X) -> io:format("~s~n",[fizzbuzz(X)]) end, lists:seq(1,100)).

fizzbuzz(N) when N rem 3 == 0, N rem 5 == 0 -> "FizzBuzz";
fizzbuzz(N) when N rem 3 == 0 -> "Fizz";
fizzbuzz(N) when N rem 5 == 0 -> "Buzz";
fizzbuzz(N) -> integer_to_list(N).

Using list comprehension:
-module(fizzbuzz_with_list_comprehension).

-export([start/0]).

start() ->
lists:foreach(fun(Y) -> io:format("~s~n",[Y]) end, [fizzbuzz(X) || X <- lists:seq(1,100)]).

fizzbuzz(N) when N rem 3 == 0, N rem 5 == 0 -> "FizzBuzz";
fizzbuzz(N) when N rem 3 == 0 -> "Fizz";
fizzbuzz(N) when N rem 5 == 0 -> "Buzz";
fizzbuzz(N) -> integer_to_list(N).

Using processes:
-module(fizzbuzz_with_processes).

% export default start function for command line
-export([start/0]).

% export entry point for spawned processes
-export([fizzbuzz_process/0]).

start() ->
Fizz_PID = spawn(fizzbuzz_with_processes,fizzbuzz_process,[]),
lists:foreach(fun(X) -> ask_fizzbuzz(Fizz_PID,X) end, lists:seq(1,100)),
Fizz_PID ! {done}.

ask_fizzbuzz(Fizz_PID,N) ->
Fizz_PID ! {fizzbuzz,N,self()},
receive {Answer} -> io:format("~s~n",[Answer]) end.

fizzbuzz_process() ->
receive
{fizzbuzz,N,Caller_PID} -> Caller_PID ! {fizzbuzz(N)}, fizzbuzz_process();
{done} -> noop end.

fizzbuzz(N) when N rem 3 == 0, N rem 5 == 0 -> "FizzBuzz";
fizzbuzz(N) when N rem 3 == 0 -> "Fizz";
fizzbuzz(N) when N rem 5 == 0 -> "Buzz";
fizzbuzz(N) -> integer_to_list(N).

With too many processes:
-module(fizzbuzz_with_too_many_processes).

% export default start function for command line
-export([start/0]).

% export entry point for spawned processes
-export([fizzbuzz_fizzer/1]).
-export([fizzbuzz_buzzer/1]).
-export([fizzbuzz_number/1]).
-export([fizzbuzz_printer/0]).

start() ->
Printer = spawn(fizzbuzz_with_too_many_processes, fizzbuzz_printer, []),
Processes = lists:map(fun(F) -> spawn(fizzbuzz_with_too_many_processes, F, [Printer]) end, [fizzbuzz_fizzer,fizzbuzz_buzzer,fizzbuzz_number]),
lists:foreach(fun(X) -> lists:foreach(fun(P) -> P ! {n,X,self()}, receive {ok} -> noop end end, Processes), Printer ! {newline} end, lists:seq(1,100)),
lists:foreach(fun(P) -> P ! {done} end, [Printer|Processes]).

fizzbuzz_fizzer(Printer) ->
receive
{n,N,P} ->
if N rem 3 == 0 -> Printer ! {s,"Fizz"};
true -> noop
end,
P ! {ok},
fizzbuzz_fizzer(Printer);
{done} -> noop end.

fizzbuzz_buzzer(Printer) ->
receive
{n,N,P} ->
if N rem 5 == 0 -> Printer ! {s,"Buzz"};
true -> noop
end,
P ! {ok},
fizzbuzz_buzzer(Printer);
{done} -> noop end.

fizzbuzz_number(Printer) ->
receive
{n,N,P} ->
if (N rem 5 =/= 0) and (N rem 3 =/= 0) -> Printer ! {n,N};
true -> noop
end,
P ! {ok},
fizzbuzz_number(Printer);
{done} -> noop end.

fizzbuzz_printer() ->
receive
{s,S} -> io:format("~s",[S]), fizzbuzz_printer();
{n,N} -> io:format("~w",[N]), fizzbuzz_printer();
{newline} -> io:format("~n",[]), fizzbuzz_printer();
{done} -> noop end.

Why is the above so painful? Mostly because we have to wait for a reply from each fizzbuzz_X process because otherwise some of the processes will send messages to the printer out of order. To compare, see what the output is if you did not wait for an {ok}:
-module(fizzbuzz_with_asynch_processes).

% export default start function for command line
-export([start/0]).

% export entry point for spawned processes
-export([fizzbuzz_fizzer/1]).
-export([fizzbuzz_buzzer/1]).
-export([fizzbuzz_number/1]).
-export([fizzbuzz_printer/0]).

start() ->
Printer = spawn(fizzbuzz_with_asynch_processes, fizzbuzz_printer, []),
Processes = lists:map(fun(P) -> spawn(fizzbuzz_with_asynch_processes, P, [Printer]) end, [fizzbuzz_fizzer,fizzbuzz_buzzer,fizzbuzz_number]),
lists:foreach(fun(N) -> lists:foreach(fun(P) -> P ! {n,N} end, Processes), Printer ! {newline} end, lists:seq(1,100)),
lists:foreach(fun(P) -> P ! {done} end, [Printer|Processes]).

fizzbuzz_fizzer(Printer) ->
receive
{n,N} ->
if N rem 3 == 0 -> Printer ! {s,"Fizz"};
true -> noop
end,
fizzbuzz_fizzer(Printer);
{done} -> noop end.

fizzbuzz_buzzer(Printer) ->
receive
{n,N} ->
if N rem 5 == 0 -> Printer ! {s,"Buzz"};
true -> noop
end,
fizzbuzz_buzzer(Printer);
{done} -> noop end.

fizzbuzz_number(Printer) ->
receive
{n,N} when N rem 5 =/= 0, N rem 3 =/= 0 -> Printer ! {n,N}, fizzbuzz_number(Printer);
{n,_N} -> fizzbuzz_number(Printer);
{done} -> noop end.

fizzbuzz_printer() ->
receive
{s,S} -> io:format("~s",[S]), fizzbuzz_printer();
{n,N} -> io:format("~w",[N]), fizzbuzz_printer();
{newline} -> io:format("~n",[]), fizzbuzz_printer();
{done} -> noop end.

The output of this last program is consistently interleaved. Moral of the story: just because you send a message to each of a set of processes in a particular order, it does not mean that each message will be received or more importantly acted upon in that particular order. Messages to a process are queued in order, but if you send a message M1 to process P1 and another message M2 to process P2, there is no guarantee that process P1 will receive the message before process P2. There is a guarantee that if you send message M1 to process P1 and another message M2 to the same process P1, M1 will be received before M2. But this is a crucial difference.

Lastly, an implementation that works, consisting of far too many processes:
-module(fizzbuzz_with_far_too_many_processes).

-export([start/0]).
-export([fizzbuzz/2]).

start() ->
% spawn 100 processes to compute the 100 terms
lists:map(fun(N) -> spawn(fizzbuzz_with_far_too_many_processes,fizzbuzz,[N,self()]) end, lists:seq(1,100)),
lists:foreach(fun(T) -> io:format("~s~n",[element(2,T)]) end, [I || I <- receive_all(100)]).

receive_all(M) -> receive_all(M,[]).

receive_all(M,L) when M == length(L) -> lists:sort(L);
receive_all(M,L) ->
receive
{N,S} -> receive_all(M,[{N,S}|L])
end.

fizzbuzz(N,P) when N rem 3 == 0, N rem 5 == 0 -> P ! {N,"FizzBuzz"};
fizzbuzz(N,P) when N rem 3 == 0 -> P ! {N,"Fizz"};
fizzbuzz(N,P) when N rem 5 == 0 -> P ! {N,"Buzz"};
fizzbuzz(N,P) -> P ! {N,integer_to_list(N)}.

A key thing to learn from this last listing is that erlang tuples are indexed starting at 1. But that is a gripe for another blog entry.

Notes:
[1] add -smp to the execution of the program to enable Erlang's use of multiple processors. [More]

23 January 2007

Erlang example: "parallel primes"

The code for this example is fairly simplistic and does many things wrong. Nevertheless it does provide some examples of several Erlang constructs. Hopefully this little example is self-explanatory enough to be useful on its own for now, but I hope to now add some commentary as to its construction and flaws. First, it is obviously by far not a good way to generate a list of primes. The point is not to generate primes well, but to demonstrate, using a fairly easy-to-understand problem domain, how one might write a simple program that (1) spawns a set of worker processes to operate upon work items from a queue, (2) accumulate results from their work in an accumulation process, and (3) retrieve those results.

% pprimes.erl
-module(pprimes).

% functions that will be called from command-line must be exported
-export([start/1]).

% functions that will be spawned must be exported
-export([prime/3]).
-export([queue/3]).
-export([accum/1]).

% functions that could be externally testable could be exported
-export([is_prime/1]).
-export([atom_to_integer/1]).
-export([pprimes/2]).

% called from command-line with atoms, e.g. ['10','1000']
start([NProcessesAtom,PLimitAtom]) ->
NProcesses = atom_to_integer(NProcessesAtom),
PLimit = atom_to_integer(PLimitAtom),
T1 = erlang:now(),
L = pprimes(NProcesses,PLimit),
T2 = erlang:now(),
T = timer:now_diff(T2,T1),
io:format("~w processes computing the ~w primes under ~w took ~w microseconds~n",[NProcesses,length(L),PLimit,T]).

pprimes(NProcesses,PLimit) ->

% the simple accumulator process will collect the results
Accum_PID = spawn(pprimes, accum, [[]]),

% the queue process will serve out the next number to check as a prime
Queue_PID = spawn(pprimes, queue, [2,PLimit,self()]),

% spawn the worker processes
lists:map(fun(X) -> spawn(pprimes, prime, [X,Accum_PID,Queue_PID]) end, lists:seq(1,NProcesses)),

% wait for the worker processes to signal completion
wait_primes(NProcesses),

% when the worker processes are done, tell the queue process to complete
Queue_PID ! {done},

% send message to the accumulator process and wait for response
Accum_PID ! {get_primes, self()},
receive {primes,L} -> Accum_PID ! {done} end,
L.

% if there are no more worker processes to wait for...
wait_primes(0) -> noop;

% otherwise keep waiting...
wait_primes(N) ->
receive
{done} -> wait_primes(N-1)
end.

% simple accumulator process:
% when it receives an add_prime message, adds the prime and re-loops with new list
% when it receievs a get_primes message, responds with the current list and re-loops
% when it receives the "done" message, ends.
accum(L) ->
receive
{add_prime,P} -> accum([P|L]);
{get_primes,Client_PID} -> Client_PID ! {primes,L}, accum(L);
{done} -> noop
end.

% queue process end case: when the queue is empty, enter the queue_empty function.
queue(Cur,PLimit,Main_PID) when Cur == PLimit ->
queue_empty(Main_PID);

% queue process main case: wait for next_prime messages and re-enter
queue(Cur,PLimit,Main_PID) ->
receive
{next_prime,Prime_PID} ->
Prime_PID ! {ok,Cur}
end,
queue(Cur+1,PLimit,Main_PID).

% empty queue process: inform next_prime requesters that the queue is empty until we receive the "done" message
queue_empty(Main_PID) ->
receive
{next_prime,Prime_PID} ->
Prime_PID ! {queue_empty, Main_PID},
queue_empty(Main_PID);
{done} -> noop end.

% main worker process: get the next prime from the queue...
prime(X,Accum_PID,Queue_PID) ->
Queue_PID ! {next_prime, self()},
receive
{ok,N} ->
% io:format("worker ~w testing ~w ~n",[X,N]),
case is_prime(N) of
true ->
% io:format("worker ~w found ~w to be prime!~n",[X,N]),
Accum_PID ! {add_prime,N};
false ->
% io:format("worker ~w found ~w to NOT be prime!~n",[X,N]),
noop
end,
prime(X,Accum_PID,Queue_PID);
{queue_empty, Main_PID} -> Main_PID ! {done} end.

% very poor, but we want the process to actually do "some" work...
is_prime(N) when N < 2 -> false;
is_prime(N) when N == 2; N == 3 -> true;
is_prime(N) ->
lists:all(fun(A) -> N rem A =/= 0 end, lists:seq(2, N div 2)).

% utility funtion: we get our arguments as 'atoms' e.g. '12' but we want integers
atom_to_integer(Atom) ->
erlang:list_to_integer(erlang:atom_to_list(Atom)).

If you save the code from this listing as a file "pprimes.erl", you can then compile it from the command line with:
$ erlc pprimes.erl
The following shows an example of running the program from the command line:
$ erl -smp -s pprimes start 100 10000 -run init stop -noshell
You should receive some output that looks something like:
100 processes computing the 1229 primes under 10000 took 1894288 microseconds
And if that isn't "parallel enough", how about just blindly spawning a process for each work item immediately, bypassing the queue altogether?
-module(vpprimes).

-export([start/1]).
-export([vpprime/2]).

start([PLimitAtom]) ->
vpprimes(atom_to_integer(PLimitAtom)).

vpprimes(PLimit) ->
lists:map(fun(N) -> spawn(vpprimes, vpprime, [N,self()]) end, lists:seq(1,PLimit)),
lists:foreach(fun(N) -> io:format("~w~n",[N]) end, [N || N <- receive_all(PLimit,0,[])]).

receive_all(PLimit,N,L) when PLimit == N -> lists:sort(L);
receive_all(PLimit,N,L) ->
receive
{p,P} -> receive_all(PLimit,N+1,[P|L]);
{n,_N} -> receive_all(PLimit,N+1,L)
end.

vpprime(N,R) ->
T = is_prime(N),
if T -> R ! {p,N};
true -> R ! {n,N}
end.

% very poor, but we want the process to actually do "some" work...
is_prime(N) when N < 2 -> false;
is_prime(N) when N == 2; N == 3 -> true;
is_prime(N) ->
lists:all(fun(A) -> N rem A =/= 0 end, lists:seq(2, N div 2)).

% utility funtion: we get our arguments as 'atoms' e.g. '12' but we want integers
atom_to_integer(Atom) ->
erlang:list_to_integer(erlang:atom_to_list(Atom)).

If nothing else, erl -smp -s vpprimes start 10000 -run init stop -noshell will launch 10000 processes to poorly compute a list of primes. Now, is any of this useful? Not for computing primes, certainly. But imagine instead that you wanted to take advantage of an 8-core system and at least speed up the "map" part of a "map-reduce" algorithm. You could now (hopefully) go out and write a "parallel-map" function that applies a function in parallel to all elements of a list, returning the new list. [More]

Labels

Twitter Updates