Friday, November 16, 2007

How will we adapt to the future?

When I was at QCon San Francisco last week, Martin Fowler put together a panel of 4 people that had been speakers at the conference. They accepted questions from the crowd, and I asked a question about the future of software development given recent changes in parallelism. I want to ask that same question here (with a little bit of preliminary set-up first).

There is general consensus that the current set of programming languages and paradigms is not well suited to concurrent/parallel programming - almost all of these languages were created with sequential, single-threaded development in mind. Given that it is now almost impossible to get a single-core CPU, and that the number of cores on a computer is going to keep expanding dramatically, we need to do something to address this pain point.

There seem to be four major things that we can change to address the parallelism/concurrency problem:

  1. The programming language
  2. The execution environment
  3. The abstractions and APIs
  4. The programmer

Which do you think is the most likely to succeed, and why?

3 comments:

Anonymous said...

For 90% of systems it will be the programming languages and the environments. We already have *really* good ways of doing this (we have for a long time). Namely the lambda and the pi calculi.

Functional code is becoming more and more popular either in entirety or in concept alone (see linq and the functional concepts being introduced in .NET). This will be a trend namely because it becomes fairly trivial (no offense to the people actually implementing it:)) to create a parallel version of these systems (see PLinq). This allows the parallelism to be transparent.

I was at this panel and tried to re-focus your question a bit ...

To me the multi-core problem is not too much of a problem right now. We will however see many more problems as we become pushed up against Amdahl's law http://en.wikipedia.org/wiki/Amdahl's_law which states that every process eventually becomes serial (there is a maximum amount of parallelization for any process...)

Our problems are becoming not only more complex but more serially complex. Up until now faster processors have allowed us to handle these more serially complex problems but all of that is going to change. Now our more serially complex problems will begin to introduce latency ... Sure we can run 1000 of them at a time but each will take longer. This has been hidden up until now by the increases in processor speed.

Ryan Slobojan said...

I'm not convinced that functional languages are the way this is going to be solved - it just doesn't really map to the way most people think. Functional languages have been around for a very long time (I'd guess at least 30 years), but they haven't really gained traction except in specialized verticals (like Erlang in phone switches).

Also, on the page that you mentioned, take a look at the limitations that are mentioned - they give an example of where the speedup actually goes beoynd the theoretical maximum due to speedups in other components of the system. I think that, with sufficient optimization and program redesign, we can stave off Amdahl's Law by making more and more of the program parallelizable.

Can you expand on what you mean about serial complexity and latency? I believe that few processes that are currently serial necessarily have to remain serial - there are creative ways to parallelize many things (just look at CPU pipelining and predictive branching). It might make sense to do such things at the code level when you get to the point that cores are very cheap.

Travelling Greg said...

I think you misunderstand me on functional languages. I am not saying that we will all be using ML in 5 years :) I am saying we will see a continued trend of hybrid languages (like what the .NET languages are becoming) in places where it makes sense. LINQ was a great example of this .... and soon PLINQ will come play happily in a parallel world.

There will also be many simple compiler optimizations coming soon (such as auto-threading loops see Annex F of ECMA 335).

I think you misunderstood the section "Limitations" where super-linear time is met ... this does not affect the basic principle of the law in that nearly all problems eventually reach a saturation point of being parallel. I will give an example:

let's say I have task ... it needs to get 5 values and add them together. Let's imagine I have no overhead for communications to make this parallel...

task executes 5 concurrent tasks then rendezvous waiting for them ... my runtime can be define by the MAX(subtask) + C where C is the constant amount of time for me to start the dispatch (we assumed 0) and the time to add the last + return the final value...

My point is that as we get more difficult problems we don't just make it so now it has 10 things that it can do in parallel but another 5 things which depend on the result from the first 5 things. Because they depend on the result we end up with a process where we now ...

dispatch 5 ..
rendezvous
dispatch 5 ...
rendezvous ...
return ...

Our problem has become serially more complex. Given with a 72 core machine we could just as easily dispatch 72 and that is good we can do nothing about the fact that are problem eventually becomes serial.

If you look back our problems become both serially and parallelly more complex. Up until now in well parallelized code the serial complexity increases have been hidden by things like faster processors but that will no longer be the case. Now as these problems get more serially complex we will gain latency (they take longer to execute). We will be able to execute *many* of these reduced to serial problem at once but each individual task will take longer and longer to run.

Just to be clear when discussing Amdahl my worry is more so the theoretical problem that processes can only be reduced to a maximum point of parallelism and once they hit that point it is the serial complexity which dictates how long they take to run. Without the benefit of increasing core speeds we will very quickly run into this wall as desired processes have been increasing in serial complexity at about the same rate as chips have been improving.