Monday, June 23, 2008

Red Hat Enterprise MRG Presentations From the 2008 Red Hat Summit

I've posted online the presentations that we just did at the 2008 Red Hat Summit about Red Hat Enterprise MRG. You can download them at:

Thursday, June 19, 2008

Red Hat Enterprise MRG v1 is Released

Today marks the release of version 1 of Red Hat Enterprise MRG, our high performance distributed computing platform that integrates Messaging, Realtime, and Grid technologies. Red Hat has been working across each of these technologies for years, so we're excited to be launching the initial release at the Red Hat Summit.

We've got some pretty impressive performance results, customers, partners, and use cases for MRG. For details, see:

Additionally, if you happen to be at the Red Hat Summit, we're featuring MRG pretty prominently:
  • Our CEO, Jim Whitehurst, highlighted MRG Messaging and AMQP yesterday in his keynote as an example of a customer (JPMC) contributing to open source
  • Our CTO, Brian Stevens, featured MRG in this morning's keynote
  • We have several sessions on MRG
  • We are doing MRG demos at the Red Hat booth in the Expo Hall
  • Cisco is a sponsor at the Summit and is demonstrating their AON Message Bus Interconnect (MBI) solution. Cisco is debuting support for Red Hat Enterprise MRG Messaging in their AON MBI product at the Summit and demonstrating this in the Expo Hall.
  • IBM is a sponsor at the Summit and is demonstrating their WebSphere Real Time, which is an RTSJ-compliant realtime JVM. IBM supports WebSphere Real Time exclusively on Red Hat Enterprise MRG. They have also been a strong development partner with Red Hat around realtime, and they are a winner in this year's Red Hat Innovation Awards for this work. IBM is demonstrating WebSphere Real Time in the Expo Hall.
Congratulations to the entire MRG team for this fantastic release!

Monday, June 16, 2008

Red Hat Summit 2008 (and FUDCon!)

Tomorrow is the start of the 2008 Red Hat Summit in Boston. There are going to be several sessions related to Red Hat Enterprise MRG there:

  • Thursday 1:30pm: Realtime Linux: Who, What, When, Where and Why by Clark Williams. Clark is the tech lead for realtime at Red Hat, so he'll have a lot of good stuff to say about performance results, how we've developed realtime, what's happening in the open source community, what's planned for the future, and so on.
  • Thursday 4:00pm: Red Hat Enterprise MRG Overview by Carl Trieloff. Carl is the technical director and visionary behind MRG, so this will be a great opportunity to hear first-hand about the origins, successes, and benefits of MRG. Way back, I spent over a year working to get Carl into Red Hat to launch and drive our MRG initiatives. Now, after creating AMQP, starting new open source projects, bringing realtime to maturity, and signing our partnership with the University of Wisconsin around Condor, we are starting to see significant traction around MRG.
  • Friday 9:00am: Dynamic Grid Computing With Red Hat Enterprise MRG & Amazon EC2 by Bryan Che. That's me! I hope you can get up early enough to attend my session. I'll be presenting on the work we've been doing to enable dynamically provisioning grid capacity at Amazon EC2's cloud infrastructure right from your MRG Grid's Condor scheduler. This will enable enterprises to add capacity dynamically to existing data centers or even to provision entire grids on-demand in the cloud. Cloud computing is hot these days, and we are seeing a lot of customer interest in MRG's integration with EC2.
This week also marks the start of FUDCon 2008 in Boston. Matt Farrellee, who is our tech lead for Condor and MRG Grid, will be coming to town to help lead discussions on implementing Fedora Nightlife. Of course, I'll be there too.

Thursday, June 5, 2008

Fedora Nightlife Article on lwn.net

There's a nice, detailed article about Fedora Nightlife on lwn.net: http://lwn.net/SubscriberLink/284887/b05744ca15f41a52/.

Sunday, June 1, 2008

Fedora Nightlife and Energy Usage

Wow, lots of response to my blog post about Nightlife! It's great to see so much interest right at the start. There are a lot of questions, but many of the conversations around these should happen on the Fedora Nightlife mailing lists as they're not just for me to answer. Also, I've now created an initial Wiki page for Nightlife (https://fedoraproject.org/wiki/Nightlife), so a lot of information will ultimately go over there. I will, however, blog about some of the topics that have stirred more discussion.

I'll start with one of the questions that always seems to arise when people talk about harvesting idle computing capacity: energy usage. Specifically, isn't it a waste of energy to leave your computer running when you're not using it so that others can leverage it for distributed computation? This is a complex issue with a complicated answer: it sometimes is a waste of energy, but it doesn't have to be a waste and can even save energy in the long run.

Cycle harvesting is sometimes a waste of energy
Let's start with the obvious: harvesting idle computer capacity across many--perhaps millions--of computers can definitely waste energy. Computers that might otherwise have been turned off are now running at full power crunching data for projects that may not be useful. Furthermore, these computers won't all be fully utilized 100% of the time, so there will be many instances of computers running with nothing to do but waste energy. Yes, unfortunately, cycle harvesting can and often does waste energy.

Cycle harvesting doesn't have to be a waste of energy
Cycle harvesting can waste energy, but it doesn't have to do so. I hope that as we work on Nightlife, this will prove to be the case.

There are many worthwhile tasks which can only be accomplished by heavy computation. A lot of fundamental research today in biology or healthcare, for example, requires access to large computer grids. If you believe that this type of research is a worthy use of energy, then the issue of wasting energy becomes an engineering problem of utilization and efficiency. That is, there are certain tasks for which it is worthwhile to let others use your available computing power. As long as these tasks fully utilize your computer when it is idle and they do so in the most efficient manner, then they aren't really wasting energy. More concretely, what if finding a cure to cancer required a lot of computational modeling? Would it be a waste of energy if there was a good project devoted to harvesting idle capacity to find such a cure, and it did so in a way that fully utilized all the computers which were donating capacity in an efficient manner?

If the keys to preventing energy waste in cycle harvesting are utilization and efficiency for worthwhile projects, then this is a problem we can address for Nightlife in a variety of ways. For example, the Condor scheduler is highly adept at maximizing resources efficiently. Given enough tasks or projects, we should be able to use all the resources available to Nightlife efficiently. The challenges will come from finding enough good projects and work for which people can donate their computing capacity. As long as we've got a good queue of work, we should be able to ensure that all the computers donating to Nightlife are doing something worthwhile and not just sitting around.

There are also many things that we can do at Fedora to increase further our ability to utilize resources efficiently. For example, we could explore waking computers to execute tasks that the owners of those computers deem worthwhile; otherwise, those computers will be in a suspended or low/no-power mode. At Fedora, we can drive the Linux operating system to be much more efficient in how it uses power while doing computations. And, Fedora's patron, Red Hat, has strong relationships with and influence over major hardware manufacturers and customers of grids. As commercial enterprises also look at how to save energy while doing their own grid computations, we have an opportunity to lead the way in developing and demonstrating the best techniques for doing this in an earth-friendly way.

Cycle Harvesting Can Save Energy
Much of the work that would run on Fedora Nightlife is going to be computed one way or another. A project like Nightlife, however, can not only help speed these computations by providing additional processing power, it can also help save total energy usage in the long run.

If you've ever visited a large data center, then you know that the energy usage of of the individual computers in that data center is only a fraction of the total energy the data center consumes. When you put tens of thousands of computers together in a single room, then many other energy hogs come into play. Foremost is cooling--large data centers require massive amounts of redundant air conditioning systems to prevent the computers from overheating as they process in close proximity to each other. There are also many other devices that draw power: the numerous network switches and routers connecting the computers, all the devices that monitor the data center's health and security, the redundant power supplies that keep the data center operating in the event of a power failure, and so on.

If a project were to distribute its computations over many individual, geographically dispersed computers and didn't need to build out a large data center for all its work, then it would no longer have to use as large a cooling center or provide as much backup power or do any of the other energy-consuming things that putting so many computers close together requires. Instead, by distributing its work over a number of individual machines through Nightlife, a project could cut down on its total energy required per computation.

Another way that Nightlife can provide energy benefits over a dedicated data center is by avoiding a concentrated usage of power in a single geographic location. I once visited a large Internet company in a power crisis because its host city's power grid could provide it with no additional electricity to grow--the company had totally maxed out the available electricity to it. Even if Nightlife didn't save total energy usage but increased the amount of energy required per calculation (which, as I've argued above, it doesn't have to do), this could still provide a better overall environmental impact. Rather than concentrating all its energy use in one place, a project could distribute and amortize its energy impact across a much larger area by leveraging Nightlife.

You don't have to participate, but you can contribute
Finally, maybe you fundamentally believe that there is nothing worthwhile for running on a large computer grid--no matter how noble the task--because of the energy required for running a grid. That's fine--you don't have to donate capacity to Nightlife. But, pragmatically, you have to agree that people are going to compute certain things one way or another. At Fedora, we have a tremendous opportunity to improve the power usage of grid computing in general. So, even if you don't donate idle capacity to Nightlife, please consider helping Fedora as a whole become the most energy-efficient platform for computation.