Long time not here

AndyE

Limp Gawd
Joined
May 13, 2013
Messages
276
I just wanted to show a sign of life after staying away for approx a year and share how my HW had been used since I left the FAH crowd in these 12 months.

Reading through some of the threads the departure from the FAH project seems to be a wider spread movement - showing the complexity and effort required to run a large dispersed community for a longer period.

With all the HW investments for folding and sorting still available, some of the systems got reconfigured for different objectives - supporting students in local universities in their quest for computational resources to conduct research based on simulations.

While I enjoyed the work in distributed computing for a while, I found this second phase more rewarding. By not only donating my systems for research done by students I had the pleasure to meet, those projects were insightful and fun as well.

Take for instance a student working in bio informatics utilizing Gromacs or Rosetta to simulate proteins for a particular research interest or exam. Gromacs has a significantly different CPU/GPU computational pattern than FAH and so the FAH optimized machines were quite sub-optimal for Gromacs code. Working with students to identify barriers or bottlenecks , reconfiguring systems, optimizing code bases to different GPU microarchitecures, tuning HW appropriately is such a fun past time, that I might stick for the next year or so with this approach, until my HW arsenal ages out.

I appreciate and value all of you what you are doing to support distributed compute projects around the globe, but wanted to share this opportunity to support also local research and local students as well. It is real fun ...

Some of the experiences:
  • How badly Gromacs scale to run efficiently on 4-socket machines (it does well on 2-socket servers), due to lack of reasonable NUMA optimization
  • Rosetta "loves" Intel CPUs, but lacks NUMA optimization as well
  • How much more mature the CUDA ecosystem and programming model currently is vs. OpenCL.
  • How little scientists and students considered computational intensity in home grown algorithms (none used watt meters ...)
  • The incredible efficiency and performance of the Maxwell GPU in Gromacs (BTW, will pick up my EVGA overclocked GTX 980 tomorrow)
  • How universally balanced the relatively old Radeon 7970 GE is for a range of OpenCL codes.
  • That quite a few codes run almost at the same speed on the 4-core 4790K and 6-core 3930K (and how hot the 4790K gets)
  • The convenience, utility and stability of running dual-socket water-cooled Xeon systems in the Coolermaster HAF BX cube including 2 GPUs (with small ATX Asus motherboards)
  • The importance of system stability. Some of the codes used run for 7-10 days with no restart functionality built in. HW better be stable.

So the one 4-socket server, the three 2-socket servers and the 7 dualGPU cubes are individually or in subsets "distributed" to a group of different people I trust, or run (more often) in my house. Anyway, it is fun and great to see how ubiqutous simulation has become in many scientific fields.

I learned a lot here @Hardforum and enjoy remembering those fond days which helps me in this new phase.

All the best to you guys (and gals),
Andy
 
Wow, AndyE, has it been a year already?

That is great that you have found a way to go directly to the "customer". It sounds like you are also doing a great job training the scientists to think a little more like 'computer' scientists. It is clear from running FAH and BOINC that many of the true researchers don't think that way. To have the opportunity to actually talk with them, show them what is important in their hardware/code, and then get them the computational resources they need is awesome. You are sort of like a one man BOINC :D

If you don't mind my asking, which universities have you helped out at? I don't recall where 'local' is for you.

And of course, drop in whenever you can. I always really enjoyed your posts. And who knows, perhaps someone on here can help as well. FAH uses Gromacs and I know it didn't originally have NUMA support ('community' developed). Perhaps some ideas can be generated to help with that as well here.
 
Thank you for your kind reply.

BTW, I got the GTX 980 today. May be some people might be interested in the results

Running the FAH benchmark the Maxwell chip shows its optimization vs the Kepler generation:

Explicit solvent was 63.99 ns/day for the GTX 980 (best result listed so far GTX 780: 49,68 ns/day) = +29%

Implicit solvent = 309.41 ns/day vs. GTX 780 ( 202.33 ns/day) = +53%

Power consumption = approx 175 watt difference between idle and load
Highest temp during the run: 52 degree with 80% fan speed (most of the time around 45 degrees)

rgds,
Andy

PS:
I am from Austria/Europe and the projects supported came from physics, comp science and bioinformatics.
 
Hi AndyE

Nice to see you around again, and that you still contribute to science. Hope to see you stay around for longer this time, some of the [H]ardware you were using was pretty insane - and I'm not just talking about the computers
 
Hey AndyE, have they had any interest in setting up their own projects? And if so, potentially using the BOINC platform? It would be kind of interesting to be a part of.
 
..... some of the [H]ardware you were using was pretty insane - and I'm not just talking about the computers

Hi Nathan_P :)
part of the reason why I was less often here was a stronger focus on this "hardware".
Which I would not call insane but rather reasonable. Got a second one last week, plus a lot more stuff at smaller scale over the last year. Most trips went fine except one - drowned the "hardware" in a big way.

Probably the wrong forum to share these stories over here ....

cheers,
Andy
 
Hey AndyE, have they had any interest in setting up their own projects? And if so, potentially using the BOINC platform? It would be kind of interesting to be a part of.
I talked with them about this. But given the size and duration of each of the projects they did not want to invest in creating BOINC ready projects, fearing that creating and maintaining it would take too much attention.

The real fun part was the physical interaction, brainstorming, creating, instrumenting, optimizing and tweaking to name a few.


But your post made me think how these researcher can leverage the community of people willing to help.Need to think about it more.

rgds,
Andy
 
Yeah...maybe even coming up with another project like WCG where you setup the central hosting service and then help the scientists bring the work. However, anything large would probably serve well to contact WCG directly anyways. But, it is an idea and I find that interesting.
 
AndyE, thanks for checking back in. I had been wondering what ever happened to you and I'm extremely happy you were able to work directly with scientists using your extreme hardware. Hope you check in more often in the future!
 
I talked with them about this. But given the size and duration of each of the projects they did not want to invest in creating BOINC ready projects, fearing that creating and maintaining it would take too much attention.

The real fun part was the physical interaction, brainstorming, creating, instrumenting, optimizing and tweaking to name a few.


But your post made me think how these researcher can leverage the community of people willing to help.Need to think about it more.

rgds,
Andy

I wish I knew how much tweaking that required to create boinc ready projects.
Perhaps that would be something you could help them with ;) and let the rest of us share in your fun tweaking.

Welp glad to hear from you and glad you have been putting your "decent" hardware to use.
 
Hey AndyE, glad to hear you still have a hand in computing, don't be a stranger :)
 
Hello AndyE

I love to hear your stories and see your [H]ardware pr0n. Post some pics when you have a chance!
 
Back
Top