ATTN: Major Problem with 171.67.108.21 (NV GPU Server)

I'm feeling mildly annoyed by seeing those 'ready to send' messages for two of my GPUs in FahSpy all the time. Otherwise all three GPU cores are folding along nicely :)
 
Latest from Vijay:

I think we've had a breakthrough (well maybe that's too strong of a term), but certainly found something that will help. People should be getting more backlogged credits soon. We have to see whether this will fix all of the problems. I'm thinking it won't fix them all, but it is a step in the right direction.
 
I'm just happy it doesn't seem like those WUs were lost :)
 
I'm just happy it doesn't seem like those WUs were lost :)
I've lost one or two WUs minimum in the past few days to server related issues. The server this thread is concerned about doesn't seem to be the only one that is having issues, unless I'm mistaken.
 
dont know.. im waiting for tobit or some one to officially post that the servers fixed before i turn the gpu client on again.. at this rate my electricity bill should be under 300 this month.. woot woot!
 
dont know.. im waiting for tobit or some one to officially post that the servers fixed before i turn the gpu client on again..
Good question. Servers don't appear to be having problems issuing and receiving new work. However, older work (pre-fixes) appear to still be having problems. Essentially, it was only one server that was causing a problem. Other nvidia servers have been re-weighted to assign more work from them. One of the big problems they are working on right now revolve around issuing re-credits for Sundays work units. There is sporadic net-load and CPU-load increases on the servers causing some slow downs but nothing serious like we were seeing on Sunday and early Monday.

However, there has been no new news from Vijay since this afternoon (EST) so YMMV. :rolleyes: FWIW, my nvidia clients appear to be working fine.
 
Just thought I would post a quick reply, it seems that I have downloaded another WU since my last post to at least one of my clients. I think its just slow?
 
a guy on another forum reported a problem also, but I have 9/10 working clients....the 10th client is sending work packet
 
Still getting random issues sending completed work but it's different servers.
 
Posted by Vijay just a few minutes ago:

Joe has made some good progress in tracking down the problem. He's found the bug that was recently introduced into the WS code that caused this problem and is now testing the fix to rollout to the NV GPU WS's.

He has also suggested a short term workaround which should allow many of the WUs that have been sitting in the queue to be sent back. We've instituted that fix this morning and are looking to see if that helps the situation.
 
It looks like the NV GPU servers are pretty much fixed now. The final fixes were sent to the servers last night and the results have been very favorable. All six of my GPUs no longer have any work sitting in the queue and everything is flowing nicely here. Vijay says they will continue to monitor the servers closely through out the weekend. He has been very apologetic and recognizes this as one of the worst outages ever for the project.
 
Back
Top