Trouble getting WUs

Celerator · Jan 31, 2004

For the past few days I've been having problems downloading WUs from Stanford, especially if assigned to 171.67.89.150 and once to 171.67.89.149, but not 171.67.89.151. The boxen uploads the completed WU, and then cycles every hour trying to get a new one -- sometimes for several hours! If this keeps up, I'll never stay ahead of Dr. Cleetus.

I posted a note on the Folding Community Forum and got replies (see below). Anyone else having this problem lately?

http://forum.folding-community.org/viewtopic.php?t=6971

LPerry · Jan 31, 2004

If you're using port 8080, switch to port 80 and try again. Or, the other way around, which ever method changes out the port. This usually works for me.

Celerator · Jan 31, 2004

If you're using port 8080, switch to port 80 and try again. Or, the other way around, which ever method changes out the port. This usually works for me.

Where do I do that?

LPerry · Jan 31, 2004

Shut down the client.
Open the options menu in EM, latest version.
Select the box name in the list.
Push button #1 of the three small buttons.
Your config file listing will display.
Change the port from 8080 to 80 and press the save button.
Restart the client...

Celerator · Feb 1, 2004

Thanks!
It appears that the problem was at Stanford. The server .150 was heavily overloaded. It appears that they put it into Receive Only mode for several hours to clear up the backlog. I just checked server stats and it is now Assign/Receive again, and the load it rising again...

I noticed in the FAH client, Connection, I have nothing at all ticked (default I believe). I guess it is using 8080...

Celerator · Feb 4, 2004

I continue to have problems with Stanford WU servers. My boxen race through a protein and, upload it, and then need to wait (sometimes for hours) before a new WU is assigned. Right now I have 12Ghz awaiting assignment of a new WU. I can't be the only one losing valuable frame time because of a clogged up server system.

Those new servers are overloaded and my boxen are cooling down. If anyone has any pull with Stanford, can you jog them to add another few servers?

FLECOM · Feb 4, 2004

my stats have gone to hell

from random crashes (on p4 machines) to waiting for units for hours on end

stanford needs to fix their stuff fast

Tigerbiten · Feb 4, 2004

The trouble I find is that unless you restart F@H about 6 times there is no way of getting off an overloaded server.

Plus we are going through an other batch of lockups.

Luck.........

mwarps · Feb 4, 2004

8GHz down for various amounts of time the last few days.

Stats are on the shitter lately.

gnewbury · Feb 4, 2004

Originally posted by Celerator
I continue to have problems with Stanford WU servers. My boxen race through a protein and, upload it, and then need to wait (sometimes for hours) before a new WU is assigned. Right now I have 12Ghz awaiting assignment of a new WU. I can't be the only one losing valuable frame time because of a clogged up server system.

Those new servers are overloaded and my boxen are cooling down. If anyone has any pull with Stanford, can you jog them to add another few servers?

This is a good reason to run 1 folding instance at low and 1 genome instance at idle per CPU. Remember that idle is lower than low.
That way when folding is not doing anything genome can pick up the slack and there is no time handicap for sending in genes. 1 instance of genome, with a full set of 10 genes, could last a long time.
Or UD or anything else to help.
See my sig,
Stanford servers always have problems.

Celerator · Feb 4, 2004

I complained to Vijay and got this response almost immediately:

Let me know if this persists. We've made some changes that should help.

Vijay

Let hope that thing free up our boxen for folding, their intended use.

mwarps · Feb 8, 2004

Currently my entire farm is down awaiting WU's. I am going to cry. Nearly 15 GHz going to waste for more than three hours now

Celerator · Feb 8, 2004

Nearly 15 GHz going to waste for more than three hours now

Ouch! 15GHz of goodness going to waste. I continue to have problems too. I've been alterting Vijay and several times he's made temporary adjustments that seem to help in the short run. But the basic problem persists. Seems like they need more servers to ease the congestion, or a better assignment algorithm.

When my clients fail to get a WU, I notice that they wait 1 hour

before retrying (most often getting the same clogged server -- .150).

Wouldn't it be helpful to have the download timeout faster and, when it does, immediately switch to another server? Perhaps there could be a server dedicated to this...

I keep wondering how this problem could be resolved. Does anyone know if Stanford has a writeup on the strategy used to allocate work to their servers?

mwarps · Feb 8, 2004

Originally posted by Celerator
Ouch! 15GHz of goodness going to waste. I continue to have problems too. *snip*

Aye. I just did the math. Assuming a possibility of getting two FLOs thru each chip each clock, it's a waste of over 324TFLO

Celerator · Feb 15, 2004

Here's the log from one of my machines, demonstrating the problem
with the 171.67.89.150 WU server, and the 1-hour retry waiting period.

[color=sky blue]
[00:23:23] + Attempting to send results
[00:23:34] + Results successfully sent
[00:23:34] Thank you for your contribution to Folding@home.
[00:23:34] + Number of Units Completed: 54

[00:23:38] - Preparing to get new work unit...
[00:23:38] + Attempting to get work packet
[00:23:38] - Connecting to assignment server
[00:23:38] - Successful: assigned to (171.67.89.150).
[00:23:38] + News From Folding@Home: v.4 client available
[00:23:38] Loaded queue successfully.
[01:23:39] + Could not get Work unit data from Work Server
[01:23:39] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
[01:23:54] + Attempting to get work packet
[01:23:54] - Connecting to assignment server
[01:23:55] - Successful: assigned to (171.67.89.150).
[01:23:55] + News From Folding@Home: v.4 client available
[01:23:55] Loaded queue successfully.
[02:26:08] + Could not get Work unit data from Work Server
[02:26:08] - Error: Attempt #2 to get work failed, and no other work to do.
Waiting before retry.
[02:26:19] + Attempting to get work packet
[02:26:19] - Connecting to assignment server
[02:26:19] - Successful: assigned to (171.67.89.150).
[02:26:19] + News From Folding@Home: v.4 client available
[02:26:19] Loaded queue successfully.
[03:26:20] + Could not get Work unit data from Work Server
[03:26:20] - Error: Attempt #3 to get work failed, and no other work to do.
Waiting before retry.
[03:26:43] + Attempting to get work packet
[03:26:43] - Connecting to assignment server
[03:26:43] - Successful: assigned to (171.67.89.150).
[03:26:43] + News From Folding@Home: v.4 client available
[03:26:43] Loaded queue successfully.
[04:26:45] + Could not get Work unit data from Work Server
[04:26:45] - Error: Attempt #4 to get work failed, and no other work to do.
Waiting before retry.
[04:27:25] + Attempting to get work packet
[04:27:25] - Connecting to assignment server
[04:27:25] - Successful: assigned to (171.67.89.150).
[04:27:25] + News From Folding@Home: v.4 client available
[04:27:26] Loaded queue successfully.
[/color]

One problem of having a garden of fast boxen is that these frustrations come frequently...

When I remove -advmethods, I generally have no problems getting a WU, but it's often a Tinker.
I figure if I've got a boxen with SSE I should take advantage of the fact and aim for Gromacs.

Elec · Feb 15, 2004

Originally posted by Celerator

When I remove -advmethods, I generally have no problems getting a WU, but it's often a Tinker.
I figure if I've got a boxen with SSE I should take advantage of the fact and aim for Gromacs.

Same here. My Barton 3000+ has been chewing on a Tinker all day today (just finished maybe an hour ago), and it started before I went to bed last night. It can generally do a Gromacs in less than 10 hours, depending on the size.

Trouble getting WUs

Celerator

[H]ard|Gawd

LPerry

The Weatherman

Celerator

[H]ard|Gawd

LPerry

The Weatherman

Celerator

[H]ard|Gawd

Celerator

[H]ard|Gawd

FLECOM

Modder(ator) & [H]ardest Folder Evar

Tigerbiten

[H]ard|DCer of the Month - February 2007/January 2

mwarps

Supreme [H]ardness

gnewbury

[H]ard|DCer of the Month - September 2007

Celerator

[H]ard|Gawd

mwarps

Supreme [H]ardness

Celerator

[H]ard|Gawd

mwarps

Supreme [H]ardness

Celerator

[H]ard|Gawd

Elec

[H]ard|Gawd