Message boards : Number crunching : Duplicated work
Author | Message |
---|---|
Hello: I would like to comment on a matter which does not understand. | |
ID: 20891 | Rating: 0 | rate: / Reply Quote | |
You'll find that the first WU is over 2 days old without a result so it is issued to a second host. | |
ID: 20892 | Rating: 0 | rate: / Reply Quote | |
Hello: From a few days ago the number tereas we performed in duplicate (I + a collaborator) and not by spending two days between sending each of them, in these last few hours have passed. | |
ID: 21152 | Rating: 0 | rate: / Reply Quote | |
Hi, if you can, please report the task numbers which seem duplicate. | |
ID: 21154 | Rating: 0 | rate: / Reply Quote | |
Hello: So far I see this, there were two more but now indicate miscalculation ...?. Will report if I find others. Greetings. | |
ID: 21155 | Rating: 0 | rate: / Reply Quote | |
After the first task sent out was not returned for about 3days, another task was sent out. The first task then returned with an error. I presume this triggered a second task to be sent out, at which time there would have been 2 tasks in progress. Now one of these tasks has been returned, but there is still one in progress. | |
ID: 21157 | Rating: 0 | rate: / Reply Quote | |
Hello Again it happened that a task is sent to another user before the end of two days of processing in the first ...! | |
ID: 21321 | Rating: 0 | rate: / Reply Quote | |
Hi, again a task has been sent to two users on the same day (three hour time difference) I think is worth reporting if a server failure or any special circumstances of the task ...? Greetings. | |
ID: 21392 | Rating: 0 | rate: / Reply Quote | |
The work unit was sent out. | |
ID: 21393 | Rating: 0 | rate: / Reply Quote | |
At the moment I suspect that a lot of work is being wasted due to this policy. Don't forget that BOINC is a multi-project environment and the scheduler takes into account the project resource share and the deadline when scheduling tasks. The problem is made worse in that BOINC can take up to a day to send the results back. | |
ID: 21406 | Rating: 0 | rate: / Reply Quote | |
Hmm I agree with making the deadline 2 days instead of 5. I often get sent tasks which are still running on slower GPUs, my WU will complete and it'll still be running on the other host, just a waste of resources if you ask me. Would also, hopefully, stop boinc downloading 2 GPUGRID tasks (my buffer is set to 1 day because I need enough einstein WUs to feed my GPU which runs 4 of those at once) | |
ID: 21407 | Rating: 0 | rate: / Reply Quote | |
I almost agree in principle, though I would opt for a 3 day cutoff point and change the credit system slightly (150% for <1day return, 100% for <2day return, 50% for <3day return). Anyway, it's really down to the scientists to work out what is best for the project overall, not us. They know things we don't. | |
ID: 21411 | Rating: 0 | rate: / Reply Quote | |
Yes, I usually don't run both GPU projects at the same time. Also use report tasks immediately. I myself have no trouble completing WUs in time. | |
ID: 21415 | Rating: 0 | rate: / Reply Quote | |
By all means reward users for a fast turnaround, but I don't think it's a good idea to penalise users from returning tasks within the deadline period. | |
ID: 21427 | Rating: 0 | rate: / Reply Quote | |
| |
ID: 21431 | Rating: 0 | rate: / Reply Quote | |
I agree with Dagorath on the deadline. | |
ID: 21434 | Rating: 0 | rate: / Reply Quote | |
Boinc calculates the ratio of time the computer is on and running Boinc. If Boinc thinks it is not on long enough to finish a task in 2 days then Boinc will not ask for a task. Crunching for other GPU projects would also impact upon asking for GPUGrid tasks as would things like task failures, your Boinc configurations such as use GPU while computer is in use... | |
ID: 21435 | Rating: 0 | rate: / Reply Quote | |
Boinc calculates the ratio of time the computer is on and running Boinc. If Boinc thinks it is not on long enough to finish a task in 2 days then Boinc will not ask for a task. How can that be when the deadline is currently 5 days? Was the 2 above a typo? Or do I not understand the scheduler? 2 days would be long enough for many people for any tasks, but not long enough for many people trying to run long WU's. Then make the deadline 3 days. Or 4 days. Or 100 days, I don't care. But stop resending tasks before the deadline is up. Boinc sees different tasks as all part of the same GPUGrid propject, so just trying to run one long task might throw it out and prevent someone that would be capable of running normal tasks from getting any. If they shortened the deadline to 2 days some people would find their ACEMD long tasks would get status Abandoned or No Response or whatever this server gives when a task goes over deadline. They would soon learn to go to their preferences and deselect ACEMD long so they get only ACEMD standard tasks. To make the transition less painful they could advertise in the threads and on the home page that the deadline is about to change and that those with slower GPUs and those who don't run BOINC 24/7 should deselct the long tasks. Then there is different Boinc versions (some might behave slightly differently to others) and what would happen if you had a poor GPU and replaced it with a high end GPU? I don't see any problem with that. If they've deselected the long tasks because they have a slower GPU and then they get a faster GPU they can easily select the long tasks. Seems simple enough. Did I miss something? Basically I think a bit more leeway is called for than a 2 day cutoff. I mentioned 2 days in my last post because that seems to be the return time the admins want. I could live with a 3 day deadline. Or 4 days. Or 100 days. Whatever the admins want ius fine with me. Just stop resending tasks before the deadline expires. Ideally Boinc would better understand separate task types and their requirements and even allow the user to more exactly specify what type of task and when these tasks should run. So GPUGrid has to be a bit more flexible and try to facilitate more people and operate within Boinc's restrictions. Indeed projects should be flexible and facilitate and operate within BOINC's restrictions. However, I think project admins' primary obligation is to use donated resources as efficiently as possible. That's not happening now. The research team also spent a lot of time and effort getting to where we are now. At some stage (now) they have to concentrate more on the Science and maintenance (new GPUs/drivers/CUDA versions) and less on the project setup and development. Also, changes always annoy someone. As I understand it, changing the deadline is as simple as changing a number in a config file. Resending tasks before the deadline expires seems to me to be custom code by GPUgrid devs but it shouldn't be too hard to go back to the standard BOINC server way. As for people being annoyed by changes...well...there are people annoyed now by the current policy of resending tasks before the deadline expires. So what's an admin to do? In my mind there is no contest. They MUST live up to their primary obligation to use donated resources as efficiently as possible. That means appease the people who want to eliminate duplicated work and to heck with those who support the status quo. Remember, less duplicated work means more throughput for the project. Who doesn't want more throughput? | |
ID: 21437 | Rating: 0 | rate: / Reply Quote | |
I was explaining what might happen if we had a 2day deadline, speaking hypothetically, rather than explaining the existing 5day system. I agree that the system should be reviewed periodically, and the deadline probably reduced, but it's not my call and I don't have all the info. | |
ID: 21441 | Rating: 0 | rate: / Reply Quote | |
I was explaining what might happen if we had a 2day deadline, speaking hypothetically, rather than explaining the existing 5day system. I agree that the system should be reviewed periodically, and the deadline probably reduced, but it's not my call and I don't have all the info. I realize it's not your call but maybe the project admins can be persuaded through a discussion here. Resends are always going to happen (tasks fail, as do GPU's, computers, routers). Resends for the reasons you state will always be with us but that doesn't mean we should set a deadline of 5 days but resend if the result hasn't returned in 2 days. Even if the deadline was 2 or 3 days, tasks would still fail/not be returned in time and have to be resent. Indeed tasks would still fail but that's extraneous to discussion of the deadline. Tasks not returned in time is what we're focused on here. The number of resends might actually rise if the deadline was changed. No, the number would not rise, because the tasks are being resent now after 2 days anyway. How can reducing the real deadline to match the artificial deadline induce more late returns? There is no cause-effect relationship there that I can see. If the real deadline (now 5 days) were reduced to match the artificial deadline (now 2 days) then volunteers who do not return tasks in 2 days would get some feedback that their results are not returning in time. They might notice that their RAC decreases or they might see many "Abandoned" or "No Reply" outcomes and zero credits awarded in their list of tasks on the website. They'll figure it out for themselves or they'll inquire in the forums as to what's going wrong. Ideally someone would create a sticky thread and put an item on the home page news warning folks of an upcoming reduction in real deadline a month or more in advance and advise users what to watch for and do. Another mechanism would kick in too. BOINC would warn users that task XYZ is about to miss its deadline. That would get users asking questions too. One way or another, volunteers who cannot return a task in 2 days would come to realize that they need to select only the short tasks in their preferences or perhaps move their resources to a different project. With the way things are now they just keep missing the 2 day artificial deadline over and over but get credits anyway and no feedback that they've missed the artificial deadline. They have no reason/motivation to change. Therefore, in the long run, reducing the real deadline from 5 days to match the artificial deadline of 2 days will actually reduce the number of resends. As I said, that won't happen immediately. It will happen as volunteers become aware of the change. If we leave things as they are then there is almost zero chance of ever reducing the number of resends. I don't have the figures so I can't work it out, but the team could do some stats to work out in advance if it would expedite the project overall, or hinder it. It seems to me pertinent stats won't exist until the real deadline is reduced to match the artificial deadline. I don't see how they can predict from the stats they have now. At present we can only reason our way to the outcome based on what we know (or assume) about volunteer behavior. If they were really up for the challenge they could write a program to continuously analyze returns and self-regulate the deadline within boundaries, but I expect Boinc would backfire. Perhaps I don't fully understand what you're getting at but it sounds like that would result in a floating deadline. I think that could only confuse users in the sense that some would find their results return in time today but not the next day. I can't see that as being a good thing. Decide what the project needs for a return time (it seems to be 2), draw one line in the sand and stick to it. Don't fudge around with artificial deadlines or floating deadlines. | |
ID: 21455 | Rating: 0 | rate: / Reply Quote | |
At least long workunits should not be sent to slow hosts. They have no chance to return them even in 5 days. I'm still getting resent tasks, which are still being processed on GT 9500 or GT 9600 cards. I don't feel like warning these users one by one. I wrote about it in march. | |
ID: 21456 | Rating: 0 | rate: / Reply Quote | |
| |
ID: 21463 | Rating: 0 | rate: / Reply Quote | |
Present GIANNI_KKFREE tasks now use less GPU (87% compared to 98%) and do not cause any Lag on either of my Fermi systems. | |
ID: 21475 | Rating: 0 | rate: / Reply Quote | |
We have increased the duplication time from 2 to 3 days. | |
ID: 21477 | Rating: 0 | rate: / Reply Quote | |
We have increased the duplication time from 2 to 3 days. Hello: The increase of 2 to 3 days I think it was positive, not get so many duplicate tasks ... today. The two tasks that I have received recently, had been implemented by others but within minutes one is over, so I was performing was just useless, I canceled the other and see ... I sincerely believe that the best solición to this question is simply adapt the return time of the tasks to the maximum time allowed, two, three, etc ... days and nothing more. Greetings. | |
ID: 21583 | Rating: 0 | rate: / Reply Quote | |
Hi: The two tasks that I am running right now are in the process by another user. | |
ID: 21657 | Rating: 0 | rate: / Reply Quote | |
Hello: Every time I understand forwarding policies least half-finished tasks ... Now I get one that makes 24 hours sent to another user ... as 3 days or case. Greetings. | |
ID: 21680 | Rating: 0 | rate: / Reply Quote | |
Thought I already explained that situation? | |
ID: 21681 | Rating: 0 | rate: / Reply Quote | |
Thought I already explained that situation? Hello, explain whether this and I also understand, but continues with little sense, in my opinion. The first user does not comply within 3 days and the task is forwarded, fine, but continues its execution. (sure he still believes that you have 5 days ...) The second user receives the task and the performer is usually from about 24 hours ago We return to the first user, which is now the task is aborted runs (you will notice that they have forwarded their task and makes no sense to stick with it ...?), well. But I do not understand is that when you cancel the task by the first user to generate a new shipment of the same task, when it is running normally by another user and within 3 days. If this does not duplicate work without any need and I will say it is. Greetings. | |
ID: 21682 | Rating: 0 | rate: / Reply Quote | |
Perhaps this system needs to be revised, but the importance of returning the task increases following a failure. As to why the user aborted, who knows. Usually they just think it will take too long, but they might have a problem with it, which would make it more important to return the task. The server cannot know that the first resend was/is running smoothly. Should one of the resends finish before the other starts (when someone has their cache too high) the other resend will not start (assuming intervening communications). | |
ID: 21684 | Rating: 0 | rate: / Reply Quote | |
The system certainly does need to be revised but let's keep it simple rather than make it more complicated. The suggestion to issue resends only to CC 2.0 hosts would require new code on the server side, I think. | |
ID: 21687 | Rating: 0 | rate: / Reply Quote | |
Perhaps this system needs to be revised, but the importance of returning the task increases following a failure. As to why the user aborted, who knows. Usually they just think it will take too long, but they might have a problem with it, which would make it more important to return the task. The server cannot know that the first resend was/is running smoothly. Should one of the resends finish before the other starts (when someone has their cache too high) the other resend will not start (assuming intervening communications). Hi Allow me a suggestion on the topic: 1 - Leave only 3 days to complete a task, Amul duplication of 3 to 5 days. Avoid many forwards, duplicate tasks and less load to the server. 2 - The server knows when a task is running for being forwarded and no problem as the task in the previous user will automatically canceled prior to forwarding. The references that I'm putting in these comments are only those that affect me personally, I know for sure, but I fear that the volume of duplicate tasks is very high. Greetings. | |
ID: 21688 | Rating: 0 | rate: / Reply Quote | |
Hello: Following what I promised, I put a link to a new task duplicate. | |
ID: 21716 | Rating: 0 | rate: / Reply Quote | |
| |
ID: 21890 | Rating: 0 | rate: / Reply Quote | |
| |
ID: 21894 | Rating: 0 | rate: / Reply Quote | |
I have just removed the duplication mechanism as announced in another thread. | |
ID: 21974 | Rating: 0 | rate: / Reply Quote | |
I have just removed the duplication mechanism as announced in another thread. Thank you very much! So now it's like this: less than 24h (except PYRT): bonus of 100% less than 48h: bonus of 50% before deadline: normal credits, no redundant results after deadline: new result sent. (BTW: What's the 100%-time for PYRT?) ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki | |
ID: 21975 | Rating: 0 | rate: / Reply Quote | |
I have just removed the duplication mechanism as announced in another thread. I still have to talk with Ignasi about PYRT because we just came back from Berlin. But it should be valid for all workunits. gdf | |
ID: 21976 | Rating: 0 | rate: / Reply Quote | |
Hello. Thanks for the change. | |
ID: 21977 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Duplicated work