Message boards : Number crunching : BOINC 7.0.15 won't download new work
Author | Message |
---|---|
As the title says it won't download new work for all my projects except SETI. | |
ID: 23388 | Rating: 0 | rate: / Reply Quote | |
This is the debug information I get: Windows-7-PC 1336 Feb/11/2012 02:12:23 [work_fetch] work fetch start 1337 Feb/11/2012 02:12:23 [rr_sim] start: work_buf min 864000 additional 864000 total 1728000 on_frac 0.974 active_frac 1.000 1339 GPUGRID Feb/11/2012 02:12:23 [rr_sim] 4239.37: I1R48-NATHAN_FA5-99-100-RND5020_1 finishes (719073.45G/152.34G) 1340 GPUGRID Feb/11/2012 02:12:23 [rr_sim] I1R48-NATHAN_FA5-99-100-RND5020_1 misses deadline by 473632.17 1396 Feb/11/2012 02:12:23 [work_fetch] ------- start work fetch state ------- 1397 Feb/11/2012 02:12:23 [work_fetch] target work buffer: 864000.00 + 864000.00 sec 1398 Feb/11/2012 02:12:23 [work_fetch] CPU: shortfall 3115244.87 nidle 0.00 saturated 154398.94 busy 0.00 1406 GPUGRID Feb/11/2012 02:12:23 [work_fetch] CPU: fetch share 0.00 rec 962.02268 prio -2.00069 backoff dt 0.00 int 0.00 (no apps) 1407 Feb/11/2012 02:12:23 [work_fetch] NVIDIA: shortfall 3431927.32 nidle 1.00 saturated 13906.36 busy 0.00 1415 GPUGRID Feb/11/2012 02:12:23 [work_fetch] NVIDIA: fetch share 1.00 rec 962.02268 prio -2.00069 backoff dt 0.00 int 0.00 1423 GPUGRID Feb/11/2012 02:12:23 [work_fetch] REC 962.022678 1424 Feb/11/2012 02:12:23 [work_fetch] ------- end work fetch state ------- 1425 Feb/11/2012 02:12:23 [work_fetch] No project chosen for work fetch 1426 Feb/11/2012 02:12:33 [work_fetch] Request work fetch: Backoff ended for SETI@home 1427 Feb/11/2012 02:12:33 [work_fetch] work fetch start 1428 Feb/11/2012 02:12:33 [rr_sim] start: work_buf min 864000 additional 864000 total 1728000 on_frac 0.974 active_frac 1.000 1430 GPUGRID Feb/11/2012 02:12:33 [rr_sim] 4239.58: I1R48-NATHAN_FA5-99-100-RND5020_1 finishes (717084.36G/152.34G) 1431 GPUGRID Feb/11/2012 02:12:33 [rr_sim] I1R48-NATHAN_FA5-99-100-RND5020_1 misses deadline by 473629.12 1487 Feb/11/2012 02:12:33 [work_fetch] ------- start work fetch state ------- 1488 Feb/11/2012 02:12:33 [work_fetch] target work buffer: 864000.00 + 864000.00 sec 1489 Feb/11/2012 02:12:33 [work_fetch] CPU: shortfall 3115259.58 nidle 0.00 saturated 154399.10 busy 0.00 1497 GPUGRID Feb/11/2012 02:12:33 [work_fetch] CPU: fetch share 0.00 rec 962.02268 prio -1.17372 backoff dt 0.00 int 0.00 (no apps) 1498 Feb/11/2012 02:12:33 [work_fetch] NVIDIA: shortfall 3431950.81 nidle 1.00 saturated 13899.72 busy 0.00 1506 GPUGRID Feb/11/2012 02:12:33 [work_fetch] NVIDIA: fetch share 1.00 rec 962.02268 prio -1.17372 backoff dt 0.00 int 0.00 1514 GPUGRID Feb/11/2012 02:12:33 [work_fetch] REC 962.022678 1515 Feb/11/2012 02:12:33 [work_fetch] ------- end work fetch state ------- 1516 Feb/11/2012 02:12:33 [work_fetch] No project chosen for work fetch 1517 Feb/11/2012 02:12:50 [cpu_sched_debug] Request CPU reschedule: periodic CPU scheduling 1518 Feb/11/2012 02:12:50 [cpu_sched_debug] schedule_cpus(): start 1519 Feb/11/2012 02:12:50 [rr_sim] start: work_buf min 864000 additional 864000 total 1728000 on_frac 0.974 active_frac 1.000 1521 GPUGRID Feb/11/2012 02:12:50 [rr_sim] 4239.94: I1R48-NATHAN_FA5-99-100-RND5020_1 finishes (714597.47G/152.34G) 1522 GPUGRID Feb/11/2012 02:12:50 [rr_sim] I1R48-NATHAN_FA5-99-100-RND5020_1 misses deadline by 473629.80 1578 GPUGRID Feb/11/2012 02:12:50 [rr_sim] Result I1R48-NATHAN_FA5-99-100-RND5020_1 projected to miss deadline. 1579 GPUGRID Feb/11/2012 02:12:50 [rr_sim] Project has 1 projected NVIDIA deadline misses 1580 GPUGRID Feb/11/2012 02:12:50 [cpu_sched_debug] earliest deadline: 1329317831 I1R48-NATHAN_FA5-99-100-RND5020_1 1581 GPUGRID Feb/11/2012 02:12:50 [cpu_sched_debug] scheduling I1R48-NATHAN_FA5-99-100-RND5020_1 (coprocessor job, EDF) (prio -0.825617) 1582 GPUGRID Feb/11/2012 02:12:50 [cpu_sched_debug] reserving 1.000000 of coproc NVIDIA 1591 Feb/11/2012 02:12:50 [cpu_sched_debug] enforce_schedule(): start 1592 Feb/11/2012 02:12:50 [cpu_sched_debug] preliminary job list: 1593 GPUGRID Feb/11/2012 02:12:50 [cpu_sched_debug] 0: I1R48-NATHAN_FA5-99-100-RND5020_1 (MD: yes; UTS: yes) 1598 Feb/11/2012 02:12:50 [cpu_sched_debug] final job list: 1599 GPUGRID Feb/11/2012 02:12:50 [cpu_sched_debug] 0: I1R48-NATHAN_FA5-99-100-RND5020_1 (MD: yes; UTS: yes) 1604 GPUGRID Feb/11/2012 02:12:50 [cpu_sched_debug] scheduling I1R48-NATHAN_FA5-99-100-RND5020_1 1612 GPUGRID Feb/11/2012 02:12:50 [cpu_sched_debug] I1R48-NATHAN_FA5-99-100-RND5020_1 sched state 2 next 2 task state 1 1616 Feb/11/2012 02:12:50 [cpu_sched_debug] enforce_schedule: end Anthony. ____________ The longer I live, the more reasons I develop for wanting to die. | |
ID: 23389 | Rating: 0 | rate: / Reply Quote | |
It seems that BOINC 7.0.15 will only download 1 WU per GPU. If I remove the <exclude_gpu> option from the cc_config.xml file it will download a second WU (for my 2nd GPU), while with version 6.12.34 it downloaded 2 WU's per GPU, and I have 2 GPU's, so I did end up with 4 WU's at one time (until I went back to a 2 client setup). So now it will only download 2 WU's if I remove that option which is a pain in the but. And if I leave the option in the cc_config file then I don't know when it will start to download a new WU. Will it do that after reporting the finished one, because that might take a while as the upload of a WU might take a while with mu upload speed. | |
ID: 23390 | Rating: 0 | rate: / Reply Quote | |
It seems that BOINC 7.0.15 will only download 1 WU per GPU. If I remove the <exclude_gpu> option from the cc_config.xml file it will download a second WU (for my 2nd GPU), while with version 6.12.34 it downloaded 2 WU's per GPU, and I have 2 GPU's, so I did end up with 4 WU's at one time (until I went back to a 2 client setup). So now it will only download 2 WU's if I remove that option which is a pain in the but. And if I leave the option in the cc_config file then I don't know when it will start to download a new WU. Will it do that after reporting the finished one, because that might take a while as the upload of a WU might take a while with mu upload speed. So you have a machine with dual GPU's and your cc_config file specifies an exclude on one (presumably allocating one to seti and other to GPUgrid). You have the "connect every" set to 1 day and "additional buffer" set to 1 day. Is this correct? Can you post your cc_config.xml file please. Also whats your on_frac value? You will have to browse your client_state.xml to find it. I have reported issues with GPU work fetch (or lack of). CPU seems to work correctly as far as I can tell but GPU falls way too short. ____________ BOINC blog | |
ID: 23393 | Rating: 0 | rate: / Reply Quote | |
So you have a machine with dual GPU's and your cc_config file specifies an exclude on one (presumably allocating one to seti and other to GPUgrid). You have the "connect every" set to 1 day and "additional buffer" set to 1 day. Is this correct? I have the "connect every" set to 1 day and "additional buffer" set to 10 days. <on_frac>0.975064</on_frac> My cc_config.xml <cc_config> <log_flags> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> <task>1</task> <app_msg_receive>0</app_msg_receive> <app_msg_send>0</app_msg_send> <benchmark_debug>0</benchmark_debug> <checkpoint_debug>0</checkpoint_debug> <coproc_debug>0</coproc_debug> <cpu_sched>0</cpu_sched> <cpu_sched_debug>0</cpu_sched_debug> <cpu_sched_status>0</cpu_sched_status> <dcf_debug>0</dcf_debug> <priority_debug>0</priority_debug> <file_xfer_debug>0</file_xfer_debug> <gui_rpc_debug>0</gui_rpc_debug> <heartbeat_debug>0</heartbeat_debug> <http_debug>0</http_debug> <http_xfer_debug>0</http_xfer_debug> <mem_usage_debug>0</mem_usage_debug> <network_status_debug>0</network_status_debug> <poll_debug>0</poll_debug> <proxy_debug>0</proxy_debug> <rr_simulation>0</rr_simulation> <rrsim_detail>0</rrsim_detail> <sched_op_debug>0</sched_op_debug> <scrsave_debug>0</scrsave_debug> <slot_debug>0</slot_debug> <state_debug>0</state_debug> <statefile_debug>0</statefile_debug> <std_debug>0</std_debug> <task_debug>0</task_debug> <time_debug>0</time_debug> <trickle_debug>0</trickle_debug> <unparsed_xml>0</unparsed_xml> <work_fetch_debug>0</work_fetch_debug> <notice_debug>0</notice_debug> </log_flags> <options> <abort_jobs_on_exit>0</abort_jobs_on_exit> <allow_multiple_clients>0</allow_multiple_clients> <allow_remote_gui_rpc>0</allow_remote_gui_rpc> <client_version_check_url>http://boinc.berkeley.edu/download.php?xml=1</client_version_check_url> <client_download_url>http://boinc.berkeley.edu/download.php</client_download_url> <disallow_attach>0</disallow_attach> <dont_check_file_sizes>0</dont_check_file_sizes> <dont_contact_ref_site>0</dont_contact_ref_site> <exit_after_finish>0</exit_after_finish> <exit_before_start>0</exit_before_start> <exit_when_idle>0</exit_when_idle> <fetch_minimal_work>0</fetch_minimal_work> <force_auth>default</force_auth> <http_1_0>0</http_1_0> <http_transfer_timeout>300</http_transfer_timeout> <http_transfer_timeout_bps>10</http_transfer_timeout_bps> <max_file_xfers>8</max_file_xfers> <max_file_xfers_per_project>2</max_file_xfers_per_project> <max_stderr_file_size>0</max_stderr_file_size> <max_stdout_file_size>0</max_stdout_file_size> <max_tasks_reported>0</max_tasks_reported> <ncpus>-1</ncpus> <network_test_url>http://www.google.com/</network_test_url> <no_alt_platform>0</no_alt_platform> <no_gpus>0</no_gpus> <no_info_fetch>0</no_info_fetch> <no_priority_change>0</no_priority_change> <os_random_only>0</os_random_only> <proxy_info> <socks_server_name></socks_server_name> <socks_server_port>80</socks_server_port> <http_server_name></http_server_name> <http_server_port>80</http_server_port> <socks5_user_name></socks5_user_name> <socks5_user_passwd></socks5_user_passwd> <http_user_name></http_user_name> <http_user_passwd></http_user_passwd> <no_proxy></no_proxy> </proxy_info> <rec_half_life_days>10.000000</rec_half_life_days> <report_results_immediately>0</report_results_immediately> <run_apps_manually>0</run_apps_manually> <save_stats_days>30</save_stats_days> <skip_cpu_benchmarks>0</skip_cpu_benchmarks> <simple_gui_only>0</simple_gui_only> <start_delay>0</start_delay> <stderr_head>0</stderr_head> <suppress_net_info>0</suppress_net_info> <unsigned_apps_ok>0</unsigned_apps_ok> <use_certs>0</use_certs> <use_certs_only>0</use_certs_only> <zero_debts>0</zero_debts> <use_all_gpus>1</use_all_gpus> <exclude_gpu> <url>http://www.gpugrid.net</url> <device_num>1</device_num> </exclude_gpu> </options> </cc_config> Anthony. ____________ The longer I live, the more reasons I develop for wanting to die. | |
ID: 23394 | Rating: 0 | rate: / Reply Quote | |
Well that's a good sample cc_config. Every option it can do, even if most of them are off. | |
ID: 23396 | Rating: 0 | rate: / Reply Quote | |
It just downloaded 4 (long) WU's while I'm only using 1 GPU for this project. | |
ID: 23401 | Rating: 0 | rate: / Reply Quote | |
Like I said before, I still think it has gpu work fetch issues I agree - its a feast or famine regime. Some Projects will drip feed me one at a time (which is fine) until it catches up to the required number, more than half the time it will not download at all until the cache is near empty, and I get it all at once. Thats with unchanged settings ... I usually set for half a day (1 day at most) on the primary cache at Projects, and never use the days delay aspect, even with that the work fetch is getting in a twist. Its going to be a big issue if not resolved by the time 7.X.X goes Production. Regards Zy | |
ID: 23402 | Rating: 0 | rate: / Reply Quote | |
I notice that under 7.0.15 that its now been labelled "Minimum work buffer" and "Max additional work buffer". The idea is it will build up the cache and wait until it gets below the minimum level before asking for more work, to reduce the number of requests for work. | |
ID: 23411 | Rating: 0 | rate: / Reply Quote | |
GPUGrid only issues 2 tasks per card maximum, so with high end CC2.0 cards running 24/7 (or close to), you don't actually have a high cache of GPU tasks anyway. This means you can safely up your cache, to reduce the chance of such issues, and perhaps better accommodate some CPU projects, especially the ones more prone to outages. | |
ID: 23420 | Rating: 0 | rate: / Reply Quote | |
GPUGrid only issues 2 tasks per card maximum, so with high end CC2.0 cards running 24/7 (or close to), you don't actually have a high cache of GPU tasks anyway. So why did it download 4 WU's, whil GPUGrid is assigned to 1 GPU only (out of 2)? Anthony. ____________ The longer I live, the more reasons I develop for wanting to die. | |
ID: 23424 | Rating: 0 | rate: / Reply Quote | |
Because there are two GPU's, and Boinc's not that smart. Obviously such configurations are an exception as would be configuring an app_info file. | |
ID: 23433 | Rating: 0 | rate: / Reply Quote | |
And now it won't download a new task while I only have just over 20 minutes of crunching time left on my last GPUGrid WU. | |
ID: 23436 | Rating: 0 | rate: / Reply Quote | |
And now it won't download a new task while I only have just over 20 minutes of crunching time left on my last GPUGrid WU. It usually waits until the task has a about 5 mins left before it goes and requests a new one (assuming you are running .01 + 0). If you are running 1 + 3 then it should keep 1 days worth and when it falls below that its meant to request 3 days plus whatever its shortfall from the 1 day minimum is. I have raised a bug where in project properties it doesn't show that a gpu has been excluded. As skgiven has said it may be requesting work for the excluded gpu. The joys of running alpha-test software. ____________ BOINC blog | |
ID: 23444 | Rating: 0 | rate: / Reply Quote | |
It's only cool to use Alpha Boinc versions is you have the following recommended cc_config and report issues to the alpha email list: | |
ID: 23452 | Rating: 0 | rate: / Reply Quote | |
I had turned on some suggested log-flags but it flooded my BoincTasks message window so I turned them off. | |
ID: 23453 | Rating: 0 | rate: / Reply Quote | |
I had turned on some suggested log-flags but it flooded my BoincTasks message window so I turned them off. I think work_fetch_debug and sched_op_debug would be sufficent log flags if you are going to report stuff to the mailing list. Also Dr Anderson likes the client emulator, so if you can create a scenerio there it might help debug things. Its on the Boinc Alpha web site. ____________ BOINC blog | |
ID: 23457 | Rating: 0 | rate: / Reply Quote | |
I was getting a lot of error last night si I went back to 6.12.34 for the moment. | |
ID: 23460 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : BOINC 7.0.15 won't download new work