Incorrectly reported run time

Message boards : Number crunching : Incorrectly reported run time

Author	Message
catavalon21 Send message Joined: 1 Feb 09 Posts: 4 Credit: 383,039,090 RAC: 1,084,540 Level Scientific publications	Message 60015 - Posted: 5 Mar 2023 \| 12:27:32 UTC
	I notice that the run times reported for Python tasks on Win10 are much larger than the actual elapsed clock time; rather it is mirroring the value of CPU time. Example, a recently completed WU took just shy of 14 hours to complete (50739 seconds exactly based on first and last time reported in the log) while the Completed Tasks list shows the task taking 208,842.7 seconds to complete. Curious, is this a new anomaly? Task is here: https://www.gpugrid.net/result.php?resultid=33337456
	ID: 60015 \| Rating: 0 \| rate: / Reply Quote

KAMasud Send message Joined: 27 Jul 11 Posts: 137 Credit: 523,901,354 RAC: 0 Level Scientific publications	Message 60016 - Posted: 5 Mar 2023 \| 17:42:03 UTC - in response to Message 60015.
	I notice that the run times reported for Python tasks on Win10 are much larger than the actual elapsed clock time; rather it is mirroring the value of CPU time. Example, a recently completed WU took just shy of 14 hours to complete (50739 seconds exactly based on first and last time reported in the log) while the Completed Tasks list shows the task taking 208,842.7 seconds to complete. Curious, is this a new anomaly? Task is here: https://www.gpugrid.net/result.php?resultid=33337456 ________ Not a new issue. Go to the news section and then the Python concerning thread. Every issue is covered in that thread.
	ID: 60016 \| Rating: 0 \| rate: / Reply Quote

catavalon21 Send message Joined: 1 Feb 09 Posts: 4 Credit: 383,039,090 RAC: 1,084,540 Level Scientific publications	Message 60017 - Posted: 5 Mar 2023 \| 19:58:23 UTC - in response to Message 60016.
	I notice that the run times reported for Python tasks on Win10 are much larger than the actual elapsed clock time; rather it is mirroring the value of CPU time. Example, a recently completed WU took just shy of 14 hours to complete (50739 seconds exactly based on first and last time reported in the log) while the Completed Tasks list shows the task taking 208,842.7 seconds to complete. Curious, is this a new anomaly? Task is here: https://www.gpugrid.net/result.php?resultid=33337456 ________ Not a new issue. Go to the news section and then the Python concerning thread. Every issue is covered in that thread. Thanks. Searched a bit, but obviously not in the correct location.
	ID: 60017 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1340 Credit: 7,653,273,724 RAC: 13,371,735 Level Scientific publications	Message 60018 - Posted: 5 Mar 2023 \| 22:06:47 UTC - in response to Message 60017.
	The posts about this ARE in there. The gist is that since these tasks are of dual cpu-gpu nature and primarily compute on the cpu, BOINC has no mechanism for handling or understanding these tasks. It only knows that a task is either run on the cpu or run on the gpu. It does not compute correctly the cpu_time or run_time for any task that is run on both components. Ignore the cpu_times and run_times, they are meaningless values. Just pay attention to the delta between time sent and time returned to determine the period of time the task takes to crunch.
	ID: 60018 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,825,716,430 RAC: 19,538,814 Level Scientific publications	Message 60019 - Posted: 5 Mar 2023 \| 22:30:47 UTC - in response to Message 60018.
	Not quite. Runtime and CPU time are recorded locally by the BOINC client, and reported accurately to the project server. At some point, the server throws its hands up, and says "the runtime can't be a low as that, if the CPU time is so high", and 'corrects' the runtime to what it regards is the true value - even if that's impossible given the clock times when the task was issued and returned. I've never summoned up the energy to investigate further, and find out whether that's 1) an error by GPUGrid 2) a deliberate decision by BOINC 3) because GPUGrid is running an outdated server 4) something else But it could be worked out.
	ID: 60019 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1340 Credit: 7,653,273,724 RAC: 13,371,735 Level Scientific publications	Message 60021 - Posted: 6 Mar 2023 \| 1:26:03 UTC - in response to Message 60019. Last modified: 6 Mar 2023 \| 1:45:26 UTC
	But neither the run_time or cpu_time is correctly reported on ANY of my hosts. I return a PythonGPU task in 8.5 - 9 hours consistently on my fastest host. Times reported to the scheduler are ~80000 seconds or equivalent to 22.2 hours. So off by a factor of 3X there consistently. My slowest hosts report using 135K seconds or 37.8 hours of run_time. Tasks actually run and report in 12 - 13 hours. So also off by ~3X factor. So I ignore the reported values as nonsense. [Edit] OK, I see what you are talking about. You are referring the time recorded in the job_log for the task aren't you? But even that is not correct. This is the job_log output for this task. 1678063389 ue 3024664.119761 ct 80540.580000 et 32929.492896 ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8739_0 ue = Estimated runtime = 3024664 seconds = !wrong! ct CPU time = 80540 seconds = !correct! = that is correctly reported value to the project from the job_log. et = elapsed time = 32929 seconds = ! actual correct runtime not reported to the project. But does not reflect any true relation to the actual time the calculation needs to run and report. And here is the task at the project. https://www.gpugrid.net/result.php?resultid=33336333
	ID: 60021 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,825,716,430 RAC: 19,538,814 Level Scientific publications	Message 60023 - Posted: 6 Mar 2023 \| 12:50:40 UTC - in response to Message 60021.
	[Edit] OK, I see what you are talking about. You are referring the time recorded in the job_log for the task aren't you? Actually, no - I'm not relying on historical logs. I'm looking at the real-time data interchange between BOINC client and BOINC server. I've just reported task e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1 The event log messages relating to that task are: Mon 06 Mar 2023 12:34:06 GMT \| GPUGRID \| Computation for task e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1 finished Mon 06 Mar 2023 12:34:26 GMT \| GPUGRID \| Sending scheduler request: To report completed tasks. Mon 06 Mar 2023 12:34:26 GMT \| GPUGRID \| Reporting 1 completed tasks Mon 06 Mar 2023 12:34:27 GMT \| GPUGRID \| [sched_op] handle_scheduler_reply(): got ack for task e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1 That report was accomplished by sending a file called sched_request_www.gpugrid.net.xml The significant content of that file was: <result> <name>e00006a07567-ABOU_rnd_ppod_expand_demos29_2_exp3-0-1-RND8382_1</name> <final_cpu_time>79615.790000</final_cpu_time> <final_elapsed_time>30189.014614</final_elapsed_time> ... So, the CPU and elapsed times were different when the report left my machine. But they are the same on the website: Run time 79,615.79 CPU time 79,615.79 Neglecting the possibility that it was intercepted by a trojan router between northern England and eastern Spain, I'm asserting that the change happened within the GPUGrid server.
	ID: 60023 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1340 Credit: 7,653,273,724 RAC: 13,371,735 Level Scientific publications	Message 60024 - Posted: 6 Mar 2023 \| 20:30:52 UTC
	OK, poTaytoe - poTahtoe Same values in the job log culled from the sched_reply. But my statement about what the website displaying is nonsense values is correct.
	ID: 60024 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Number crunching : Incorrectly reported run time

	About	Science	Volunteers	Performance	Forum	Join us	Donate