Advanced search

Message boards : Number crunching : I'm going to suspend crunching for now.

Author Message
dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34737 - Posted: 20 Jan 2014 | 4:17:17 UTC

Hello,

For the past few weeks, I got quite a bit of crunching done. But then I discovered what that means, on my machine. My CPU is an Intel i7 950 Quad-Core, with a top clock speed of 3.07 GHz (with 8 virtual cores, hyper-threaded in 4 pairs).

Even though GPU-Grid is most famous for using the GPU, it also uses a CPU quite heavily, just because CUDA computing truly runs both on the CPU and on the GPU. AND, while I was running WUs for GPU-Grid, I was also running WUs for some other BOINC projects. For the other projects, I limited my CPU-core loads to 30%, but it's the maximum between the core usage levels, that determines overall clock speed.

I specifically revved up the GPU fan to 80% (3240RPM), using MSI Afterburner, which I consider to be an absolute necessity for this sort of use.

Further, my CPU's cooling system is a "Corsair CWCH50" liquid cooling unit. What I find is that this little cooling unit is not all it's cracked up to be. It looks cute, and there's a YouTube video that does a good job of advertizing it. But finally, I feel that this CPU would actually need a better cooling system, to be used heavily and continuously so.

http://www.youtube.com/watch?v=jx5usoClfDs

Instead of fitting my PC with a real water-cooling system, the eBay seller chose this route because it was much easier for him to install, and because it sounds neat, just like a real water-cooling setup would. BUT, there is no way this unit could remove the 750W of heat that the video claims, which the seller of my PC also could not have known when he installed it.

But, during continuous use, my CPU can in fact put out 750W of heat I observe. And so what happens is that in the short run, i.e. for 24-48 hours of operation, this thing will seem to work well. But then, past 48 hours of use, the temperatures in my case just seem to keep getting hotter and hotter. When idle, my CPU temperature hovers near 29degrees C, while after a lot of crunching, I had CPU temps above 45degrees, constantly. And, an Intel chip isn't meant to run as hot as an AMD. If this was an AMD, the same temperatures would not worry me. I think that the YouTube video skews some of these numbers slightly.

Finally, from Friday evening (January 17) to Saturday (January 18), I noticed a suspicious odor in my home, which resembled that of freshly-unpacked bandages, but also that of broiled bacon. I suspected this setup as the culprit, and suspended crunching. By this morning (January 19), the odor had vanished overnight. So what I surmise, is that the low-expansion fluid was not zero -expansion, and even though I couldn't prove it, the cooling unit may have lost some fluid. I had noticed before, that sounds were coming from it routinely, which suggested some small pressure buildup within the cooling loop.

Long story short, I could do a lot of fiddling with this thing, including by sandwiching the heat-exchanger between two cooling fans, that only came with one. But one fact which comes into play with my PC, is that all but one of my cooling fans already blow into the case. There's only one fan on it, that actually expels air. And there are many components inside the case, that generate heat, which on inspection were warmer than I thought they'd be, even if no components per se showed evidence of fluid on the outside (when inspected by finger, while running fully). So even with more fans, between the GPU, the CPU and the MB, I see little chance of expelling much more heat in continuous use.

Finally, I'm just happy that nothing fried. And, still owning a PC that has oodles of power with no real damage done, I'm going to hang up my crunching for now. I just don't see it as a safe thing to do anymore.

Some time ago, I had reached a similar conclusion. But recently I hoped, that just because I can rev up my GPU fan I can make it a safe thing to do. I presently feel I cannot.

Sorry,
Dirk

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34738 - Posted: 20 Jan 2014 | 9:32:55 UTC - in response to Message 34737.

When idle, my CPU temperature hovers near 29degrees C, while after a lot of crunching, I had CPU temps above 45degrees, constantly. And, an Intel chip isn't meant to run as hot as an AMD. If this was an AMD, the same temperatures would not worry me.


How much above 45C... up to 50C? What makes you think that's too hot? Did you go to the Intel website and look at the spec sheet for that particular CPU?
Have you ever even Googled for something like intel i7 max temp. Try it, you'll be amazed.

But one fact which comes into play with my PC, is that all but one of my cooling fans already blow into the case. There's only one fan on it, that actually expels air.


That isn't necessarily a bad arrangement. The number of fans pointing this way or that way is irrelevant. What is important is that the total volumetric capacity (CFM) of the fans blowing in should equal the total volumetric capacity of the fans blowing out. For example, 3 small fans blowing in with 1 large fan blowing out isn't necessarily a bad arrangement, it might be, it might not be.

And there are many components inside the case, that generate heat, which on inspection were warmer than I thought they'd be, even if no components per se showed evidence of fluid on the outside (when inspected by finger, while running fully).


That isn't necessarily cause for alarm. Some of the components in there run hot but don't suffer from it. The bottom line is that for components that have temperature sensors on them (disks, CPUs, GPUs, etc.) you need to go to the manufacturer's website and look for the max. operating temp they recommend. Or get the numbers from some other reliable source, then use the temperature monitoring software available to see if those components are running at an acceptable temperature. That's all you can do and you must do it if you're going to crunch. The finger test means nothing.

So even with more fans, between the GPU, the CPU and the MB, I see little chance of expelling much more heat in continuous use.


More fans is rarely the best solution. Smart fan placement and lowering the ambient temp is the best solution.

And, still owning a PC that has oodles of power with no real damage done, I'm going to hang up my crunching for now. I just don't see it as a safe thing to do anymore.


It's safe if you do it the smart, safe way. You were doing it the dumb, unsafe way. Not saying you're dumb, just saying you did a dumb thing. We all do dumb things from time to time and frequently advertisers talk us into buying garbage on the promise it will solve all our problems.

Some time ago, I had reached a similar conclusion. But recently I hoped, that just because I can rev up my GPU fan I can make it a safe thing to do. I presently feel I cannot.


You're right, it's not that simple but it can be done. If you want to return to crunching GPUgrid and want some suggestions on how to do it safely then just ask.


____________
BOINC <<--- credit whores, pedants, alien hunters

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34740 - Posted: 20 Jan 2014 | 13:53:20 UTC

45 degress is hot? Some of my cpus (all intel) never seeing lower temps then 70 ^^
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Old man
Send message
Joined: 24 Jan 09
Posts: 42
Credit: 16,676,387
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34741 - Posted: 20 Jan 2014 | 15:07:21 UTC - in response to Message 34737.



But, during continuous use, my CPU can in fact put out 750W of heat I observe.


maybe there's one wrong number.
75 w is possible. 750 w impossible.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,241,765,968
RAC: 3,389,627
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34744 - Posted: 20 Jan 2014 | 22:53:52 UTC - in response to Message 34737.

Even though GPU-Grid is most famous for using the GPU, it also uses a CPU quite heavily, just because CUDA computing truly runs both on the CPU and on the GPU.

It's true for the Kepler based GPUs, but you have a GTX 460 which is a Fermi based GPU, and that series doesn't need a full CPU thread to be feeded well. You can tell it by the difference between the run time and the CPU time on the list of your host's tasks. The CPU time in every row is only 7-12% of the run time, so the GPUGrid app doesn't use a full CPU thread on your host.

AND, while I was running WUs for GPU-Grid, I was also running WUs for some other BOINC projects.

Just like most of us do.

For the other projects, I limited my CPU-core loads to 30%,

Your CPU has 8 cores (with HT on), so one core is 12.5% of the total. The 30% setting instructs the BOINC manager to use only 2 cores of the 8, which is unnecessarily low setting (even when one core is feeding the GPU). It should be between 50% and 60%.

... but it's the maximum between the core usage levels, that determines overall clock speed.


I specifically revved up the GPU fan to 80% (3240RPM), using MSI Afterburner, which I consider to be an absolute necessity for this sort of use.

That is good. What are your GPU temps? What type of cooler does your card have? (radial, or axial fan - which is very rare among the GTX 460s, only the EVGA GTX 460 FTW has that type of fan). What type of card do you use?

Further, my CPU's cooling system is a "Corsair CWCH50" liquid cooling unit. What I find is that this little cooling unit is not all it's cracked up to be.

The catch in these sealed liquid units is that their radiator can't be placed entirely outside the case, so it will be cooled by the warm air from inside the case (which is a fairly stupid thing to do, instead of removing the hot coolant from the case, and cool it externally), or by the cool air from the outside, but in that way the warm air from the radiator will be blown inside the case (which is a really stupid thing to do).

It looks cute, and there's a YouTube video that does a good job of advertizing it. But finally, I feel that this CPU would actually need a better cooling system, to be used heavily and continuously so.

The cooling of a CPU (or GPU) works in two steps:
1. remove the heat from the CPU
2. remove the hot coolant from the case, and put fresh coolant in it's place.
There are better water cooling systems than this one, and those do the 2nd part better.

http://www.youtube.com/watch?v=jx5usoClfDs

Instead of fitting my PC with a real water-cooling system, the eBay seller chose this route because it was much easier for him to install, and because it sounds neat, just like a real water-cooling setup would.

So, as far I can understand this guy fell for all these marketing BS, just like a one-off inexperienced customer.

BUT, there is no way this unit could remove the 750W of heat that the video claims,...

There is nothing said in the video about on what temperatures this unit could do that. For example a 750W incandescent bulb removes that much heat from its filament by emission, however while doing that it reaches 3500K. Maybe this unit could do it at as low as 100°C. There is no temperature measurements published about this product on Corsair's website. I guess it could dissipate 250~300W through its radiators while the CPU wouldn't go above 80°C, but 750W is an overstatement.

...which the seller of my PC also could not have known when he installed it.

If he didn't know that then it makes him a mountebank. If he did know (at least by experiencing with this one), then he is a chiseler.

But, during continuous use, my CPU can in fact put out 750W of heat I observe.

That is impossible, even with LN2 (Liquid Nitrogen, -196°C) cooling.
How did you observe it?
What is your PSU's wattage? Your whole PC can't consume more power than that.

And so what happens is that in the short run, i.e. for 24-48 hours of operation, this thing will seem to work well. But then, past 48 hours of use, the temperatures in my case just seem to keep getting hotter and hotter. When idle, my CPU temperature hovers near 29degrees C, while after a lot of crunching, I had CPU temps above 45degrees, constantly.

Check out your CPU's specifications at Intel's website. It has a maximum rating of 67.9°C, so at 45°C there is nothing to worry about. You can also see that its maximum TDP (Thermal Design Power) is 130W.
However if your temps are getting higher after 10 hours, then the airflow in your case is insufficient and/or inadequate. I think that the hot air from your GTX 460 heats up the air inside the case, and the water cooler's radiator can't do its job right using that hot air.

Finally, from Friday evening (January 17) to Saturday (January 18), I noticed a suspicious odor in my home, which resembled that of freshly-unpacked bandages, but also that of broiled bacon.

That's clearly a strong warning sign, but I'm not sure that it was the coolant's smell.

...no components per se showed evidence of fluid on the outside (when inspected by finger, while running fully).

I suggest to use paper tissue for that inspection.

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34773 - Posted: 22 Jan 2014 | 23:37:37 UTC

These are the specs for my CPU:

http://ark.intel.com/products/37150/Intel-Core-i7-950-Processor-8M-Cache-3_06-GHz-4_80-GTs-Intel-QPI

It seems that you were right, about the 750W just being an impossible, bogus wattage. But then the video was also wrong about that. The CPU's maximum wattage is 130W.

The purpose of my finger-test was to look for any possible, leaked coolant, on the assumption that the coolant, of unknown composition, would evaporate more slowly than water. How is a finger-test unreliable for that?

And Tcase is listed here as 67.9degrees Celsius. According to this, if I ran this CPU at 70degrees as one of you suggested, the package would already have failed.

Also, if I'm not stupid, but only did something stupid - Besides suspending my number-crunching and using a cooling device that the computer shipped with - What stupid thing did I do?

Dirk

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34774 - Posted: 22 Jan 2014 | 23:54:09 UTC - in response to Message 34744.

Even though GPU-Grid is most famous for using the GPU, it also uses a CPU quite heavily, just because CUDA computing truly runs both on the CPU and on the GPU.

It's true for the Kepler based GPUs, but you have a GTX 460 which is a Fermi based GPU, and that series doesn't need a full CPU thread to be feeded well. You can tell it by the difference between the run time and the CPU time on the list of your host's tasks. The CPU time in every row is only 7-12% of the run time, so the GPUGrid app doesn't use a full CPU thread on your host.


There's a fact about me which you should take into consideration. I'm an older man, who makes assumptions about computing that are not always 100% up-to-date, with the most fashionable trends in computing, nor with CUDA. I did notice that the latest CUDA versions do support subroutines, and a non-zero GPU stack. But my first assumption isn't that GPU subroutines actually get used.

If the WU is computing nested loops, it makes perfect sense to me, that it could be scheduling the inner-most loops actually to run on GPU cores, while outer loops could run on the CPU. And I don't know everything about the driver issues.

Are your WUs actually compiled to run on a Fermi-based GPU, or were they compiled to be backwards-compatible with Kepler-based GPU cores?

Dirk

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34775 - Posted: 23 Jan 2014 | 0:18:04 UTC - in response to Message 34738.

When idle, my CPU temperature hovers near 29degrees C, while after a lot of crunching, I had CPU temps above 45degrees, constantly. And, an Intel chip isn't meant to run as hot as an AMD. If this was an AMD, the same temperatures would not worry me.


How much above 45C... up to 50C? What makes you think that's too hot? Did you go to the Intel website and look at the spec sheet for that particular CPU?
Have you ever even Googled for something like intel i7 max temp. Try it, you'll be amazed.


Without being artificial, I'd say that my software was reporting CPU temperatures of ~50C. The only problem with the exact number, is that my preferences suspend GPU computing anyways, as soon as I touch the mouse and look at CPU temperatures. Thus, by the time I'm looking at them, those temperatures are already decreasing before my eyes. And, this CPU actually has 4 sensors. Two of those usually report temperatures below what the other two report, and the latter report temperatures that generally fluctuate.

I do not know why.

I customized my CPU-monitoring app, to use TJmax = 90C instead of the default TJmax = 100C, according to a thread on another forum, which I can't find right now.

But even if I was to suppose my CPU temperature was "only" 50C, I also never reported here, any CPU errors. Only the fact that I was getting concerned about this.

Dirk

BTW, I never noticed that strange odor in my home again, since I suspended number-crunching. I don't see this as a GPU issue, as much as an issue with the CPU -cooling unit in question, which might just become noticeable, when I do my number-crunching.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 34776 - Posted: 23 Jan 2014 | 1:25:37 UTC

Nobody's calling you stupid Dirk, it's more of an expression than anything else. I think the smell could have been pressure being released from your cooling unit, not leaking but releasing gas as pressure and it takes a very minuscule amount when hot to make a powerful odor.

Most late model motherboards will shut off when the CPU hits it's heat threshold, not shut down but shut off to keep the CPU from being damaged. You will also get a lot of heat from the VRM's, memory, north and a little from the south bridge.

A good high CFM fan (80 or higher) blowing in the front on to those components and 2 smaller 40 CFM fans blowing out the back make a big difference or, leave the side panel off with a large cooling fan (like a household fan) blowing in to the PC helps a lot. We use our computers harder than anyone else, that's why if the person building it doesn't crunch too, they really don't understand and think if they cool it to gamers specs, it will be fine witch it wont.

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34777 - Posted: 23 Jan 2014 | 2:08:50 UTC

@flashawk:

I think that your description of what my CPU -cooler did, matches what I currently suspect very closely. This module has a reservoir, which is strangely mounted with the unit that goes on the CPU, and not with the heat-exchanger as I'd have expected. Further, this unit is quoted (from the video) as being sealed metal-to-metal, with the plastic around the metal. This would mean, that including the tubes, the cooler itself tolerates very little expansion. Then, if the fluid only expands microscopically, there can be some small blow-off. Which would seem to mean, that this cooling loop now has a tiny air bubble it didn't have before, due to which the pressure will also not climb as steeply as it did before.

Therefore, it seems possible that the blow-off might have taken place inside that huge, round thing visible in the YouTube video, where I can neither reach with my finger, nor with any cloth.

BTW, What I seemed to encounter, was that two work-units which ran for 18 hours (+) ran fine, while a shorter work unit, only supposed to run for "2-3 hours", actually caused this thing to flake. What this vague perception would seem to suggest, is that perhaps WU's compiled to take advantage of later CUDA features are less problematical, because they involve the CPU less, than earlier WU's. Somebody might want to check, whether this idea matches the facts (on how your WU's were actually compiled).

I did notice in my Task Manager, that the GPU WU's seemed to engage only 1 CPU core, and not fully so... But I also saw in the "RealTemp" app, that the clock speed was factually knocked up to 3.07GHz part of the time.

As far as the fans are concerned, you were right in your earlier assumption, that I have numerous smaller ones blowing into the case, and one larger fan expelling air - upwards. Maybe, one stupid thing I actually did - hindsight is 20/20 - was not to rev the large, upward-facing exhaust fan, to full RPMs as well, before starting the BOINC WU's.

You see, the maker of this case left it in such a state, that I have potentiometers with which I can adjust all my fan-speeds, although I think that the speed of the actual coolant-pump is on auto-pilot. I did turn up the smaller fans, but clumsily assumed the one larger fan didn't need it. Food for future thought...

I am happy that my GPU temperature was fully stable throughout, at 64-65C .

I'd hesitate to do much hardware-work on the case myself, because I'm just not mechanically inclined. I'm more of a software person.

Finally, right now Montreal is in the middle of another "Polar Vortex", which means that this is not the time to get ambitious just yet, but rather time to scan my home for possible electrical overloads. Belsibud is in control of the weather right now, and this can even paralyze people psychologically, when there has been no real malfunction yet.

I did warn you, that my latest, 'short' GPU WU will time out.

Dirk

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34778 - Posted: 23 Jan 2014 | 4:58:01 UTC

I've decided to resume number crunching.

This is based on the observation, that even though I had most of my fans at 100%, I did not have the larger, 110mm fan at the top of my case so. Now that I truly have all my fans at 100% (except for the fan on my HD), I'm betting that the increased air circulation inside my case, will be enough to prevent a long-term overheat.

It was never really 100% certain, that the way I had it before was bound for disaster. But I had decided to play it safe...

Dirk

P.S. At some earlier point in time, I had developed the peculiar notion, that most of my fans were devoted to cooling something specific, while the big one on top was ~mainly for decoration~. In fact, it's the top one that actually expels air from the case. And, I have additional air grills built into the case, so that just to have more air expelled, will passively also cause more air to enter.

Because otherwise, air expelled by fans would equal air blown inside.

And, when I started number-crunching more recently, I had not thought to update the non-GC fan settings.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34781 - Posted: 23 Jan 2014 | 7:24:04 UTC - in response to Message 34778.

Dirk,

The dumb thing you did was buy a rig with a closed loop liquid cooling system. Again, not saying you're dumb, not saying you should have known better or anything like that. I did the same dumb thing years ago and it was nothing but a headache. Sure they sound like a great idea and yes they have great cooling capacity but they need constant watching and maintenance. There are so many things that can go wrong with them and if they're not installed properly they most certainly will fail very quickly.

After I tossed the first closed loop liquid cooler I bought I decided to make a few of my own to see if I could come up with something better. As a welder-millwright I've been building and maintaining all sorts of equipment for many, many years, built cooling systems big enough for 10 men to walk into, with coolant lines of 12" diameter steel pipe and 6,000 liter coolant reservoirs, for twin V-16 supercharged diesels. Built many smaller ones too, machined waterblocks from stock, welded them, planed the cold plates, built diaphragm pumps, impeller pumps, radiators you name it. I've wrestled with every problem you'll ever run into with cooling, seen it all, done it all and trust me when I tell you closed loop liquid cooling does not belong in a computer if you're not familiar with how to check and maintain such a system. And if you do know how to check and maintain one then you'll likely know better than to mess with one.

Now, if you know how tapered thread seals work vs. ORB type fittings and which barbs, hoses and clamps to buy (or go with compression sleeves and ferrules) and how to install them properly you can easily convert that closed loop piece of crap into an open loop system with a transparent reservoir that you can check. Or you can do the smart thing and ditch the works and go with air cooling. You're in Canada, me too. Most of the year you can get just as much cooling capacity with plain old air if you use your head. And unless you're down on the Niagra Penninsula where it gets hot as Hades in the summer you can get that all year round, easy, and all you have to worry about is blowing the dust out of the cooling fins and renewing the thermal grease. No leaks to worry about, no pump failure to worry about, ya just use it instead of worrying about it all the time.

One last hint... keep the reservoir above the pump else kiss the thing goodbye.


____________
BOINC <<--- credit whores, pedants, alien hunters

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34782 - Posted: 23 Jan 2014 | 7:50:57 UTC - in response to Message 34778.

I've decided to resume number crunching.

This is based on the observation, that even though I had most of my fans at 100%, I did not have the larger, 110mm fan at the top of my case so. Now that I truly have all my fans at 100% (except for the fan on my HD), I'm betting that the increased air circulation inside my case, will be enough to prevent a long-term overheat.


That's a bet you could easily loose. It's been demonstrated many times and documented at various websites dedicated to computer cooling issues that too many fans can be just as bad as not enough fans. As flashawk mentioned, the configuration that works best for most users is fans in the front pulling cool air into the case and fans at the rear pushing hot air out. Numerous experiments have shown fans on top and fans on the sides frequently (not always) cause turbulence that reduces air flow and cooling. Every case has a different geometry, whatever works for you works.

It was never really 100% certain, that the way I had it before was bound for disaster. But I had decided to play it safe...


Safe is smart :-)

Dirk

P.S. At some earlier point in time, I had developed the peculiar notion, that most of my fans were devoted to cooling something specific, while the big one on top was ~mainly for decoration~. In fact, it's the top one that actually expels air from the case.


Uh-oh. That doesn't sound good. If the top fan is the only one that expels air and it is not the fan attached to the rad then it's blowing the heat from the CPU back into the case. In his post in this thread Retvari kind of figured that might be the case and now again it sounds like maybe that's what's happening. You don't want to do that, the fan on the rad should expel air.

____________
BOINC <<--- credit whores, pedants, alien hunters

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34784 - Posted: 23 Jan 2014 | 11:15:43 UTC - in response to Message 34782.

P.S. At some earlier point in time, I had developed the peculiar notion, that most of my fans were devoted to cooling something specific, while the big one on top was ~mainly for decoration~. In fact, it's the top one that actually expels air from the case.


Uh-oh. That doesn't sound good. If the top fan is the only one that expels air and it is not the fan attached to the rad then it's blowing the heat from the CPU back into the case. In his post in this thread Retvari kind of figured that might be the case and now again it sounds like maybe that's what's happening. You don't want to do that, the fan on the rad should expel air.


It might be ideal, if I had the mechanical skills - and the will - to make that happen. But for now my CPU temperatures look good, fluctuating between 30C and 40C (while crunching). I don't really know why they fluctuate, but they've always done so.

Please don't forget that Yes, 'the radiator fan' does blow the slightly heated air into the case, which means it's cooling the liquid with the colder air from outside. But there's a limit to how much damage merely-lukewarm air inside my case can actually do, since the CPU doesn't have a fan on it, only the liquid cooling pump. As long as the heat doesn't build up in there I should be fine.

And the GPU was steady at 63C just now.

Thanks for all the attention. Now let's just see how good it looks, after 48 hours of crunching.

Dirk

Profile Chilean
Avatar
Send message
Joined: 8 Oct 12
Posts: 98
Credit: 385,652,461
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34785 - Posted: 23 Jan 2014 | 12:31:31 UTC

My CPU is constantly at around 85-90 C (sometimes I have to throttle back the frequency)
GPU constantly at around 75 C.

(Laptop)
____________

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34812 - Posted: 25 Jan 2014 | 4:03:40 UTC

Well I've been number-crunching for more than 48 hours again now, with the settings to use at most 25% of my processors, and at most 50% of CPU time. This is for non-GPU WU's.

My CPU temperatures have remained between 35C and 40C.

And I've completed two "long run" GPU WU's, while work is underway on a following "short run" GPU WU.

BTW, the description of the WU states that it uses 0.401 of a CPU, while using 1 GPU.

And, I did not buy this PC. It was a present which was given to me.

Dirk

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34813 - Posted: 25 Jan 2014 | 11:44:40 UTC - in response to Message 34812.

That's a very nice gift, one I'd love to receive myself. It was a mistake for me to assume you bought it.

The fluctuations in temperature you mentioned in a previous post are normal and nothing to worry about unless the high end of the swing takes the device over its max temp. You are well within the limit. There are 2 influences on the swings. The first is that certain parts of the project's application (and other software running on your computer) make the CPU work harder which generates more heat. The other influence is hysteresis which is the time lag between a cause and an effect/result. The cooling system is supposed to respond to the increase in temperature and it does but not immediately. When it eventually responds, it takes time for the coolant to circulate out through the radiator and back again. Likewise, when the temperature returns to the target temperature the cooling system should respond by decreasing the cooling effect. It does but not immediately and there is a delay that results in the time required for coolant to flow. The result is the temperature fluctuates up and down from the target temperature. You can reduce hysteresis and the magnitude of the swings but of course it will cost you something. Not saying you should reduce it, just saying one usually can in any system if one can pay the price. Personally, I don't like temperature hysteresis because it causes expansion and contraction which is not good but there is only so much I will do to reduce it.

Again, you're well below the max temp and if that's where you want to stay then that's your choice. If you're looking to see if you can force another coolant leak, it probably won't happen unless you get the temperature up a little higher.

You're using the throttle built into BOINC to slow the CPU down. It works but even the BOINC devs admit it's not a very good throttle (it's a compromise between time, what works on Windows, what works on Linux and what works on OSX). The recommended throttle is Tthrottle and I believe it would work very well for you. I don't use it because I don't need a throttle but apparently it allows you to set a target CPU temperature and if the CPU goes beyond the target Tthrottle slows the CPU down until the temperature drops below the target. In addition, Tthrottle's on-off time granularity is much finer than BOINC's in-built throttle.


____________
BOINC <<--- credit whores, pedants, alien hunters

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 21
Credit: 8,773,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 34836 - Posted: 28 Jan 2014 | 1:14:56 UTC

I've just come to an important realization, about those apparent CPU-core temperature swings. I've been using RealTemp-3.70 to monitor my CPU periodically, and apparently this little program can't quite distinguish, whether I have 4 CPU cores or 8. That's alright, because I've had a hard time explaining it to my friends personally. :-D

But apparently, the core temperatures were not really fluctuating. Instead, this program, when running, was alternating between 2 sets of 4 core-temperatures each.

My CPU must have a total of 8 sensors, to go with the '8 virtual cores, which are threaded as 4'. But the program registers only 4 cores, at any one given time.

What this means, is that my coolest core was around 30C for now, while my hottest core has been around 45C. But that's about all there is to it. After I break the blank-screen screensaver, the hottest cores comes down from its 45C to closer to 40C.

Also, I understand that when I throttle the BOINC client, it intentionally alternates between full CPU use and zero. I suspect that the reason for this is, the fact that the CPU only comes to full clock speed, when one of the cores is near 100% used. I.e., it would make little sense for the BOINC client only to throttle the core to 50% usage, if doing so also meant that the clock speed would be part-way up.

But the time-frame in which this seems to happen, is much too fast, for a CPU core to change 10C or more in temperature.

Bye,
Dirk

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34837 - Posted: 28 Jan 2014 | 6:17:04 UTC - in response to Message 34836.

I've never heard the expression "break the screensaver" before. I assume it means the screensaver is on because you're not using the computer then you either move the mouse or use the keyboard then the screensaver turns off. If that's what you mean and if the temperature of a CPU drops from 45 to 40 when you break the screensaver then I am pretty sure you have BOINC configured to suspend computing while you are using the computer. BOINC defines "using the computer" as keyboard or mouse activity. You could be seeing temperature swings for that reason too, in fact that and hysteresis may be the only reason you are seeing 95% of the swings you see.

I just don't buy the "8 sensors, one for each virtual core, which are threaded to 4". I don't understand what you mean by "threaded to 4". It's a usage and/or concept I am unfamiliar with and I doubt Intel would have 1 sensor per virtual core. To me, that would make no sense. Maybe someone else can make sense of it?

Also, I've never used RealTemp so I don't know what it is capable of or incapable of but I kind of suspect it isn't switching from one set of sensors to another. I could be wrong about that but at the moment it just sounds too odd to be true, IMHO.

I suspect there is some other reason for why you're seeing what you're seeing, I don't think you've got it figured out yet and I highly recommend that you turn off BOINC's throttle and forget that it even exists. You've mentioned that it wouldn't make sense for BOINC to do <whatever> but trust me when I tell you there is not one single lick of sense in BOINC's throttle, it is as dumb as a bag full of broken hammers and doesn't make any decisions about what it ought to do or not do. By removing it from the mix of things that are affecting your temperatures you'll have one less thing to try to explain and will understand what's happening with your temps that much sooner. I highly doubt you need a throttle anyway if you're going to use only 50% of the cores for crunching and suspend crunching when you use the computer (if that is what you are doing).

____________
BOINC <<--- credit whores, pedants, alien hunters

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34889 - Posted: 4 Feb 2014 | 0:11:30 UTC - in response to Message 34837.
Last modified: 4 Feb 2014 | 0:33:29 UTC

I'll chime in, but only to say this, as clearly as possible, so as to help you guys distinguish real cores, hyperthreading, and logical cores.

I have an Intel i7 965 eXtreme Edition CPU.
It has 4 real cores, with hyperthreading (enabled in the BIOS).
Hyperthreading allows 2 threads to run simultaneously on a single core, offering potential performance gains.
This means that Windows (and other programs) are capable of showing 8 logical cores. I still only have 4 real cores, but the "usage" of each "hyperthreaded core" can be tracked, such that I can track usage on all 8 of my logical cores.
But temperature readings can only come from the cores themselves; as such, CPU Temp shows 4 temperatures, 1 for each core.

Dirk, I strongly recommend setting "Use at most X% CPU Time" to 100%. BOINC doesn't handle this setting too well, and some applications also don't handle it very well, especially GPU applications such as GPUGrid.

Then, if you want to limit how many concurrent processes BOINC is allowed to run (maybe for heat concerns?), set "On multiprocessor systems, use at most X% of the processors" to a value. I recommend 100%, but on an 8-virtual-core system, other valid values are: 88% (7 CPU), 75% (6 CPU), 63% (5 CPU), 50% (4 CPU), 38% (3 CPU), 25% (2 CPU), 13% (1 CPU).

In regards to temperatures, I routinely run all cores at 78-85*C, and all 3 GPUs at 65-80*C. Hot temps are normal for hot systems under full load.

Hope that helps.
- Jacob

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34914 - Posted: 5 Feb 2014 | 15:22:15 UTC - in response to Message 34785.

My CPU is constantly at around 85-90 C

(Laptop)


*hehe* nice
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34919 - Posted: 5 Feb 2014 | 16:40:34 UTC - in response to Message 34889.

I'll chime in, but only to say this, as clearly as possible, so as to help you guys distinguish real cores, hyperthreading, and logical cores.


Gee thanks, Jake, but I'm quite aware of the diff between real cores, logical (virtual) cores and what hyperthreading is. I suspect Dirk is too but maybe an unfortunate choice of words on his part led to my not understanding him. I don't see any sense in my responding with recommendations to something I don't understand unless it's just to say "I don't understand" or "you're talking gibberish".

____________
BOINC <<--- credit whores, pedants, alien hunters

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34920 - Posted: 5 Feb 2014 | 16:58:20 UTC - in response to Message 34919.
Last modified: 5 Feb 2014 | 17:01:55 UTC

I'm sorry. I knew there was a chance that what I'd post may come across as "talking down", and that certainly wasn't the intent.

I was merely trying to make it clear, for Dirk especially, that a quad-core hyperthreaded CPU has 4 real cores, 8 virtual cores, and likely only 4 CPU temperature sensors. I use CoreTemp to monitor the temp and load on each real core.

And I set BOINC up to schedule 8 CPUs worth of tasks, to ensure I can take advantage of any hyperthreading gains that might occur when running 2 threads on a single core. As such, CoreTemp always shows all 4 cores as 100% loaded.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34922 - Posted: 5 Feb 2014 | 17:20:47 UTC - in response to Message 34920.
Last modified: 5 Feb 2014 | 17:21:14 UTC

I'm sorry. I knew there was a chance that what I'd post may come across as "talking down", and that certainly wasn't the intent.


No problem :-) Been doing this since early FidoNet days, I usually don't take offence and don't hold grudges. Due to limitations in this form of comms as well as user time constraints I take it all with a grain or two of salt.
____________
BOINC <<--- credit whores, pedants, alien hunters

Post to thread

Message boards : Number crunching : I'm going to suspend crunching for now.

//