Re: Max CPU threads tweak for Core i7 and AMD FX's CPU's - Testing & Results

by RohanVonDragon
Reply

Original Post

Accepted Solution

Max CPU threads tweak for Core i7 and AMD FX's CPU's - Testing & Results.

[ Edited ]
★★ Apprentice

Edit: Thanks for the people who tested, go to solution if you want technical details. To increase your max CPU threads and measure performance, follow this post below. There will likely be a sweet spot for most people, not too high or not too low thread count. your results are highly dependant on GPU being bottlenecked or not. NPC crowded areas in particular are most demanding on CPU and should show the most improvement. Note: this won't do anything for 4-core CPU's without HT or 4-core AMD's since both will use 3 threads by default and be unable to increase past that.

 

Original post (Edited slightly):

 

I found a way to increase the number of  "Job threads" the game will use. A friend of mine reported that it improved his performance a lot in NPC crowded areas. 

 

Instructions:

1. Create a file in the DA:I install directory named "user.cfg" (same folder dragonageinquisition.exe is located)

2. Add the following lines:

Render.DrawScreenInfo 1

Thread.MaxProcessorCount 6

3. Save file and start the game, you should see something similar to this https://i.imgur.com/KFnvAA4.png (taken pre-patch, reports 3 for me now post-patch)

4. Raise the Thread.MaxProcessorCount as high as you can before the game stops increasing job threads. You might want to lower to whatever sweet spot you find though after you're done. You can remove the Render.DrawScreenInfo line to remove the top information overlay after you've obtained necessary information. (Edit: max threads should be 7 for Intel 4-core with HT, and also 7 for AMD 8 "core" FX's, but 6 may be more optimal)

 

I have a 4-core i5 without HT so I can't seem to get the number to go past 3, but a friend of mine with a 6-core i7 with HT managed to get it up to a maximum of 10 Job threads. So, I'm thinking:

A: game will always leave one "physical" core free (in his case, this reserved 2 of his 12 CPU's)

or

B: game is limited to 10 max threads, while also attempting to leave at least one CPU free.

 

I'm particularly interested in people with 4-core i7's with HT (8 CPU's) because I'm thinking of upgrading, but it would also be interesting to see how high an 8-core (16 with HT) could go (or AMD 8+ core CPU's).

 

Also interesitng to note is the patch raised the default job threads (at least for me, from 2 to 3) which could explain why some people are reporting worse performance somehow, but in my experience this has increased performance.

 

Another tweak you can try if you want is adding this line to that file:

gametime.maxvariablefps 60

 

This tells the game to not render (or pre-render) past 60fps,  which seemed to reduce stutter a lot and even increase fps without even moving camera (this is not the 60fps cutscene hack, CS's will still be 30). I came to the conclusion that the game is issuing too many draw calls ahead of time, or pre-rendered frames, which causes CPU spikes and frame dips. "Perfoverlay.DrawGraph 1"  will add a CPU/GPU usage graph which I was using to investigate the poor performance. This also explains why some people report setting max pre-rendered frames to 1 improved their performance, unfortunately SLi users can't control this value. I have a feeling both the game and GPU drivers are pre-rendering frames or something which causes a bottleneck.

 

To show your current FPS, add this line:

perfoverlay.drawfps 1 

 

Please post results. To revert changes, delete/rename the user.cfg file.

 

Thanks.

 

Message 1 of 16 (10,032 Views)

Accepted Solution

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

[ Edited ]
★★ Apprentice

Edited! New information, new conclusion.

 

Thanks for testing, this confirms my suspicions on how the game engine works. I've come to the following conclusions:

 

On Intel CPU's the engine will always attempt to reserve one core or HT core (i.e. 7 jobs max on 4core with 8 CPU HT, and 3 max jobs with a i5 without HT)

On AMD CPU's the engine will always try and reserve one core or CMT core (i.e. 7 jobs max on a 8 "core" FX CPU, and presumably max 5 threads on a 6 "core")

Engine may have a 10-thread limit since a 6-core with HT can only go to max of 10.

 

 It would be interesting to see some tests with an Intel 8core/16 thread HT or AMD Opteron with 16 "cores" to see if there really is a 10-job hard limit in the engine.

 

Technical details and speculations/opinions:

AMD FX 8-core CPU's have 4 core clusters, each cluster containing two dedicated integer cores but a single shared FPU. Windows treats this similar to Intel's HT, also called SMT. AMD's version is called CMT. According to AMD documentation, the shared FPU between two Integer cores can perform independant operations if operating in 128bit mode, or combined for 256bit instructions. Intel i7 CPU's have singular but powerful cores with 2 threads capable of running on each core, albeit without any dedicated resources for each HT thread. Due to this design, the cores are presumably more optimized to leverage fewer threads with advanced SSE and AVX instruction sets, or more threads with HT when a program is coded with HT in mind (Unsure about Intel HT best coding practices, there's probably documentation on how to code 2 threads to run with minimal contention on each core). On the AMD side, they seem to be better optimized for more threads if the engine is programmed properly. So... more threads may not always be better, but could be in some circumstances. Apples and Oranges, each with their own strengths/weaknesses. I'm not trying to start a CPU war here.

 

In the AMD case here, this engine seems to be better optimized to take advantage of more threads on AMD if it will enable 7 threads on their 8-core CPU's, presumably for console optimizations (since Xbox One and PS4 both have AMD CPU's). If this were more Intel optimized it would take advantage of the latest SSE and AVX technologies better, with HT optimizations. I'd imagine it does to some degree but probably more work has been put into optimizing for AMD, and rightly so since their game performance is mediocre in comparison when not properly optimized to leverage AMD's clustered cores and shared core resources efficiently (Assuming there isn't a GPU bottleneck first).

 

My opinion here on max jobs: If you're running a lot of background apps and/or have high OS overhead, maxing out the threads may not be optimal although the game does attempt to save one core for background stuff, if you have background apps running on that last core which shares resources, it could cause thread contention.

 

 

Next up... benchmarks to find optimal thread counts, then tackle the GeForce/Radeon debate - which this game clearly lacks NVidia optimizations but has many wortkarounds. I will save that can of worms for a new topic.

View in thread

Message 6 of 16 (11,242 Views)

All Replies

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

[ Edited ]
★★ Apprentice

bump, can anyone please test this? I really need to know how the max thread jobs scale with a HT CPU.

Message 2 of 16 (9,957 Views)

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

[ Edited ]
★★★★★ Novice

I have a i7-4790k and with Thread.MaxProcessorCount 8 i got 6 job threads.

 

edit: Also i get the same fps with 4 or 6 job threads.

Message 3 of 16 (9,934 Views)

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

[ Edited ]
★★ Pro

I have an AMD FX-8350 8-Core

Dual R9-270 DCUII OC 2GB in crossfirex

 

I changed the

Thread.MaxprocessorCount to 8, and got 7 Threads in game

 

prepatch I was getting 8 Cores 4 Threads.

 

-update- With Dual GTX 760s in sli I get  8 Cores 6 threads...

with the same command.

-update2- Get a different thread count when using amd vs nvida. AMD is using 1 more core than with Nvida,

8cores 7 threads with the same command.

 

Thank you for the information. I was looking for a way to increase thread usage.

 

+1

 

 

-as an afterthought, game gets really stuttery, if you enable multi-gpu with this command.

RenderDevice.MultiGpuEnable 1

so i removed it from my user.cfg

 

game still stutters a little bit, from time to time... large fps drops i.e. 200 FPS dpwn to 2 or 3 in some places... usually when spawning in from a fast travel... or the memory usage goes Really really high, or the pagefile usage goes over 12 GB.... so there are some memory leaks and processor issues still.

Message 4 of 16 (9,923 Views)

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

★★★★ Apprentice

FX-8370, 2x R9 290 Tri-X

 

"Thread.MaxProcessorCount 8" seems to have fixed from hotfix the singleplayer rubberbanding, what I got from running around and maneuvering with right mouse button.

Message 5 of 16 (9,886 Views)

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

[ Edited ]
★★ Apprentice

Edited! New information, new conclusion.

 

Thanks for testing, this confirms my suspicions on how the game engine works. I've come to the following conclusions:

 

On Intel CPU's the engine will always attempt to reserve one core or HT core (i.e. 7 jobs max on 4core with 8 CPU HT, and 3 max jobs with a i5 without HT)

On AMD CPU's the engine will always try and reserve one core or CMT core (i.e. 7 jobs max on a 8 "core" FX CPU, and presumably max 5 threads on a 6 "core")

Engine may have a 10-thread limit since a 6-core with HT can only go to max of 10.

 

 It would be interesting to see some tests with an Intel 8core/16 thread HT or AMD Opteron with 16 "cores" to see if there really is a 10-job hard limit in the engine.

 

Technical details and speculations/opinions:

AMD FX 8-core CPU's have 4 core clusters, each cluster containing two dedicated integer cores but a single shared FPU. Windows treats this similar to Intel's HT, also called SMT. AMD's version is called CMT. According to AMD documentation, the shared FPU between two Integer cores can perform independant operations if operating in 128bit mode, or combined for 256bit instructions. Intel i7 CPU's have singular but powerful cores with 2 threads capable of running on each core, albeit without any dedicated resources for each HT thread. Due to this design, the cores are presumably more optimized to leverage fewer threads with advanced SSE and AVX instruction sets, or more threads with HT when a program is coded with HT in mind (Unsure about Intel HT best coding practices, there's probably documentation on how to code 2 threads to run with minimal contention on each core). On the AMD side, they seem to be better optimized for more threads if the engine is programmed properly. So... more threads may not always be better, but could be in some circumstances. Apples and Oranges, each with their own strengths/weaknesses. I'm not trying to start a CPU war here.

 

In the AMD case here, this engine seems to be better optimized to take advantage of more threads on AMD if it will enable 7 threads on their 8-core CPU's, presumably for console optimizations (since Xbox One and PS4 both have AMD CPU's). If this were more Intel optimized it would take advantage of the latest SSE and AVX technologies better, with HT optimizations. I'd imagine it does to some degree but probably more work has been put into optimizing for AMD, and rightly so since their game performance is mediocre in comparison when not properly optimized to leverage AMD's clustered cores and shared core resources efficiently (Assuming there isn't a GPU bottleneck first).

 

My opinion here on max jobs: If you're running a lot of background apps and/or have high OS overhead, maxing out the threads may not be optimal although the game does attempt to save one core for background stuff, if you have background apps running on that last core which shares resources, it could cause thread contention.

 

 

Next up... benchmarks to find optimal thread counts, then tackle the GeForce/Radeon debate - which this game clearly lacks NVidia optimizations but has many wortkarounds. I will save that can of worms for a new topic.

Message 6 of 16 (11,243 Views)

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

★★ Novice
LeetMiniWheat, so for example when I put my max/min at 8 with a i7 3770k @ 4.4, in game I says 7 threads. Would the optimal configuration be 6 threads, so the other two threads (one for background and one for just in case?)

The game originally told me that I was only using 4 threads, but I couldn't discern a difference. However that could be due to me putting in too high of a thread count as you said.

I'm just looking for the understanding of whether or not I should leave it by default, or try messing with it.
Message 7 of 16 (9,658 Views)

Re: Max CPU threads tweak for Core i7 and AMD FX's CPU's - Testing & Results

[ Edited ]
★ Apprentice

Here are my results

 

Renderer: Direct3D 11.0  1080x720 (scaled: 320x180) @ 0.00Hz  Vsync: On  GPUs: 1  GPU RAM: 0/1011  GPUs: 1  GPU: [0x6739\  AMD Radeon HD 6800 Series    CPU Cores: 4  Job Threads: 3  CPU: Intel Core i5 CPU    661 @ 3.33GHz

 

CPU is maxed out while trying to play.

 

 

Question: the GPU RAM says 0/1011.  Does that mean that the game is not using any of the RAM built on the GPU?

Message 8 of 16 (9,622 Views)

Re: Max CPU threads tweak for Core i7 and AMD FX's CPU's - Testing & Results

[ Edited ]
★★★ Apprentice

hey good job ^^ but iam not an english or  hardware crack .. can someone translate it in german ^^ ? a shot video tutrial via youtube would be good :D

Message 9 of 16 (9,583 Views)

Re: I found a CPU threads tweak, can anyone with a HT CPU test it? may work for

[ Edited ]
★★ Apprentice

@Nadajohna wrote:
@Sh0tGunAnGeL, so for example when I put my max/min at 8 with a i7 3770k @ 4.4, in game I says 7 threads. Would the optimal configuration be 6 threads, so the other two threads (one for background and one for just in case?)

The game originally told me that I was only using 4 threads, but I couldn't discern a difference. However that could be due to me putting in too high of a thread count as you said.

I'm just looking for the understanding of whether or not I should leave it by default, or try messing with it.


are you saying it actually shows 7 threads on a 4core(8 thread) CPU? I thought someone above confirmed it would only go to max of 6. this changes things then

Message 10 of 16 (9,500 Views)