@OrioStormactually forget everything I said about Triple Buffering, it didnt work, it just didnt crash for a long time.
However I made an interesting discovery.
I have i7 7700k with a noctua cooler (runs very cool) and MSI 1080 ti sea hawk, these were top shelf parts couple of years ago.
Now I did have some overclocking on the CPU, not much increase but from 4,2 ghz to 4,8 ghz.
When I read here about the CPU problems, I reset my bios settings to default so it no longer runs an overclock.
Also I disabled Intel Speedstep technology in BIOS.
I played all night without any crash now, longer than this Triple Buffering Theory I had previously. I can even play with V-sync off and whatever setting I like.
Im pretty sure the game is fixed for me now because usually it would crash like an amount of times every hour.
This time I played for many many hours (all night actually) without even a single crash..
so I recommend everyone here if they can, put your CPU speed back to default and disable Intel SpeedstepTM.
I will keep on playing all day tomorrow to ensure this in fact had something to do with the crashes.
Like I said with oc before and Intel SpeedstepTM on it would crash almost every match for me.
Now thats all gone it seems.
Tried disabling the meltdown/spectre fixes just to test and didn't change anything. Here is another crash log. Starting to crash a lot without a crash log which is getting quite annoying. Reset my bios, tried everything I could think of. My PC is brand new and I have never had an issue in other games. I even did a fresh install of Windows 10 with the latest 1809 update. It clearly has to be triggering something on the higher end Intel processors. Mainly the 7700k and 8700k. Haven't got the typical EXCEPTION_BREAKPOINT crash yet though.
April 2019 - last edited April 2019
Thanks for the answer.
The fact is that my CPU is all new, I installed it like 3/4 weeks ago, and i'm 100% sure that it's not overheating and I never had the bug before in 300/400hrs played with the same processor overcloacked, only had it one time just when the update 1.1.1 came out. But maybe the overcloack is causing trouble, yeah.
I know what you mean, actually I am a system administrator. However I only experienced the crash one time. After a clean reboot I was able to play for 4 hours without any crashes. As you said it must be rare.
Is it also possible that the autoexec command " cl_forcepreload " had a role in this ? Mine is to 0 at the moment (till 2 days.) It was 1 before I experience this crash
Do you recommend a value for this command ?
I have an idea what is causing these crashes.
Are you using the CPU's L3 cache in a nonstandard way?
I made two changes last night which fixed all crashes on an overclocked 9900K, AND which removed all "Internal Parity Errors", but I still need to test this again today (it's 3 AM right now) to confirm, but there were no errors in two hours of testing. But I have to make 100% sure before I jump the gun.
And no, it was NOT CPU Vcore at all, and it might also explain why some people fixed the issue by using an AVX offset (Apex Legends does not use AVX).
@eXe_NIBIRU, the game doesn't use or even create "cl_forcepreload", so my only recommendation is to not set it.
@TEZZ0FIN0, thanks for the experimental results. It's interesting that disabling speedstep seems to have stopped the issue for you. You shouldn't HAVE to disable a CPU feature to get the CPU to work. If the CPU says it will go up to a certain frequency as long as the temperature stays in range, then that frequency should work.
@Falkentyne, the function that is crashing is really quite boring. It's just doing some simple math in a loop. There's nothing that stands out as strange or tricky compared to any other part of the code. If there was anything remotely suspicious in this code, I'd probably change it just to see if "stirring the pot" caused the problem to go away.
Which brings up another interesting point. This function is templated with two nearly-identical versions (one for shadows and one for cameras). The crash has only ever happened in the one for cameras. Why doesn't the other one ever crash? I'll have to compare their disassembly soon...
You said that it's doing a loop, right?
Aren't loops heavily cache intensive since it's pulling instructions repeatedly that have already been executed?
I did a test last night.
I set my 9900k to 5.2 ghz (Hyperthreading off), and I set my cpu voltage -extremely- high. 1.385v. High enough to be Prime95 AVX stable 1344K fixed FFT's.
(even did a 30 minute stress test, temps were reaching 90C but no problems, but not safe to test at those voltages).
That's what you would consider stable, right?
Nope. Apex crashed to desktop with no error.
Then I set it to 1.390v. Apex crashed with the "usual" 2DFA hex error (forgot exactly, but you know it) that many people are getting.
Ok something isn't right if I'm prime AVX 1344K stable but Apex is crashing.
So then I set it to 1.395v. This is getting in dangerous territory.
Guess what happened?
There were no crashes, but an "Internal Parity Error" was logged on CPU core #2 (going from cores 0 to 7).
So then I set the voltage down to 1.335v
Got a "memory can not be "read" error. (attached here).
I then increased CPU PLL Overvoltage (+mv)--(i don't even know what this setting does, has something to do with "clipping" the PLL so the CPU receives more than other devices that use the PLL) to +160mv. tested 1.335v again and got three Internal Parity Errors.
So if voltage isn't helping (Apex should NOT be crashing when Prime95 isn't crashing!) and you keep mentioning "loops", I realized it had to be SOMETHING else--possibly CACHE related.
Downclocking my RAM to 2133 mhz (3200 mhz CAS 14 gskill) didn't do anything.
So I increased VCCIO and VCCSA voltages to 1.25v and put the CPU voltage back at the unstable 1.335v.
1 hour at 5.2 ghz HT off (1.335v): No crashes or parity errors.
So then I tried 5.1 ghz, HT on (1.340v). No errors.
So this means that the crashes are related to the L3 cache. Not the L2 cache. I know that the VCCIO voltage controls both the memory controller and the shared L3 cache. Since downclocking the RAM didn't help, it isn't the memory controller that is the issue. It also isn't the cache speed either, as I said my system earlier to 4.7 ghz core, 4.7 ghz cache, 1.230v (Loadline calibration=High), which would cause a clock watchdog timeout in Prime95 FMA3, but Apex ran just fine.
I'm guessing that at higher clockspeeds, the L3 cache needs to run faster because the core is running faster. So if Apex Legends is making heavy use of the cache (you mentioned looping instructions), then that means VCCIO needs to be increased, not CPU Vcore. I'm still testing this though. I think this can explain why other users are passing stress tests but Apex is crashing. As far as I know, many stress tests make HEAVY use of the L1 and L2 caches, which are directly tied to CPU Vcore. L3 cache isn't tied to CPU Vcore, but to VCCIO.
Stock VCCIO is 0.95v and stock VCCSA is 1.05v.
I think people can try increasing both VCCIO and VCCSA to 1.25v and see if their crashes stop.
I looked at the diff of the disassembly of the shadow and camera versions of this templated function. For most of the function, the only difference in the disassembly is what registers the compiler assigned to various things. However, the shadow version always uses a specific lod, so it skips a bunch of scalar floating point math (which is done on SSE hardware).
Also, the shadow version is actually only used for the really distant shadows, which only do partial updates when things like the drop ship are moving. Many frames we don't update them at all. Plus, the nature of the incremental updates means that it processes far fewer models. So, the shadow version does a lot less work, and does it less often. That reduced workload may be enough to explain why it never seems to crash.
@Falkentyne, it is doing a loop, so the data will be coming through L3 cache. Some of the data never changes, and some is being produced in another thread.
However, the crashes are all related to instructions, not data. The instruction cache is 256 kiB on an i9-9900K, which is more than enough to hold this function. So normally the whole function will be in the instruction cache after the first iteration. This crash always occurs after a different number of iterations, but the fewest I've seen is still 137 iterations.
Now, that doesn't mean the problem CAN'T be the cache. Windows is a preempting OS, so it can switch to another program and/or migrate this thread to another core. If either of those happens, it effectively flushes the L1 instruction cache, so it has to refill, which probably will go through L3 cache.
On the other hand, if it was the cache and not related to the actual instruction sequence, I would expect this bug to show up throughout the executable. Instead, they're always in this one function. So even if L3 cache is a factor, it seems like it may just affect timing, which also appears to be a factor. Also, if it was cache related, I'd expect