id Tech Forums

id Tech 4 (Doom3/Prey/Q4) => id Tech 4 Engine Coding => Topic started by: revelator on December 12, 2019, 02:57:03 AM

Title: updated code for getting system memory
Post by: revelator on December 12, 2019, 02:57:03 AM
little tidbit for systems like mine with more than 4gb of system memory.

in win_shared.cpp put this at the top ->

/* functions for the below */
typedef BOOL( WINAPI *PGetPhysicallyInstalledSystemMemory )( PULONGLONG TotalMemoryInKilobytes );
#define GPA( module, func ) ( PGetPhysicallyInstalledSystemMemory ) GetProcAddress( GetModuleHandle( module ), func )

/*
================
Sys_GetSystemMemory

description:
retrieves installed physical memory from bios
================
*/
static uint64_t Sys_GetSystemMemory( void )
{
/* This code only works on XP or older */
#if ( _WIN32_WINNT <= 0x501 )
MEMORYSTATUSEX statex;
statex.dwLength = sizeof( statex );
GlobalMemoryStatusEx( &statex );
uint64_t mem = static_cast<uint64_t>( statex.ullTotalPhys ) / 1024.0; /* else it would return bytes */
return mem;
#else
/* this works on vista and up */
ULONGLONG mem; /* physical memory installed (kb) */

PGetPhysicallyInstalledSystemMemory pFGetPhysicallyInstalledSystemMemory = GPA("kernel32.dll", "GetPhysicallyInstalledSystemMemory");

/* Uh oh... */
if ( !pFGetPhysicallyInstalledSystemMemory || !pFGetPhysicallyInstalledSystemMemory( &mem ) )
{
return 0;
}

/* Couldnt get system ram */
if ( !mem )
{
return 0;
}
return static_cast<uint64_t>( mem );
#endif
}


and replace Sys_GetSystemRam with this ->

uint64_t Sys_GetSystemRam( void )
{
    uint64_t physRam = Sys_GetSystemMemory() / 1024.0;
    physRam = ( physRam + 8 ) & ~15;
    return physRam;
}


remember to change it in the header as well since the original used int as a return value.

code will now correctly report your system ram on newer windows versions, the old code could only report 4 GB max.

and get rid of the video ram detection code it no longer works correctly and also suffers from the same limit,
you can only get videoram on modern gfx cards by using card specific codepaths like this ->

void idVertexCache::Show( void )
{
GLint  mem[4];

if ( GLEW_NVX_gpu_memory_info && ( glConfig.vendor == glvNVIDIA ) )
{
common->Printf( "\nNvidia specific memory info:\n" );
common->Printf( "\n" );
glGetIntegerv( GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX , mem );
common->Printf( "dedicated video memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_TOTAL_AVAILABLE_MEMORY_NVX , mem );
common->Printf( "total available memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_CURRENT_AVAILABLE_VIDMEM_NVX , mem );
common->Printf( "currently unused GPU memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX , mem );
common->Printf( "count of total evictions seen by system %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX , mem );
common->Printf( "total video memory evicted %i MB\n", mem[0] >> 10 );
}
else if ( GLEW_ATI_meminfo && ( glConfig.vendor == glvAMD ) )
{
common->Printf( "\nATI/AMD specific memory info:\n" );
common->Printf( "\n" );
glGetIntegerv( GL_VBO_FREE_MEMORY_ATI, mem );
common->Printf( "VBO: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "VBO: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "VBO: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "VBO: largest auxiliary free block %i MB\n", mem[3] >> 10 );
glGetIntegerv( GL_TEXTURE_FREE_MEMORY_ATI, mem );
common->Printf( "Texture: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "Texture: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "Texture: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "Texture: largest auxiliary free block %i MB\n", mem[3] >> 10 );
glGetIntegerv( GL_RENDERBUFFER_FREE_MEMORY_ATI, mem );
common->Printf( "RenderBuffer: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "RenderBuffer: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "RenderBuffer: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "RenderBuffer: largest auxiliary free block %i MB\n", mem[3] >> 10 );
}
else
{
common->Printf( "MemInfo not availabled for your video card or driver!\n" );
}
}


and sadly this only works on nvidia and AMD.
Title: Re: updated code for getting system memory
Post by: revelator on December 12, 2019, 09:10:29 AM
A barebones (no editors) engine using the above and some fixes from darkmod like AVX optimizations and fps untangling (com_fixedtic)
as well as a hybrid GLSL and ARB2 renderer is avaliable if anyone wants to try it.

It runs smooth as butter on my AMD card  :))

thread code also had a few fixes as well as the VBO cache.

If someone wants to try toying with depth access, set r_skipDepthCapture to 0 and provide a shader (darkmod has a working one),
you will need to add it to progDef_t in draw_interactions.cpp the support code is allready there but you would need to supply code for using the depth access (like soft particles SSAO etc).
The support code will get yor cards depthbuffer capabilities and set the depth access bits automatically (most cards today can do 24 bit depth anyway, but in case it cannot and the driver reports it correctly, it will autoset the bitdepth to whatever your card supports).

I have a github site where i will upload the code, still toying with openal soft and adding framebuffers.
Title: Re: updated code for getting system memory
Post by: revelator on December 17, 2019, 04:34:38 PM
Well this is wonderfull, github destroyed my project by overwriting all my changes with the old code despite being told not to  >:(

SO this project will go no further and im done with git.
Title: Re: updated code for getting system memory
Post by: caedes on December 18, 2019, 06:46:31 AM
did you try to restore the old (or rather latest unbroken) state locally with git reflog?
Title: Re: updated code for getting system memory
Post by: revelator on December 25, 2019, 08:25:07 PM
tried everything the error is not recoverable as it decided the next push that my local copy was not in sync with the online one,
so it merged my local copy with the broken online commit and thats all she wrote.
Title: Re: updated code for getting system memory
Post by: caedes on January 06, 2020, 04:49:23 PM
Do you still have that git repo on your harddisk?
It might still be possible to get the last (committed) state before it merged in garbage.
(If you had uncommitted changes when the merge happened those might indeed be gone, but everything that was locally committed might be salvageable)
Title: Re: updated code for getting system memory
Post by: revelator on January 10, 2020, 05:43:57 AM
deleted both the repo and the sources im afraid.

To many changes to the source in comparison with the old one to recreate the missing parts, i might as well have started from scratch  :-[
Title: Re: updated code for getting system memory
Post by: caedes on January 10, 2020, 09:19:28 PM
That's a pity :-(

From what you posted here and in comments on github you had some awesome changes, some of which I would have loved to integrate in dhewm3 like SMP multithreading changes and microsecond timing which could (or even did?) help with support for high refreshrates..

Well, shit happens, I guess :-/
Title: Re: updated code for getting system memory
Post by: revelator on January 14, 2020, 09:44:10 AM
darkmods smp changes are easy to get at, they are all commented in the idlib math source code.
There is however one snag since they removed old MMX and ALTIVEC paths there are a few places with commented code for that which needs to be reactivated.

The timing code makes use of some std:: functions which may not be in all versions of MSVC so be carefull there.

The multithreading code however was lost which indeed sucks  :-\  and also unfortunatly the hybrid GLSL backend.

The smp and timing changes are probably the ones you would be most interrested in, since they provide the biggest noticable performance boost.

The AVX and AVX2 smp functions are used for stencil shadow volumes, which have allways been a major ressource hog, so the boost is quite welcome there.
Title: Re: updated code for getting system memory
Post by: revelator on January 16, 2020, 01:56:41 AM
i added AVX and AVX2 to the dhewm code, but you need to revisit the SDL2 libraries provided since it does not support AVX2 and it should (it has been availiable since version 2.0.4 but this version of 2.0.4 does not support it).
The timing changes will be added later because of differences in the code related to SDL which im not used to working with.
Title: Re: updated code for getting system memory
Post by: revelator on January 16, 2020, 06:15:57 AM
added the base code for the darkmod SMP changes to dhewm, still a lot of work needed but its closer now.
Lots of stuff from darkmod runs through here so its easy to make mistakes and add stuff that does not belong there (lightgems framebuffers ad nauseum etc and so forth).
Since dhewm relies on SDL threads im not even certain if i can do it, main async thread runs at a different ticrate than normal 3 vs 60.
Title: Re: updated code for getting system memory
Post by: caedes on January 18, 2020, 06:34:42 PM
Thanks for porting those changes to dhewm3, that's awesome!

If you need help with the dhewm3 codebase or SDL feel free to ask :)
Title: Re: updated code for getting system memory
Post by: caedes on January 20, 2020, 11:01:46 PM
oh yeah, regarding the SMP stuff, I haven't looked at TDM but if they're using the same interface as D3BFG the following code I wrote ages ago might be helpful:
https://github.com/DanielGibson/DOOM-3-BFG/commit/edfa0ded31cb6cab76eb3eb2db27cbecf62e8d55
Title: Re: updated code for getting system memory
Post by: revelator on January 21, 2020, 05:01:38 AM
thanks ill have a look at it  ;)
The simd changes needed a lot more headers in the math sources than what was originally in darkmod since those where pulled from precompiled.h originally.
the SMP changes need some std functions that i cannot guarantee will exist in all versions of msvc or even gcc so i will need to be carefull there.
Title: Re: updated code for getting system memory
Post by: caedes on January 21, 2020, 06:57:03 PM
Maybe we can replace the std:: stuff used for SMP with SDL functions or OS-specific functions eventually for better portability (and because right now dhewm3 doesn't use anything from std:: ), but don't worry about it for now, once we have something that works well, iterating on it should be easy(er) :)

(Also, I might be able to do the std:: replacements myself even in the limited time I currently have for dhewm3)
Title: Re: updated code for getting system memory
Post by: argoon on February 27, 2020, 05:35:24 PM
Quote from: revelator on December 12, 2019, 02:57:03 AM
...

and get rid of the video ram detection code it no longer works correctly and also suffers from the same limit,
you can only get videoram on modern gfx cards by using card specific codepaths like this ->

void idVertexCache::Show( void )
{
GLint  mem[4];

if ( GLEW_NVX_gpu_memory_info && ( glConfig.vendor == glvNVIDIA ) )
{
common->Printf( "\nNvidia specific memory info:\n" );
common->Printf( "\n" );
glGetIntegerv( GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX , mem );
common->Printf( "dedicated video memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_TOTAL_AVAILABLE_MEMORY_NVX , mem );
common->Printf( "total available memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_CURRENT_AVAILABLE_VIDMEM_NVX , mem );
common->Printf( "currently unused GPU memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX , mem );
common->Printf( "count of total evictions seen by system %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX , mem );
common->Printf( "total video memory evicted %i MB\n", mem[0] >> 10 );
}
else if ( GLEW_ATI_meminfo && ( glConfig.vendor == glvAMD ) )
{
common->Printf( "\nATI/AMD specific memory info:\n" );
common->Printf( "\n" );
glGetIntegerv( GL_VBO_FREE_MEMORY_ATI, mem );
common->Printf( "VBO: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "VBO: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "VBO: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "VBO: largest auxiliary free block %i MB\n", mem[3] >> 10 );
glGetIntegerv( GL_TEXTURE_FREE_MEMORY_ATI, mem );
common->Printf( "Texture: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "Texture: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "Texture: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "Texture: largest auxiliary free block %i MB\n", mem[3] >> 10 );
glGetIntegerv( GL_RENDERBUFFER_FREE_MEMORY_ATI, mem );
common->Printf( "RenderBuffer: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "RenderBuffer: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "RenderBuffer: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "RenderBuffer: largest auxiliary free block %i MB\n", mem[3] >> 10 );
}
else
{
common->Printf( "MemInfo not availabled for your video card or driver!\n" );
}
}


and sadly this only works on nvidia and AMD.

Hello guys not time no see.

I know this thread is a tad old but wanted to give my small contribution to it.

I'm using fhdoom engine (not latest version unfortunately, made to many changes to merge) and the version I'm using, also didn't detected well GPU vram the value was always zero.

Fortunately, was able to tweak the original code to detect vram but like Revelator said, it only detects max 4GB, my GPU has 8GB for example, so that's not good. Trying to solve that, googled some stuff and was able to find some info online (never saw this thread at the time btw), on how to detect vram for AMD and Nvidia GPU's. Now that I see this thread and because the version I found looks different from what is showed, I decided to post my version here has well.
Btw not saying this is a better way, or is even stable code, I'm still learning coding and C++, this could be slower or buggy, better coders may review this and comment but in my case it worked. And so, for those that could be trying to solve the same problem, next to Revelator info (thanks for it btw) perhaps is helpful information. 
But this is important to say, it worked ON MY AMD GPU, i also didn't tested the Nvidia path at all (have no nvidia GPUs to test). So use this code at your own count and risk. Don't blame me if it breaks on your side! :P

Now for those that still want to try this...the code for me, in fhdoom worked, at the end of the function GLW_InitDriver on win_glimp.cpp, when the win32.hGLRC and wgl are already initialized.

StringsAreEqual() is a custom MACRO just for my own reading   #define StringsAreEqual(x, y) (idStr::Icmp(x, y) == 0)

the same with the types s32 and u32, signed int32 and unsigned int32 respectively.

The gl vendor strings code was inside R_InitOpenGL on the RenderSystem_Init.cpp file, add to transfer it inside GLW_InitDriver for the code to run, this was because code inside GLW_InitDriver is called before the vendor strings were set, inside R_InitOpenGL and so that info was available only later.


// START print GPU info
#if 1
common->Printf("... Getting GPU ID and RAM:\n");
common->Printf("\n");
// get our config strings
glConfig.vendor_string = (const char *)glGetString(GL_VENDOR);
glConfig.renderer_string = (const char *)glGetString(GL_RENDERER);
glConfig.version_string = (const char *)glGetString(GL_VERSION);

glConfig.vendorisAMD = false;
glConfig.vendorisNVIDIA = FALSE;
if (StringsAreEqual(glConfig.vendor_string, "AMD") ||
StringsAreEqual(glConfig.vendor_string, "ATI Technologies Inc.") ||
StringsAreEqual(glConfig.vendor_string, "Advanced Micro Devices Inc."))
{
glConfig.vendorisAMD = true;
}
else if (StringsAreEqual(glConfig.vendor_string, "NVIDIA") ||
StringsAreEqual(glConfig.vendor_string, "NVidia Corporation."))
{
glConfig.vendorisNVIDIA = TRUE;
}

if (glConfig.vendorisAMD)
{
u32 total_mem_mb = 0;
u32 gpuid = wglGetContextGPUIDAMD(win32.hGLRC);
wglGetGPUInfoAMD(gpuid,
WGL_GPU_RAM_AMD,
GL_UNSIGNED_INT,
sizeof(u32),
&total_mem_mb);

common->Printf("^3%s\n ^1VRAM ^3= ^2%dMB\n", glConfig.renderer_string, total_mem_mb);
}
else if (glConfig.vendorisNVIDIA)
{
#define GL_GPU_MEM_INFO_TOTAL_AVAILABLE_MEM_NVX 0x9048
#define GL_GPU_MEM_INFO_CURRENT_AVAILABLE_MEM_NVX 0x9049
s32 total_mem_kb = 0;
glGetIntegerv(GL_GPU_MEM_INFO_TOTAL_AVAILABLE_MEM_NVX,
&total_mem_kb);

s32 cur_avail_mem_kb = 0;
glGetIntegerv(GL_GPU_MEM_INFO_CURRENT_AVAILABLE_MEM_NVX,
&cur_avail_mem_kb);

common->Printf("%s Available VRAM = %d\n", glConfig.renderer_string, cur_avail_mem_kb / 1024);
common->Printf("%s Total VRAM = %d\n", glConfig.renderer_string, cur_avail_mem_kb / 1024);
}
else { common->Printf("^3Warning:^0Unable to find GPU ID and RAM!.......\n"); }
common->Printf("\n");
common->Printf("...End GPU ID and RAM\n");
#endif
// END print GPU info

  }


Hope is useful code, please review, if it has problems let me know.
Title: Re: updated code for getting system memory
Post by: revelator on March 13, 2020, 05:57:48 AM
I allready use a variant of this codepiece from darkmod using enumerators instead of boolean, still usefull though :)
Title: Re: updated code for getting system memory
Post by: argoon on March 13, 2020, 08:53:47 PM
Happy you think is useful.  :)
Title: Re: updated code for getting system memory
Post by: revelator on March 15, 2020, 08:33:12 AM
Tbh we could do away with it completely these days, pretty much any card today can run idtech4 with ok fps.
The original code was used as a way to detect older cards capabillities and set performance settings based on that,
but back then the biggest card we had was a geforce 3 ultra and even that struggled running it on ultra.

New capabilities added by developers have pushed idtech further ofc,
but still the biggest problems with performance these days stem from the engine being from a time where a lot of the grunt work was done on the cpu.
fhDoom might actually change that if and when someone completes codepaths for GPU skinning.
Title: Re: updated code for getting system memory
Post by: revelator on September 15, 2020, 04:28:06 AM
Slowly recreating my lost work but since im officially retired development is slow.
New changes: Thread affinity is now used but in a different way than original.
Doom3 only uses 2 threads one for the main engine and one for the server, i created a function to auto delegate the main thread on core 1 and 3 and the server on core 2 and 4.

Thread exit was newer used in the windows version since win does not like threads being killed, so i had to create a function that makes sure all threads have run to completion then it would check for stuck threads and end each one at sys exit (and only at sys exit (doing it anywhere else might actually crash your PC)). It basically newer gets that far though as windows handles thread exit internally pretty well, but in case it gets stuck it will kill the stuck thread handles safely.

The hybrid ARB/GLSL backend has also had a major overhaul and now works perfect, you can seamlessly switch between ARB and GLSL interactions at runtime without any hickups now.
The GLSL backend uses half lambertian so looks a fair bit better than ARB. It also works just fine with sikkmod though you cannot use sikkmods parallax occlusion shader as that one is an interaction shader (world drawing) so if you want to toy with that you need to turn of GLSL. All the other effects work though. Or you could rewrite sikkmods POM shader in GLSL :) as the backend supports loading external GLSL as well.

Also added AVX and AVX2 support again, still missing the SMP changes from darkmod but eventually ill get there.

Planned: Adding dentons bloom code to the game code for base doom3 and the expansion (so that the broken bloom cvar actually gets used).
Dentons bloom is pretty sleek and does not look over the top like many other implementations so it would probably fit well.

Sadly my main engine does not have the internal editors anymore as it was based on MH's Doom3 port. For the linux guys this is pretty moot as the editors are MFC based and so not portable anyway.
Title: Re: updated code for getting system memory
Post by: revelator on September 15, 2020, 04:36:26 AM
If someone is good with gui code i could use a hand in fixing some old bugs with venoms doom3 menu.
I once had most of them fixed but it was long ago when idtech4 had just been released and i have since lost the fixed code.
One bug in particular was pretty annoying, you could not overwrite savegames with venoms menu and there was another one that screwed up game skill selection.

Also need a special version for sikkmod's options and i plan on removing the broken features from sikkmod like SSAO (breaks skyportals and alpha entities) and soft shadows (huge FPS sink and looks ghastly on some AMD cards with black and or blue outlines).
Title: Re: updated code for getting system memory
Post by: revelator on September 16, 2020, 08:29:04 AM
SO heres TDM's thread routine used in my project.

/*
===================
CreateThreadStartRoutine
===================
*/
typedef std::pair<xthread_t, void *> CreateThreadStartParams;
DWORD WINAPI CreateThreadStartRoutine( LPVOID lpThreadParameter ) {
    std::pair<xthread_t, void *> arg = *( ( CreateThreadStartParams * )lpThreadParameter );
    delete ( ( CreateThreadStartParams * )lpThreadParameter );
    return arg.first( arg.second );
}

/*
==================
Sys_Createthread
==================
*/
void Sys_CreateThread( xthread_t function, void *parms, xthreadPriority priority, xthreadInfo &info, const char *name, xthreadInfo *threads[MAX_THREADS], int *thread_count ) {
    Sys_EnterCriticalSection();
LPVOID threadParam = new CreateThreadStartParams( function, parms );
    HANDLE temp = CreateThread( NULL, // LPSECURITY_ATTRIBUTES lpsa,
                                0, // DWORD cbStack,
                                CreateThreadStartRoutine, // LPTHREAD_START_ROUTINE lpStartAddr,
                                threadParam, // LPVOID lpvThreadParm,
                                0, // DWORD fdwCreate,
                                &info.threadId );

    info.threadHandle = ( intptr_t )temp;

    if ( priority == THREAD_HIGHEST ) {
        SetThreadPriority( ( HANDLE )info.threadHandle, THREAD_PRIORITY_HIGHEST ); //  we better sleep enough to do this
    } else if ( priority == THREAD_ABOVE_NORMAL ) {
        SetThreadPriority( ( HANDLE )info.threadHandle, THREAD_PRIORITY_ABOVE_NORMAL );
    } else {
        // if we hit this then the programmer forgot to set a default thread priority.
        SetThreadPriority( ( HANDLE )info.threadHandle, GetThreadPriority( ( HANDLE )info.threadHandle ) != THREAD_PRIORITY_ERROR_RETURN );
    }
    info.name = name;

    if ( *thread_count < MAX_THREADS ) {
        threads[( *thread_count )++] = &info;
    } else {
        common->DPrintf( "WARNING: MAX_THREADS reached\n" );
    }
    Sys_LeaveCriticalSection();
}


And here is the function to kill the threads at exit.

void Sys_DestroyThread( xthreadInfo &info ) {
    DWORD dwExitCode, dwWaitResult, dwThreadCount;
    HANDLE dwThreadHandle[MAX_THREADS];

    // no threads running so nothing to kill.
    if ( !info.threadHandle ) {
        return;
    }
    Sys_EnterCriticalSection();

    // give it a little time
    Sys_Sleep( 1000 );

    // get number of threads to wait for.
    for ( dwThreadCount = 0; dwThreadCount < MAX_THREADS; dwThreadCount++ ) {
        // create an array of handles for WaitForMultipleObjects.
        dwThreadHandle[dwThreadCount] = ( HANDLE ) info.threadHandle;

        // wait for the handle to be signaled.
        dwWaitResult = WaitForMultipleObjects( dwThreadCount, dwThreadHandle, TRUE, INFINITE );

        // signal handlers for WaitForMultipleObjects.
        switch ( dwWaitResult ) {
        case WAIT_ABANDONED_0:
// Major problem somewhere mutex object might have been killed prematurely.
            idLib::common->Printf( "Mutex object was not released by the thread that owned the mutex object before the owning thread terminates...\n" );
            break;
        case WAIT_OBJECT_0:
// The condition we want.
            idLib::common->Printf( "The child thread state was signaled!\n" );
            break;
        case WAIT_TIMEOUT:
// Thread might be busy.
            idLib::common->Printf( "Time-out interval elapsed, and the child thread's state is nonsignaled.\n" );
            break;
        case WAIT_FAILED:
            // Fatal this condition would crash us anyway so might as well let it, yeah right...
idLib::common->Printf( "WaitForMultipleObjects() failed, error %u\n", ::GetLastError() );
return; // get the hell outta here!
        }

// Get thread exit status and close the handle.
        if ( ::GetExitCodeThread( dwThreadHandle, &dwExitCode ) != FALSE ) {
            ExitThread( dwExitCode );
            if ( CloseHandle( dwThreadHandle ) != FALSE ) {
                dwThreadHandle[dwThreadCount] = NULL;
            }
        }
    }
    Sys_LeaveCriticalSection();
}


it is called in one place only at the end of Sys_Quit just before ExitProcess(0);
This is done to make sure nothing is still hooked when exiting the thread and only runs at game shutdown.

The entercriticalsection and leavecriticalsection additions are there to make sure the process owns the running thread.

The last function delegates the running threads on multicore machines eg. > 2 cores.

void Sys_SetThreadAffinity( bool mainthread ) {
    SYSTEM_INFO info;

    // check number of processors
    GetSystemInfo( &info );

    // single core machine so but out.
    if ( info.dwNumberOfProcessors < 2 ) {
        return;
    }

    // set thread affinity for main thread on core 1 or 3
    if ( mainthread ) {
        switch ( info.dwNumberOfProcessors ) {
        case 1:
            SetThreadAffinityMask( GetCurrentThread(), ( 1 << info.dwNumberOfProcessors ) );
            break;
        case 3:
            SetThreadAffinityMask( GetCurrentThread(), ( 3 << info.dwNumberOfProcessors ) );
            break;
        default:
            break;
        }
    } else {
        // set affinity for other threads on core 2 or 4
        switch ( info.dwNumberOfProcessors ) {
        case 2:
            SetThreadAffinityMask( ( HANDLE )threadInfo.threadHandle, ( 2 << info.dwNumberOfProcessors ) );
            break;
        case 4:
            SetThreadAffinityMask( ( HANDLE )threadInfo.threadHandle, ( 4 << info.dwNumberOfProcessors ) );
            break;
        default:
            break;
        }
    }
}


If say we have a quad core it would delegate the main thread on core 1 and 3 and the second thread on core 2 and 4.
Doom3 uses only 2 threads, one runs the game and another for background file reads.

void idFileSystemLocal::StartBackgroundDownloadThread() {
    if ( !backgroundThread.threadHandle ) {
        Sys_CreateThread( ( xthread_t )BackgroundDownloadThread, NULL, THREAD_NORMAL, backgroundThread, "backgroundDownload", g_threads, &g_thread_count );
        if ( !backgroundThread.threadHandle ) {
            common->Warning( "idFileSystemLocal::StartBackgroundDownloadThread: failed" );
        }
    } else {
        common->Printf( "background thread already running\n" );
    }

// give the async thread an affinity for the 2 or 4'th core.
Sys_SetThreadAffinity();
}


and here

int WINAPI WinMain( HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow ) {
#ifdef _DEBUG
    SetCurrentDirectory( "C:\\Doom\\Doom 3" );
#endif
    const HCURSOR hcurSave = ::SetCursor( LoadCursor( 0, IDC_WAIT ) );

    // tell windows we're high dpi aware, otherwise display scaling screws up the game
    Sys_SetHighDPIMode();
    Sys_SetPhysicalWorkMemory( 192 << 20, 1024 << 20 );
    Sys_GetCurrentMemoryStatus( exeLaunchMemoryStats );

    win32.hInstance = hInstance;
    idStr::Copynz( sys_cmdline, lpCmdLine, sizeof( sys_cmdline ) );

    // done before Com/Sys_Init since we need this for error output
    Sys_CreateConsole();

    // no abort/retry/fail errors
    SetErrorMode( SEM_FAILCRITICALERRORS );

    for ( int i = 0; i < MAX_CRITICAL_SECTIONS; i++ ) {
        InitializeCriticalSection( &win32.criticalSections[i] );
    }

    // get the initial time base
    Sys_Milliseconds();

#ifdef DEBUG
    // disable the painfully slow MS heap check every 1024 allocs
    _CrtSetDbgFlag( 0 );
#endif

    Sys_FPU_SetPrecision( FPU_PRECISION_DOUBLE_EXTENDED );
    common->Init( 0, NULL, lpCmdLine );

#ifndef ID_DEDICATED
    if ( win32.win_notaskkeys.GetInteger() ) {
        DisableTaskKeys( TRUE, FALSE, FALSE );
    }
#endif
    Sys_StartAsyncThread();

    // hide or show the early console as necessary
    if ( win32.win_viewlog.GetInteger() || com_skipRenderer.GetBool() || idAsyncNetwork::serverDedicated.GetInteger() ) {
        Sys_ShowConsole( 1, true );
    } else {
        Sys_ShowConsole( 0, false );
    }

    // give the main thread an affinity for the first or 3'rd core.
    Sys_SetThreadAffinity( true );

    ::SetCursor( hcurSave );
    ::SetFocus( win32.hWnd );

    // main game loop
    while ( 1 ) {
        Win_Frame();

        // run the game
        common->Frame();
    }

    // never gets here
    return 0;
}


Also need an extern for it in Sys_Public.h

void Sys_SetThreadAffinity( bool mainthread = false );

Title: Re: updated code for getting system memory
Post by: revelator on September 19, 2020, 03:19:40 PM
More thread work.

Removed thread affinity again windows handles this just fine itself.

Event triggers have been fixed so that it now actually works on the backgroundthread and does not require Sys_Sleep anymore ( fix from TDM ).

My thread exit code turned out to be safe enough after some small modification to actually run before recreating the thread handles in case of map change or reload so it is used there as well now.
If the handle allready exists it will simply reuse it :) else it will wait for the thread to exit and then recreate it.
Found one small use for Sys_ThreadName in the event trigger it was normally set to NULL to create an unamed event handle, instead im now feeding it the current threads name so we can keep track of it.