Author Topic: updated code for getting system memory  (Read 4503 times)

0 Members and 1 Guest are viewing this topic.

argoon

  • Sr. Member
  • ****
  • Posts: 279
  • Karma: +21/-81
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #15 on: February 27, 2020, 05:35:24 PM »
...

and get rid of the video ram detection code it no longer works correctly and also suffers from the same limit,
you can only get videoram on modern gfx cards by using card specific codepaths like this ->

Code: [Select]
void idVertexCache::Show( void )
{
GLint  mem[4];

if ( GLEW_NVX_gpu_memory_info && ( glConfig.vendor == glvNVIDIA ) )
{
common->Printf( "\nNvidia specific memory info:\n" );
common->Printf( "\n" );
glGetIntegerv( GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX , mem );
common->Printf( "dedicated video memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_TOTAL_AVAILABLE_MEMORY_NVX , mem );
common->Printf( "total available memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_CURRENT_AVAILABLE_VIDMEM_NVX , mem );
common->Printf( "currently unused GPU memory %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX , mem );
common->Printf( "count of total evictions seen by system %i MB\n", mem[0] >> 10 );
glGetIntegerv( GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX , mem );
common->Printf( "total video memory evicted %i MB\n", mem[0] >> 10 );
}
else if ( GLEW_ATI_meminfo && ( glConfig.vendor == glvAMD ) )
{
common->Printf( "\nATI/AMD specific memory info:\n" );
common->Printf( "\n" );
glGetIntegerv( GL_VBO_FREE_MEMORY_ATI, mem );
common->Printf( "VBO: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "VBO: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "VBO: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "VBO: largest auxiliary free block %i MB\n", mem[3] >> 10 );
glGetIntegerv( GL_TEXTURE_FREE_MEMORY_ATI, mem );
common->Printf( "Texture: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "Texture: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "Texture: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "Texture: largest auxiliary free block %i MB\n", mem[3] >> 10 );
glGetIntegerv( GL_RENDERBUFFER_FREE_MEMORY_ATI, mem );
common->Printf( "RenderBuffer: total memory free in the pool %i MB\n", mem[0] >> 10 );
common->Printf( "RenderBuffer: largest available free block in the pool %i MB\n", mem[1] >> 10 );
common->Printf( "RenderBuffer: total auxiliary memory free %i MB\n", mem[2] >> 10 );
common->Printf( "RenderBuffer: largest auxiliary free block %i MB\n", mem[3] >> 10 );
}
else
{
common->Printf( "MemInfo not availabled for your video card or driver!\n" );
}
}

and sadly this only works on nvidia and AMD.

Hello guys not time no see.

I know this thread is a tad old but wanted to give my small contribution to it.

I'm using fhdoom engine (not latest version unfortunately, made to many changes to merge) and the version I'm using, also didn't detected well GPU vram the value was always zero.

Fortunately, was able to tweak the original code to detect vram but like Revelator said, it only detects max 4GB, my GPU has 8GB for example, so that's not good. Trying to solve that, googled some stuff and was able to find some info online (never saw this thread at the time btw), on how to detect vram for AMD and Nvidia GPU's. Now that I see this thread and because the version I found looks different from what is showed, I decided to post my version here has well.
Btw not saying this is a better way, or is even stable code, I'm still learning coding and C++, this could be slower or buggy, better coders may review this and comment but in my case it worked. And so, for those that could be trying to solve the same problem, next to Revelator info (thanks for it btw) perhaps is helpful information. 
But this is important to say, it worked ON MY AMD GPU, i also didn't tested the Nvidia path at all (have no nvidia GPUs to test). So use this code at your own count and risk. Don't blame me if it breaks on your side! :P

Now for those that still want to try this...the code for me, in fhdoom worked, at the end of the function GLW_InitDriver on win_glimp.cpp, when the win32.hGLRC and wgl are already initialized.
 
StringsAreEqual() is a custom MACRO just for my own reading   #define StringsAreEqual(x, y) (idStr::Icmp(x, y) == 0)

the same with the types s32 and u32, signed int32 and unsigned int32 respectively.

The gl vendor strings code was inside R_InitOpenGL on the RenderSystem_Init.cpp file, add to transfer it inside GLW_InitDriver for the code to run, this was because code inside GLW_InitDriver is called before the vendor strings were set, inside R_InitOpenGL and so that info was available only later.

Code: [Select]
// START print GPU info
#if 1
common->Printf("... Getting GPU ID and RAM:\n");
common->Printf("\n");
// get our config strings
glConfig.vendor_string = (const char *)glGetString(GL_VENDOR);
glConfig.renderer_string = (const char *)glGetString(GL_RENDERER);
glConfig.version_string = (const char *)glGetString(GL_VERSION);

glConfig.vendorisAMD = false;
glConfig.vendorisNVIDIA = FALSE;
if (StringsAreEqual(glConfig.vendor_string, "AMD") ||
StringsAreEqual(glConfig.vendor_string, "ATI Technologies Inc.") ||
StringsAreEqual(glConfig.vendor_string, "Advanced Micro Devices Inc."))
{
glConfig.vendorisAMD = true;
}
else if (StringsAreEqual(glConfig.vendor_string, "NVIDIA") ||
StringsAreEqual(glConfig.vendor_string, "NVidia Corporation."))
{
glConfig.vendorisNVIDIA = TRUE;
}

if (glConfig.vendorisAMD)
{
u32 total_mem_mb = 0;
u32 gpuid = wglGetContextGPUIDAMD(win32.hGLRC);
wglGetGPUInfoAMD(gpuid,
WGL_GPU_RAM_AMD,
GL_UNSIGNED_INT,
sizeof(u32),
&total_mem_mb);

common->Printf("^3%s\n ^1VRAM ^3= ^2%dMB\n", glConfig.renderer_string, total_mem_mb);
}
else if (glConfig.vendorisNVIDIA)
{
#define GL_GPU_MEM_INFO_TOTAL_AVAILABLE_MEM_NVX 0x9048
#define GL_GPU_MEM_INFO_CURRENT_AVAILABLE_MEM_NVX 0x9049
s32 total_mem_kb = 0;
glGetIntegerv(GL_GPU_MEM_INFO_TOTAL_AVAILABLE_MEM_NVX,
&total_mem_kb);

s32 cur_avail_mem_kb = 0;
glGetIntegerv(GL_GPU_MEM_INFO_CURRENT_AVAILABLE_MEM_NVX,
&cur_avail_mem_kb);

common->Printf("%s Available VRAM = %d\n", glConfig.renderer_string, cur_avail_mem_kb / 1024);
common->Printf("%s Total VRAM = %d\n", glConfig.renderer_string, cur_avail_mem_kb / 1024);
}
else { common->Printf("^3Warning:^0Unable to find GPU ID and RAM!.......\n"); }
common->Printf("\n");
common->Printf("...End GPU ID and RAM\n");
#endif
// END print GPU info

  }

Hope is useful code, please review, if it has problems let me know.
« Last Edit: February 27, 2020, 05:38:41 PM by argoon »

revelator

  • Jr. Member
  • **
  • Posts: 62
  • Karma: +5/-0
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #16 on: March 13, 2020, 05:57:48 AM »
I allready use a variant of this codepiece from darkmod using enumerators instead of boolean, still usefull though :)
« Last Edit: March 13, 2020, 06:04:05 AM by revelator »

argoon

  • Sr. Member
  • ****
  • Posts: 279
  • Karma: +21/-81
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #17 on: March 13, 2020, 08:53:47 PM »
Happy you think is useful.  :)

revelator

  • Jr. Member
  • **
  • Posts: 62
  • Karma: +5/-0
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #18 on: March 15, 2020, 08:33:12 AM »
Tbh we could do away with it completely these days, pretty much any card today can run idtech4 with ok fps.
The original code was used as a way to detect older cards capabillities and set performance settings based on that,
but back then the biggest card we had was a geforce 3 ultra and even that struggled running it on ultra.

New capabilities added by developers have pushed idtech further ofc,
but still the biggest problems with performance these days stem from the engine being from a time where a lot of the grunt work was done on the cpu.
fhDoom might actually change that if and when someone completes codepaths for GPU skinning.

revelator

  • Jr. Member
  • **
  • Posts: 62
  • Karma: +5/-0
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #19 on: September 15, 2020, 04:28:06 AM »
Slowly recreating my lost work but since im officially retired development is slow.
New changes: Thread affinity is now used but in a different way than original.
Doom3 only uses 2 threads one for the main engine and one for the server, i created a function to auto delegate the main thread on core 1 and 3 and the server on core 2 and 4.

Thread exit was newer used in the windows version since win does not like threads being killed, so i had to create a function that makes sure all threads have run to completion then it would check for stuck threads and end each one at sys exit (and only at sys exit (doing it anywhere else might actually crash your PC)). It basically newer gets that far though as windows handles thread exit internally pretty well, but in case it gets stuck it will kill the stuck thread handles safely.

The hybrid ARB/GLSL backend has also had a major overhaul and now works perfect, you can seamlessly switch between ARB and GLSL interactions at runtime without any hickups now.
The GLSL backend uses half lambertian so looks a fair bit better than ARB. It also works just fine with sikkmod though you cannot use sikkmods parallax occlusion shader as that one is an interaction shader (world drawing) so if you want to toy with that you need to turn of GLSL. All the other effects work though. Or you could rewrite sikkmods POM shader in GLSL :) as the backend supports loading external GLSL as well.

Also added AVX and AVX2 support again, still missing the SMP changes from darkmod but eventually ill get there.

Planned: Adding dentons bloom code to the game code for base doom3 and the expansion (so that the broken bloom cvar actually gets used).
Dentons bloom is pretty sleek and does not look over the top like many other implementations so it would probably fit well.

Sadly my main engine does not have the internal editors anymore as it was based on MH's Doom3 port. For the linux guys this is pretty moot as the editors are MFC based and so not portable anyway.

revelator

  • Jr. Member
  • **
  • Posts: 62
  • Karma: +5/-0
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #20 on: September 15, 2020, 04:36:26 AM »
If someone is good with gui code i could use a hand in fixing some old bugs with venoms doom3 menu.
I once had most of them fixed but it was long ago when idtech4 had just been released and i have since lost the fixed code.
One bug in particular was pretty annoying, you could not overwrite savegames with venoms menu and there was another one that screwed up game skill selection.

Also need a special version for sikkmod's options and i plan on removing the broken features from sikkmod like SSAO (breaks skyportals and alpha entities) and soft shadows (huge FPS sink and looks ghastly on some AMD cards with black and or blue outlines).

revelator

  • Jr. Member
  • **
  • Posts: 62
  • Karma: +5/-0
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #21 on: September 16, 2020, 08:29:04 AM »
SO heres TDM's thread routine used in my project.

Code: [Select]
/*
===================
CreateThreadStartRoutine
===================
*/
typedef std::pair<xthread_t, void *> CreateThreadStartParams;
DWORD WINAPI CreateThreadStartRoutine( LPVOID lpThreadParameter ) {
    std::pair<xthread_t, void *> arg = *( ( CreateThreadStartParams * )lpThreadParameter );
    delete ( ( CreateThreadStartParams * )lpThreadParameter );
    return arg.first( arg.second );
}

/*
==================
Sys_Createthread
==================
*/
void Sys_CreateThread( xthread_t function, void *parms, xthreadPriority priority, xthreadInfo &info, const char *name, xthreadInfo *threads[MAX_THREADS], int *thread_count ) {
    Sys_EnterCriticalSection();
LPVOID threadParam = new CreateThreadStartParams( function, parms );
    HANDLE temp = CreateThread( NULL, // LPSECURITY_ATTRIBUTES lpsa,
                                0, // DWORD cbStack,
                                CreateThreadStartRoutine, // LPTHREAD_START_ROUTINE lpStartAddr,
                                threadParam, // LPVOID lpvThreadParm,
                                0, // DWORD fdwCreate,
                                &info.threadId );

    info.threadHandle = ( intptr_t )temp;

    if ( priority == THREAD_HIGHEST ) {
        SetThreadPriority( ( HANDLE )info.threadHandle, THREAD_PRIORITY_HIGHEST ); //  we better sleep enough to do this
    } else if ( priority == THREAD_ABOVE_NORMAL ) {
        SetThreadPriority( ( HANDLE )info.threadHandle, THREAD_PRIORITY_ABOVE_NORMAL );
    } else {
        // if we hit this then the programmer forgot to set a default thread priority.
        SetThreadPriority( ( HANDLE )info.threadHandle, GetThreadPriority( ( HANDLE )info.threadHandle ) != THREAD_PRIORITY_ERROR_RETURN );
    }
    info.name = name;

    if ( *thread_count < MAX_THREADS ) {
        threads[( *thread_count )++] = &info;
    } else {
        common->DPrintf( "WARNING: MAX_THREADS reached\n" );
    }
    Sys_LeaveCriticalSection();
}

And here is the function to kill the threads at exit.

Code: [Select]
void Sys_DestroyThread( xthreadInfo &info ) {
    DWORD dwExitCode, dwWaitResult, dwThreadCount;
    HANDLE dwThreadHandle[MAX_THREADS];

    // no threads running so nothing to kill.
    if ( !info.threadHandle ) {
        return;
    }
    Sys_EnterCriticalSection();

    // give it a little time
    Sys_Sleep( 1000 );

    // get number of threads to wait for.
    for ( dwThreadCount = 0; dwThreadCount < MAX_THREADS; dwThreadCount++ ) {
        // create an array of handles for WaitForMultipleObjects.
        dwThreadHandle[dwThreadCount] = ( HANDLE ) info.threadHandle;

        // wait for the handle to be signaled.
        dwWaitResult = WaitForMultipleObjects( dwThreadCount, dwThreadHandle, TRUE, INFINITE );

        // signal handlers for WaitForMultipleObjects.
        switch ( dwWaitResult ) {
        case WAIT_ABANDONED_0:
// Major problem somewhere mutex object might have been killed prematurely.
            idLib::common->Printf( "Mutex object was not released by the thread that owned the mutex object before the owning thread terminates...\n" );
            break;
        case WAIT_OBJECT_0:
// The condition we want.
            idLib::common->Printf( "The child thread state was signaled!\n" );
            break;
        case WAIT_TIMEOUT:
// Thread might be busy.
            idLib::common->Printf( "Time-out interval elapsed, and the child thread's state is nonsignaled.\n" );
            break;
        case WAIT_FAILED:
            // Fatal this condition would crash us anyway so might as well let it, yeah right...
idLib::common->Printf( "WaitForMultipleObjects() failed, error %u\n", ::GetLastError() );
return; // get the hell outta here!
        }

// Get thread exit status and close the handle.
        if ( ::GetExitCodeThread( dwThreadHandle, &dwExitCode ) != FALSE ) {
            ExitThread( dwExitCode );
            if ( CloseHandle( dwThreadHandle ) != FALSE ) {
                dwThreadHandle[dwThreadCount] = NULL;
            }
        }
    }
    Sys_LeaveCriticalSection();
}

it is called in one place only at the end of Sys_Quit just before ExitProcess(0);
This is done to make sure nothing is still hooked when exiting the thread and only runs at game shutdown.

The entercriticalsection and leavecriticalsection additions are there to make sure the process owns the running thread.

The last function delegates the running threads on multicore machines eg. > 2 cores.

Code: [Select]
void Sys_SetThreadAffinity( bool mainthread ) {
    SYSTEM_INFO info;

    // check number of processors
    GetSystemInfo( &info );

    // single core machine so but out.
    if ( info.dwNumberOfProcessors < 2 ) {
        return;
    }

    // set thread affinity for main thread on core 1 or 3
    if ( mainthread ) {
        switch ( info.dwNumberOfProcessors ) {
        case 1:
            SetThreadAffinityMask( GetCurrentThread(), ( 1 << info.dwNumberOfProcessors ) );
            break;
        case 3:
            SetThreadAffinityMask( GetCurrentThread(), ( 3 << info.dwNumberOfProcessors ) );
            break;
        default:
            break;
        }
    } else {
        // set affinity for other threads on core 2 or 4
        switch ( info.dwNumberOfProcessors ) {
        case 2:
            SetThreadAffinityMask( ( HANDLE )threadInfo.threadHandle, ( 2 << info.dwNumberOfProcessors ) );
            break;
        case 4:
            SetThreadAffinityMask( ( HANDLE )threadInfo.threadHandle, ( 4 << info.dwNumberOfProcessors ) );
            break;
        default:
            break;
        }
    }
}

If say we have a quad core it would delegate the main thread on core 1 and 3 and the second thread on core 2 and 4.
Doom3 uses only 2 threads, one runs the game and another for background file reads.

Code: [Select]
void idFileSystemLocal::StartBackgroundDownloadThread() {
    if ( !backgroundThread.threadHandle ) {
        Sys_CreateThread( ( xthread_t )BackgroundDownloadThread, NULL, THREAD_NORMAL, backgroundThread, "backgroundDownload", g_threads, &g_thread_count );
        if ( !backgroundThread.threadHandle ) {
            common->Warning( "idFileSystemLocal::StartBackgroundDownloadThread: failed" );
        }
    } else {
        common->Printf( "background thread already running\n" );
    }

// give the async thread an affinity for the 2 or 4'th core.
Sys_SetThreadAffinity();
}

and here

Code: [Select]
int WINAPI WinMain( HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow ) {
#ifdef _DEBUG
    SetCurrentDirectory( "C:\\Doom\\Doom 3" );
#endif
    const HCURSOR hcurSave = ::SetCursor( LoadCursor( 0, IDC_WAIT ) );

    // tell windows we're high dpi aware, otherwise display scaling screws up the game
    Sys_SetHighDPIMode();
    Sys_SetPhysicalWorkMemory( 192 << 20, 1024 << 20 );
    Sys_GetCurrentMemoryStatus( exeLaunchMemoryStats );

    win32.hInstance = hInstance;
    idStr::Copynz( sys_cmdline, lpCmdLine, sizeof( sys_cmdline ) );

    // done before Com/Sys_Init since we need this for error output
    Sys_CreateConsole();

    // no abort/retry/fail errors
    SetErrorMode( SEM_FAILCRITICALERRORS );

    for ( int i = 0; i < MAX_CRITICAL_SECTIONS; i++ ) {
        InitializeCriticalSection( &win32.criticalSections[i] );
    }

    // get the initial time base
    Sys_Milliseconds();

#ifdef DEBUG
    // disable the painfully slow MS heap check every 1024 allocs
    _CrtSetDbgFlag( 0 );
#endif

    Sys_FPU_SetPrecision( FPU_PRECISION_DOUBLE_EXTENDED );
    common->Init( 0, NULL, lpCmdLine );

#ifndef ID_DEDICATED
    if ( win32.win_notaskkeys.GetInteger() ) {
        DisableTaskKeys( TRUE, FALSE, FALSE );
    }
#endif
    Sys_StartAsyncThread();

    // hide or show the early console as necessary
    if ( win32.win_viewlog.GetInteger() || com_skipRenderer.GetBool() || idAsyncNetwork::serverDedicated.GetInteger() ) {
        Sys_ShowConsole( 1, true );
    } else {
        Sys_ShowConsole( 0, false );
    }

    // give the main thread an affinity for the first or 3'rd core.
    Sys_SetThreadAffinity( true );

    ::SetCursor( hcurSave );
    ::SetFocus( win32.hWnd );

    // main game loop
    while ( 1 ) {
        Win_Frame();

        // run the game
        common->Frame();
    }

    // never gets here
    return 0;
}

Also need an extern for it in Sys_Public.h

Code: [Select]
void Sys_SetThreadAffinity( bool mainthread = false );
« Last Edit: September 19, 2020, 03:11:03 PM by revelator »

revelator

  • Jr. Member
  • **
  • Posts: 62
  • Karma: +5/-0
  • Doom Newbie
    • View Profile
Re: updated code for getting system memory
« Reply #22 on: September 19, 2020, 03:19:40 PM »
More thread work.

Removed thread affinity again windows handles this just fine itself.

Event triggers have been fixed so that it now actually works on the backgroundthread and does not require Sys_Sleep anymore ( fix from TDM ).

My thread exit code turned out to be safe enough after some small modification to actually run before recreating the thread handles in case of map change or reload so it is used there as well now.
If the handle allready exists it will simply reuse it :) else it will wait for the thread to exit and then recreate it.
Found one small use for Sys_ThreadName in the event trigger it was normally set to NULL to create an unamed event handle, instead im now feeding it the current threads name so we can keep track of it.