inside tcmalloc

TCMalloc is a fast malloc alternative from google, comparing to other malloc implement, it can work without modifying target application. It’s interesting, how does it work?

// inside tcmalloc.cc

static int tcmallocguard_refcount = 0;  // no lock needed: runs before main()
TCMallocGuard::TCMallocGuard() {
if (tcmallocguard_refcount++ == 0) {
   ReplaceSystemAlloc();    // defined in libc_override_*.h
    tc_free(tc_malloc(1));
ThreadCache::InitTSD();
tc_free(tc_malloc(1));
// Either we, or debugallocation.cc, or valgrind will control memory
// management.  We register our extension if we’re the winner.
#ifdef TCMALLOC_USING_DEBUGALLOCATION
// Let debugallocation register its extension.
#else
if (RunningOnValgrind()) {
// Let Valgrind uses its own malloc (so don’t register our extension).
} else {
MallocExtension::Register(new TCMallocImplementation);
}
#endif
}
}

TCMallocGuard::~TCMallocGuard() {
if (–tcmallocguard_refcount == 0) {
const char* env = NULL;
if (!RunningOnValgrind()) {
// Valgrind uses it’s own malloc so we cannot do MALLOCSTATS
env = getenv(“MALLOCSTATS”);
}
if (env != NULL) {
int level = atoi(env);
if (level < 1) level = 1;
PrintStats(level);
}
}
}
#ifndef WIN32_OVERRIDE_ALLOCATORS
static TCMallocGuard module_enter_exit_hook;
#endif

// inside libc_override.h

// For windows, there are two ways to get tcmalloc.  If we’re
// patching, then src/windows/patch_function.cc will do the necessary
// overriding here.  Otherwise, we doing the ‘redefine’ trick, where
// we remove malloc/new/etc from mscvcrt.dll, and just need to define
// them now.
#if defined(_WIN32) && defined(WIN32_DO_PATCHING)
void PatchWindowsFunctions();   // in src/windows/patch_function.cc
static void ReplaceSystemAlloc() { PatchWindowsFunctions(); }

#elif defined(_WIN32) && !defined(WIN32_DO_PATCHING)
#include “libc_override_redefine.h”

That’s why we can use tcmalloc without modify source code, just load tcmalloc, it will hook functions and take control of memory management.

1. if used as a static libary, module_enter_exit_hook will do the hook stuffs when the target module is initializing.

2. if used as a dynamic library( a dll file), module_enter_exit_hook  will run when tcmalloc.dll is initializing.

Ok, we have known how does it initialize, let’s explore how does it work in different scenarios.

scenario 1. why is it able to take control when it’s a static library and built with flag /MT,/MTd

const GenericFnPtr LibcInfo::static_fn_[] = {
(GenericFnPtr)&::malloc,
(GenericFnPtr)&::free,
(GenericFnPtr)&::realloc,
(GenericFnPtr)&::calloc,
#ifdef __MINGW32__
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
#else
(GenericFnPtr)(void*(*)(size_t))&::operator new,
(GenericFnPtr)(void*(*)(size_t))&::operator new[],
(GenericFnPtr)(void(*)(void*))&::operator delete,
(GenericFnPtr)(void(*)(void*))&::operator delete[],
(GenericFnPtr)
(void*(*)(size_t, struct std::nothrow_t const &))&::operator new,
(GenericFnPtr)
(void*(*)(size_t, struct std::nothrow_t const &))&::operator new[],
(GenericFnPtr)
(void(*)(void*, struct std::nothrow_t const &))&::operator delete,
(GenericFnPtr)
(void(*)(void*, struct std::nothrow_t const &))&::operator delete[],
#endif
(GenericFnPtr)&::_msize,
(GenericFnPtr)&::_expand,
(GenericFnPtr)&::calloc,
};

It uses the same functions as our application code(macros in some cases), so even in a debug build, it still knows what is needed to hook.

scenario 2. Why is it able to take control even the target application is built with CL flag /MT, /MTd while itself is a dll?

Once an application is compiled with one of these flags, malloc/free/new/delete become part of the target binary,  it can not know these functions’ addresses. However, it does work in this case.

Use a debugger, trace the application in single step mode.

in release build:

malloc  -> HeapAlloc

in Debug build:

malloc –> _nh_malloc_dbg_impl-> _heap_alloc_base –> HeapAlloc

>    HeapAllocntdll.dll!_RtlAllocateHeap@12()    Unknown
test_tcmalloc.exe!_heap_alloc_base(unsigned int size) Line 58    C
test_tcmalloc.exe!_heap_alloc_dbg_impl(unsigned int nSize, int nBlockUse, const char * szFileName, int nLine, int * errno_tmp) Line 431    C++
test_tcmalloc.exe!_nh_malloc_dbg_impl(unsigned int nSize, int nhFlag, int nBlockUse, const char * szFileName, int nLine, int * errno_tmp) Line 239    C++
test_tcmalloc.exe!_nh_malloc_dbg(unsigned int nSize, int nhFlag, int nBlockUse, const char * szFileName, int nLine) Line 302    C++
test_tcmalloc.exe!malloc(unsigned int nSize) Line 56    C++

No matter how we use malloc, it will call HeapAlloc at the end.

/*static*/ WindowsInfo::FunctionInfo WindowsInfo::function_info_[] = {
{ “HeapAlloc”, NULL, NULL, (GenericFnPtr)&Perftools_HeapAlloc },
{ “HeapFree”, NULL, NULL, (GenericFnPtr)&Perftools_HeapFree },
{ “VirtualAllocEx”, NULL, NULL, (GenericFnPtr)&Perftools_VirtualAllocEx },
{ “VirtualFreeEx”, NULL, NULL, (GenericFnPtr)&Perftools_VirtualFreeEx },
{ “MapViewOfFileEx”, NULL, NULL, (GenericFnPtr)&Perftools_MapViewOfFileEx },
{ “UnmapViewOfFile”, NULL, NULL, (GenericFnPtr)&Perftools_UnmapViewOfFile },
{ “LoadLibraryExW”, NULL, NULL, (GenericFnPtr)&Perftools_LoadLibraryExW },
{ “FreeLibrary”, NULL, NULL, (GenericFnPtr)&Perftools_FreeLibrary },
};

void WindowsInfo::Patch() {
HMODULE hkernel32 = ::GetModuleHandleA(“kernel32”);
CHECK_NE(hkernel32, NULL);

// Unlike for libc, we know these exist in our module, so we can get
// and patch at the same time.
for (int i = 0; i < kNumFunctions; i++) {
function_info_[i].windows_fn = (GenericFnPtr)
::GetProcAddress(hkernel32, function_info_[i].name);
// If origstub_fn is not NULL, it’s left around from a previous
// patch.  We need to set it to NULL for the new Patch call.
// Since we’ve patched Unpatch() not to delete origstub_fn_ (it
// causes problems in some contexts, though obviously not this
// one), we should delete it now, before setting it to NULL.
// NOTE: casting from a function to a pointer is contra the C++
//       spec.  It’s not safe on IA64, but is on i386.  We use
//       a C-style cast here to emphasize this is not legal C++.
delete[] (char*)(function_info_[i].origstub_fn);
function_info_[i].origstub_fn = NULL;  // Patch() will fill this in
CHECK_EQ(sidestep::SIDESTEP_SUCCESS,
PreamblePatcher::Patch(function_info_[i].windows_fn,
function_info_[i].perftools_fn,
&function_info_[i].origstub_fn));
}
}

As the result, once it hooks windows heap management function, it’s able to take control.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s