Say Hello to Moduledata

Say Hello to Moduledata

The moduledata structure is a data table that was first introduced in version 1.5 of Go. It is a structure that holds important information that is needed when you statically analyzing Go binaries. It records information about the layout of the executable. For ELF binaries, the structure can be found in the .noptrdata section. In PE files it is much harder to find. Sometimes it is located in the .text section and sometimes it is in the .data section. If the binary has been stripped there is no symbol pointing to the structure. In these scenarios, a brute-force search is needed. To know what we should search for, we need first know what the structure looks like.

The structure

The structure that was added in version 1.5 is shown below. It is defined in the symtab.go file in the runtime package. It holds many good entries that can be used when analyzing Go binaries.

type moduledata struct {
	pclntable    []byte
	ftab         []functab
	filetab      []uint32
	findfunctab  uintptr
	minpc, maxpc uintptr

	text, etext           uintptr
	noptrdata, enoptrdata uintptr
	data, edata           uintptr
	bss, ebss             uintptr
	noptrbss, enoptrbss   uintptr
	end, gcdata, gcbss    uintptr

	typelinks []*_type

	modulename   string
	modulehashes []modulehash

	gcdatamask, gcbssmask bitvector

	next *moduledata
}

The first entry in the structure is the pclntable. This table holds mappings between source code line numbers and the program counter. The reason for this table is to be able to produce meaning full stack traces during a panic. The table is not removed if the binary is stripped and if debugging information is removed. Since it holds information that maps back to the source code file, this table can be a gold mine for malware analysts. The information includes the full file path and name of the source file at compile time and name of the function. Also using the program counter to source code line number mapping, it is possible to estimate the source code lines of the functions. Since this table holds a lot of information, it is one of the big contributors to the large file size of Go binaries, particularly in larger applications.

The next entry in the structure is the ftab, short for function table. The functab structure is shown below. One of its usages is in the runtime package. The table is used to determine which function is currently being executed. This table is used if the runtime function FuncForPC is called. The function takes a value for the program counter and returns a function representation of that subroutine. Using that function representation, the filename and line number can be determined when given a program counter.

type functab struct {
	entry   uintptr
	funcoff uintptr
}

The moduledata structure also holds pointers to the beginning and the end of useful data areas. For example the beginning and end of the program counter and other sections. When analyzing ELF binaries, most of these values are easily found because they correspond to ELF sections. This data is more important when analyzing PE files because the areas are not marked as sections. Instead, they are located in either the .text section or the .data section.

The last entry of interest is the typelinks. This slice holds pointers to the internal descriptions for types in the binary. These types structures hold information for all types, included user defined and internal primitives. By parsing this list it is possible to recover type information.

1.7 changes

With the release of 1.7, the moduledata structure was changed. The structure is shown below. A new “section” was added called types and instead of the typelinks being a slice of pointers to _type structures, it is a slice of 32-bit integers that are offsets from the beginning of the types section where the structure is located. A table was also added for interfaces called itablinks.

type moduledata struct {
	pclntable    []byte
	ftab         []functab
	filetab      []uint32
	findfunctab  uintptr
	minpc, maxpc uintptr

	text, etext           uintptr
	noptrdata, enoptrdata uintptr
	data, edata           uintptr
	bss, ebss             uintptr
	noptrbss, enoptrbss   uintptr
	end, gcdata, gcbss    uintptr
	types, etypes         uintptr

	typelinks []int32 // offsets from types
	itablinks []*itab

	modulename   string
	modulehashes []modulehash

	gcdatamask, gcbssmask bitvector

	typemap map[typeOff]*_type // offset to *_rtype in previous module

	next *moduledata
}

Current format

The current format of the moduledata was introduced in version 1.8. The main changes to the structure that was added were to support the plugin functionality. The structure is shown below.

type moduledata struct {
	pclntable    []byte
	ftab         []functab
	filetab      []uint32
	findfunctab  uintptr
	minpc, maxpc uintptr

	text, etext           uintptr
	noptrdata, enoptrdata uintptr
	data, edata           uintptr
	bss, ebss             uintptr
	noptrbss, enoptrbss   uintptr
	end, gcdata, gcbss    uintptr
	types, etypes         uintptr

	textsectmap []textsect
	typelinks   []int32 // offsets from types
	itablinks   []*itab

	ptab []ptabEntry

	pluginpath string
	pkghashes  []modulehash

	modulename   string
	modulehashes []modulehash

	hasmain uint8 // 1 if module contains the main function, 0 otherwise

	gcdatamask, gcbssmask bitvector

	typemap map[typeOff]*_type // offset to *_rtype in previous module

	bad bool // module failed to load and should be ignored

	next *moduledata
}

How to find the moduledata

It is easiest to find the moduledata structure in ELF binaries. Because it is not found at a predetermined offset, has its own section, or a symbol it needs to to be searched for. It is located in the .noptrdata section and since this has its own section in an ELF binary, the search scope can be limited to this section only. One way of finding it is to search for known offsets that should be in the data structure. For example, the address of the pclntable can be used. If a match is found, the location is a good candidate for the beginning of the structure since the address to the pclntable is the first entry in the moduledata structure. Since the pclntable is its own section on ELF binaries, the address of it is easy to extract. The match can be verified by checking for other known address in the candidate. For example, the addresses for the .text and .data sections can be used.

The process for PE files is slightly harder. For starters, the .noptrdata area is not located in its own section. Secondly, neither is the pclntable. Consequently, are rather brute-force method is required. Luckily, the pclntable has a header with a starting set of magic bytes. This can be used to search for the table. Once it has been found, the address of this header can be used to find the moduledata structure in either the .text or .data section using the same process as in ELF binaries.

Conclusion

The moduledata structure holds information about the layout of the executable. It can be somewhat hard to find this data structure. Once it has been found, it can be used to analyze the Go binary file better. This also includes recovering type information.