Say Hello to Moduledata
Say Hello to Moduledata
The moduledata structure is a data table that was first introduced in version
1.5 of Go. It is a structure that holds important information that is needed
when you statically analyzing Go binaries. It records information about the
layout of the executable. For ELF binaries, the structure can be found in the
.noptrdata
section. In PE files it is much harder to find. Sometimes it is
located in the .text
section and sometimes it is in the .data
section. If
the binary has been stripped there is no symbol pointing to the structure. In
these scenarios, a brute-force search is needed. To know what we should search
for, we need first know what the structure looks like.
The structure
The structure that was added in version 1.5 is shown below. It is defined in the symtab.go file in the runtime package. It holds many good entries that can be used when analyzing Go binaries.
type moduledata struct {
pclntable []byte
ftab []functab
filetab []uint32
findfunctab uintptr
minpc, maxpc uintptr
text, etext uintptr
noptrdata, enoptrdata uintptr
data, edata uintptr
bss, ebss uintptr
noptrbss, enoptrbss uintptr
end, gcdata, gcbss uintptr
typelinks []*_type
modulename string
modulehashes []modulehash
gcdatamask, gcbssmask bitvector
next *moduledata
}
The first entry in the structure is the pclntable
. This table holds mappings
between source code line numbers and the program counter. The reason for this
table is to be able to produce meaning full stack traces during a panic. The
table is not removed if the binary is stripped and if debugging information is
removed. Since it holds information that maps back to the source code file,
this table can be a gold mine for malware analysts. The information includes
the full file path and name of the source file at compile time and name of the
function. Also using the program counter to source code line number mapping, it
is possible to estimate the source code lines of the functions. Since this
table holds a lot of information, it is one of the big contributors to the
large file size of Go binaries, particularly in larger applications.
The next entry in the structure is the ftab
, short for function table. The
functab
structure is shown below. One of its usages is in the runtime
package. The table is used to determine which function is currently being
executed. This table is used if the runtime function FuncForPC
is called. The
function takes a value for the program counter and returns a function
representation of that subroutine. Using that function representation, the
filename and line number can be determined when given a program counter.
type functab struct {
entry uintptr
funcoff uintptr
}
The moduledata structure also holds pointers to the beginning and the end of
useful data areas. For example the beginning and end of the program counter and
other sections. When analyzing ELF binaries, most of these values are easily
found because they correspond to ELF sections. This data is more important when
analyzing PE files because the areas are not marked as sections. Instead, they
are located in either the .text
section or the .data
section.
The last entry of interest is the typelinks
. This slice holds pointers to the
internal descriptions for types in the binary. These types structures hold
information for all types, included user defined and internal primitives. By
parsing this list it is possible to recover type information.
1.7 changes
With the release of 1.7, the moduledata structure was changed. The
structure is shown below. A new “section” was added called types and instead
of the typelinks
being a slice of pointers to _type
structures, it is a
slice of 32-bit integers that are offsets from the beginning of the types
section where the structure is located. A table was also added for interfaces
called itablinks
.
type moduledata struct {
pclntable []byte
ftab []functab
filetab []uint32
findfunctab uintptr
minpc, maxpc uintptr
text, etext uintptr
noptrdata, enoptrdata uintptr
data, edata uintptr
bss, ebss uintptr
noptrbss, enoptrbss uintptr
end, gcdata, gcbss uintptr
types, etypes uintptr
typelinks []int32 // offsets from types
itablinks []*itab
modulename string
modulehashes []modulehash
gcdatamask, gcbssmask bitvector
typemap map[typeOff]*_type // offset to *_rtype in previous module
next *moduledata
}
Current format
The current format of the moduledata was introduced in version 1.8. The main changes to the structure that was added were to support the plugin functionality. The structure is shown below.
type moduledata struct {
pclntable []byte
ftab []functab
filetab []uint32
findfunctab uintptr
minpc, maxpc uintptr
text, etext uintptr
noptrdata, enoptrdata uintptr
data, edata uintptr
bss, ebss uintptr
noptrbss, enoptrbss uintptr
end, gcdata, gcbss uintptr
types, etypes uintptr
textsectmap []textsect
typelinks []int32 // offsets from types
itablinks []*itab
ptab []ptabEntry
pluginpath string
pkghashes []modulehash
modulename string
modulehashes []modulehash
hasmain uint8 // 1 if module contains the main function, 0 otherwise
gcdatamask, gcbssmask bitvector
typemap map[typeOff]*_type // offset to *_rtype in previous module
bad bool // module failed to load and should be ignored
next *moduledata
}
How to find the moduledata
It is easiest to find the moduledata structure in ELF binaries. Because it is
not found at a predetermined offset, has its own section, or a symbol it needs
to to be searched for. It is located in the .noptrdata
section and since this
has its own section in an ELF binary, the search scope can be limited to this
section only. One way of finding it is to search for known offsets that should
be in the data structure. For example, the address of the pclntable
can be
used. If a match is found, the location is a good candidate for the beginning
of the structure since the address to the pclntable
is the first entry in the
moduledata structure. Since the pclntable
is its own section on ELF binaries,
the address of it is easy to extract. The match can be verified by checking for
other known address in the candidate. For example, the addresses for the .text
and .data
sections can be used.
The process for PE files is slightly harder. For starters, the .noptrdata
area is not located in its own section. Secondly, neither is the pclntable
.
Consequently, are rather brute-force method is required. Luckily, the
pclntable
has a header with a starting set of magic bytes. This can be used
to search for the table. Once it has been found, the address of this header can
be used to find the moduledata structure in either the .text
or .data
section using the same process as in ELF binaries.
Conclusion
The moduledata structure holds information about the layout of the executable. It can be somewhat hard to find this data structure. Once it has been found, it can be used to analyze the Go binary file better. This also includes recovering type information.