module Virt_mem_mmap:Memory map.sig..end
Memory maps represent the virtual memory of a virtual machine.
We are mostly interested in the kernel memory and kernel data structures. In Linux this stays at the same virtual memory address whichever task is actually running (eg. on i386 machines, the kernel is often found at virtual address 0xC0100000). Kernel memory is spread out over several ranges of addresses, with gaps of uninteresting or non-existant virtual addresses in between, and this structure captures that.
A memory map is a range of 64 bit addresses from 0 to 2^64-1.
(Note that 64 bit addresses are used even for 32 bit virtual
machines - just ignore everything above 0xFFFFFFFF).
A memory map consists of zero or more mappings of data. A
mapping starts at some address and has some size, and the data for
a mapping can come from some source such as a file or OCaml
string. Use Virt_mem_mmap.of_file, Virt_mem_mmap.of_string, Virt_mem_mmap.add_file, Virt_mem_mmap.add_string
to create a memory map from mappings.
If mappings overlap, then the mapping which was added later overrides/overwrites earlier mappings at any addresses which coincide.
Where there is no mapping for a particular address, the memory map
is said to have a hole. (Typically almost all of a memory map is
holes). In general, the searching functions such as Virt_mem_mmap.find skip
over holes, while the accessor functions such as Virt_mem_mmap.get_bytes
raise an error if you try to read a hole, but read the individual
function documentation.
Memory maps may (or may not) have an associated word size and
endianness for the whole map. These are used when we look at
integers and pointers in the memory. See Virt_mem_mmap.get_endian,
Virt_mem_mmap.set_endian, Virt_mem_mmap.get_wordsize and Virt_mem_mmap.set_wordsize, and accessor
functions such as Virt_mem_mmap.get_int32 and Virt_mem_mmap.follow_pointer.
Mappings' data are stored in 1D Bigarrays. The advantages of using a Bigarray are: (a) hidden from the garbage collector, (b) easily accessible from C, (c) uses mmap(2) where possible.
Some low level functions are written in C for speed.
Mappings are stored in a segment tree for efficient access, but
the segment tree has to be rebuilt from scratch each time you add
a new mapping. It is not known if there is a more efficient way
to incrementally update a segment tree. In any case, as long as
you are mainly doing lookups / searches / getting bytes, this is
very fast.
type ('a, 'b, 'c) t
The 'ws, 'e and 'hm type parameters are phantom types
designed to ensure you don't try illegal operations before
initializing certain parts of the memory map. If you are not
familiar with phantom types, you can just ignore them.
See also this posting about the phantom types used in virt-mem.
The memory map structure is an example of a
persistent
data structure.
typeaddr =int64
val create : unit -> ([ `NoWordsize ], [ `NoEndian ], [ `NoMappings ]) tval of_file : Unix.file_descr ->
addr ->
([ `NoWordsize ], [ `NoEndian ], [ `HasMapping ]) tfd at address addr.val add_file : ('a, 'b, 'c) t ->
Unix.file_descr ->
addr -> ('a, 'b, [ `HasMapping ]) tfd at address addr to an existing memory map.
The new mapping can overwrite all or part of an existing mapping.val of_string : string ->
addr ->
([ `NoWordsize ], [ `NoEndian ], [ `HasMapping ]) taddr.val add_string : ('a, 'b, 'c) t ->
string -> addr -> ('a, 'b, [ `HasMapping ]) taddr to an existing memory map.
The new mapping can overwrite all or part of an existing mapping.val set_wordsize : ([ `NoWordsize ], 'a, 'b) t ->
Virt_mem_utils.wordsize -> ([ `Wordsize ], 'a, 'b) tval set_endian : ('a, [ `NoEndian ], 'b) t ->
Bitstring.endian -> ('a, [ `Endian ], 'b) tval get_wordsize : ([ `Wordsize ], 'a, 'b) t -> Virt_mem_utils.wordsizeval get_endian : ('a, [ `Endian ], 'b) t -> Bitstring.endianval find : ('a, 'b, [ `HasMapping ]) t ->
?start:addr -> string -> addr optionNone (if not found). You can pass an optional starting
address. If no start address is given, we begin searching at
the beginning of the first mapping.
Any holes in the memory map are skipped automatically.
Note that this doesn't find strings which straddle the boundary of two adjacent or overlapping mappings.
Note that because the string being matched is an OCaml
string it may contain NULs (zero bytes) and those are matched
properly.
val find_align : ([ `Wordsize ], 'a, [ `HasMapping ]) t ->
?start:addr -> string -> addr optionVirt_mem_mmap.find, but the string must be aligned to the word size of
the memory map.val find_all : ('a, 'b, [ `HasMapping ]) t ->
?start:addr -> string -> addr listVirt_mem_mmap.find, but returns all occurrences of a string in a memory map.val find_all_align : ([ `Wordsize ], 'a, [ `HasMapping ]) t ->
?start:addr -> string -> addr listVirt_mem_mmap.find_all, but the strings must be aligned to the word size.val find_pointer : ([ `Wordsize ], [ `Endian ], [ `HasMapping ]) t ->
?start:addr -> addr -> addr optionval find_pointer_all : ([ `Wordsize ], [ `Endian ], [ `HasMapping ]) t ->
?start:addr -> addr -> addr listval get_byte : ('a, 'b, [ `HasMapping ]) t -> addr -> int
This will raise Invalid_argument "get_byte" if the address is
a hole (not mapped).
val get_bytes : ('a, 'b, [ `HasMapping ]) t ->
addr -> int -> string
This will raise Invalid_argument "get_bytes" if the address range
contains holes.
val get_int32 : ('a, [ `Endian ], [ `HasMapping ]) t ->
addr -> int32addr.val get_int64 : ('a, [ `Endian ], [ `HasMapping ]) t ->
addr -> int64addr.val get_C_int : ([ `Wordsize ], [ `Endian ], [ `HasMapping ]) t ->
addr -> int32addr.val get_C_long : ([ `Wordsize ], [ `Endian ], [ `HasMapping ]) t ->
addr -> int64addr.val get_string : ('a, 'b, [ `HasMapping ]) t -> addr -> stringaddr up to (but not
including) the first ASCII NUL character. In other words, this
returns a C-style string.
This may raise Invalid_argument "get_string" if we reach a
hole (unmapped address) before finding the end of the string.
See also Virt_mem_mmap.get_bytes, Virt_mem_mmap.is_string and Virt_mem_mmap.is_C_identifier.
val is_string : ('a, 'b, [ `HasMapping ]) t -> addr -> boolval is_C_identifier : ('a, 'b, [ `HasMapping ]) t -> addr -> boolval is_mapped : ('a, 'b, 'c) t -> addr -> booladdr is mapped.val is_mapped_range : ('a, 'b, 'c) t -> addr -> int -> booladdr to addr+size-1
are mapped.val follow_pointer : ([ `Wordsize ], [ `Endian ], [ `HasMapping ]) t ->
addr -> addraddr and return
the address pointed to.val succ_long : ([ `Wordsize ], 'a, [ `HasMapping ]) t ->
addr -> addraddr and return it.val pred_long : ([ `Wordsize ], 'a, [ `HasMapping ]) t ->
addr -> addraddr and return it.val align : ([ `Wordsize ], 'a, [ `HasMapping ]) t ->
addr -> addraddr to the next wordsize boundary. If it already
aligned, this just returns addr.Invalid_argument.