MEMSCAN 1.4

So, admittedly it’s been a while since i’ve done anything with MEMSCAN. It’s been neglected for a couple of reasons.

Firstly, I’ve been super busy inside and outside of Cigital.

Secondly, many of the most recent iOS jailbreaks did not include the task_for_pid 0 patch, of which MEMSCAN is dependant on.

The latter was true until the most recent Pangu jailbreak for 9.2-9.3.3 (arm64 only). Pangu included the necessary patch in the jailbreak, allowing MEMSCAN to work again.

I was actually just starting to prepare a version 2 of MEMSCAN, which i’ll describe a little bit further down, but essentially it would be a total re-write and would piggy back on the awesome Frida.re project by NowSecure.

Since MEMSCAN now works again I decided that I would complete my initial plan of tidying up the code and fixing a couple of issues.

There are however still some outlying issues that I’d like to work on in version 1. Namely, MEMSCAN is pretty inefficient when it comes to scanning memory. Dumping memory to disk is fairly trivial though. Below are the changes I’d like to make before finalising version 1:

  1. When dumping memory, i’d like to offer the option of only including the app’s memory/heap, rather than the entire memory space of the process.
  2. When scanning memory for a given secret, i’d like to make things more efficient, via threading.

Both of these, are likely to feature as part of version 2, so you may be wondering why not just work on version 2 and release that?

I am still a big fan of the way MEMSCAN 1 works, unlike other tools/scripts which rely on Frida/Cycript/Injection/Debuggers, MEMSCAN neither attaches to nor injects in to a target process. This is particularly useful for apps that have a very deep network of layered tamper protections. If i want to dump the memory of the target at a specific time, I don’t need to worry about circumventing all of these protections. This was ultimately the reason I wrote MEMSCAN in the first place.

Clean Up

The main source code of MEMSCAN has been refactored already, however I still have to re-factor and include the scanning code.

The current source is shown below:

  1/*
  2  @Author:  Grant Douglas <@Hexploitable>
  3  @Title:   MEMSCAN
  4  @Desc:    A tool for memory analysis of iOS and OSX applications
  5*/
  6
  7//Lib includes
  8#include <sys/types.h>
  9#include <sys/stat.h>
 10#include <unistd.h>
 11#include <stdio.h>
 12#include <stdlib.h>
 13#include <dlfcn.h>
 14#include <mach-o/dyld_images.h>
 15#include <string.h>
 16#include <limits.h>
 17#include <errno.h>
 18
 19//Local includes
 20#include "colour_defs.h"
 21#include "vm_defs.h"
 22#include "banner.h"
 23
 24//  "Globals"
 25static int g_pid = 0;
 26static char *o_File = NULL;
 27static int pid = 0;
 28static int verbose = 0;
 29static int count = 0;
 30
 31
 32#define MAX_REGIONS 1000
 33//  Prints the usage of memscan
 34#define USAGE \
 35printf("Usage:\n-------\n"); \
 36printf("Verbose mode: -v\n"); \
 37printf("Dump memory to a file: memscan [-p <PID>] -d [-o <outputFile>]\n"); \
 38printf("Search memory for a sequence of bytes: memscan [-p <PID>] -s <INPUT_FILE>\n");
 39
 40
 41/*
 42  TODO LIST
 43  [X] Task/PID stuff
 44  [X] Get regions for task
 45  [X] Get memory for region
 46  [X] Dump memory
 47  [ ] Scan memory
 48*/
 49
 50
 51/*
 52  Structure to hold all region data
 53*/
 54struct {
 55  new_vm_address_t begin;
 56  new_vm_address_t end;
 57} MS_Regions[MAX_REGIONS];
 58
 59
 60/*
 61  Get the mach task for a given PID.
 62  Also verifies that the PID exists.
 63  @returns a pointer to the mach task if success. Returns NULL if an error
 64  occurs.
 65*/
 66static int get_task_for_pid(int pid, task_t *t)
 67{
 68  //Check PID is valid
 69  if (!(getpgid(pid) >= 0))
 70  {
 71    printf(
 72      "PID: %d, is invalid\n",
 73      pid
 74    );
 75    return -1;
 76  }
 77
 78  //Attempt to get task
 79  kern_return_t kret;
 80  kret = task_for_pid(mach_task_self(), pid, t);
 81  if (kret != KERN_SUCCESS)
 82  {
 83    printf(
 84      "Failed to get task for PID: %d. Error: %s\n",
 85      pid,
 86      mach_error_string(kret)
 87    );
 88    return -2;
 89  }
 90  return 0;
 91}
 92
 93
 94/*
 95  Using the task reference, get all regions in the process space.
 96  @stores the region data in the MS_Region struct.
 97*/
 98static void get_regions_for_task(mach_port_t *t)
 99{
100#ifdef __arm64__
101   mach_vm_size_t size;
102#else
103   vm_size_t size;
104#endif
105
106  kern_return_t kret;
107  vm_region_extended_info_data_t info;
108  mach_msg_type_number_t info_count = VM_REGION_EXTENDED_INFO_COUNT;
109  mach_port_t object_name;
110  new_vm_address_t address;
111
112  address = 0;
113
114  do {
115    //Attempt to get the next region
116    kret = new_vm_region(*t,
117      &address,
118      &size,
119      VM_REGION_BASIC_INFO,
120      (vm_region_info_t) &info,
121      &info_count,
122      &object_name
123    );
124
125    //If we got the region OK
126    if (kret == KERN_SUCCESS) {
127      //Store region data
128      MS_Regions[count].begin = address;
129      MS_Regions[count].end = address+size;
130
131      count++;
132      address += size;
133    }
134    else
135    {
136      MS_Regions[count].begin = 0;
137      MS_Regions[count].end = 0;
138    }
139  } while (kret != KERN_INVALID_ADDRESS);
140  return;
141}
142
143
144/*
145  Basic checks to see if region was read OK.
146*/
147static bool region_is_valid(int region)
148{
149  //Check region is valid
150  if (MAX_REGIONS <= region)
151  {
152    if (verbose)
153      printf("Index out of bounds (Exceeds MAX_REGIONS): %d.\n", region);
154    return false;
155  }
156  //Check if struct entry is valid
157  if (MS_Regions[region].begin == 0 || MS_Regions[region].end == 0)
158    return false;
159
160  return true;
161}
162
163
164/*
165  Dump buffer to disk
166*/
167static void dump_buffer_to_disk(int region, const void *buffer, size_t sz)
168{
169  if (verbose)
170    printf("Dumping region %d, buffer size: %lu\n", region+1, sz);
171
172  //Create section break header
173  const char *name = "Region: ";
174  char cated_string[17];
175  sprintf(cated_string,"%s%d", name, region+1);
176
177  //Get file output handler
178  FILE *f = fopen(o_File, "a+");
179
180  //Write section break to output file.
181  fwrite(cated_string, sizeof(cated_string), 1, f);
182
183  //Write the data out to the file handler
184  fwrite(buffer, sz, 1, f);
185  fclose(f);
186}
187
188
189/*
190  Read memory for region, using the mach_port_t ref
191  Also dumps to disk, if so desired
192*/
193static void *read_memory_for_region(mach_port_t *t,
194  int region,
195  bool should_dump_to_disk
196) {
197#ifdef __arm64__
198    vm_offset_t strt = 0;
199    mach_msg_type_number_t sz = 0;
200    mach_vm_size_t size;
201#else
202    pointer_t strt = 0;
203    uint32_t sz = 0;
204    vm_size_t size;
205#endif
206
207  kern_return_t kr_val;
208
209  size = MS_Regions[region].end - MS_Regions[region].begin;
210  //Attempt to read the memory from the region
211  kr_val = new_vm_read(*t,
212    (new_vm_address_t)MS_Regions[region].begin,
213    size,
214    &strt,
215    &sz
216  );
217
218  //If read was successful
219  if (kr_val == KERN_SUCCESS)
220  {
221    //If verbose, print the read size
222    if (verbose)
223      printf("\t--->Size of read: %u\n\n", sz);
224
225    //Create a new buffer to hand black
226    void *buffer = calloc(1, sz);
227
228    //Copy the data read into said buffer
229    memcpy(buffer, (const void *)strt, sz);
230
231    //Dump the region to disk if the user has asked us to do so
232    if (should_dump_to_disk)
233      dump_buffer_to_disk(region, (const void *)strt, sz);
234
235    //Return our pointer
236    return buffer;
237  }
238  //Return NULL if failure, this is bad, we need to restructure this func.
239  //Should return status code, and store result in received reference.
240  return NULL;
241}
242
243
244/*
245  Application main() -> Handles arguments and invoking the other functions
246*/
247int main(int argc, char** argv) {
248  BANNER
249
250  //Set some defaults
251  bool shouldDump = false;
252  char *i_File = NULL;
253
254  //Get and process all of the command line options
255  while (1)
256  {
257      char c;
258      c = getopt(argc, argv, "ds:o:vp:");
259      if (c == -1)
260          break;
261      switch (c)
262      {
263          case 'p':
264              pid = atoi(optarg);
265              break;
266          case 'd':
267              shouldDump = true;
268              break;
269          case 'v':
270              verbose = 1;
271              break;
272          case 'o':
273              o_File = optarg;
274              break;
275          default:
276              USAGE
277              break;
278      }
279  }
280
281  argc -= optind;
282  argv += optind;
283
284  //If they are not dumping, and do not have an input file for searching.
285  if (!shouldDump && i_File == NULL)
286  {
287      USAGE
288      exit(-1);
289  }
290
291  //If user didnt specify a file, default to memscan_output_<PID>.bin
292  if (o_File == NULL)
293  {
294    //Create section break header
295    const char *name = "memscan_output_";
296    const char *ext = ".bin";
297
298    //Get char * representation of pid
299    char strpid[sizeof(int)*pid];
300    snprintf(strpid, sizeof strpid, "%d", pid);
301
302    //Total str length
303    int strLen = strlen(name) + strlen(strpid) + strlen(ext);
304
305    //Construct our output file name
306    char cated_string[strLen];
307    sprintf(cated_string,"%s%s%s", name, strpid, ".bin");
308
309    o_File = cated_string;
310  }
311
312  //Create the file fresh, truncates it if exists.
313  FILE *f = fopen(o_File, "w+");
314  fclose(f);
315
316  g_pid = pid;
317
318  /*
319        ========= Code for main program begins =========
320  */
321
322  // First get the mach task for the pid
323  mach_port_t t;
324  int status = get_task_for_pid(g_pid, &t);
325
326  //If we got the task ref OK...
327  if (status == 0)
328  {
329    //Get all of the regions in memory.
330    get_regions_for_task(&t);
331
332    //For each region, check the validity and then dump/search
333    //Note that MAX_REGIONS is staticly defined, it will usually always be
334    //higher than the true number of regions.
335    for (int i = 0; i<MAX_REGIONS; i++)
336    {
337      //Skip invalid regions
338      if (!region_is_valid(i))
339        continue;
340
341      //Print region data if verbose
342      if (verbose)
343      {
344        printf("Obtained Region %d.\n", i+1);
345        printf(
346          "\t--->Begins: %p.\n\t--->Ends: %p.\n",
347          (const void *)MS_Regions[i].begin,
348          (const void *)MS_Regions[i].end
349        );
350      }
351      else
352      {
353        //If not verbose do a funky percentage.
354        int percentage = ((double)i / (double)count) * 100;
355        printf("\rProgress: %d%%", percentage);
356        fflush(stdout);
357      }
358
359      //Get pointer to memory for region.
360      //We should know its size from the item in MS_Regions
361      void *memory_blob = read_memory_for_region(&t, i, shouldDump);
362
363      //If the read memory was successful
364      if (memory_blob != NULL)
365      {
366        //TODO - Check for needle if desired
367        //Free the memory once done
368        free(memory_blob);
369        continue;
370      }
371    }
372  }
373  printf("\rProgress: 100%%");
374  fflush(stdout);
375  printf("\nFinished.\n");
376  if (shouldDump)
377    printf("Dump written to: %s\n", o_File);
378  return 0;
379}

Version 2

As noted above, version 2 of MEMSCAN will be written in JavaScript because that’s apparently how you do things now. Well, actually, I’m going to use JavaScript to make use of the Frida engine and API. I might actually write it in Swift but if I go the JavaScript route it will work for Android applications too.

Below is the general plan for version 2:

  • It needs to be more performance efficient than v1.
    • Threading
    • ARM64 massively increases the address space so this is important.
  • Should outright dump regions or entire memory space as v1 does
    • Should allow selective dumping of memory, e.g. ignore libraries etc.
  • Should allow the researcher to look for a specific thing (like v1)
  • Should look for generic things
    • Can be clever, looking for common patterns such as keys/password terminology
    • Can look for byte patterns exhibited by formats such as DER/PEM etc (if possible).
      • Going a step further it would be interesting to be able to replace these which seems achievable by the API too. (Think replacing certificates in memory with your own.)