MEMSCAN is back

MEMSCAN 1.4

So, admittedly it’s been a while since i’ve done anything with MEMSCAN. It’s been neglected for a couple of reasons.

Firstly, I’ve been super busy inside and outside of Cigital.

Secondly, many of the most recent iOS jailbreaks did not include the task_for_pid 0 patch, of which MEMSCAN is dependant on.

The latter was true until the most recent Pangu jailbreak for 9.2-9.3.3 (arm64 only). Pangu included the necessary patch in the jailbreak, allowing MEMSCAN to work again.

I was actually just starting to prepare a version 2 of MEMSCAN, which i’ll describe a little bit further down, but essentially it would be a total re-write and would piggy back on the awesome Frida.re project by NowSecure.

Since MEMSCAN now works again I decided that I would complete my initial plan of tidying up the code and fixing a couple of issues.

There are however still some outlying issues that I’d like to work on in version 1. Namely, MEMSCAN is pretty inefficient when it comes to scanning memory. Dumping memory to disk is fairly trivial though. Below are the changes I’d like to make before finalising version 1:

  1. When dumping memory, i’d like to offer the option of only including the app’s memory/heap, rather than the entire memory space of the process.
  2. When scanning memory for a given secret, i’d like to make things more efficient, via threading.

Both of these, are likely to feature as part of version 2, so you may be wondering why not just work on version 2 and release that?

I am still a big fan of the way MEMSCAN 1 works, unlike other tools/scripts which rely on Frida/Cycript/Injection/Debuggers, MEMSCAN neither attaches to nor injects in to a target process. This is particularly useful for apps that have a very deep network of layered tamper protections. If i want to dump the memory of the target at a specific time, I don’t need to worry about circumventing all of these protections. This was ultimately the reason I wrote MEMSCAN in the first place.

Clean Up

The main source code of MEMSCAN has been refactored already, however I still have to re-factor and include the scanning code.

The current source is shown below:

/*
  @Author:  Grant Douglas <@Hexploitable>
  @Title:   MEMSCAN
  @Desc:    A tool for memory analysis of iOS and OSX applications
*/

//Lib includes
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
#include <mach-o/dyld_images.h>
#include <string.h>
#include <limits.h>
#include <errno.h>

//Local includes
#include "colour_defs.h"
#include "vm_defs.h"
#include "banner.h"

//  "Globals"
static int g_pid = 0;
static char *o_File = NULL;
static int pid = 0;
static int verbose = 0;
static int count = 0;


#define MAX_REGIONS 1000
//  Prints the usage of memscan
#define USAGE \
printf("Usage:\n-------\n"); \
printf("Verbose mode: -v\n"); \
printf("Dump memory to a file: memscan [-p <PID>] -d [-o <outputFile>]\n"); \
printf("Search memory for a sequence of bytes: memscan [-p <PID>] -s <INPUT_FILE>\n");


/*
  TODO LIST
  [X] Task/PID stuff
  [X] Get regions for task
  [X] Get memory for region
  [X] Dump memory
  [ ] Scan memory
*/


/*
  Structure to hold all region data
*/
struct {
  new_vm_address_t begin;
  new_vm_address_t end;
} MS_Regions[MAX_REGIONS];


/*
  Get the mach task for a given PID.
  Also verifies that the PID exists.
  @returns a pointer to the mach task if success. Returns NULL if an error
  occurs.
*/
static int get_task_for_pid(int pid, task_t *t)
{
  //Check PID is valid
  if (!(getpgid(pid) >= 0))
  {
    printf(
      "PID: %d, is invalid\n",
      pid
    );
    return -1;
  }

  //Attempt to get task
  kern_return_t kret;
  kret = task_for_pid(mach_task_self(), pid, t);
  if (kret != KERN_SUCCESS)
  {
    printf(
      "Failed to get task for PID: %d. Error: %s\n",
      pid,
      mach_error_string(kret)
    );
    return -2;
  }
  return 0;
}


/*
  Using the task reference, get all regions in the process space.
  @stores the region data in the MS_Region struct.
*/
static void get_regions_for_task(mach_port_t *t)
{
#ifdef __arm64__
   mach_vm_size_t size;
#else
   vm_size_t size;
#endif

  kern_return_t kret;
  vm_region_extended_info_data_t info;
  mach_msg_type_number_t info_count = VM_REGION_EXTENDED_INFO_COUNT;
  mach_port_t object_name;
  new_vm_address_t address;

  address = 0;

  do {
    //Attempt to get the next region
    kret = new_vm_region(*t,
      &address,
      &size,
      VM_REGION_BASIC_INFO,
      (vm_region_info_t) &info,
      &info_count,
      &object_name
    );

    //If we got the region OK
    if (kret == KERN_SUCCESS) {
      //Store region data
      MS_Regions[count].begin = address;
      MS_Regions[count].end = address+size;

      count++;
      address += size;
    }
    else
    {
      MS_Regions[count].begin = 0;
      MS_Regions[count].end = 0;
    }
  } while (kret != KERN_INVALID_ADDRESS);
  return;
}


/*
  Basic checks to see if region was read OK.
*/
static bool region_is_valid(int region)
{
  //Check region is valid
  if (MAX_REGIONS <= region)
  {
    if (verbose)
      printf("Index out of bounds (Exceeds MAX_REGIONS): %d.\n", region);
    return false;
  }
  //Check if struct entry is valid
  if (MS_Regions[region].begin == 0 || MS_Regions[region].end == 0)
    return false;

  return true;
}


/*
  Dump buffer to disk
*/
static void dump_buffer_to_disk(int region, const void *buffer, size_t sz)
{
  if (verbose)
    printf("Dumping region %d, buffer size: %lu\n", region+1, sz);

  //Create section break header
  const char *name = "Region: ";
  char cated_string[17];
  sprintf(cated_string,"%s%d", name, region+1);

  //Get file output handler
  FILE *f = fopen(o_File, "a+");

  //Write section break to output file.
  fwrite(cated_string, sizeof(cated_string), 1, f);

  //Write the data out to the file handler
  fwrite(buffer, sz, 1, f);
  fclose(f);
}


/*
  Read memory for region, using the mach_port_t ref
  Also dumps to disk, if so desired
*/
static void *read_memory_for_region(mach_port_t *t,
  int region,
  bool should_dump_to_disk
) {
#ifdef __arm64__
    vm_offset_t strt = 0;
    mach_msg_type_number_t sz = 0;
    mach_vm_size_t size;
#else
    pointer_t strt = 0;
    uint32_t sz = 0;
    vm_size_t size;
#endif

  kern_return_t kr_val;

  size = MS_Regions[region].end - MS_Regions[region].begin;
  //Attempt to read the memory from the region
  kr_val = new_vm_read(*t,
    (new_vm_address_t)MS_Regions[region].begin,
    size,
    &strt,
    &sz
  );

  //If read was successful
  if (kr_val == KERN_SUCCESS)
  {
    //If verbose, print the read size
    if (verbose)
      printf("\t--->Size of read: %u\n\n", sz);

    //Create a new buffer to hand black
    void *buffer = calloc(1, sz);

    //Copy the data read into said buffer
    memcpy(buffer, (const void *)strt, sz);

    //Dump the region to disk if the user has asked us to do so
    if (should_dump_to_disk)
      dump_buffer_to_disk(region, (const void *)strt, sz);

    //Return our pointer
    return buffer;
  }
  //Return NULL if failure, this is bad, we need to restructure this func.
  //Should return status code, and store result in received reference.
  return NULL;
}


/*
  Application main() -> Handles arguments and invoking the other functions
*/
int main(int argc, char** argv) {
  BANNER

  //Set some defaults
  bool shouldDump = false;
  char *i_File = NULL;

  //Get and process all of the command line options
  while (1)
  {
      char c;
      c = getopt(argc, argv, "ds:o:vp:");
      if (c == -1)
          break;
      switch (c)
      {
          case 'p':
              pid = atoi(optarg);
              break;
          case 'd':
              shouldDump = true;
              break;
          case 'v':
              verbose = 1;
              break;
          case 'o':
              o_File = optarg;
              break;
          default:
              USAGE
              break;
      }
  }

  argc -= optind;
  argv += optind;

  //If they are not dumping, and do not have an input file for searching.
  if (!shouldDump && i_File == NULL)
  {
      USAGE
      exit(-1);
  }

  //If user didnt specify a file, default to memscan_output_<PID>.bin
  if (o_File == NULL)
  {
    //Create section break header
    const char *name = "memscan_output_";
    const char *ext = ".bin";

    //Get char * representation of pid
    char strpid[sizeof(int)*pid];
    snprintf(strpid, sizeof strpid, "%d", pid);

    //Total str length
    int strLen = strlen(name) + strlen(strpid) + strlen(ext);

    //Construct our output file name
    char cated_string[strLen];
    sprintf(cated_string,"%s%s%s", name, strpid, ".bin");

    o_File = cated_string;
  }

  //Create the file fresh, truncates it if exists.
  FILE *f = fopen(o_File, "w+");
  fclose(f);

  g_pid = pid;

  /*
        ========= Code for main program begins =========
  */

  // First get the mach task for the pid
  mach_port_t t;
  int status = get_task_for_pid(g_pid, &t);

  //If we got the task ref OK...
  if (status == 0)
  {
    //Get all of the regions in memory.
    get_regions_for_task(&t);

    //For each region, check the validity and then dump/search
    //Note that MAX_REGIONS is staticly defined, it will usually always be
    //higher than the true number of regions.
    for (int i = 0; i<MAX_REGIONS; i++)
    {
      //Skip invalid regions
      if (!region_is_valid(i))
        continue;

      //Print region data if verbose
      if (verbose)
      {
        printf("Obtained Region %d.\n", i+1);
        printf(
          "\t--->Begins: %p.\n\t--->Ends: %p.\n",
          (const void *)MS_Regions[i].begin,
          (const void *)MS_Regions[i].end
        );
      }
      else
      {
        //If not verbose do a funky percentage.
        int percentage = ((double)i / (double)count) * 100;
        printf("\rProgress: %d%%", percentage);
        fflush(stdout);
      }

      //Get pointer to memory for region.
      //We should know its size from the item in MS_Regions
      void *memory_blob = read_memory_for_region(&t, i, shouldDump);

      //If the read memory was successful
      if (memory_blob != NULL)
      {
        //TODO - Check for needle if desired
        //Free the memory once done
        free(memory_blob);
        continue;
      }
    }
  }
  printf("\rProgress: 100%%");
  fflush(stdout);
  printf("\nFinished.\n");
  if (shouldDump)
    printf("Dump written to: %s\n", o_File);
  return 0;
}

Version 2

As noted above, version 2 of MEMSCAN will be written in JavaScript because that’s apparently how you do things now. Well, actually, I’m going to use JavaScript to make use of the Frida engine and API. I might actually write it in Swift but if I go the JavaScript route it will work for Android applications too.

Below is the general plan for version 2:

  • It needs to be more performance efficient than v1.
    • Threading
    • ARM64 massively increases the address space so this is important.
  • Should outright dump regions or entire memory space as v1 does
    • Should allow selective dumping of memory, e.g. ignore libraries etc.
  • Should allow the researcher to look for a specific thing (like v1)
  • Should look for generic things
    • Can be clever, looking for common patterns such as keys/password terminology
    • Can look for byte patterns exhibited by formats such as DER/PEM etc (if possible).
      • Going a step further it would be interesting to be able to replace these which seems achievable by the API too. (Think replacing certificates in memory with your own.)