The Windows Driver Developer's Digest
Security Notebook

Numeric Overflows in Complex IOCTL Handlers
December 15, 2003

 

As a conscientious reader of WD-3, you do know you're supposed to check the length of any buffers you get from user mode calls to DeviceIoControl, don't you. DON'T YOU? In this edition of the WD-3 Security Notebook, I want to discuss a wrinkle in length checking logic your driver may have to overcome when you add together two integers that both come from user mode. You can cut right to the chase by following this link, or you can read the suspenseful buildup...

Stuff you should already know

Let's say you define an IOCTL that requires the user-mode program to create a INSTRUCT input parameter. Your driver will process the data in the input structure and then give back a OUTSTRUCT as the output. The user-mode program is supposed to do something like this:

HANDLE hDevice;
INSTRUCT in;
OUTSTRUCT out;
DWORD junk;
[initialize "in" structure]
DeviceIoControl(hDevice, IOCTL_MYDRIVER_DO_SOMETHING, &in, sizeof(in), &out, sizeof(out), &junk, NULL);

To keep things simple, let's suppose that you declared IOCTL_MYDRIVER_DO_SOMETHING to use METHOD_BUFFERED. Your driver dispatch routine for IRP_MJ_DEVICE_CONTROL might then look something like this:

NTSTATUS DispatchControl(PDEVICE_OBJECT fdo, PIRP Irp)
  {
  PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);
  ULONG cbin = stack->Parameters.DeviceIoControl.InputBufferLength;
  ULONG cbout = stack->Parameters.DeviceIoControl.OutputBufferLength;
  ULONG code = stack->Parameters.DeviceIoControl.IoControlCode;
  NTSTATUS status = STATUS_SUCCESS;
  ULONG info = 0;

  switch (code)1
    {

  case IOCTL_MYDRIVER_DO_SOMETHING:
    {
    PINSTRUCT pin = (PINSTRUCT) Irp->AssociatedIrp.SystemBuffer;2
    POUTSTRUCT pout = (POUTSTRUCT) Irp->AssociatedIrp.SystemBuffer;
    if (cbin < sizeof(*pin))3
      {
      status = STATUS_INVALID_PARAMETER;
      break;
      }
    if (cbout < sizeof(*pout))
      {
      status = STATUS_BUFFER_TOO_SMALL;4
      break;
      }


    . . .

    info = sizeof(*pout);5
    break;
    }

  default:
    status = STATUS_INVALID_DEVICE_REQUEST;6
    break;
    }

  Irp->IoStatus.Status = status;
  Irp->IoStatus.Information = info;
  IoCompleteRequest(Irp, IO_NO_INCREMENT);
  return status;
  }

Validating the buffer lengths, as shown here, is an essential step in making the system secure against errant or malicious programs. The I/O Manager will make sure that the DeviceIoControl arguments are self-consistent. That is, the I/O Manager will make sure that the input and output buffer pointers are valid and point to user-mode memory buffers that are at least as long as the respective byte counts. But the I/O Manager cannot know how long these buffers should be. Only you know that, which is why your driver has to make these checks. If the user gave you too small an input buffer, you might access memory past the end of the allocated system buffer. That could cause a page fault (and thence a BSOD), or it could allow the user-mode program to get a copy of some secret stuff that it wasn't supposed to see. If the user gave you too small an output buffer, you might try to write past the end of the allocated system buffer. That could cause a page fault or corruption of memory that belongs to some other kernel component.

Stuff you might not know yet

Now let's make the example more complicated. Suppose that, instead of passing a fixed-size input buffer, the user-mode application needs to give you some variable-length data. For example, let's suppose that the input parameter to a particular IOCTL consists of a fixed-size structure header followed by a string. One poor way to design the interface for this IOCTL would be to declare an input structure this way:

typedef struct _BADINSTRUCT {
  ULONG SomeStuff;
  ULONG SomeOtherStuff;
  LPSTR TheString;
  ULONG StillMoreStuff;
  } BADINSTRUCT, *PBADINSTRUCT;

There are at least four things wrong with a design based on this structure:

  1. Since the kernel works with UNICODE strings, we should really be passing a wide string value instead of an ANSI string. So TheString ought to be a LPWSTR.

  2. We (probably) don't expect the driver to change the string, so we ought to declare TheString as a constant string pointer: LPCWSTR. Yes, this is a bit picky.

  3. The driver is going to have to setup a structured exception frame to (a) call ProbeForRead on the value of TheString, and (b) access the string data itself. This is because a driver must never just trust a user-mode pointer. It would be better to make TheString be an integer giving the offset from the beginning of the input structure to the start of the string.

  4. Giving a driver a null-terminated string argument is a horrible idea because the driver has no way to locate the null terminator that doesn't risk a page fault. It would be better to specify the length of the string explicitly.

To cure these defects, I would declare the input structure like this:

typedef struct _INSTRUCT {
  ULONG SomeStuff;
  ULONG SomeOtherStuff;
  ULONG StringLength;
  ULONG StringOffset;
  ULONG StillMoreStuff;
  WCHAR TheString[1];7
  } INSTRUCT, *PINSTRUCT;

The application code to construct the input parameter for this hypothetical IOCTL is a bit tortured:

LPCWSTR s; // <== someone gives us this
ULONG size = sizeof(INSTRUCT) + wcslen(s) * sizeof(WCHAR);
PINSTRUCT pin = (PINSTRUCT) malloc(size);
[initialize fixed-length parts of pin]
pin->StringLength = wcslen(s) * sizeof(WCHAR);
pin->StringOffset = FIELD_OFFSET(INSTRUCT, TheString);
wcscpy((LPWSTR) ((PUCHAR) pin + pin->StringOffset), s);8
. . .
DeviceIoControl(. . ., pin, size, . . .);

free(pin);

Tortured it may be, but this code generates an input buffer that's entirely self-contained: it has no pointers to memory outside the boundaries of the buffer. Furthermore, the input buffer is self-relocating because it uses numeric offsets instead of pointers.

So now let's write the driver code that verifies the input parameter. Here's my first cut at this code:

case IOCTL_MYDRIVER_DO_SOMETHING:
  {
  PINSTRUCT pin = (PINSTRUCT) Irp->AssociatedIrp.SystemBuffer;
  if (cbin < sizeof(*pin)
    || pin->StringOffset >= cbin
    || pin->StringOffset + pin->StringLength > cbin)     // <== don'
t do this
    {
    status = STATUS_INVALID_PARAMETER;
    break;
    }
  . . .
  break;
  }

The idea behind this code is to first verify that I've been given something that's big enough to hold all the fixed fields of the input structure. If so, I then verify that the variable length string data commences inside the input structure (but not at or beyond the end). If so, I verify finally that the string ends no later than the end of the input buffer.9

While this code looks bulletproof, there's a subtle error in it. I didn't see the error until Micky Snir of Microsoft pointed it out. We're adding together two integers that we got from user mode (StringOffset and StringLength). It's entirely possible that the sum will overflow the integer accumulator and yield a result that's smaller than whatever we're comparing it against. As a concrete example, suppose that StringOffset is 0x00000014 and cbin is 0x00001000, but that StringLength is the gigantic value 0xFFFFFFF0. The sum of StringOffset and StringLength overflows and yields 0x00000004, which is plainly less than cbin. Our driver is therefore going to think that it can safely address 0xFFFFFFF0 bytes of string data beginning at offset 0x00000014 into the input structure. But most of that memory doesn't exist, so we're quite likely to page fault.

We can cure this particular problem by simply not adding the two values, as follows:

case IOCTL_MYDRIVER_DO_SOMETHING:
  {
  PINSTRUCT pin = (PINSTRUCT) Irp->AssociatedIrp.SystemBuffer;
  if (cbin < sizeof(*pin)
    || pin->StringOffset >= cbin
    || pin->StringLength > cbin - pin->StringOffset)    // <== this is okay
    {
    status = STATUS_INVALID_PARAMETER;
    break;
    }
  . . .
  break;
  }

This is okay because, by the time we do the subtraction "cbin - pin->StringOffset", we already know that pin->StringOffset is strictly less than cbin. The difference cannot, therefore, underflow.

You'll note that all we had to do to solve this particular problem was to move a term from the left side of the inequality to the right side. You know from algebra that moving terms doesn't affect the inequality. All we've really done, therefore, is to make sure that the calculation can't overflow or underflow when carried out in the limited word size of our computer.

Moral

The chances of being eaten alive by an integer overflow on Main Street are high enough to worry about.


1 -- Always switch on the full 32 bits of the control code. If you don't, a malicious or buggy program could (a) specify a control code that contains the wrong buffering method, which would cause you to crash or corrupt kernel memory, or (b) specify a control code containing unintended access control flags, which would allow them to use a function for which they don't actually have the right privileges.

2 -- With METHOD_BUFFERED, the system buffer has the input data when you get called. You're supposed to put the output data into this same buffer and set IoStatus.Information to the number of bytes you return.

3 -- Saying "sizeof(*pin)" is the same as saying "sizeof(INSTRUCT)" because pin points to a INSTRUCT. Less typing is good.

4 -- STATUS_BUFFER_TOO_SMALL is an error. If you complete an IRP with this status, the I/O Manager won't copy any data back to user mode.The similar code STATUS_BUFFER_OVERFLOW is just a warning. If you complete a METHOD_BUFFERED control with STATUS_BUFFER_OVERFLOW, the I/O Manager will copy IoStatus.Information bytes back to the user-mode buffer.

5 -- Especially with METHOD_BUFFERED, make sure to set IoStatus.Information to the number of bytes you've actually altered in the system buffer. If Information is too large, the I/O Manager might copy uninitialized kernel data -- possibly including passwords or the like -- back to user mode. If it's too small, the I/O Manager won't copy all of the data back to user mode, which may cause the user-mode program to crash.

6 -- Always fail an IRP_MJ_DEVICE_CONTROL that presents an invalid control code. Not only is this a good programming practice, but the HCT tests will also send you randomly chosen control codes to make sure you do it.

7 -- In this version of the structure, TheString is a placeholder for a variable-length array of characters.

8 -- Note that size includes a null terminator for TheString -- that's why I gave TheString an explicit dimension of "1". Thus, it's safe to call wcscpy to copy the string. Since this particular input structure has just one variable-length component, I could as well have coded "wcscpy(pin->TheString, s)". I wanted, however, to show you how to use the byte offset in a more general case.

9 -- It would also be a good idea to verify that StringLength is an even number. We're quite likely to use this value as the BufferLength and Length values for a UNICODE_STRING structure, but it's potentially a big error to use an odd number that way. I mean, what does it mean to say "this buffer has three and one-half characters in it" when we set Length equal to 7?