19 November, 2020

EurekaLog causes Integer Overflow?

We were contacted by a person who complained that his application was working fine until he added EurekaLog to it. An Integer Overflow exception was raised after adding EurekaLog to project. The exception occurred inside the _UStrCatN function (it is a function for concatenating multiple strings in RTL).

The _UStrCatN function is written in assembler, but if you delve into the meaning of the checks, you get something like this:
  DestLen := {... the length of the resulting string is calculated ...};    
  if DestLen < 0 then 
where _IntOver is the RTL's function that raises the Integer Overflow exception.

What's happening? How can a string length be negative? Is this a bug in EurekaLog?

The specified check inside _UStrCatN is designed to limit strings to 2 GB of memory: if concatenation's result is more than 2 GB in length, then a literal integer overflow will occur, so the final length will become negative. Thus, Integer Overflow exception may occur when combining large strings (when result is too large).

But what does EurekaLog have to do with it then? And how can the check trigger when we contatenate short strings? (the client confirmed this with a log)

Such a "false-positive" response is possible if you are performing an operation on already deleted string.

Look at this code:
  Marker: String;
  function ReadLine: String;
    // ...
    Marker := { ... };
    // ...
  // ...
  Data := Data + Marker + ReadLine;
  // ...
Do you see a problem in this code?

To understand the problem, you need to know how the "Data: = Data + Marker + ReadLine;" line is executed. It looks something like this in pseudo-code:
Param0 := Pointer(Data);
Param1 := Pointer(Marker);
Param2 := Pointer(ReadLine);
_UStrCatN(Data, [Param0, Param1, Param2]);
In other words, the operator stores pointers to arguments sequentially before calling the function.

So, here's the bug: the statement stores a pointer to the Marker string, but the Marker string is changed inside the ReadLine function. This means that the stored pointer will point to the old string. Thus, the already deleted string will be sent to the _UStrCatN function.

Note that this bug is not a "problem" without EurekaLog in the project. Indeed, a deleted memory is simply marked as "free", but its content is not cleared. This means that _UStrCatN will successfully concatenate with the already deleted string. And the result of the operation will most likely be correct. E.g. there is a bug in the code, but it is completely invisible, since the program functions perfectly.

The situation changes radically if EurekaLog (or any other tool for debugging memory problems) is added to the project. By default, memory checks are enabled in EurekaLog. This means that a deleted memory will be cleaned up. Typically, this is done with a template like DEADBEEF. Note that the Integer representation of DEADBEEF is negative (equal to -559038737). Therefore adding lengths of several short strings to this number will also produce a negative number.

In other words, if EurekaLog is added to the project, then the operation with the already deleted string will no longer be successful. A previously hidden bug is now visible.