Friday 1 September 2017

Smart pointers

Introduction

Today, I had a discussion on StackOverflow (I hope no one deleted it yet) on the usefulness of smart pointers instead of try...finally.

My arguments are as follows:

1. try...finally decreases the readability of a function, especially if they are nested, e.g. when you instantiate a TStream and then a TStringList. Then you would need two nested try...finally constructs, thoroughly decreasing readability.

2. The code that destroys (frees) the object can be several lines removed from the place where it is created. Smart pointers allow you to take care of lifetime management right where you create the object.

3. Even if you are very good at manual memory management, taking care of it is still a chore, and it is always nice to have something, like ARC, or smart pointers, that can take that away from you. It greatly reduces the number of situations where you can make mistakes.

4. No matter how good you are, you can always forget to call Free, or call LongNamedObjectA.Free when you actually meant LongNamedObjectAB.Free. Such errors are hard to find, especially if they are several lines away from the actual creation spot. Using smartpointers, you don't have to repeat yourself this way, so you can't make such mistakes.

So if people tell you that only lazy or incompetent people would use smart pointers, or that people won't be able to understand your code if you use them, don't believe it. They greatly reduce the chance to make mistakes, even if you are competent at manual memory and lifetime management. They remove a lot of clutter and make your code more readable and let you concentrate on what you want to do, not on lifetime management anymore.

Implementation

I have seen quite a few smart pointer implementations. The nice thing about them is that they take care of automatically destroying the "guarded" object, so you don't have to take care of that yourself. So instead of something like:

var
  X: TMyObject;
begin
  X := TMyObject.Create(17);
  try
    // Several lines of code using X
  finally
    X.Free;
  end;
end;

Now should be able to do something like:

var
  X: TMyObject;
begin
  X := SmartPointer<TMyObject>(TMyObject.Create(17));
  // Several lines of code using X
end;

And at the end of the function, X is freed automatically.

Awkward to use

Unlike what I want to have, most implementations I have seen require you to declare the smart pointers as extra variables:

var
  X: TMyObject;
  SP: SmartPointer<TMyObject>;
begin
  X := SmartPointer<TMyObject>(SP, TMyObject.Create(17));
  // Several lines of code using X
end;

This is so for the ObjectHandle by Barry Kelly, as well as the ISafeGuard types in the JCL. But I never liked that. It makes you repeat yourself needlessly and makes the code less readable than it could be, taking away one of the advantages smart pointers provide, in my opinion.

Trick

So I employ a few tricks. The first one is that if a function or constructor returns a record, and this record is never assigned to a variable, the compiler still puts an anonymous record on the stack. That is because, for any record that is larger than a register, a record that is returned is actually passed as a var parameter. So the following:

function Bla(I, J: Integer): TMyRecord;

is actually compiled as:

procedure Bla(I, J: Integer; var Result: TMyRecord);

But that means that an actual record must be passed. If, in your code that uses the function, you don't assign the result of the function, the compiler creates an anonymous record on the stack and passes that, so inside your function you can assign values to the fields of Result.

Note that the use of the anonymous record mechanism is safe. It is also employed for intermediate results of class operators like + or *. If I use my BigIntegers, or Delphi's sample TComplex, then D := A + B * C; actually produces the code AnonymousIntermediate := op_Multiply(B, C); D := op_Add(A, AnonymousIntermediate); It will not change anytime soon.

Here comes the second trick: I want to return a TMyObject, and not a smart pointer record, so I can assign the result directly to X. For this, I (ab)use the Implicit operators you can define for a record. I define a

class operator Implicit(A: TMyObject): SmartPointer<TMyObject>;
and a
class operator Implicit(A: SmartPointer<TMyObject>): TMyObject;
So if I do the following:

X := SmartPointer<TMyObject>(TMyObject.Create(17));

then first, the first Implicit operator returns a record (a smart pointer). This is an anonymous record. But since X is a TMyObject, subsequently the other Implicit operator converts the anonymous record into a TMyObject again. How this "conversion" (actually just some passing around of a reference) is done can be seen in the code a bit further on.

So now I don't have to explicitly declare any smart pointers and can return and assign the created object in one fell swoop. The only thing to take care of is making this generic. That is what I have done in the following simple unit.

Many years ago, I implemented the original smart pointers in the JCL and called them Guards, and I want to remain faithful to that name. But since they are a little "smarter" and easier to use then those early implementations (which did not have overloaded operators or records with methods yet), I called them SmartGuards instead.

The unit

unit Velthuis.SmartGuards;

interface

type
  IGuard = interface
  ['{CE522D5D-41DE-4C6F-BC84-912C2AEF66B3}']
  end;

  TGuard = class(TInterfacedObject, IGuard)
  private
    FObject: TObject;
  public
    constructor Create(AObject: TObject);
    destructor Destroy; override;
  end;

  SmartGuard<T: class> = record
  private
    FGuard: IGuard;
    FGuardedObject: T;
  public
    class operator Implicit(GuardedObject: T): SmartGuard<T>;
    class operator Implicit(Guard: SmartGuard<T>): T;
  end;

implementation

{ TGuard }

constructor TGuard.Create(AObject: TObject);
begin
  FObject := AObject;
end;

destructor TGuard.Destroy;
begin
  FObject.Free;
  inherited;
end;

{ SmartGuard }

class operator SmartGuard<T>.Implicit(GuardedObject: T): SmartGuard<T>;
begin
  Result.FGuard := TGuard.Create(GuardedObject);
  Result.FGuardedObject := GuardedObject;
end;

class operator SmartGuard<T>.Implicit(Guard: SmartGuard<T>): T;
begin
  Result := Guard.FGuardedObject;
end;

end.

Sample code

Here is a simple (braindead?) program that uses them. It defines a simple but talkative class that shows when it is being destroyed, and which has a name. The demo shows how it can be used with the SmartGuard.

program SmartGuardDemo;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils,
  Velthuis.SmartGuards in 'Velthuis.SmartGuards.pas';

type
  TTalker = class
  private
    FName: string;
  public
    constructor Create(const Name: string);
    destructor Destroy; override;
    procedure Talk;
  end;

{ TTalker }

constructor TTalker.Create(const Name: string);
begin
  FName := Name;
end;

destructor TTalker.Destroy;
begin
  Writeln(FName, ' is being destroyed');
  inherited;
end;

procedure TTalker.Talk;
begin
  Writeln(FName, ' is talking');
end;

procedure Test;
var
  A, B: TTalker;
  I: Integer;
begin
  Writeln('Creating object London...');
  A := SmartGuard<TTalker>(TTalker.Create('London'));
  Writeln('Creating object Paris...');
  B := SmartGuard<TTalker>(TTalker.Create('Paris'));
  A.Talk;
  B.Talk;
  Writeln('OK, so they talked');
  for I := 1 to 100 do
    Write(I:8);
  Writeln;
  Writeln;
  Writeln('Leaving the test');
end;

begin
  Test;
  Readln;
end.

Conclusion

I am sure that these smart pointers could do with some enhancements, and that there could be smart pointers for plain pointers/memory blocks too, or smart pointers that can be stored in lists, arrays, etc. and keep their guarded objects alive for longer than a function. This is just the base frame I have been thinking of.

As was said in the StackOverflow discussion, smart pointers are not a known Delphi idiom. I think that is a pity. But that would require a standard implementation (or a number of them) in the Delphi runtime library. Perhaps one day, we will see them there. Or they will be superseded by ARC. Who knows?

Rudy Velthuis

9 comments:

  1. Of your 4 itemised points in the introduction section, they can all be refuted to varying degrees.

    **Point 1**

    Start with readability, especially of nested try blocks. For sure smart pointers are one way to reduce the clutter, but it can readily be done without. For instance, you might have code like this:

    obj1 := nil;
    obj2 := nil;
    try
    obj1 := TObj1.Create;
    obj2 := TObj2.Create;
    obj1.Foo(obj2);
    finally
    obj2.Free;
    obj1.Free;
    end;

    You can readily make some helper methods to simplify this further.

    InitializeNil(obj1, obj2);
    try
    obj1 := TObj1.Create;
    obj2 := TObj2.Create;
    obj1.Foo(obj2);
    finally
    FreeAndNil(obj2, obj1);
    end;

    Make your own choice whether or not to set the reference to nil if you don't like the FreeAndNil concept.

    Of course, one might criticise the latest block of code for using home grown methods and creating another pattern for object creation.

    **Point 2**

    That's a reasonable point. It only really applies to code review and maintenance though. When you write the code you write:

    obj := TObj.Create;
    try
    finally
    obj.Free;
    end;

    Then you fill in the body of the try. So when you first author the code the call to Free is right there. It's only when reviewing that it might be far away because you've filled out the body. But in that case, when reviewing, you'd most likely be looking at the whole body of the try in any case. For sure there is a benefit in the smart pointer approach, but I don't think it is as significant as you imply.

    **Point 3**

    Yes it is a chore, but so is wrapping your constructor in SmartPointer(...). Frustrating also that you need to repeat the generic type at that point. Can't the type inference mechanism infer that?

    **Point 4**

    Yes you can always forget to call Free, or destroy the wrong instance. But you can equally forget to wrap your constructor in a smart pointer. In any of these cases though the consequences are insignificant, because when developing code we always use memory leak detection. So if we forget to free an object (be it by omitting a Free or forgetting a smart pointer wrapper), we will be notified of the mistake as soon as we run the code.



    So, I think that all of your arguments are a little flaky.


    TBC

    ReplyDelete
    Replies
    1. To point 1: As you said, the code still gets less readable, and, as you admitted too, using homegrown functions (especially a multi-param FreeAndNil) also confuses the reader. I also don't like the repetition of the class parameter in the generic, but it seems type inference can't handle such a cast (causing an implicit operator to be called). But I do maintain that the lifetime management is handeld in the same line, and you do not have to repeat the object name later on, in the finally clause.

      To point 2: Sure, I also do it that way, i.e. I immediately code the try finally after the creation. But especially in the case of multiple objects, that means you must modify your original code to nil outside the try and construct inside the try. Another chance to make mistakes. Or you really nest the try-finally's (I would do that), increasing the line count of the function or procedure.

      To point 3 and 4: Manual memory management, not only inside a simple scope like a function, is a chore, and handing that to a smart pointer is far less of a chore, IMO. Anything that can take this away from me is a Good Thing(tm). Smart pointers, or ARC, or garbage collectors are much better at handling this than humans are. If they are designed well, they don't make the mistakes we, humans, can make. That also answers point 4, to an extent.

      I use leak detection, but once a leak is detected, it must be traced and debugged, and that is not always so simple, and it often takes valuable time. I rather avoid leaks and other invalid pointer operations. As I said, if done well, automatic mechanisms can be much better at that than fallible humans are.

      Delete
    2. Your comment shows a lack of practical experience. Debugging leaks isn't hard at all. Invariably you simply look at the latest code you worked on. Seldom takes more than a few seconds.

      And I have explained, but you don't seem to believe, it happens very infrequently with experienced developers. Perhaps you don't believe me, and perhaps my team of developers is super human or something. But that's my experience.

      And as I explained, if you can forget to free an object, you can forget to wrap it in a smart pointer. No reason why one is more or less likely than the other.

      All I have to go on is my experience, and there simply is no problem. This is a solution in search of a problem. Me and my team simply do not spend time dealing with leaks and memory management. It is a non-issue.

      This is a classic case of misidentifying a problem and proposing the wrong solution. The problem is poorly trained developers. The solution is better training. Simple as that.

      Delete
  2. CONTINUED:


    And then there is this quote:

    > They greatly reduce the chance to make mistakes, even if you are competent at manual memory and lifetime management.

    That's ridiculous. I have small team (6 devs) and we haven't had a leak in our continuously developed software since we started using FastMM leak detection. And actually it's pretty rare for any of us to forget a call to Free when developing and see the FastMM leak dialog. A competent programmer is perfectly capable of doing this properly. Perhaps you have a different idea of what a competent programmer is. The idea that anything could greatly reduce the chance of something that is already has vanishingly small chance of occurrence is preposterous.



    One thing that your post also fails to tackle are the disadvantages of smart pointers, at least as you have implemented them. Right off the bat, here are three that spring to my mind:

    1. You double the number of heap allocations. Each object instantiation also requires a guard object to be created. Likewise at time of destruction. That will have significant performance implications, especially in multi threaded scenarios non scalable memory managers.

    2. You lose control of order of destruction. Often this does not matter, but sometimes it does.

    3. You introduce another pattern, another way to do something. As a rule, large projects are easier to maintain when they are written in a uniform style. Introducing alternate ways to achieve the same goal hinders comprehension.



    As far as smart pointers not being well known in Delphi, I would dispute that. They appear quite regularly on SO, G+ etc. They are certainly well known by those Delphi programmers that interact with the rest of the Delphi community. That they aren't more widely used is perhaps an indication that those same programmers don't see the benefits that you see, but that's speculation.



    Personally, I can see some attractions of smart pointers, but the pros seem quite limited to me, and are, in my view outweighed by the cons. The idea that using them can greatly reduce the chance of defects that in practise never arise seems especially laughable.

    ReplyDelete
    Replies
    1. The idea if to avoid leaks, not to detect and debug them. FastMM can be a great tool to detect them, no doubt, but I rather avoid them. And I have seen even many "competent" programmers make mistakes doing manual memory allocations, so I don't buy that argument.

      1. The heap allocation for a guard object is not that big, compared to the memory allocated by a proper object (i.e. one that has more functions than simply guarding an object), like a stream or stringlist or any VCL or FMX component.

      2. The order of destruction is seldom a problem. In those seldom cases, use an explicitly declared guard, or if you like, use try-finally. Then you can control the lifetime exactly.

      3. I see no big problem in "introducing another pattern". If, as you say, smart pointers are well know in Delphi, that is not a problem at all. But I don't see them that often, so I doubt they are well known, except by a select few who actually communicate with many others in the Delphi community.

      And perhaps those others don't know about the benefits, or have the same irrational fear about a loss of control that they seem to have for, say, using ARC. Smart pointers have the advantage that they are optional, and you can avoid them, unlike ARC.

      And indeed, I would not use them in tight loops, but in tight loops, I also don't use many other features of the language and use a lower level, e.g. pointers instead of indexing, manual aloocation instead of dynamic arrays, etc. In the estimated 99.5% of code that is not time critical, that won't make a big difference, though.

      Delete
    2. You didn't read what I said about leak detection. Try again. Do you doubt that my team releases software with no leaks?

      The size of the heap allocation is not relevant. It's how many you do. If you had practical experience of an application which was sensitive to this then you'd understand.

      Order of destruction is indeed rarely significant. But when it is you need to switch to a different pattern. And then you have to leave a comment for future maintainers. All of which adds complexity.

      Anyway, I've made my points. I know you don't agree. That's fine.

      By the way, amateur means unpaid. By definition you are indeed an amateur programmer. Your bio should use replace that word with inexperienced. There's nothing wrong with being amateur, but a hobbyist programmer is amateur.

      Delete
  3. Hmmm... I see some typos in my replies, but there seems to be no way to edit them. Clicking the thing (I assume it is a pencil) in the dark circle beside the date/time doesn't seem to do anything.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Well now, 4.5 years later, I'm curious as to what the angry Brit said...

    ReplyDelete