Thursday, January 10, 2008

Random thoughts on Unicode

Recently CodeGear engineers started to talk about Unicode support [1][2][3][4][5] in the next major release of Delphi (codenamed Tiburon [6]). I am still so excited to hear about that, although it should have been done ten years ago.

Is Unicode support a NP-problem?

You might ask the same question. In fact, a set of Wide-string types have been introduced for many years. But their RTL, VCL and IDE stay in Ansi-version stage. Why not update? IMO: 50% is because of technical issues and 50% is because of their market focus and operative issues. On one hand, a smooth migration is a big challenge, which must be tested completely and seriously. On another hand, their human resource might be limited, so that Unicode support has been shifted several times.

How about the days without Unicode support?

Fortunately, some nice guys have created some nice components for us in the most difficult days. The TntControls [7], a collection of basic Unicode enabled RTL and VCL, is one of them. The concept behind TntControls is to override all Ansi-version of string properties and routines with wide-versions. For Win9x platforms, all Wide-something will be casted to Ansi-something at runtime, so that everything works fine for all Windows platforms. But you have to create Wide-version of components and functions one by one. This is time intensive, boring and sometimes a little bit difficult. Furthermore, WideString is not reference counted. It performs /much more/ slower [8] than AnsiString. Unfortunately, nowadays there are few options...

What will be done to next Delphi?

First, a reference-counted new string type UnicodeString will be introduced. Second, all type aliases and function aliases will be switched from Ansi-version to Wide-version. So that most existed projects can be upgraded to Unicode stage without difficulties or performance lost. It sounds very simple, doesn't it? Thanks for those genies in advance.

But there are also two things I do not like:
  1. The new type name is inconsistent with existed types. AnsiString is equivalent to UnicodeString, but AnsiChar is equivalent to WideChar. I suggest deprecating WideChar as well and introducing a new type UnicodeChar.
  2. Type aliases and function aliases are not switchable, which means, that you have to make sure that UnicodeString will NOT break your code. Unfortunately, if your project is not test driven, it is very hard to say...

Conclusion

So many questions, discusses and requests about the new UnicodeString. No doubt, CodeGear engineers will be quite busy this year. The fully Unicode support is a big challenge to everyone. Are you ready? [9]

References

  1. Chris Bensen: Unicode
  2. Chris Bensen: Unicode: SizeOf is Different than Length Part II
  3. Chris Bensen: Unicode: SizeOf is Different than Length
  4. The Oracle at Delphi: DPL & Unicode - a toss up
  5. The Oracle at Delphi: More FAQs about Unicode in Tiburón
  6. Delphi and C++ Builder Roadmap
  7. TMSUnicode Components (formal TntControls)
  8. Tobias Gurock: What’s wrong with Delphi’s WideString?
  9. The Chinese version of this article (on my blog @csdn)
 

Potential memory leaks by initializing a record

Delphi uses reference-counting with copy-on-write semantics [1][2] to reduce memory allocation for strings (not for WideString). A kind of memory leak was found by accident. Let us first look the following example:

type
TFoo = record
StringField: string;
end;

procedure CreateLeakTest;
var
Foo: TFoo;
begin
FillChar(Foo, SizeOf(Foo), 0);
Foo.StringField := 'Leak Test';
FillChar(Foo, SizeOf(Foo), 0); //<--- A leak!
end;

Initializing records with FillChar() is quite common in Delphi. By calling FillChar() twice, you might create a memory leak. Note that no leaks on ShortString. My assumption: FillChar() is unsafe to cleanup records with ref-counted fields.

function StringStatus(const S: string): string;
begin
Result := Format('Addr: %p, Refc: %d, Val: %s',
[Pointer(S), PInteger(Integer(S) - 8)^, S]);
end;

procedure Diagnose;
var
S: string;
Foo: TFoo;
begin
S := Copy('Leak Test', 1, 5); // Force to allocate a new string
WriteLn(StringStatus(S));
Foo.StringField := S;
WriteLn(StringStatus(Foo.StringField));
FillChar(Foo, SizeOf(Foo), 0);
WriteLn(StringStatus(S));
end;

The output of the above code looks as follows:
Addr: 00E249E8, Refc: 1, Val: Leak Test // A string buffer is allocated
Addr: 00E249E8, Refc: 2, Val: Leak Test // Its Refc is incremented
Addr: 00E249E8, Refc: 2, Val: Leak Test // Its Refc should equal 1 (unexpected)

After calling FillChar(), the StringField is pointed to nil. However the reference count of its previous string buffer hasn't been decremented, so that its reference count will NEVER go back to 0. In other words, this string buffer will not be deallocated before your program is terminated. This is a leak.

How to initialize a record in a safe way?

As ref-counted fields are not handled correctly by using FillChar(). The default way of initializing a record looks more like an abuse of FillChar. I suggest declare a const record with initial values instead of using FillChar.

const
EmptyRecordX: TRecordX = (
Field1: InitVal1;
Field2: InitVal2;
...
FidldN: InitValN
);

// In your application
var
Foo: TRecordX;
begin
Foo := EmptyRecordX; // instead of FillChar(Foo, SizeOf(Foo), 0);
//...

It is quite safe to initialize a record in this way, isn't it?

Alternative Solution

If you are too lazy to declare such empty record constants. The following function can help you as well. Note that it is a little bit tricky.

procedure InitRecord(out R; RecordSize: Integer);
begin
FillChar(R, RecordSize, 0);
end;

Thanks for the magic word "out". As it is in Help described "An out parameter, like a variable parameter, is passed by reference. With an out parameter, however, the initial value of the referenced variable is discarded by the routine it is passed to. The out parameter is for output only; that is, it tells the function or procedure where to store output, but doesn't provide any input. "

Let us see what the code actually has done to a record.

mov edx,[$0040c904]
mov eax,ebx
call @FinalizeRecord //<----- cleanup
mov edx,$0000000c
call InitializeRecord

Compile calls procedure FinalizeRecord(), so that a record will be completely finalized.

UPDATE #1: As Jonas Maebe recently described: "If you have local record which was declared but not yet used, a simple fillchar(rec,sizeof(rec),0) will set everything to 0/nil/empty. If it may have been used earlier and contains ref-counted fields, you first have to call finalize(rec). "[3] His argument is more understandable and closer to the point of the issue.

References

  1. Wikipedia: Copy-on-write
  2. A Brief History of Strings
  3. fpc-pascal maillist
 

Way back into native: Tooltip

I love Delphi because of its components, I hate Delphi also because of its components. If you really care about user interface, you might find many minor differences between a standard Windows application and a Delphi application. For instance: All shortcuts of menu items are right aligned, when the menu is associated with an image list. When you press ALT, the first button on menubar should be selected. Missing support of chevron on toolbar. The tooltip (known as HintWindow in Delphi) looks different from the native tooltip. In this article I will started with tooltip and tell you how to make it nicer.

Background

In WindowsXP, even you are very careful, you might probably not mention the difference. A Delphi styled tooltip has a gray edge. (Actually it should be black) And there is no shadow under a tooltip either. This issue was not a real issue, until you upgrade to Windows Vista. A native tooltip in Vista [1][2] has rounded corners. Its background is gradient filled. (See the picture below)


Delphi style   vs.   Windows native style

No matter there is a manifest or not, you cannot change a tooltip to its native style. Why? I suppose, that Delphi engineers want to make the HintWindow more customizable, but it is quite difficult to start with the standard TOOLTIPS_CLASS, so they created a HintWindow as a WS_EX_TOOLWINDOW. Now it is easy to build your own styled tooltips, but it is also difficult to reproduce the native style, isn't it?

Solution

The idea is simple: if you do not intend to customize your tooltip, you can replace WS_EX_TOOLWINDOW with TOOLTIPS_CLASS. I have made a patch. You just have to include the NativeHintWindow.pas in your application and build your project again. Done, your nice tooltip is back. If you are using TntControls (or TMSUnicode Controls), there is an extra editon for you. Click here to download. It has been reported to QualityCentral [3] as well.

Conclusion

In this article [4], I have shown a visibility's issue of tooltip control (THintWindow) and have also implemented a patch to fix it. I hope Delphi engineers could pay more attention on such kind of issues. More posts about inconsistencies of user interface are coming soon. So stay tuned ;-)

References

  1. MSDN: Tooltips and Infotips
  2. MSDN: Top Guidelines Violations
  3. Related discuss at QualityCentral
  4. The Chinese version of this article (on my blog @csdn)

Performance issue of TAction

If you have never heard about TAction or used it, you are definitely not a Delphi developer ^^) This component simplifies the UI work. I mentioned once accidentally, that the CPU usage became abnormally high, when mouse was moved quickly on the main form. Spy++ tells, that massive WM_UPDATE message were sent, when mouse was moving fast over the main form. So I took a closer look into the details and found out that TContainedAction.Update() was executed many times by the TActionManager. As it is described in Help "this method triggers the OnUpdate event handler. ... When the application is idle, the OnUpdate event occurs for every action." The idle status will be changed very frequently. I have more than 100 TAction controls on the main form, which means TContainedAction.Update() was executed more than 100 times in a very short time. This explains, why my application became a CPU usage monster.

Solution

If your application does not handle any OnUpdate events, it really makes sense to accelerate TContrainedAction.Update().

My solution is to replace the original method is an empty method.

uses
FastcodePatch {MPL http://fastcode.sourceforge.net/};

procedure TContainedActionUpdateStub;
asm
call TContainedAction.Update;
end;

type
TContainedActionPatch = class(TContainedAction)
public
function Update: Boolean; override;
end;

function TContainedActionPatch.Update: Boolean;
begin
Result := False;
end;

// Disallows the TContainedAction.Update to trigger TAction.OnUpdate()
procedure DisableTContainedActionUpdate;
begin
FastcodeAddressPatch(
FastcodeGetAddress(@TContainedActionUpdateStub),
@TContainedActionPatch.Update);
end;

The best place to run this patch is in YourForm.OnCreate() event. If you want to make a permanent patch, you can either modify TContainedAction.Update in ActnList.pas directly, or submit it to CodeGear's quality center.

Conclusion

In this article [1], I have shown a potential performance issue by using massive TxxxxAction components. I have also implemented a patch to fix this issue. Here is the patch

References

  1. The Chinese version of this article (on my blog @csdn)