English
The Internet threat alert status is currently normal. At present, no major epidemics or other serious incidents have been recorded by Kaspersky Lab’s monitoring service. Internet threat level: 1

The Mystery of the Duqu Framework

Igor Soumenkov
Kaspersky Lab Expert
Posted March 07, 15:58  GMT
Tags: Duqu
1.2
 

While analyzing the components of Duqu, we discovered an interesting anomaly in the main component that is responsible for its business logics, the Payload DLL. We would like to share our findings and ask for help identifying the code.

Code layout

At first glance, the Payload DLL looks like a regular Windows PE DLL file compiled with Microsoft Visual Studio 2008 (linker version 9.0). The entry point code is absolutely standard, and there is one function exported by ordinal number 1 that also looks like MSVC++. This function is called from the PNF DLL and it is actually the “main” function that implements all the logics of contacting C&C servers, receiving additional payload modules and executing them. The most interesting is how this logic was programmed and what tools were used.

The code section of the Payload DLL is common for a binary that was made from several pieces of code. It consists of “slices” of code that may have been initially compiled in separate object files before they were linked in a single DLL. Most of them can be found in any C++ program, like the Standard Template Library (STL) functions, run-time library functions and user-written code, except the biggest slice that contains most of C&C interaction code.


Layout of the code section of the Payload DLL file

This slice is different from others, because it was not compiled from C++ sources. It contains no references to any standard or user-written C++ functions, but is definitely object-oriented. We call it the Duqu Framework.

The Framework

Features

The code that implements the Duqu Framework has several distinctive properties:

  • Everything is wrapped into objects
  • Function table is placed directly into the class instance and can be modified after construction
  • There is no distinction between utility classes (linked lists, hashes) and user-written code
  • Objects communicate using method calls, deferred execution queues and event-driven callbacks
  • There are no references to run-time library functions, native Windows API is used instead

Objects

All objects are instances of some class, we identified 60 classes. Each object is constructed with a “constructor” function that allocates memory, fills in the function table and initializes members.


Constructor function for the linked list class.

The layout of each object depends on its class. Some classes appear to have binary compatible function tables but there is no indication that they have any common parent classes (like in other OO languages). Furthermore, the location of the function table is not fixed: some classes have it at offset 0 of the instance, but some does not.


Layout of the linked list object. First 10 fields are pointers to member functions.

Objects are destroyed by corresponding “destructor” functions. These functions usually destroy all objects referenced by member fields and free any memory used.

Member functions can be referenced by the object’s function table (like “virtual” functions in C++) or they can be called directly. In most object-oriented languages, member functions receive the “this” parameter that references the instance of the object, and there is a calling convention that defines the location of the parameter – either in a register, or in stack. This is not the case for the Duqu Framework classes – they can receive “this” parameter in any register or in stack.


Member function of the linked list, receives “this” parameter on stack

Event driven framework

The layout and implementation of objects in the Duqu Framework is definitely not native to C++ that was used to program the rest of the Trojan. There is an even more interesting feature of the framework that is used extensively throughout the whole code: it is event driven.

There are special objects that implement the event-driven model:

  • Event objects, based on native Windows API handles
  • Thread context objects that hold lists of events and deferred execution queues
  • Callback objects that are linked to events
  • Event monitors, created by each thread context for monitoring events and executing callback objects
  • Thread context storage manages the list of active threads and provides access to per-thread context objects

This event-driven model resembles Objective C and its message passing features, but the code does not have any direct references to the language, neither does it look like compiled with known Objective C compilers.


Event-driven model of the Duqu Framework

Every thread context object can start a “main loop” that looks for and processes new items in the lists. Most of the Duqu code follow the same principle: create an object, bind several callbacks to internal or external events and return. Callback handlers are then executed by the event monitor object that is created within each thread context.

Here is an example pseudocode for a socket object:

SocketObjectConstructor {
    NativeSocket = socket();
    SocketEvent = new MonitoredEvent(NativeSocket);
    SocketObjectCallback = new ObjectCallback(this, SocketEvent, OnCallbackFunc);
    connect(NativeSocket, ...);
}
OnCallbackFunc {
    switch(GetType(Event)) {
    case Connected: ...
    case ReadData: ...
...}
}

Conclusions

  • The Duqu Framework appears to have been written in an unknown programming language.
  • Unlike the rest of the Duqu body, it's not C++ and it's not compiled with Microsoft's Visual C++ 2008.
  • The highly event driven architecture points to code which was designed to be used in pretty much any kind of conditions, including asynchronous commutations.
  • Given the size of the Duqu project, it is possible that another team was responsible for the framework than the team which created the drivers and wrote the system infection and exploits.
  • The mysterious programming language is definitively NOT C++, Objective C, Java, Python, Ada, Lua and many other languages we have checked.
  • Compared to Stuxnet (entirely written in MSVC++), this is one of the defining particularities of the Duqu framework.

The Duqu Framework: What was that?

After having performed countless hours of analysis, we are 100% confident that the Duqu Framework was not programmed with Visual C++. It is possible that its authors used an in-house framework to generate intermediary C code, or they used another completely different programming language.

We would like to make an appeal to the programming community and ask anyone who recognizes the framework, toolkit or the programming language that can generate similar code constructions, to contact us or drop us a comment in this blogpost. We are confident that with your help we can solve this deep mystery in the Duqu story.


161 comments

Oldest first
Table view
 

Satoshi Nakamoto

2012 Mar 10, 02:57
1
 

What about these.....

http://ldeniau.web.cern.ch/ldeniau/html/oopc.html - Nov 2006 and revised in Aug 2007

http://sourceforge.net/projects/cos/

http://ldeniau.home.cern.ch/ldeniau/html/oopc/oopc.html - 2001

Reply    

Nick Argall

2012 Mar 10, 03:26
0
 

Re: Destructors may be a clue

Delphi also has a destructor architecture like this. Anything that derives from TComponent will be asked to pass a reference to an Owner in the constructor, and the owner calls Free on every child. Free is written as follows
If self != null then
Destroy;

Delphi compiles directly to Windows bytecode, and was the best of the Windows programming languages in the late 90s (up until Microsoft hired the Delphi architect, who went on to design .NET)

Reply    

leith

2012 Mar 10, 09:19
0
 

Re: Destructors may be a clue

If you're looking for a Python that generates lean C code, look no further than pyrex and cython.

Obscure... but you'd have all the calling semantics of python with the lightness of directly compiled C if you focused on it.

My Assembly is a tad rusty so if this wastes anyone's time, I appologize but I thought it was worth pointing out.

Reply    

hossein

2012 Mar 10, 10:14
0
 

Languages Mixing

i think there are two possbile explainations,
first, when you use inline assembly in your C++/C code this can happen because the overall looks of code will be different.
second, when you integrate another DLL or library into your code, this can also happen, as the DLL or Library maybe written in assembly language, as you know this is an old trick which for protection reasons programmers inject their dll dependencies into their final product.
so i say they used C++ with one or both of above mentioned methods so to complicate the reversing.

Reply    

mrlozer

2012 Mar 10, 14:19
0
 

kbasic

Maybe it is kbasic. http://kbasic.com/

Reply    

exeman

2012 Mar 10, 22:16
0
 

Re: Re: Re: Other C/C++ compiler?

How about GObject OO framework for C? It is old, stable, common and focused on signals.

Edited by exeman, 2012 Mar 10, 22:53

Reply    

Igor Soumenkov

2012 Mar 11, 00:15
0
 

Re: Re: Re: Re: Other C/C++ compiler?

Code generated with GObject type system looks similar but it tends to be more verbose.

Reply    

sote

2012 Mar 11, 00:23
0
 

Eiffel +1

Why to waste time on designing and implementing something which can be achieved out-of-box? Maybe you should take a look at Eiffel?
- has OO features
- agents mechanism (which would influence event-driven part of the framework)
- can be compiled into native code

http://en.wikipedia.org/wiki/Eiffel_(programming_language)

Reply    

\x41\x6c\x62\x65\x72\x74

2012 Mar 11, 00:28
0
 

O´Caml with ocamlc +1

I suspected O'Caml, being that she is also OO and generates native code, and that it has functional characteristics that can generate an opcode bit different than usual, a bit confusing in the RCE.

https://en.wikipedia.org/wiki/OCaml

Reply    

\x41\x6c\x62\x65\x72\x74

2012 Mar 11, 00:45
1
 

Re: O´Caml with ocamlc +1

...and there is the possibility of being just a OO language using intensive COM, but I need to look better.

Reply    

Elizacat

2012 Mar 11, 04:12
1
 

C code with use of COM?

It looks to me like C code with a COM vtable.

Reply    

GeralltF

2012 Mar 11, 07:50
0
 

Re: Microsoft based, native code only

What about the IL2CPU compiler developed by the Cosmos team?
Only problem is that the library compiles pure CIL, so no platform invokes. But their X# feature allows embedding raw X86 operations.

Reply    

Igor Soumenkov

2012 Mar 11, 17:49
0
 

Re: C code with use of COM?

The are actually no 'vtables' in COM / C++ meaning. There is only a list of function pointers inside the instance, it's not even separated from the data.

Reply    

mrlozer

2012 Mar 11, 19:52
0
 

Open object rexx

http://www.oorexx.org/about.html
This is my last guess...

Reply    

alibadrelsayed

2012 Mar 12, 00:43
0
 

QNX compiler

May be it's the compiler used for QNX Neutrino is the GNU compiler (gcc). Currently, development can be done from these hosts:

QNX Neutrino
MS-Windows
Solaris

If you have the QNX Momentics Professional Edition, you can create anything using the Integrated Development Environment (IDE) from any host. Alternatively, you can use command-line tools that are based on the GNU compiler.

For MS-Windows hosts, you also have the option which is CodeWarrior tools from Metrowerks. Currently, the CodeWarrior IDE also uses gcc.

Edited by alibadrelsayed, 2012 Mar 12, 01:08

Reply    

MJPMJP

2012 Mar 12, 02:27
0
 

bonjour assembleur ?

assembleur A86 / A386 ?

Reply    

infernalmachine

2012 Mar 12, 03:44
0
 

Re: May I ask...

Most likely the consistency and perhaps optimisations that would only be conceivably possible if done by machine along with a lack of optimisations that could only be conceivably made by hand. Hard to tell without looking at the full code but generally such a large amount of code done by hand will have hints (Edit: Actually, 95663 bytes is quite small, but it should be big enough still to offer good hints).

Also, coding it in assembly by hand would have many significant weaknesses. It's inconceivable someone would do that because the disadvantages of using pure assembly outweighs the advantages immensely.

Even if there coder were able to create it in pure assembly, it is highly likely they would create something such as a set of macros or their own basic higher level language. Most likely inspired by features encountered in other higher level languages. Doing everything in assembly is not a good thing. A human will usually work out something more efficient than copying and pasting the same thing hundreds of times and editing it slightly each time.

It might be a hand made language but tactically, it's worth investigating anyway. Precisely for the fact that it appears to be obscure. It could potentially give clues about the identity of the creators.

I suggest they release the DLL and commented disassembly so that it isn't such a shot in the dark.

Edited by infernalmachine, 2012 Mar 13, 02:50

Reply    

Aulis

2012 Mar 12, 03:48
0
 

Old framework, creating small sized executables

Delphi has already been suggested for a couple of times. I'll enhance this by suggesting that the tool could have been either Delphi OR Lazarus.

And the framework that has been used is not the standard Delphi VCL but Key Object Library, KOL. http://kolmck.net/
This was once called "Bonanza" library, but KOL may be the official name now a days.

I do not think this combination leaves behind any standard Delphi strings that would make it recognizable as Delphi compiled application.

Reply    

infernalmachine

2012 Mar 12, 05:10
0
 

Re: Old framework, creating small sized executables

By Delphi strings, you mean length supplied rather than null terminated strings?

Reply    

Kiaro

2012 Mar 12, 05:30
0
 

Lotus

hmph... looks like my lost 1992 multi-dimensional / environmentally conditional feedback structure... it was originally in lotus command language, were the code itself changed depending on non-relevant variables, such as time of day, day of month, the second of last routine... it pulled source from embedded segments of other sub-routines... it would have taken a team years to decipher... I often wondered if that little monster survived.......

Reply    

iamgk

2012 Mar 12, 06:54
0
 

psather ?

Reply    

Des O'Brien

2012 Mar 12, 07:50
0
 

The way it may have been created.

A method used by systems programmers some time back was to utilise a multi phase complier for a high level language and intercept the resultant intermediate code generated. This was then modified to pure assembly - removing NOP and comment instructions. Resultant code ran a lot faster. It is also possible to hide the originating code in such a method.

Have you looked into this type of origin.

Reply    

Igor Soumenkov

2012 Mar 12, 11:01
0
 

Re: Re: Old framework, creating small sized executables

It is very easy to identify code generated by the Delphi compiler - it generates very specific code.

Reply    

Sondreal

2012 Mar 12, 16:03
0
 

I

vc++ with OLE objects I believe, have u tested this?

Reply    

StanTheMan

2012 Mar 12, 17:04
0
 

possible contender

http://www.ionicwind.com/aurora.html

1. "At first glance, the Payload DLL looks like a regular Windows PE DLL file compiled ...."
1. compiler was written in MSVC++

2. "This slice is different from others, because it was not compiled from C++ sources"
2. Aurora features a C/C++ like syntax - no inheritance

2. "There is no distinction between utility classes (linked lists, hashes) and user-written code"
3 written in Aurora or IWBasic (same site)

4. "Objects communicate using method calls, deferred execution queues and event-driven callbacks"
4. http://www.ionicwind.com/forums/index.php?topic=4594.0
Ionic Wind Network Client/Server Library
- Sends a Windows message when a connection is ready to accept.
- Sends a Windows message when incoming data is ready to be read

5. "There are no references to run-time library functions, native Windows API is used instead"
5. Completely stand-alone. Makes calls to core Windows API functions only

Reply    

Kam

2012 Mar 12, 20:17
0
 

Pike ?

Maybe it's written in Pike? Some of the features reminded me of what i have read about Pike, and according to Wikipedia it is used for Server/Gateway code for Opera Mini.
The syntax is also C like, i'm not sure about how it uses the windows API though

Reply    

xamble

2012 Mar 12, 21:44
0
 

Another suggestion for Forth

As two others put forward - might be Forth.
Haven't done serious code in years but back when I was doing Forth (nearly 30 years ago now):
- we started by making our own version of Fig with the Nautilus cross-compiler
- if you remove the headers for target compilation then things down at the metal can look very strange to a non-forth coder
- we built our own Assembler in the Forth itself
- using that we could drop down into assembler and back to high level forth within the same word def
What can one person do?
Noodling about on my own I was able to get event driven, tight code that could patch itself to be a COM file for DOS or CMD file for CP/M. Same actual code doing this at run time.
And I was only a mediocre coder with time on my hands - what a good coder could do with the right software and motivation might look pretty dam alien to someone else.
We used to have a saying: Using C you can make a kludgy Forth but using Forth you can make a fine C.

Reply    

srhubb

2012 Mar 13, 01:29
0
 

Re: Re: That code looks familiar

You forgot one other source for the military and intelligence community. UNISYS, formerly Burroughs (who supplied a lot of the contracted personnel) and Univac (Sperry+Univac) who supplied the bulk of the hardware and development software, for decades to the military and intelligence communities within our government (includes NSA, CIA, Army, Navy, Air Force, IRS, SSA, etc.).

It may be a tool developed by Sperry or Unisys as well as the one's you've mentioned.

An Old Univacer,
Srhubb

Reply    

jhkaper

2012 Mar 13, 03:08
0
 

How about Visual Prolog

It is event driven, uses native windows API, has unique objects,

Reply    

sazlm

2012 Mar 13, 03:27
0
 

This is very intriguing!

Have you considered comparing and analyzing it to RPGII it somewhat resembles that?

Reply    

If you would like to comment on this article you must first
login


Bookmark and Share
Share

Analysis

Blog