English
The Internet threat alert status is currently normal. At present, no major epidemics or other serious incidents have been recorded by Kaspersky Lab’s monitoring service. Internet threat level: 1

The Mystery of the Duqu Framework

Igor Soumenkov
Kaspersky Lab Expert
Posted March 07, 15:58  GMT
Tags: Duqu
1.2
 

While analyzing the components of Duqu, we discovered an interesting anomaly in the main component that is responsible for its business logics, the Payload DLL. We would like to share our findings and ask for help identifying the code.

Code layout

At first glance, the Payload DLL looks like a regular Windows PE DLL file compiled with Microsoft Visual Studio 2008 (linker version 9.0). The entry point code is absolutely standard, and there is one function exported by ordinal number 1 that also looks like MSVC++. This function is called from the PNF DLL and it is actually the “main” function that implements all the logics of contacting C&C servers, receiving additional payload modules and executing them. The most interesting is how this logic was programmed and what tools were used.

The code section of the Payload DLL is common for a binary that was made from several pieces of code. It consists of “slices” of code that may have been initially compiled in separate object files before they were linked in a single DLL. Most of them can be found in any C++ program, like the Standard Template Library (STL) functions, run-time library functions and user-written code, except the biggest slice that contains most of C&C interaction code.


Layout of the code section of the Payload DLL file

This slice is different from others, because it was not compiled from C++ sources. It contains no references to any standard or user-written C++ functions, but is definitely object-oriented. We call it the Duqu Framework.

The Framework

Features

The code that implements the Duqu Framework has several distinctive properties:

  • Everything is wrapped into objects
  • Function table is placed directly into the class instance and can be modified after construction
  • There is no distinction between utility classes (linked lists, hashes) and user-written code
  • Objects communicate using method calls, deferred execution queues and event-driven callbacks
  • There are no references to run-time library functions, native Windows API is used instead

Objects

All objects are instances of some class, we identified 60 classes. Each object is constructed with a “constructor” function that allocates memory, fills in the function table and initializes members.


Constructor function for the linked list class.

The layout of each object depends on its class. Some classes appear to have binary compatible function tables but there is no indication that they have any common parent classes (like in other OO languages). Furthermore, the location of the function table is not fixed: some classes have it at offset 0 of the instance, but some does not.


Layout of the linked list object. First 10 fields are pointers to member functions.

Objects are destroyed by corresponding “destructor” functions. These functions usually destroy all objects referenced by member fields and free any memory used.

Member functions can be referenced by the object’s function table (like “virtual” functions in C++) or they can be called directly. In most object-oriented languages, member functions receive the “this” parameter that references the instance of the object, and there is a calling convention that defines the location of the parameter – either in a register, or in stack. This is not the case for the Duqu Framework classes – they can receive “this” parameter in any register or in stack.


Member function of the linked list, receives “this” parameter on stack

Event driven framework

The layout and implementation of objects in the Duqu Framework is definitely not native to C++ that was used to program the rest of the Trojan. There is an even more interesting feature of the framework that is used extensively throughout the whole code: it is event driven.

There are special objects that implement the event-driven model:

  • Event objects, based on native Windows API handles
  • Thread context objects that hold lists of events and deferred execution queues
  • Callback objects that are linked to events
  • Event monitors, created by each thread context for monitoring events and executing callback objects
  • Thread context storage manages the list of active threads and provides access to per-thread context objects

This event-driven model resembles Objective C and its message passing features, but the code does not have any direct references to the language, neither does it look like compiled with known Objective C compilers.


Event-driven model of the Duqu Framework

Every thread context object can start a “main loop” that looks for and processes new items in the lists. Most of the Duqu code follow the same principle: create an object, bind several callbacks to internal or external events and return. Callback handlers are then executed by the event monitor object that is created within each thread context.

Here is an example pseudocode for a socket object:

SocketObjectConstructor {
    NativeSocket = socket();
    SocketEvent = new MonitoredEvent(NativeSocket);
    SocketObjectCallback = new ObjectCallback(this, SocketEvent, OnCallbackFunc);
    connect(NativeSocket, ...);
}
OnCallbackFunc {
    switch(GetType(Event)) {
    case Connected: ...
    case ReadData: ...
...}
}

Conclusions

  • The Duqu Framework appears to have been written in an unknown programming language.
  • Unlike the rest of the Duqu body, it's not C++ and it's not compiled with Microsoft's Visual C++ 2008.
  • The highly event driven architecture points to code which was designed to be used in pretty much any kind of conditions, including asynchronous commutations.
  • Given the size of the Duqu project, it is possible that another team was responsible for the framework than the team which created the drivers and wrote the system infection and exploits.
  • The mysterious programming language is definitively NOT C++, Objective C, Java, Python, Ada, Lua and many other languages we have checked.
  • Compared to Stuxnet (entirely written in MSVC++), this is one of the defining particularities of the Duqu framework.

The Duqu Framework: What was that?

After having performed countless hours of analysis, we are 100% confident that the Duqu Framework was not programmed with Visual C++. It is possible that its authors used an in-house framework to generate intermediary C code, or they used another completely different programming language.

We would like to make an appeal to the programming community and ask anyone who recognizes the framework, toolkit or the programming language that can generate similar code constructions, to contact us or drop us a comment in this blogpost. We are confident that with your help we can solve this deep mystery in the Duqu story.


161 comments

Newest first
Threaded view
 

StanTheMan

2012 Mar 12, 17:04
0
 

possible contender

http://www.ionicwind.com/aurora.html

1. "At first glance, the Payload DLL looks like a regular Windows PE DLL file compiled ...."
1. compiler was written in MSVC++

2. "This slice is different from others, because it was not compiled from C++ sources"
2. Aurora features a C/C++ like syntax - no inheritance

2. "There is no distinction between utility classes (linked lists, hashes) and user-written code"
3 written in Aurora or IWBasic (same site)

4. "Objects communicate using method calls, deferred execution queues and event-driven callbacks"
4. http://www.ionicwind.com/forums/index.php?topic=4594.0
Ionic Wind Network Client/Server Library
- Sends a Windows message when a connection is ready to accept.
- Sends a Windows message when incoming data is ready to be read

5. "There are no references to run-time library functions, native Windows API is used instead"
5. Completely stand-alone. Makes calls to core Windows API functions only

Reply    

Sondreal

2012 Mar 12, 16:03
0
 

I

vc++ with OLE objects I believe, have u tested this?

Reply    

Des O'Brien

2012 Mar 12, 07:50
0
 

The way it may have been created.

A method used by systems programmers some time back was to utilise a multi phase complier for a high level language and intercept the resultant intermediate code generated. This was then modified to pure assembly - removing NOP and comment instructions. Resultant code ran a lot faster. It is also possible to hide the originating code in such a method.

Have you looked into this type of origin.

Reply    

iamgk

2012 Mar 12, 06:54
0
 

psather ?

Reply    

Kiaro

2012 Mar 12, 05:30
0
 

Lotus

hmph... looks like my lost 1992 multi-dimensional / environmentally conditional feedback structure... it was originally in lotus command language, were the code itself changed depending on non-relevant variables, such as time of day, day of month, the second of last routine... it pulled source from embedded segments of other sub-routines... it would have taken a team years to decipher... I often wondered if that little monster survived.......

Reply    

Aulis

2012 Mar 12, 03:48
0
 

Old framework, creating small sized executables

Delphi has already been suggested for a couple of times. I'll enhance this by suggesting that the tool could have been either Delphi OR Lazarus.

And the framework that has been used is not the standard Delphi VCL but Key Object Library, KOL. http://kolmck.net/
This was once called "Bonanza" library, but KOL may be the official name now a days.

I do not think this combination leaves behind any standard Delphi strings that would make it recognizable as Delphi compiled application.

Reply    

infernalmachine

2012 Mar 12, 05:10
0
 

Re: Old framework, creating small sized executables

By Delphi strings, you mean length supplied rather than null terminated strings?

Reply    

Igor Soumenkov

2012 Mar 12, 11:01
0
 

Re: Re: Old framework, creating small sized executables

It is very easy to identify code generated by the Delphi compiler - it generates very specific code.

Reply    

MJPMJP

2012 Mar 12, 02:27
0
 

bonjour assembleur ?

assembleur A86 / A386 ?

Reply    

alibadrelsayed

2012 Mar 12, 00:43
0
 

QNX compiler

May be it's the compiler used for QNX Neutrino is the GNU compiler (gcc). Currently, development can be done from these hosts:

QNX Neutrino
MS-Windows
Solaris

If you have the QNX Momentics Professional Edition, you can create anything using the Integrated Development Environment (IDE) from any host. Alternatively, you can use command-line tools that are based on the GNU compiler.

For MS-Windows hosts, you also have the option which is CodeWarrior tools from Metrowerks. Currently, the CodeWarrior IDE also uses gcc.

Edited by alibadrelsayed, 2012 Mar 12, 01:08

Reply    

mrlozer

2012 Mar 11, 19:52
0
 

Open object rexx

http://www.oorexx.org/about.html
This is my last guess...

Reply    

Elizacat

2012 Mar 11, 04:12
1
 

C code with use of COM?

It looks to me like C code with a COM vtable.

Reply    

Igor Soumenkov

2012 Mar 11, 17:49
0
 

Re: C code with use of COM?

The are actually no 'vtables' in COM / C++ meaning. There is only a list of function pointers inside the instance, it's not even separated from the data.

Reply    

\x41\x6c\x62\x65\x72\x74

2012 Mar 11, 00:28
0
 

O´Caml with ocamlc +1

I suspected O'Caml, being that she is also OO and generates native code, and that it has functional characteristics that can generate an opcode bit different than usual, a bit confusing in the RCE.

https://en.wikipedia.org/wiki/OCaml

Reply    

\x41\x6c\x62\x65\x72\x74

2012 Mar 11, 00:45
1
 

Re: O´Caml with ocamlc +1

...and there is the possibility of being just a OO language using intensive COM, but I need to look better.

Reply    

sote

2012 Mar 11, 00:23
0
 

Eiffel +1

Why to waste time on designing and implementing something which can be achieved out-of-box? Maybe you should take a look at Eiffel?
- has OO features
- agents mechanism (which would influence event-driven part of the framework)
- can be compiled into native code

http://en.wikipedia.org/wiki/Eiffel_(programming_language)

Reply    

mrlozer

2012 Mar 10, 14:19
0
 

kbasic

Maybe it is kbasic. http://kbasic.com/

Reply    

hossein

2012 Mar 10, 10:14
0
 

Languages Mixing

i think there are two possbile explainations,
first, when you use inline assembly in your C++/C code this can happen because the overall looks of code will be different.
second, when you integrate another DLL or library into your code, this can also happen, as the DLL or Library maybe written in assembly language, as you know this is an old trick which for protection reasons programmers inject their dll dependencies into their final product.
so i say they used C++ with one or both of above mentioned methods so to complicate the reversing.

Reply    

Satoshi Nakamoto

2012 Mar 10, 02:57
1
 

What about these.....

http://ldeniau.web.cern.ch/ldeniau/html/oopc.html - Nov 2006 and revised in Aug 2007

http://sourceforge.net/projects/cos/

http://ldeniau.home.cern.ch/ldeniau/html/oopc/oopc.html - 2001

Reply    

bl00d

2012 Mar 10, 02:40
0
 

What about quantum leaps framework ?
Back one year ago, when I was searching for some RTOS, I thought this was a pretty cool, small and innovating OS based on events, state machines and ...UML. I don't know much about it but you should look into it.

Reply    

juryben

2012 Mar 10, 02:34
0
 

Wild guess

I'm gonna take a wild guess and say it was coded in C and compiled with the WDK in the IDE, Visual Studio. The author could have enable /FAs flag. Took the .ASM file(s) and recompiled the .ASM file(s) again with the WDK in Visual Studio.

And I can't see any author mixing and matching languages. I would say it's probably C, as that's just logical, and then converted to ASM.

Edited by juryben, 2012 Mar 10, 08:40

Reply    

aria.banacha

2012 Mar 10, 01:55
0
 

How about...

Haskell ?
Scala ?

Reply    

Painkiller

2012 Mar 10, 00:57
0
 

Maybe Pic language

It's look like Pic Language but it's strange cause this language is only for transistor programmation...

Reply    

miki

2012 Mar 10, 00:14
0
 

Tcl language

I think that is a Tcl or Incr Tcl language. It look's like assembler, but is objective.

Reply    

FatherStorm

2012 Mar 09, 23:55
0
 

Compile PHP?

could it be compiled PHP with (newish) Traits? that would explain a global $this even in attached sections.

Reply    

Chiloane RK

2012 Mar 09, 21:46
0
 

NASM

en.m.wikipedia.org/wiki/Netwide_Assembler

Reply    

Zarck

2012 Mar 09, 20:11
0
 

Forth ?

In Forth it is possible to create its own instructions, to add them to the core, to recompile core, program the robots, etc... http://en.wikipedia.org/wiki/Forth_(programming_language)

Reply    

Andreas Bogk

2012 Mar 09, 20:58
0
 

Re: Forth ?

Nope, not a forth. I can tell by the pixels.

Reply    

MJB

2012 Mar 09, 19:16
0
 

Compiler list?

Is there a public list you are keeping of languages/compilers that you have been able to check against?

I'd also not read too much into the lack of compiler identification in the binaries. There are many ways to obscure the binary after compilation.

Many years ago when I was writing laptop tracking software, which was supposed to be as hidden as possible on the system, I'd write in Power Basic, whose binaries are very tight, allows in-line assembly, everything is done dynamically, generates binaries that only use the native Windows API's instead of run times even when you use it's most sophisticated built-in functions. We'd also encode all string constants (including the names of the API's we called) as well as do all API calls via LoadLibary, as part of the binary obfuscation. Last, just before putting together the installation package, every exe dll (all binaries) were run through a filter which specifically stripped out everything that identified the compiler and language. The end result was that examining the binary without disassembling made it impossible to tell what API's we were calling, what any string constants were, nor what compiler we used. With disassembling, you'd get the strings and API's, but not the compiler. (We also did other things seen in viruses to make the laptop tracking software hard to detect, and nearly impossible to remove once detected, etc.)

My point being that the binaries you are examining might have gone through some type of post compiler process to thwart attempts to backtrack its origin.

Edited by MJB, 2012 Mar 09, 19:32

Reply    

G

2012 Mar 09, 20:14
0
 

This is definitely brainfuck!

This is definitely brainfuck!
I just couldn't resist not to write this after reading all the comments. Guys here mentioned all the languages, I heard of.
Still I find this topic very interesting in terms, what language/tools were used there, Igor, please let us know the end of the story.

Again, sorry for the trolling :)

Reply    

MMandrake

2012 Mar 09, 18:50
0
 

New Compiler

Probably it's LISP with a re-edited compiler that changed the syntaxis of the commands for new ones

Reply    

Andreas Bogk

2012 Mar 09, 20:59
0
 

Re: New Compiler

Manual memory management, no type tags: this is not a Lisp.

Reply    

MMandrake

2012 Mar 09, 22:13
0
 

Re: Re: New Compiler

I had a HP48G (hewlett packard calculator) that used something similar for programming

Reply    

Robert M

2012 Mar 09, 18:50
0
 

Other C/C++ compiler?

Isnt it possible that the original language is still C/C++, but the code generation is done by something else than MSC++/Visual Studio?

Could it be the Intel C/C++ compiler (avail for eval from intel.com), Clang or some older version of a compiler e.g. gcc?

Reply    

igorsk

2012 Mar 09, 19:07
1
 

Re: Other C/C++ compiler?

I'm 99% sure the machine code was generated by MSVC. It's something you get a feel with experience, but I can point out two things that are quite characteristic of MSVC: 1) it uses esi as the first candidate for temporary storage; 2) "pop ecx" instead of "add esp, 4".

Reply    

Igor Soumenkov

2012 Mar 10, 00:11
0
 

Re: Re: Other C/C++ compiler?

igorsk, thanks for the hint. It turns out that almost the same code can be produced by the MSVC compiler for a "hand-made" C class. This means that a custom OO C framework is the most probable answer to our question.
We kept this (OO C) version as a "worst-case" explanation - because that would mean that the amout of time and effort invested in development of the Framework is enormous compared to other languages/toolkits.

Reply    

exeman

2012 Mar 10, 22:16
0
 

Re: Re: Re: Other C/C++ compiler?

How about GObject OO framework for C? It is old, stable, common and focused on signals.

Edited by exeman, 2012 Mar 10, 22:53

Reply    

Igor Soumenkov

2012 Mar 11, 00:15
0
 

Re: Re: Re: Re: Other C/C++ compiler?

Code generated with GObject type system looks similar but it tends to be more verbose.

Reply    

jonwil

2012 Mar 09, 18:57
0
 

Re: Other C/C++ compiler?

I have seen how GCC works internally and its ABI (for a number of different versions) and I can confirm that the Duqu code is definatly not generated by GCC. I dont know how other C++ compilers work but the things I see in the ASM (like where the pointers to the functions go, the way the "this" pointer is passed etc) do not suggest C++ to me but something else entirely. (such as the aforementioned "object-oriented" frameworks for C that exist)

We know that it has to be 32-bit Windows (and probably modern) and that its not a payload for some embedded system because its calling Windows APIs. We know that whatever it is is spitting out .obj files compatible with the Microsoft compiler.

More information is needed (such as any strings in the file or the ASM for the memory allocate/free functions or more about exactly which dlls this imports from) to truly figure this out IMO.

Reply    

mrlozer

2012 Mar 09, 18:30
0
 

Another guess

It is ActionScript.

Reply    

ZuZ

2012 Mar 09, 17:55
0
 

Synon

Looks very much like Synon Code to me.

Reply    

igorsk

2012 Mar 09, 16:20
0
 

Simple Object Orientation (for C)

It seems someone over at reddit (http://www.reddit.com/r/ReverseEngineering/) hit the jackpot: the code snippets look _very_ similar to what this would produce:

http://daifukkat.su/wiki/index.php/SOO

There are a few other OO frameworks for C, but they don't match as well:
http://ooc-coding.sourceforge.net/
http://sooc.sourceforge.net/

Reply    

Igor Soumenkov

2012 Mar 10, 00:03
0
 

Re: Simple Object Orientation (for C)

SOO may be the correct answer! But there are still two things to figure out:
1) When was SOO C created? I see Oct 2010 in git - that's too late, Duqu was already out there.
2) If SOO is the toolkit, then event driven model was created by the authors of Duqu. Given the size of framework-based code, they should have spent 1+ year making all things work correctly.

Reply    

dooqoo

2012 Mar 14, 00:49
0
 

Re: Re: Simple Object Orientation (for C)

I see a SourceForge project for SOOC which dates back at least 5 years http://sourceforge.net/projects/sooc/

Reply    

acsMike

2012 Mar 09, 17:10
0
 

Re: Simple Object Orientation (for C)

If this is so, what benefits do you think the author was after?

Reply    

eternity

2012 Mar 09, 16:09
0
 

Rational Rose compiler

Using oriented programming...
Old school.

Reply    

If you would like to comment on this article you must first
login


Bookmark and Share
Share

Analysis

Blog