English
The Internet threat alert status is currently normal. At present, no major epidemics or other serious incidents have been recorded by Kaspersky Lab’s monitoring service. Internet threat level: 1

The Mystery of the Duqu Framework

Igor Soumenkov
Kaspersky Lab Expert
Posted March 07, 15:58  GMT
Tags: Duqu
1.2
 

While analyzing the components of Duqu, we discovered an interesting anomaly in the main component that is responsible for its business logics, the Payload DLL. We would like to share our findings and ask for help identifying the code.

Code layout

At first glance, the Payload DLL looks like a regular Windows PE DLL file compiled with Microsoft Visual Studio 2008 (linker version 9.0). The entry point code is absolutely standard, and there is one function exported by ordinal number 1 that also looks like MSVC++. This function is called from the PNF DLL and it is actually the “main” function that implements all the logics of contacting C&C servers, receiving additional payload modules and executing them. The most interesting is how this logic was programmed and what tools were used.

The code section of the Payload DLL is common for a binary that was made from several pieces of code. It consists of “slices” of code that may have been initially compiled in separate object files before they were linked in a single DLL. Most of them can be found in any C++ program, like the Standard Template Library (STL) functions, run-time library functions and user-written code, except the biggest slice that contains most of C&C interaction code.


Layout of the code section of the Payload DLL file

This slice is different from others, because it was not compiled from C++ sources. It contains no references to any standard or user-written C++ functions, but is definitely object-oriented. We call it the Duqu Framework.

The Framework

Features

The code that implements the Duqu Framework has several distinctive properties:

  • Everything is wrapped into objects
  • Function table is placed directly into the class instance and can be modified after construction
  • There is no distinction between utility classes (linked lists, hashes) and user-written code
  • Objects communicate using method calls, deferred execution queues and event-driven callbacks
  • There are no references to run-time library functions, native Windows API is used instead

Objects

All objects are instances of some class, we identified 60 classes. Each object is constructed with a “constructor” function that allocates memory, fills in the function table and initializes members.


Constructor function for the linked list class.

The layout of each object depends on its class. Some classes appear to have binary compatible function tables but there is no indication that they have any common parent classes (like in other OO languages). Furthermore, the location of the function table is not fixed: some classes have it at offset 0 of the instance, but some does not.


Layout of the linked list object. First 10 fields are pointers to member functions.

Objects are destroyed by corresponding “destructor” functions. These functions usually destroy all objects referenced by member fields and free any memory used.

Member functions can be referenced by the object’s function table (like “virtual” functions in C++) or they can be called directly. In most object-oriented languages, member functions receive the “this” parameter that references the instance of the object, and there is a calling convention that defines the location of the parameter – either in a register, or in stack. This is not the case for the Duqu Framework classes – they can receive “this” parameter in any register or in stack.


Member function of the linked list, receives “this” parameter on stack

Event driven framework

The layout and implementation of objects in the Duqu Framework is definitely not native to C++ that was used to program the rest of the Trojan. There is an even more interesting feature of the framework that is used extensively throughout the whole code: it is event driven.

There are special objects that implement the event-driven model:

  • Event objects, based on native Windows API handles
  • Thread context objects that hold lists of events and deferred execution queues
  • Callback objects that are linked to events
  • Event monitors, created by each thread context for monitoring events and executing callback objects
  • Thread context storage manages the list of active threads and provides access to per-thread context objects

This event-driven model resembles Objective C and its message passing features, but the code does not have any direct references to the language, neither does it look like compiled with known Objective C compilers.


Event-driven model of the Duqu Framework

Every thread context object can start a “main loop” that looks for and processes new items in the lists. Most of the Duqu code follow the same principle: create an object, bind several callbacks to internal or external events and return. Callback handlers are then executed by the event monitor object that is created within each thread context.

Here is an example pseudocode for a socket object:

SocketObjectConstructor {
    NativeSocket = socket();
    SocketEvent = new MonitoredEvent(NativeSocket);
    SocketObjectCallback = new ObjectCallback(this, SocketEvent, OnCallbackFunc);
    connect(NativeSocket, ...);
}
OnCallbackFunc {
    switch(GetType(Event)) {
    case Connected: ...
    case ReadData: ...
...}
}

Conclusions

  • The Duqu Framework appears to have been written in an unknown programming language.
  • Unlike the rest of the Duqu body, it's not C++ and it's not compiled with Microsoft's Visual C++ 2008.
  • The highly event driven architecture points to code which was designed to be used in pretty much any kind of conditions, including asynchronous commutations.
  • Given the size of the Duqu project, it is possible that another team was responsible for the framework than the team which created the drivers and wrote the system infection and exploits.
  • The mysterious programming language is definitively NOT C++, Objective C, Java, Python, Ada, Lua and many other languages we have checked.
  • Compared to Stuxnet (entirely written in MSVC++), this is one of the defining particularities of the Duqu framework.

The Duqu Framework: What was that?

After having performed countless hours of analysis, we are 100% confident that the Duqu Framework was not programmed with Visual C++. It is possible that its authors used an in-house framework to generate intermediary C code, or they used another completely different programming language.

We would like to make an appeal to the programming community and ask anyone who recognizes the framework, toolkit or the programming language that can generate similar code constructions, to contact us or drop us a comment in this blogpost. We are confident that with your help we can solve this deep mystery in the Duqu story.


161 comments

Oldest first
Table view
 

This comment was deleted by Derek Jecxz, 2012 Mar 12, 09:20

nbtaekbfgt

2012 Mar 08, 23:11
0
 

realbasic

Could it be realbasic?

Reply    

As400tech

2012 Mar 09, 00:03
0
 

That code looks familiar

The code your referring to .. the unknown c++ looks like the older IBM compilers found in OS400 SYS38 and the oldest sys36.

The C++ code was used to write the tcp/ip stack for the operating system and all of the communications. The protocols used were the following x.21(async) all modes, Sync SDLC, x.25 Vbiss5 10 15 and 25. CICS. RSR232. This was a very small and powerful communications framework. The IBM system 36 had only 300MB hard drive and one megabyte of memory,the operating system came on diskettes.

This would be very useful in this virus. It can track and monitor all types of communications. It can connect to everything and anything.

Reply    

Igor Soumenkov

2012 Mar 09, 00:11
0
 

Re:

Thank you!

Reply    

Igor Soumenkov

2012 Mar 09, 00:12
0
 

Re: Language ideas

It is definitely not a CLR based language. Native code only.

Reply    

Igor Soumenkov

2012 Mar 09, 00:13
0
 

Re: Its Iron Python

No traces of the .NET framework or JIT.

Reply    

Igor Soumenkov

2012 Mar 09, 00:14
0
 

Re: Guess

We've tried D, too.

Reply    

Igor Soumenkov

2012 Mar 09, 00:40
0
 

Re: WEB Guess

The Duqu Framework shares many principles of libevent, but it is completely object-oriented, even all events and callbacks are wrapped in objects.
Some APIs that are called by the Duqu event monitor object are not present in sources of libevent.
Anyway, we should study the sources of libevent again, to be 100% sure. Thanks!

Reply    

Igor Soumenkov

2012 Mar 09, 00:42
0
 

Re: My first thought...

We tried Vala, too. Unfortunately, the generated code is completely different.

Reply    

Igor Soumenkov

2012 Mar 09, 00:45
0
 

Re: Google's Go language?

Go was one of the first languages to check. That's definitely not Go.

Reply    

Hans

2012 Mar 09, 01:25
0
 

Different

Rather than listing all programming languages .. when I was much younger I took assembler output from compilers and modified it.
Now for a serious amount of code that would be far too much effort, I'll be the first to admit that.
What if the programming team is not a commercial organisation, and doesn't care about spending a few weeks on this..?? the resulting code would have an uncertain, and unknown signature, somewhat close to an as yet unknown programming language.

Reply    

Igor Soumenkov

2012 Mar 09, 01:40
0
 

Re: It's most probably Lisp, inspired by Mosquito Lisp

Thank you Wes! Could you please suggest a Lisp implementation that we should check in the first place?

Reply    

Ross Smith

2012 Mar 09, 02:35
0
 

Destructors may be a clue

The presence of destructor functions in all classes, which (if I'm reading your sample correctly) deallocate memory as well as releasing anything else owned by the object, may be an important clue. Deterministic destruction, as opposed to garbage collection, is a feature very few languages have. It certainly rules out several of the suggestions in the comments, such as Lisp or JavaScript.

C++ is the only language in which it's a commonly used idiom; the only other languages I know of that make it available (but not often used in practice) are Python (only in CPython), Perl 5, and the most recent incarnation of Objective-C. (Maybe D, I'm not sure.) You seem to have checked all those.

Reply    

puff65537

2012 Mar 09, 02:53
0
 

SCIL in microSCADA

Have you tried looking at this? it drags its own event loop because of it history.

Reply    

jgeorge44

2012 Mar 09, 04:27
0
 

It's been a looooong time since I've worked on them, but this does smell to me a little bit like something that'd come out of an AS/400 compiler as well. RPG/400? But not sure why that'd be the way to go to code for Windows.

Reply    

clojuredev

2012 Mar 09, 04:37
0
 

Re: Re: It's most probably Lisp, inspired by Mosquito Lisp

It's probably the Ferret Lisp to C++ compiler ( http://nakkaya.com/2011/06/29/ferret-an-experimental-clojure-compiler/ )

"Ferret: An Experimental Clojure Compiler

Ferret is an experimental Lisp to C++ compiler, the idea was to compile code that is written in a very small subset of Clojure to be automatically translated to C++ so that I can program stuff in Clojure where JVM or any other Lisp dialect is not available. "

Reply    

clojuredev

2012 Mar 09, 04:38
0
 

Lisp to C++ compiler

It's probably the Ferret Lisp to C++ compiler ( http://nakkaya.com/2011/06/29/ferret-an-experimental-clojure-compiler/ )

"Ferret: An Experimental Clojure Compiler

Ferret is an experimental Lisp to C++ compiler, the idea was to compile code that is written in a very small subset of Clojure to be automatically translated to C++ so that I can program stuff in Clojure where JVM or any other Lisp dialect is not available. "

Reply    

thebill

2012 Mar 09, 06:03
0
 

Re: Destructors may be a clue

Good thought on destructors. On other possibility: In Ada, overriding the Finalize procedure creates a destructor for an object. Described here: http://en.wikibooks.org/wiki/Ada_Programming/Object_Orientation#Destructors

Reply    

thebill

2012 Mar 09, 07:04
0
 

Re: Guess: Ada tasks?

It's also interesting to note:
- The GNU Ada reference library (GNARL) has a function InitializeCriticalSection. Critical sections are commonly used in Ada tasks, such as to implement synchronization by monitors and semaphores.
- You can build Windows DLLs with the Ada GNAT compiler. See http://www.adacore.com/wp-content/files/auto_update/gnat-unw-docs/html/gnat_ugn_38.html.
- See later post: Object-oriented Ada does allow you to implement destructors for your objects:
http://en.wikibooks.org/wiki/Ada_Programming/Object_Orientation
- Ada is not used by many people, but is used widely in government and defense.

Reply    

WilliamOckham

2012 Mar 09, 07:05
0
 

Embedded systems compiler?

The description of the features identified makes me think that you are looking at code from a compiler that targets devices. The giveaway is the lack of inheritance in an object based system and the ability to pass the "this" pointer in multiple ways. Those are features that can be useful in a resource constrained system. I would look at something like QNX or a competitor.

Reply    

Wes Brown

2012 Mar 09, 08:12
0
 

Re: Re: It's most probably Lisp, inspired by Mosquito Lisp

Igor,

Your first mistake is assuming that they are using an off the shelf compiler. Scott Dunlop wrote Mosquito Lisp, which is an entire virtual machine with a byte code language and a dialect of Scheme combined with Lisp in about nine months or so.

Someone who is smart and motivated, as the Duqu people were, could dedicate someone to writing a compiler in-house in the same timeframe, but targeted towards x86 object systems -- we could have done this, but we wanted to transmit byte code and be portable across multiple architectures. Different goals here. You also presume that they are using an off the shelf linker. Mosquito Lisp and Wasp Lisp append byte code to the end of the VM stub.

-Wes

Reply    

Wes Brown

2012 Mar 09, 08:17
0
 

Re: Re: Re: It's most probably Lisp, inspired by Mosquito Lisp

Probably not. Duqu and Stuxnet components date to 2007, predating this. I would also point out that as of 2006, this particular technique of a virtual machine to evade detection and reverse engineering was known.

Reply    

pdw

2012 Mar 09, 11:09
0
 

The object system matches what I remember of Wirth's Oberon. In Oberon, an object was really just a struct with method pointers. Conventionally you'd place them at the beginning the struct, but you didn't have to. The "this" pointer had to be passed manually (normally as the first parameter, but this was not required). Inheritance was done by specifying that struct B should start with the same members as struct A.

However using Oberon today would be very anachronistic, and the object system is so minimal that it must have been many independent creations.

Reply    

Matthias Braun

2012 Mar 09, 12:33
2
 

object oriented C

Is there any reason why this isn't simply C-code written in an object oriented fashion? Putting a function pointers into structs looks like classes, when you are explicitely passing "this" around as a function parameter then you might sometimes choose to use the 2nd or 3rd argument for it, if the compiler only puts the first 2 arguments into registers and the other arguments on the stack...

At least I like to code in C and often find myself using function pointers and an object-oriented style where it makes sense.

Reply    

SJP

2012 Mar 09, 12:59
0
 

Small Talk

Could this be a variant of SmallTalk?

Reply    

acsMike

2012 Mar 09, 14:34
0
 

Go

Is it... Google Go?

Reply    

acsMike

2012 Mar 09, 14:39
0
 

Re: Go

The destructors may point in another direction though.

Reply    

acsMike

2012 Mar 09, 14:48
0
 

Re: Re: Go

Or has anyone mentioned REALbasic yet? It would probably plug in well in Visual Studio. It would also be prestigious to create a potent virus in Basic.

Reply    

jonwil

2012 Mar 09, 15:06
0
 

Possiblities

Borland Delphi? (although if it was Delphi, there would be specific strings in there that would identify it)
C code written to fake object orientation using structures? (glib for one does stuff like this and its not unheard of in other circles)
Whatever it is, its clearly not some sort of blah-to-C++ converter otherwise the output would look like the output of a C++ compiler.

I think that posting any strings from the exe might help identify the compiler, as would identifying any signatures of the linker or the compilers used for other bits of code. If its known that Microsoft Visual C++ was used for other parts of the code, that would probably rule out compilers where the output isn't compatible with Visual C++.

Also examining the memory allocation functions (labeled new and free in the above images) might narrow down if they match any known language or if they are custom written.

EDIT:
I didn't see the mention that the visual C++ 2008 linker was used.

That rules out Delphi and probably Borland C++/Borland C++ Builder.

I still think there may be strings in there that hint at the compiler or that the memory allocate/free functions could give clues (or that details of which dlls it imports from and which APIs it imports could rule in or out possible options)

I can also rule out GCC G++ from the look of the ASM, its definatly not any version of GCC G++ that I know of.

Reply    

lizardluser

2012 Mar 09, 15:37
0
 

Interesting

Considering it uses destructors, plus the payload had something to do with breaking non windows systems in a nuclear facility (windows would be exclusively for reading data), plus real-time implications, I would suggest the assembler for a HEX/ROM file generated by a tool like Paradigm C++ destined for an old 386 or 486. If they had a windows box reading data through RS232 from a critical subsystem, and the port wasn't configured exclusively for output, it would be very possible to drop a new ROM into the subsystem. Lots of really weird implications if this is true.

Thanks for the post Igor, this was a lot of fun!

Reply    

If you would like to comment on this article you must first
login


Bookmark and Share
Share

Analysis

Blog