English
The Internet threat alert status is currently normal. At present, no major epidemics or other serious incidents have been recorded by Kaspersky Lab’s monitoring service. Internet threat level: 1

The Mystery of the Duqu Framework

Igor Soumenkov
Kaspersky Lab Expert
Posted March 07, 15:58  GMT
Tags: Duqu
1.2
 

While analyzing the components of Duqu, we discovered an interesting anomaly in the main component that is responsible for its business logics, the Payload DLL. We would like to share our findings and ask for help identifying the code.

Code layout

At first glance, the Payload DLL looks like a regular Windows PE DLL file compiled with Microsoft Visual Studio 2008 (linker version 9.0). The entry point code is absolutely standard, and there is one function exported by ordinal number 1 that also looks like MSVC++. This function is called from the PNF DLL and it is actually the “main” function that implements all the logics of contacting C&C servers, receiving additional payload modules and executing them. The most interesting is how this logic was programmed and what tools were used.

The code section of the Payload DLL is common for a binary that was made from several pieces of code. It consists of “slices” of code that may have been initially compiled in separate object files before they were linked in a single DLL. Most of them can be found in any C++ program, like the Standard Template Library (STL) functions, run-time library functions and user-written code, except the biggest slice that contains most of C&C interaction code.


Layout of the code section of the Payload DLL file

This slice is different from others, because it was not compiled from C++ sources. It contains no references to any standard or user-written C++ functions, but is definitely object-oriented. We call it the Duqu Framework.

The Framework

Features

The code that implements the Duqu Framework has several distinctive properties:

  • Everything is wrapped into objects
  • Function table is placed directly into the class instance and can be modified after construction
  • There is no distinction between utility classes (linked lists, hashes) and user-written code
  • Objects communicate using method calls, deferred execution queues and event-driven callbacks
  • There are no references to run-time library functions, native Windows API is used instead

Objects

All objects are instances of some class, we identified 60 classes. Each object is constructed with a “constructor” function that allocates memory, fills in the function table and initializes members.


Constructor function for the linked list class.

The layout of each object depends on its class. Some classes appear to have binary compatible function tables but there is no indication that they have any common parent classes (like in other OO languages). Furthermore, the location of the function table is not fixed: some classes have it at offset 0 of the instance, but some does not.


Layout of the linked list object. First 10 fields are pointers to member functions.

Objects are destroyed by corresponding “destructor” functions. These functions usually destroy all objects referenced by member fields and free any memory used.

Member functions can be referenced by the object’s function table (like “virtual” functions in C++) or they can be called directly. In most object-oriented languages, member functions receive the “this” parameter that references the instance of the object, and there is a calling convention that defines the location of the parameter – either in a register, or in stack. This is not the case for the Duqu Framework classes – they can receive “this” parameter in any register or in stack.


Member function of the linked list, receives “this” parameter on stack

Event driven framework

The layout and implementation of objects in the Duqu Framework is definitely not native to C++ that was used to program the rest of the Trojan. There is an even more interesting feature of the framework that is used extensively throughout the whole code: it is event driven.

There are special objects that implement the event-driven model:

  • Event objects, based on native Windows API handles
  • Thread context objects that hold lists of events and deferred execution queues
  • Callback objects that are linked to events
  • Event monitors, created by each thread context for monitoring events and executing callback objects
  • Thread context storage manages the list of active threads and provides access to per-thread context objects

This event-driven model resembles Objective C and its message passing features, but the code does not have any direct references to the language, neither does it look like compiled with known Objective C compilers.


Event-driven model of the Duqu Framework

Every thread context object can start a “main loop” that looks for and processes new items in the lists. Most of the Duqu code follow the same principle: create an object, bind several callbacks to internal or external events and return. Callback handlers are then executed by the event monitor object that is created within each thread context.

Here is an example pseudocode for a socket object:

SocketObjectConstructor {
    NativeSocket = socket();
    SocketEvent = new MonitoredEvent(NativeSocket);
    SocketObjectCallback = new ObjectCallback(this, SocketEvent, OnCallbackFunc);
    connect(NativeSocket, ...);
}
OnCallbackFunc {
    switch(GetType(Event)) {
    case Connected: ...
    case ReadData: ...
...}
}

Conclusions

  • The Duqu Framework appears to have been written in an unknown programming language.
  • Unlike the rest of the Duqu body, it's not C++ and it's not compiled with Microsoft's Visual C++ 2008.
  • The highly event driven architecture points to code which was designed to be used in pretty much any kind of conditions, including asynchronous commutations.
  • Given the size of the Duqu project, it is possible that another team was responsible for the framework than the team which created the drivers and wrote the system infection and exploits.
  • The mysterious programming language is definitively NOT C++, Objective C, Java, Python, Ada, Lua and many other languages we have checked.
  • Compared to Stuxnet (entirely written in MSVC++), this is one of the defining particularities of the Duqu framework.

The Duqu Framework: What was that?

After having performed countless hours of analysis, we are 100% confident that the Duqu Framework was not programmed with Visual C++. It is possible that its authors used an in-house framework to generate intermediary C code, or they used another completely different programming language.

We would like to make an appeal to the programming community and ask anyone who recognizes the framework, toolkit or the programming language that can generate similar code constructions, to contact us or drop us a comment in this blogpost. We are confident that with your help we can solve this deep mystery in the Duqu story.


161 comments

Newest first
Table view
 

lokoalextremo

2012 Aug 08, 01:33
0
 

¿Estoy Mas Que Seguro Que es protollua?

si es igual fue creado por un grupo de estudiantes para facilitar
la orientacion a objetos escribeme......

Reply    

lokoalextremo

2012 Aug 08, 01:31
0
 

¿Estoy mas que seguro que es protollua?

este es un nuevo lenguaje de programacio creado por la univesidad nacional de colombia pero aun no ha sido sacado a luz.
la sintasis es igual y se puede compilar en lua.por favor escribeme al gmail y te envio unas imagenes.y mira haber si es sierto

Reply    

Arvind Singh

2012 Apr 01, 15:27
0
 

ZeroMQ? http://www.zeromq.org/

Reply    

mrlozer

2012 Mar 27, 17:05
0
 

Some languages to check

http://en.wikipedia.org/wiki/List_of_programming_languages_ by_type#Object-oriented_class-based_languages

You should take a look at the fancy programming language http://en.wikipedia.org/wiki/Fancy_%28programming_language%29

Reply    

vishwanath99

2012 Mar 26, 09:55
0
 

iscoD

knows well basics of micro controller and processor.It Designed in all language and used effective parts assembler code.

sorry v poor eng

Reply    

thaeick

2012 Mar 21, 02:20
0
 

Since The language is C compiled with Microsoft C++, perhaps the library is the Microsoft SDK, a C library.

Reply    

Andreas

2012 Mar 19, 21:08
0
 

I know what the language is!

I've programmed in that language. It's assembly language.
100% sure.

Andreas

Reply    

xentorex

2012 Mar 19, 15:18
0
 

The Mystery of the Duqu Framework

I think that is Squirrel or very similar.

Reply    

DarkArchon

2012 Mar 17, 20:44
0
 

Re: Re: OOOAC ?

Could we see the entire source code (asm, of course) of DUQU for reverse engineering ? That could help you more than expect found an hypothetical framework. I saw that no one known is same as the DUQU's one. If they did it home-made, it is probable that no one would tell you how they did !

Reply    

Mont

2012 Mar 15, 23:48
0
 

Re: What about nesC

Another thing to consider by be Cilk I think it is...

Reply    

Mont

2012 Mar 15, 23:29
0
 

What about nesC

I've been looking into TinyOS and Sensor Network VM's and came across this nesC thing that seems would be right up the alley for someone doing this type of work...

I looks like it's got much what you've been talking about, not sure about it's direct access to WinAPI's but what about adaptability?

Reply    

Kochise

2012 Mar 15, 19:00
0
 

Re: Check-out older possibilities

Have Ch been mentioned ? Here it is : http://www.softintegration.com/products/

Reply    

rt15

2012 Mar 15, 18:49
0
 

Re: An interesting aside

Ken just seems to have the feeling that there are similarities. And I really don't find anything obvious.

On one side:
Win32 OO programming.
On the other side:
Procedural programming with C runtime.

Reply    

Kochise

2012 Mar 15, 18:25
0
 

Check-out older possibilities

http://www.sics.se/~adam/lwip/ : IP stack
http://openthreads.sourceforge.net/ : threading framework
http://directory.fsf.org/wiki/Lightweight_C++ : intermediate language (get the code using webarchive)
http://bellard.org/tcc/ : low-level compiler

Perhaps that's the toolchain used...

Reply    

david heath

2012 Mar 15, 17:23
0
 

An interesting aside

Of some kind of passing interest is the comment from Ken of Caffeine Security suggesting some level of similarity between Duqu and the recent Linux malware Linux/Bckdr-RKC. he claims to have sent material to Kasperski, but it may have fallen through the cracks.

http://caffeinesecurity.blogspot.com.au/2012/03/linuxbckdr-rkc-and-duqu-links-food-for.html

Reply    

rt15

2012 Mar 15, 14:54
0
 

Hand written asm

Like eyenot, I think that it is hand written asm.

That would explain the different locations of the function table and the different ways the "this" pointer is passed. Human "mistakes".

(About mistake, should not be DeleteCriticalSection called instead of InitializeCriticalSection in the destructor ?)

That would also explain the non-usage of C runtime while it is painful in some cases (For example, whereas CopyMemory is "part" of Win32, it is not an actual function exported by a win32 dll. You have to code it again or use msvcrt.dll memcpy or another implementation).

Using asm + Win32 would not be to strange for people searching for vulnerabilities in a TrueType font parsing engine.

Putting function tables in instance is a naive and easy way of doing object oriented programming. Even a basic OO framework would have put function tables in a separate place with other class stuff (Static fields...) and would have put a pointer on this in the instance.

The framework looks like HLA standard library or ObjAsm32. But it is none of these two.

Anyway, my main point is that disassembly appears very clean and simple to me. Something pretty hard to obtain with traditional compilers which are often adding some weird stuff (Less clear instructions doing the same thing, particular stack frames, alignment stuff, strange instructions order, "mov edi, edi"...)

Reply    

david heath

2012 Mar 15, 13:42
0
 

Re: Re: looking wider

but many SCADAs (most?) run under Windows.

PLCs on the other hand don't have Windows anywhere near them.

Reply    

Igor Soumenkov

2012 Mar 15, 01:26
0
 

Re: haXe

Yes, we checked haXe, too.

Reply    

Igor Soumenkov

2012 Mar 15, 01:25
0
 

Re: OOOAC ?

The type system and the code look completely different to the one in Duqu..

Reply    

andydude

2012 Mar 14, 22:55
0
 

haXe

Have you eliminated haXe from your list of options?

Reply    

2esoskwahom4

2012 Mar 14, 17:52
0
 

sniffing from wrong direction, what does history tell you?

both As400tech and SCooke handed you the best hints.

A few years back I worked at East Fishkill long enough to meet eggs rubbing elbows with the 'black' GSA guys working down in Endicott and Watson (mostly the latter). The big topic at the time was exhorbitantly hi-priced memory being frantically consumed (we knew it was NSA, we realized later for upgrading Echelon to make it's data more transparent for future TIA transactions) post-911.

A cyberop like this would inevitably end up at big blooze' shop for the reasons scooke mentions: NOTHING gets thrown away by Endicott's hacks (a somewhat frustrating problem for workers needing access to boxes), their library of tools is as incomprehensibly massive as it is old. Indeed, Watson has not infrequently sent researchers there first to get their feet wet.

This probably initiated at Watson under NSA aegis, followed by research of tools at Endicott's library, then a handover to Haifa after payload completion. It's unrecognizable because NSA would demand that; any self-respecting beemer hack would know to hit up Endicott's libraries to make it so.

That said, it might be a little naive thinking any ibm'er you ask is gonna be successful convincing one of the mustier Endicott hacks to pony up from their libraries. scooke is right none of it is officially secret - but it frequently is VERY proprietary for some of them. A handful of old Endicott hacks still spend more time there than at home. That should tell you something about their priorities. It's all who you know. 'n no, I don't.

Reply    

n

2012 Mar 14, 17:48
0
 

>they can receive “this” parameter in any register or in stack.
aggressive Whole Program Optimization?

I suggest to check those languages.
* XPCOM API -> pseudocode looks like this. but ABI is not standard xptcall.
* .Net (compiled to native code w/ Mono, LLVM CIL or GCC CIL back-end)
* Scheme-variant
* Haskell (GHC) -> it's not OO, so maybe not.
* OpenCOBOL -> IIRC COBOL 2002 has OO feature.
* Go Programming Language
* D Programming Language
* Vala

Reply    

DarkArchon

2012 Mar 14, 14:21
0
 

OOOAC ?

Is the C framework could be an extention of the "OOOAC" homemade framework ? It implement class system inheritance and event management in ansi c language.

Reply    

spikeysnack

2012 Mar 14, 11:02
0
 

I typed in "; lpMem" into google

and was then in a hunt for a win32 compiler ==>PowerBasic.

http://www.powerbasic.com/support/help/pbcc/index.htm#protected_mode_programming.htm

looks to be very nearly it. they have a set of objects including LinkedList with seemingly the same api names as above.
looking around their site it seems to be a serious programming platform for heavy win32 COM , complete with inline assembler.

they have some of the following idioms:

CLASS MyClass
INSTANCE MyVar AS LONG

CLASS METHOD CREATE()
' Do initialization
END METHOD

CLASS METHOD Destroy()
' Do cleanup
END METHOD

INTERFACE MyInterface
INHERIT IUNKNOWN
METHOD MyMethod()
' Do things
END METHOD
END INTERFACE
END CLASS

------------------------------------------------
EnterCriticalSection ByVal VarPtr(dStatus())
For i = 0 To UBound(gSoundStatus)
.... do stuff to data members of array
Next
LeaveCriticalSection ByVal VarPtr(dStatus())
------------------------------------------------

#COMPILE DLL "EvServer.dll"

$EvIFaceGuid = GUID$("{00000098-0000-0000-0000-000000000002}")
$MyClassGuid = GUID$("{00000098-0000-0000-0000-000000000003}")
$MyIFaceGuid = GUID$("{00000098-0000-0000-0000-000000000004}")

INTERFACE Status $EvIFaceGuid AS EVENT
INHERIT IUNKNOWN
METHOD Done
END INTERFACE

CLASS MyClass $MyClassGuid AS COM
INTERFACE MyMath $MyIFaceGuid
INHERIT IUNKNOWN
METHOD DoMath
MSGBOX "Calculating..." ' Do some math calculations here
RAISEEVENT Status.Done()
END METHOD
END INTERFACE

EVENT SOURCE Status

END CLASS
------------------------------------------------

This or something similar -- many scientists use this kind of interface for programming experimental machines and automating proceses. Thinking like a sneakypants, it would be a good way to take advantage of COM/win32 api 0-day exploits. TTF engine? .doc files -- tres faux paus!

Reply    

M-Boy

2012 Mar 14, 05:34
0
 

Brainstorming

For some reason - call it a hunch - I delved into the AI related programming languages after seeing the output.

First up was Lisp, specifically Common Lisp, but it seems it has been mentioned plentiful already (Under the assumption that dialects like Scheme and Clojure also has been tested), maybe I am not to far off?

I lack deeper programming knowledge, but other AI programming languages like Strips, Planner and Prolog seems fundamentally to different to logically produce the same result. But then we have IPL, not that distant and there is the IBM connection with the IPL-V. But it feels way to legacy to be used today?

Then it felt like I had seen this code already, sometime, somewhere. And the only thing I could think was FORTRAN - From my mothers studies way back. And considering the prior mentioned dates, Fortran 2003 could be of interest. Also considering the fact that mixing C++ and Fortran is not to unheard of.

Edit:
Thought while trying to sleep - TCL?

Edited by M-Boy, 2012 Mar 14, 06:32

Reply    

dcedilotte

2012 Mar 14, 01:55
0
 

Could be one of these.

Could it be made in L (http://www.bitmover.com/lm/L/L.html)
Or in Ceylon
Or in Rust (by Mozilla).

Reply    

Shalogrim

2012 Mar 14, 00:56
0
 

Re:

I stumbled upon this site: http://autodiff.piotrbania.com /get_function_listing.php?diff_id=84 module_id=167 np_module_id=168 function_rva=0x0001f9b8 os=1#

What is AutoDiff?

AutoDiff is a project which performs automated binary differential analysis between two executable files. This is especially useful for reverse engineering vulnerability patches and spotting other additional code updates. AutoDiff allows to find executable code similarities and differences among two executable files. Additionally it also includes some heuristics methods for matching variables (objects) between two executable files. AutoDiff is ultra fast, standalone tool. It was especially designed to diff Portable Executable files released by Microsoft every time in the security bulletin.

More about the AutoDiff story:

http://blog.piotrbania.com/2010/12/rebootless-windows-updates-ksplice-for.html

That´s my contribution for the possibility list.

Reply    

dooqoo

2012 Mar 14, 00:49
0
 

Re: Re: Simple Object Orientation (for C)

I see a SourceForge project for SOOC which dates back at least 5 years http://sourceforge.net/projects/sooc/

Reply    

diskjunky

2012 Mar 13, 20:33
0
 

Re: The facts

There are various ways of adding runnable code to a PE file, ranging from linking at compile time to embedding as a runnable resource to even injecting code (think buffer overflow security exploits). One does not necessarily need to link to a compilable resource - although it's probably one of the easier ways.

Given the install base of SCADA systems, any runnable file would have to assume that all necessary runtime libraries were not available and must be included somewhere in the running PE file. This rules out all interpreted languages (non-natively compiled java, basic, etc), any language requiring an external runtime (all CLR based languages, VB 5/6, etc), to name but a few. Some languages allow native compilation, eg, delphi and java but they have already been investigated, or so I understand.

Reply    

diskjunky

2012 Mar 13, 20:19
0
 

Re: Obfuscated ASM ?

it's possible but looking at the naming conventions used in the disassembly, it looks more like a dedicated tool created the payload from an OO based language structure. Of course, there's nothing stopping someone creating a tool to deliberately obfuscate code to make it look like OO but that's an order of magnetude more difficult than creating a straight compiler. And if you're going to the trouble of creating and maintaining a custom compiler, you're going to keep it pretty simple - which is probably why it looks 'old'. Older systems were a little more direct in their compiled code. This being an event-driven architecture and therefore capable of being used for multi and single-threaded applications (not quite but bear with me), a lot of effect went into making it. If it was a custom compiler and deliberately obfuscating code, it'd make it very hard to maintain and debug. The level of sophestication of the existing stuxnet and duqu code and their use of the VS C++ library suggests they were using an off-the-shelf compiler albeit an obscure one.

Following ocham's razor; "The most simple explanation is usually the correct one" (actually it's not in all circumstantes but I digress), the 'obvious' answer is that a tool was used to compile from a standard OO language or variant thereof

Reply    

If you would like to comment on this article you must first
login


Bookmark and Share
Share

Analysis

Blog