English
The Internet threat alert status is currently normal. At present, no major epidemics or other serious incidents have been recorded by Kaspersky Lab’s monitoring service. Internet threat level: 1

The Mystery of the Encrypted Gauss Payload

GReAT
Kaspersky Lab Expert
Posted August 14, 13:00  GMT
Tags: Data Encryption, Cyber espionage, Gauss
0.9
 

There are many remaining mysteries in the Gauss and Flame stories. For instance, how do people get infected with the malware? Or, what is the purpose of the uniquely named “Palida Narrow” font that Gauss installs?

Perhaps the most interesting mystery is Gauss’ encrypted warhead. Gauss contains a module named “Godel” that features an encrypted payload. The malware tries to decrypt this payload using several strings from the system and, upon success, executes it. Despite our best efforts, we were unable to break the encryption. So today we are presenting all the available information about the payload in the hope that someone can find a solution and unlock its secrets. We are asking anyone interested in cryptology and mathematics to join us in solving the mystery and extracting the hidden payload.

The containers

Infected USB sticks have two files that contain several encrypted sections. Named “System32.dat” and “System32.bin”, they are 32-bit and 64-bit versions of the same code. These files are loaded from infected drives using the well-known LNK exploit introduced by Stuxnet. Their primary goal is to extract a lot of information about the victim system and write it back to a file on the drive named “.thumbs.db”. Several known versions of the files contain three encrypted sections (one code section, two data sections).

The decryption key for these sections is generated dynamically and depends on the features of the victim system, preventing anyone except the designated target(s) from extracting the contents of the sections.

By the way, the 64-bit version of the module has some debug information left in it. The module contains debug assertion strings and names of the modules:

.\loader.cpp
NULL != encSection
Path
NULL != pathVar && curPos < pathVarSize
NULL != progFilesDirs && curPos < progFilesDirsSize
NULL != isExpected
NULL != key
(NULL != result) && (NULL !=str1) && (NULL != str2)
.\encryption_funcs.cpp

The data

The mysterious encrypted data is stored in three sections:

The files also contain an encrypted resource “100” that seems to be the actual payload, given the relatively small size of the encrypted sections. It is most likely that the section “.exsdat” contains the code for decrypting the resource and executing its contents.

The algorithm

The code that decrypts the sections is very complex compared to any regular routine we usually find in malware. Here is a brief description of the algorithm:

Validation

1. Make a list of all entries from GetEnvironmentVariableW(“Path”), split by separator “;”
2. Append the list with all entries returned by FindFirstFileW / FindNextFileW by mask “%PROGRAMFILES%\*”, where cFileName[0] > 0x007A (UNICODE ‘z’)

Note: in essence, this means the specific program which is installed in “%PROGRAMFILES%” has a name which starts either with a special char such as “~”, as in our example, or uses an UNICODE special char table, such as Arabic or Hebrew, where all chars are higher than 0x007A.

3. Make all possible pairs from the entries of the resulting list.
4. For each pair, append the first hard-coded 16-byte salt and calculate MD5 hash.

Example of the string pair, second string starting from “~dir” and first salt

5. Calculate MD5 hash from the hash ( i.e. hash = md5(hash) ), 10000 times.
6. Compare if the MD5 hash matches the hard-coded value. If not, then exit.

Decryption

The sections are decrypted in the following order: .exsdat, .exrdat, .exdat

1. Use the PATH/PROGRAMFILES pair that was used to generate the expected MD5 hash in the validation code above.
2. Append the pair with the second hard-coded 16-byte salt and bytes 0x15, 0x00

Example of the string pair, second string starting from “~dir” and first salt

3. Calculate MD5 hash from the resulting buffer.
4. Calculate MD5 hash from the hash ( i.e. hash = md5(hash) ), 10000 times.
5. Derive the RC4 key from the resulting hash using WinAPI’s CryptDeriveKey(hProv, CALG_RC4, hBaseData, 0, &hKey).
6. Decrypt the section (RC4), treating its first DWORD as the length of the buffer to decrypt and encrypted buffer starting at offset 4 of the section.
7. Compare DWORDs in the decrypted buffer at positions 0 or 7 with magic value “0x20332137”. Proceed only if any of the DWORDs match.
8. Increase the last WORD in the pair+salt buffer (the one initially set to 0x0015) by 1.
9. Decrypt another section, goto 3.

After all the sections are decrypted: call the function at the beginning of the .exsdat section.

Sample data for validating the algorithm:

The string pair is created by concatenating the strings. The strings and the salt buffer are not separated by any character.

Sample test Strings, Unicode (without quotes):

  • “C:\Documents and Settings\john\Local Settings\Application Data\Google\Chrome\Application”
  • “~dir1”

First salt, hex dump : 97 48 6C AA 22 5F E8 77 C0 35 CC 03 73 23 6D 51
MD5 at validation step 6: 76405ce7f4e75e352c1cd4d9aeb6be41
Second salt, hex dump : BB 49 4E 77 F9 25 EE C0 3B 89 FC ED C2 22 4A 21
MD5 at decryption step 5: 00916031b3e9513044436ee42b6aa273

Join the quest

We have tried millions of combinations of known names in %PROGRAMFILES% and Path, without success. The check for the first character of the folder in %PROGRAMFILES% indicates that the attackers are looking for a very specific program with the name written in an extended character set, such as Arabic or Hebrew, or one that starts with a special symbol such as “~”.

Of course, it is obvious that it is not feasible to break the encryption with a simple brute-force attack. We are asking anyone interested in breaking the code and figuring out the mysterious payload to join us.

The resource section is big enough to contain a Stuxnet-like SCADA targeted attack code and all the precautions used by the authors indicate that the target is indeed high profile.

We are providing the first 32 bytes of encrypted data and hashes from known variants of the modules. If you are a world class cryptographer or if you can help us with decrypting them, please contact us by e-mail: theflame@kaspersky.com.

Source data

We are providing up to 32 bytes from the beginning of each encrypted section, skipping the DWORD that contains the length of the encrypted buffer. Please contact us by e-mail theflame@kaspersky.com if you need more encrypted data.

Sample 56e4fb972828fafbbdc11158a1b5fa72
Salt 1 97 48 6C AA 22 5F E8 77 C0 35 CC 03 73 23 6D 51
Reference MD5 758EA09A147DCBCAD6BD558BE30774DE
Salt 2 BB 49 4E 77 F9 25 EE C0 3B 89 FC ED C2 22 4A 21
Exsdat 4C CC BA E2 E0 BA 2E 44 C7 60 17 9A 72 F4 2F 27 DD FD DB 11 03 94 E3 4B 0A 16 66 F3 36 97 6C D8
Exrdat C9 27 BE 67 4D 3B 39 36 AB 14 44 32 88 60 7A 64 B0 92 9B 3A A1 5B C5 21 A7 6E 09 0C F8 71 84 87
Exdat B8 EB 6D 61 2B 4F 70 65 75 A2 1C 03 1C DF 26 2F

Sample 695056ffacef1fdaa326d7c8bb0f88ba
Salt 1 6E E3 47 2C 06 A5 C8 59 BD 16 42 D1 D4 F5 BB 3E
Reference MD5 EB2F172398261ED94C8D05216650919B
Salt 2 8F 42 B5 87 E8 9A B2 32 C8 1C 1A EC B5 2D 55 19
Exsdat CE 31 D0 5D 7D CB 57 9A 83 06 09 8D 42 2B 44 34 24 13 B2 39 22 48 8F F3 76 E5 9C DA 87 8F BC 42
Exrdat 50 1F F8 BA 18 1B 3E 36 23 9D 95 DC 5A 07 E4 EC 76 38 78 79 BA 84 A5 4E 24 BA 0E 27 94 63 F7 3D
Exdat 9D 5B B8 3B B2 17 00 DC 76 81 1D 4E 54 80 9B 31

Sample 089d45e4c3bb60388211aa669deab26a
Salt 1 0E A5 01 D1 24 71 CD CD 0E 9E AC 6E 48 5A F9 32
Reference MD5 52DD4D6B792D84C422E6A08E4272ACB8
Salt 2 38 F9 A6 5B 82 08 E7 61 1D 10 73 53 50 BC B4 F0
Exsdat D3 CA 9D 9F 87 FB 25 43 7E C6 57 7C D9 06 10 8D D2 5B B2 88 18 6E FD B4 C4 30 12 2E 1E EC E0 64
Exrdat B4 43 8F B8 0A 67 7D 88 C1 CD F3 E8 D9 61 1B E9 5A 8A 41 16 8B 8A 18 AD 25 5A 81 87 8F 8D 1A 40
Exdat F6 C9 81 C9 86 27 16 0C B7 33 93 AB 3E 71 5B E2

Sample 8d90e3c68030fbb91ad5b920d5e17b32
Salt 1 C3 23 4D 51 5D 52 A5 8E 81 46 FA 8A 6D 93 DF 7D
Reference MD5 53B3FAEA53CC1B90AA2C5FCF831EF9E2
Salt 2 21 9D 04 35 7B 96 74 53 B0 9C CD 7F 2F E6 63 AA
Exsdat AB 01 6A 8E 42 F0 F2 92 1D F1 4A 42 01 63 72 78 D6 F7 A5 0C 54 37 21 2C B8 59 6A D0 7E 68 19 2D
Exrdat 6C 2D D7 E4 F6 08 15 C0 69 D9 9E FF EA 68 63 4F 56 59 DA 28 E5 2E A1 EF 21 FB F9 2B C2 BC E7 CE
Exdat 55 A7 F3 93 E0 AF 5B 7E 17 22 7E 82 8A 6F 25 21 3D 64 D7 E8


163 comments

Oldest first
Table view
 

sisyphus

2012 Aug 16, 18:42
0
 

Re: Re: Re: sp00ff

I think the 10k iterations of md5 hashing is very significant as well. It seems the emphasis thus far has been on the target having some static pair in their path/program files as part of some known configuration of the target, but is it not possible that the combo is written dynamically by an as yet undiscovered companion exploit?

It seems to me that the hashing iteration is done to widen the number of of possible collisions (various inputs lead to the same key such that there isn't a single path/file pair leading to the key, but rather any number of a dynamically generated set leading to the final hash). Note that the pair itself isn't used as the encryption key, but again a statically-salted, iterated hash.

The attacker now has an intentionally imprecise method for retrieving the key, such that there may be many, many combinations leading to the iterated hash, but perhaps only a smaller subset of those that lead to the 2nd iterated hash once the second salt is used (perhaps someone with better knowledge of md5 can chime in on whether that is a property of md5 collisions, i.e. two inputs resulting in collision when appended with a suffix not yielding another collision).

If it is, cracking by generating a collision for the static hashes will be more difficult, but it seems like that's the route to go down at the moment to this amateur.

Reply    

sisyphus

2012 Aug 16, 19:10
0
 

Perhaps I'm being a bit naive

From everyone's favorite source:

On December 24, 2010, Tao Xie and Dengguo Feng announced the first published single-block (512 bit) MD5 collision.[16] Previous collision discoveries relied on multi-block attacks. For "security reasons", Xie and Feng did not disclose the new attack method. They have issued a challenge to the cryptographic community, offering a US$ 10,000 reward to the first finder of a different 64-byte collision before January 1, 2013.

Maybe someone should talk to Xie and Feng.

Reply    

Blood of Avatar

2012 Aug 16, 19:49
0
 

Question...?

After reading the objective and the comments, I still feel everything is to vague for an actual understanding.

So I have a question in order to help fill the jargon questions left over in my mind.

Basically we are being told that once the proper path/file concatenation + salt is found to obtain the MD5 hash we can use to obtain a session key from CryptDeriveKey() which is actually in return the key for RC4 that decrypts a specific part of the encrypted code?

If so, then in theory a different RC4 key was used to encrypt each part in sequence... (if im not confused yet).

I just need this elaborated more before I move on to my next question :(

Thanks...?

Reply    

hhhobbit

2012 Aug 16, 23:20
0
 

Missing point

I have read all of your remarks and most have a consistent ISO-Latin bias. The people that are getting infected are in a given region of the earth. It cannot be just the language since Arabic is script based. Farsi adopted the Arabic script with some changes to write their language in a Perso-Arabic script. Mostly each of the symbols in Arabic (cannot speak for Farsi) map to either a morpheme or most likely a phoneme. Hebrew is an alphabetic system. I suspect what the authors of Gauss keyed in on is something that specifies the locale which seems to have got broken at Windows 7 SP1 with the development being done on XP. Most modules don't work at all on Windows 64. Most of you didn't read Kaspersky's PDF file on Gauss, did you?

http://www.securelist.com/en/downloads/vlpdfs/kaspersky-lab-gauss.pdf

The existence of many AV and especially firewall packages also brought most modules to a screeching halt. Why 10,000 iterations other than it is the square of the number 100 that Gauss was to add up the numbers from 1 ... 100 (101 * 50 = 5050)? It does a great job of obfuscating what they used for the start doesn't it? What ever it is works for both Arabic and Hebrew but nothing else. Key in on something locale specific. From the map it looks like it started with a first infection vector in Lebanon (remember, they also use French) with perhaps a secondary infection in Israel and then a third one in Palestine. I am a bit puzzled by Jordan, Syria and Egypt having almost nothing - maybe the method of flash for intiation limited the spread. So key in on something in Windows itself that specifies the locale, not the language. What ever they used seems to have been changed with the introduction of SP1 for Windows 7 32 bit. Windows 7 64 bit was immune to everything but the flash drive module from the start. I enjoyed the bash script - slow as coal tar with xxd and awk. I wrote it in C but I have always been a speed demon. For those that didn't know, much of Stuxnet, Flame, Duqu, and Gauss is coded in LAU, the script language that is cross platform. This makes it possible for Linux and Macintosh programmers to contribute to the packages for Windows.

https://en.wikipedia.org/wiki/Lua_%28programming_language%29

Reply    

lightswitch05

2012 Aug 17, 07:01
0
 

Strange Results trying to recreate

I'm trying to write an open-source program that people can run to check and see if they are the target of Gauss or not (link below). But I'm getting strange results validating the algorithm. In one case I create a QByteArray::fromHex() and input the entire hex values in the screen shot. Calculate the MD5 10,001 times, and my end result is the same as "MD5 at decryption step 5: 00916031b3e9513044436ee42b6aa2 73" - but I was expecting "MD5 at validation step 6: 76405ce7f4e75e352c1cd4d9aeb6be 41". Are these perhaps switched? Or maybe I'm just getting a collision.

code: https://github.com/lightswitch05/gaussCrack

EDIT: problem solved. working correctly now

Edited by lightswitch05, 2012 Aug 18, 00:45

Reply    

Michael_Mike

2012 Aug 17, 10:04
0
 

Re: Strange Results trying to recreate

You might want to start by double-checking the values of all your variables by printing them to standard output or to a file, as is in hex value, and validate the data and the value. There is no reason to calculate the md5 hash 10,000 times if you start with the wrong result from the first hash. If you are under linux/unix, use xxd.

To avoid confusion of the previous thread (though I suggest that you read that thread, started by LSD4me) I made the script to output the md5 result of each passes

http://pastebin.com/b5J9MvS4

In any case, if you have really produced a collision, while it will not solve the problem, I am quite intrigued how you made it.

Reply    

Hans Adams

2012 Aug 17, 10:07
0
 

Re: Strange Results trying to recreate --- FINE and interesting

1) "Wouldn't it be wise to publish (GPLed?) source code in many programming languages, form Scala and F# to C and gcc down to ASM and gas.

So everyone feeling in danger could compile his own tools on his own, hopefully trustworthy, systems and check these.
"

Thanks that you started.... FINE!

2) Your code seems to be correct. Kapersky's algorithm is to be discussed.....

Your example:
"* Example: A = [{1},{2},{3}]
* A^2 = [{1,1},{1,2},{1,3}
* {2,1},{2,2},{2,3} ..."

My analysis:
"
APPENDED_SQUARE=APPENDED X APPENDED =
{A, B, C ....a, b, c ...} X {A, B, C ... a, b, c ...}
={AA, AB, ...CC,...Aa, Ab, ..., aA, aB, ..,Cc, .,aa,ab,..cc..}

Shouldn't the first element be AA, with A being the very first element of (list) APPENDED, which in turn must be the very first element of (list) PATH?"

As your example also confirms the concatenated list MUST start with the symmetric pair (1,1) or (A,A)....

BUT the screen dump in topic four of section "Validation" by Kapersky does not show a symmetric pair at all..... Their proposed algorithm does not compute the expected result.

wonders HA

Edited by Hans Adams, 2012 Aug 17, 10:18

Reply    

Mr. T

2012 Aug 17, 10:11
0
 

Re: Missing point

about the amount of iterations ( 10 000 ).

The OWASP recommends minimum 1000 iterations for anything that is being hashed even in web applications. 10000 iterations makes the hashing 10000 times slower, which is significant when trying to crack the hash, but usually quite meaningless compared to overall performance of the application.

So the 10000 might not be any magic number that has some significant meaning.

Reply    

Michael_Mike

2012 Aug 17, 10:42
0
 

Re: Missing point

>Most modules don't work at all on Windows 64. Most of you didn't read Kaspersky's PDF file on Gauss, did you?

I don't see any mention of Gauss that is not working on windows 64 bits. Windows 64 bit seems to have dropped support for 16bit software, but not 32bit. There are exception, but I am not very knowledgeable of windows.

All I see is a mention that some modules doesn't work under windows 7 sp1, but it turn out that Gauss is modular. There is also a specific executable for 64 bit machines that is loaded on the usb key.

> From the map it looks like it started with a first infection vector in Lebanon (remember, they also use French)

I am having difficulties do make the relation (like many people commenting on web article about Gauss) with the fact that Lebanese are speaking french. A lot of Country in the north africa (actually most African country) are speaking french, that include Algeria, Morocco, Egypt and yet they are not infected much or at all.

Languages as filters would simply not follow geographical frontiers as Lebanese would be infected all across the globe. It might be used as a post infection filter, I don't know.

The most straightforward way to be capable to geographically control the virus diffusion would be to well, get the geographic postion. Computing IP address can give that capability, and it's pretty simple to do.

Reply    

Michael_Mike

2012 Aug 17, 11:01
0
 

Re: Re: Strange Results trying to recreate --- FINE and interesting

Note that I have not read the source code.

Pardon my lack of terminology and mathematical knowledge, but isn't the algorithm described as forming tuples? (if by any chance it's the right term)

(P:Path, F:File/Folder) + salt | (P:Path, F:File/Folder, Salt)

where the list(cartesian square) would be something like:
(p1,f1,salt)(p1,f2,salt)(p2,f1,salt)...(PX,fY,salt)(PX,fX,salt)(pY,fX,salt)...

But yes the algorithm should be more detailled, as most of us does not seems to be expert at analyzing threat, and all its inherent casual scheme. Speaking for myself, I have way more interest than I have knowledge. Since it's a call for solving a cryptographic problem and not debugging, more information should be provided to be able to meet a more important basin of cryptanalysis skilled people.

For my part, it include to be sure of the behavior of the program when meeting exception, like I had previously asked, what happen if the code does not meet case 'where cFileName[0] > 0x007A': halt? append 0x00 or no appending and continue?

Reply    

Mr. T

2012 Aug 17, 11:01
0
 

Re: Re: Missing point

The 64-bit can be x86_64 or IA64. If it is IA64 it cannot be run on windows 7 64-bit as it is x86_64. The 32-bit version could be run on most windows machines anyway, so if the virus is not accessing memory space, the x86_64 version could be quite useless.

In the examples the target.lnk points to the System32.dat which is the 32bit version. Maybe on IA64 systems it points to System32.bin?

If it is IA64, it must mean that the GAUSS is targeted to server side also.

Reply    

Michael_Mike

2012 Aug 17, 11:20
0
 

Re: Re: Re: Missing point

That would then, once again, make a very specific attack: machine on itanium hardware, running windows.

I mean, if I would target Itanium machine, I would look to compromise Hp-UX; if I would target bank, I would probably look to compromise a lot of unix flavor, and I would give a particular attention to Oracle/SUN and IBM products; like SPARC, POWER and Z series mainframe. That would make Gauss the smallest part of the equation.

A specific version for IA64 is not impossible, gauss being exceptionally geo-specific. Wouldn't it be specified in the analysis?

Edit
By the way, Itanium got x86 emulation. Can't windows make x86 application directly running on IA64?

Reply    

Mr. T

2012 Aug 17, 11:45
0
 

Re: Re: Re: Re: Missing point

the x86 version covers most of the windows machines and the IA64 would cover the rest of them. So it would be more wide than with x86 and x86_64? But as the 32bit version cannot access complete memory space of 64bit machine, it makes sence to build x64 version also instead of IA64.

which 64bit version it is, is not specified in this article nor in the PDF version of Kaspersky's GAUSS analysis.

At least some Itaniums are capable of running 32bit code with some sort of hardware emulation etc, I'm not sure if all has support for this.

If the target is bank, the backend servers are most likely non-windows machines.

Reply    

jvd

2012 Aug 17, 14:54
1
 

A different approach

Hi,

Imho you are heading the wrong way. Sure you can ask the community for help breaking cryptography, but it is a little bit naive to assume:
1) The malware authors do not know how to use cryptography, since they found a way to fake the md5sum for MS certificates with flame (finding Chosen-prefix collisions)
2) If someone knows a faster way for this, or someone knows how to break rc4 he/she would share it with you.

Imho it is a good idea to look at things from a different perspective and try to answers questions which you can answer, thus helping to find the solution, e.g. find the machine and you have the key.

First of all, as you already mentioned, they are looking for a specific target. This is confirmed by the way the key is generated. Pairs are made, concatenated, etc.

Also note that FindFirstFileW and FindNextFile return the result in a non specified order.

Thus the system is very specific and, possibly, the system is more or less static. No programs matching the prefix are installed or removed.

Another interesting fact is that the attackers have knowledge about this system. They know the results of the algorithm used to calculate the key. In short, they must have tested their code on the target machine or they must have a replica of the target machine.

Since they try to infect the target machine, it is more likely that they have a replica.

According to Iranian Scientist their nuclear "powerplant" has been infected again. From this you can conclude:
1) They found a way to attack the nuclear site directly
2) They wrote a virus that infects the site that is spreading somehow (like they did with stuxnet)

This means that the authors of the malware do not have the possibility to infect the target machine directly. Why else would you put it in some malware and go through great lengths to protect the content? It is thus highly likely it is located at a location with no direct internet connection, or one that is heavily firewalled/etc.

In short, you need to look for a machine that
1) Is (semi) offline
2) Has software installed that matches the prefix
3) That gets no software installed
4) Is located in Libanon
5) That is pre-installed

Try to make a list of these machines, possibly with the help of the community, and I think the changes of finding the key are much higher than when trying to break the cryptography.

Reply    

lightswitch05

2012 Aug 17, 18:04
0
 

Re: Re: Re: Strange Results trying to recreate --- FINE and interesting

Since they don't specify the behavior when no "cFileName[0] > 0x007A" are found, I assume the virus doesn't care about it - meaning it keeps going. In that case, you still have a list from PATH to form pairs from (since its a cartesian square). I hadn't thought about appending an empty string, I should add that... you can never have too many tests, only too few.

Reply    

lightswitch05

2012 Aug 17, 18:12
0
 

Re: Re: Strange Results trying to recreate --- FINE and interesting

Yes, I have elements like AA in my list, I talked to a friend working on his math doctorate to verify I was doing cartesian square correctly. From my understanding the screen shot is just an example for people to use for verification, its not meant to represent the first key pair in the list. Also, the algorithm doesn't care in what order your list is in, meaning it wouldn't matter if AA is at the start (like in my example), or even in the end or randomly in the middle.

Reply    

Michael_Mike

2012 Aug 17, 20:42
0
 

Re: Re: Re: Strange Results trying to recreate --- FINE and interesting

There is something not clear. Did your program get the right md5 hash with the test pair (C:\Documents and Settings\john\Local Settings\Application Data\Google\Chrome\Application, ~dir1, $salt)? Then it mean that your program have created and handled this pair correctly.

If you get a different result -like hash at decryption step 5 **00916031b3e9513044436ee42b6aa2 73**- can you paste the string that got hashed, in hexadecimal?

Reply    

Michael_Mike

2012 Aug 17, 21:20
0
 

Re: Re: Re: Re: Re: Missing point

They support emulation at least up to Itanium 2, for the most recent I dont know. To open a parenthesis on IA64, I would not be surprised that they are many of them in israel, but it's unclear whether the version that strike israel contain that payload (only ~800 detected with that payload if I read correctly).

There is an intel development lab in israel (I think they are behind sandy-bridge), and I have already ear about some megalomaniac Israeli building storage solution using itanium, one of them ended up literally catching fire. :-)

Reply    

pooh

2012 Aug 18, 00:44
0
 

RC4 key and decrypted exsdat for sample test strings

Can you confirm the following results which I get, when run against your sample test strings:

Processing token-pair "C:\Documents and Settings\john\Local Settings\Application Data\Google\Chrome\Application~dir1"

Algorithm validation sample
First hash=3a 18 8 2a 8d c 87 27 c8 43 aa b3 4a 1b 46 96
10000th hash=76 40 5c e7 f4 e7 5e 35 2c 1c d4 d9 ae b6 be 41
MD5 matches at validation step 6
Attempting decryption of sample "Algorithm validation sample"'s exsdat
First hash using salt2=16 a5 13 d8 c7 3c bf 98 9 5f b9 9e e7 80 6e c7
10000th hash using salt2=0 91 60 31 b3 e9 51 30 44 43 6e e4 2b 6a a2 73
MD5 at decryption step 5 matches sample data
RC4 key derived from MD5 hash is:63 73 be ea b6 96 ce d4 9e 48 6d 80 1 2b 2c 89

Decrypted plaintext is:4c 64 bb b1 e0 ba 2e cd c7 60 2d 9a bc f4 2f 27 dd 82 a 7d 3 94 fb 33 f1 73 66 e1 36 e5 f4 d8

If these do not match, would you please provide the values you do obtain?

Thanks.

Reply    

lightswitch05

2012 Aug 18, 00:48
0
 

Open Source app to check for targets

I've created an open source app to check your computer and determine if it has the right configuration that Gauss is looking for. It doesn't require any install or active internet connections. Just unzip and run.

Source Code: https://github.com/lightswitch05/gaussCrack
Download: https://github.com/downloads/lightswitch05/gaussCrack/GaussCheck.zip

Reply    

d3ad0ne

2012 Aug 18, 01:00
0
 

Generate validation key

For anyone still struggling to generate a TEST validation key here is a simple perl script to do so. - http://pastebin.com/QXg65x2x

I also have versions 56e4fb972828fafbbdc11158a1b5fa 72 and 695056ffacef1fdaa326d7c8bb0f88 ba of the exe. For anyone that would like a copy email me at - "d3 ad 0n e@ hashcat.net" without the spaces/quotes.

I hope to finish some work this weekend to ensure that Kaspersky's validation scheme is correct.

Reply    

Myth

2012 Aug 18, 01:15
0
 

Re: A different approach

According to what is known in the different pieces of documentation from kaspersky, it's actually a bit more "wierd".

From: http://www.securelist.com/en/analysis/204792238/Gauss_Abnormal_Distribution

We have some clue about what the known part of the virus gathers information about banking, creditcards, and some social media stuff. Although it only targets specific banks in the known version.

And it does not seem to reproduce - but can collect information from other computers through infecting a usb drive. it's a bit unclear though, since the usb part contains the mysterious payload, that might infect a system matching the specific system they search for. But since they use a .lnk exploit to start the usb part - it can probably only run on Windows XP SP3, Server 2003 SP2, Vista SP1 and SP2, Server 2008 SP2 and R2, and Windows 7 accourding to NIST.

So we have something that spreads how? And how do they control the area of infection?

And if the virus is searching for a specific environment to run in, and then just in a specific area. But it still gathers a lot of information about the systems it encounters - so perhaps they simply search for information about specific persons, and especially people who have some specific software installed?

Reply    

curious

2012 Aug 18, 02:13
1
 

lebanon: the basics

Folk are making this a lot harder than it needs to be.

Most of this malware is focused on passwords for money transfers, credit cards and bank accounts in Lebanon, not stuxnet-like industrial process controllers of which Lebanon has none of interest. Follow the money, then destroy it. That's what the mystery packet is doing instead of centrifuges.

Of the 49 significant banks in Lebanon, only 6 are targeted (http://en.wikipedia.org/wiki/List_of_banks_in_Lebanon) according to the Kapersky pdf. These all lie among "Lebanese based banks with a significant presence domestically and overseas" but 5 others in this category are not targeted. The largest, Bank Audi, may be off-limits in view of Deutsche Bank Trust Company America's controlling interest.

Banking in Lebanon falls along sectarian lines like everything else there. The perps already knew -- after decades of previous surveillance -- a great deal about which banks do what for whom. They also knew static file names on a particular computer of great interest to them to which they have no physical or online access.

If my computer were the target, from access to Visa sales receipts or by Flashback on Arizona distributors you would know that it has a unique UUID similar to B2E598F1-35F1-52CC-BF0B-B4E2A20D3F58. A key built around that would trigger the mystery packet uniquely on my computer.

The intrusion to date has not involved monetary theft from the accounts, though the perps have been perfectly positioned to do so for almost a year. However this is surely the purpose of the mystery packet -- to create monetary havoc in the accounts of the specific target: bankrupt the account owner, drain funders of the account, cut off recipients of account dispersements, and best of all, bring down the enabling bank. (Not to mention the byproduct, a giant off-the-books black fund for later perpetrator use, like ContraGate, with stolen money passing untrackably through an offshore or non-cooperating country (eg the perpetrators).

To date the malware has succeeded in capturing a vast amount of transactional data but has failed to get on the computer with the activating key.

Because this would backfire on an colossal scale if anyone was able to crack open the encrypted tool and redirect it elsewhere, notably unleash it on the entire banking system of the perpetrator's country. A lesson was no doubt learned from Stuxnet getting loose in 2010, a year prior to the release of Gauss.

Since many malware programs already steal passwords, that horse was already out of the barn. However none to date has unleashed a weapon of mass financial destruction. For this reason -- not to mention legal repercussions and blowback to homeland security, the perps went the extra mile to encrypt their compilation of expensively developed and carefully tested mayhem.

Let's face it: the only plausible target of interest in Lebanon is Hezbollah. Any large organization like that has a substantial annual budget, requiring tracking of inflows (donations from Iran?) and outflows (salaries, benefits, weapon purchases, social programs), investing cash on hand, and accounting controls.

Given the curiosity level, the Hezbollah central financial record-keeping computer is surely offline. So for the obligatory daily backup, do you print it out (manual re-entry?!?) or just stick in a thumb drive? To view transactions and issue instructions to the banking system, is that done by compromised email or by walking a thumb drive over to the bank?

I just came across a thoughtful piece by Kathleen Maher along these same lines (spyware crossing over into cyberwar) in the Atlantic. She writes:

"...Why Lebanon? Why banks? Stealing financial transaction data is traditionally the province of, say, shadowy underground criminal gangs. Lebanon is a small country better known for its vibrant nightlife and perpetual domestic volatility. Neither its banking sector nor the state itself are obvious targets for the U.S. or Israeli ntelligence services, which, though they haven't been connected to Gauss, are the only groups with both the know-how and, if they truly were behind Stuxnet and Flame, the track record.

However, Lebanon's size belies its importance as a regional entrepôt and banking haven; its cosmopolitan libertarianism, along with old-world discretion, have long made the country a popular choice for foreign depositors of all profiles and persuasions. Think of it as something like the Switzerland of the modern Middle East. More than 60 banks manage nearly $120 billion in private deposits in a country of 4.3 million people, and account for roughly 35 percent of the country's economic activity.

These are not mere corner retail banks serving up loans, mortgages, and checking accounts to Lebanese citizens. They are among the most private banks in the world, bound by genteel conventions of secrecy long since abandoned elsewhere. Since 1956, domestic and foreign banks operating in Lebanon have been legally required to protect the names and assets of their clients from all inquiring authorities.

U.S. financial regulators, concerned with money laundering and terrorism financing, have long given special attention to the opacity and reach of the Lebanese banking system. A 2000 advisory by the U.S. Department of Treasury Financial Crimes Enforcement Network instructed all U.S. banks to "give enhanced scrutiny to all financial transactions originating in or routed to or through Lebanon." In 2011, the Lebanese Canadian Bank was shuttered after the U.S. revealed that the Lebanese militant group Hezbollah was using the bank to launder money from cocaine profits, Mexican cartels, and African conflict diamonds. This year, the entire national banking system has come under scrutiny, accused of assisting members of the Syrian and Iranian regimes evade international sanctions and launder money that's also being funneled to Syria's ongoing conflict...."

http://www.theatlantic.com/international/archive/2012/08/did-the-bounds-of-cyber-war-just-expand-to-banks-and-neutral-states/261230/

http://www.nytimes.com/2011/12/14/world/middleeast/beirut-bank-seen-as-a-hub-of-hezbollahs-financing.html?_r=1 pagewanted=all

Edited by curious, 2012 Aug 18, 15:35

Reply    

Dennis Farr

2012 Aug 19, 07:05
0
 

md5crk, pollard, and this malware

The 10000 iterations of MD5 are similar to an old attempt to create MD5 collisions called MD5CRK, which was a distributed computing effort relying on an adaptation of the Pollard rho factoring algorithm for detecting collisions. The project ended after another method to create collisions was found and before it found any. See Wikipedia for more.

This malware may be susceptible to a similar attack.

MC5 was a Detroit rock band.

Reply    

Michael_Mike

2012 Aug 20, 09:24
0
 

Re: md5crk, pollard, and this malware

That's a nice find that you got, it could be an interesting way to generate collision. Someone finding the initial value of the md5sum before it got re-computed 10,000 times is definitely going to be handy.

I am not sure it's going to yield the expected result, this is my irrational feeling, from my uneducated ass; it would be a mistake that we all take the same direction in parallel. I also strongly believe that with enough effort, all method will work. That one will probably do it, with the appropriate implementation. I might test something to see how it turn out.

Edited by Michael_Mike, 2012 Aug 20, 18:16

Reply    

Ed

2012 Aug 20, 13:54
0
 

Jerk

@HA. Why do you behave like a jerk. Everybody is stupid except you. Come with the solution or shut up. Get a life.

Reply    

d3ad0ne

2012 Aug 20, 21:26
0
 

Re: md5crk, pollard, and this malware

Obtaining a collision is not useful in this instance. Even if you were to find a collision for the original hash it is irrelevant to the second step where a different salt is added to the plain used to build the decryption key. The only options are to find the original path + program files + salt1 that creates the correct validation hash so that it can be used with salt2 for decryption, or directly attack the RC4 encryption. The only shortcut I see is to statically define the path variable and brute force the program files variable. However with "where cFileName[0] > 0x007A" my money is on it being a GUID. Brute forcing a 32 character GUID of 0-9A-F (16^32) just isn't feasible.

Reply    

Bat

2012 Aug 21, 00:18
0
 

Questions and remarks

Gauss take "path" and "filename/dirname" and salt1, it hashes it.
It take result of hash (16 bytes) and rehash it. Internally MD5 pad the message with one bit to 1, other to 0 and size at end (128 bits). Do this x10000

So in this page, we have "Reference MD5" that Gauss check. Basically if we can do inverse MD5 transform 10000 times, we will have original MD5. Just after the first hash. With it, we can deduce size of original message (so size of path+filename/dirname+salt1)

But reverse a MD5 hash take some times (in function of computer power and/or method use). There are online bank of MD5 hashs. So if we try each hash, perharps we can reverse one or 2 steps (on 10000) or eventually no step !!!
A question to cryptogrpah : Are all MD5 hash "valid" ? Are they some MD5 hash that can't go out hashing ?

If I check 4 MD5 presented here with
http://www.cmd5.org/
(when MD5 is not in base, it's : "Not Found, it is being cracked by our background system. done:xx%")

For first and last MD5 that someone have probably already check with this site, background system is at 95% process ... and no solution for the time ! Strange.
If it's not a valid MD5, payload will never unlock.

Are we sure that MD5 hashing algorithm is true RSA MD5 ??
Are original state the same ? (a=0x67452301, b=0xefcdab89, ...)
Are rotate the same ?
Are order of permutations the same ?

I suppose this as been tested with pattern ? We feed some pattern to Gauss, we checkek computed MD5 against a know MD5 algorithm.
Is there not "subtle" modifications ?

=============================================================
If we take problem on other side.
Kaspersky stated that stuxnet and Flame share some code or just some module (if I'm not wrong).
On other hand, here :
http://www.nytimes.com/2012/06/01/world/middleeast/obama-ordered-wave-of-cyberattacks-against-iran.html?pagewanted=all
it's supposed that stuxnet come from government that don't want "show to become public". But even when "show became plublic" continue to use the tools. The problem for theses people is : "how can we prevent everyone that search to find what exactly we attack ?" Just lock the target.

For stuxnet, they begin by spying for month, gave some information. Here probably it's the same. They found THE computer(s) to attack. How to lock attack ? Encrypt payload with key that only this computer can have, that no one can find. So my best advise is that foldername/filename is purely random (ex a temporary folder that land here, with good old timestamp, guid generated by machine is an example) but the path can also be anything that a machine can have, so if there is screwed program that have added a purely garbage path to %PATH% they use it.
The goal is to have random string already in THE machine, that outside world can't know about. A combinaison of 2 garbage value is hard enough to find. If it's the case, the dictionnary method will fail, only brute force attack is usable. It can even be a bit refined, with rules of filename (for ile/dir name), and rules of pathname for "path".
But in worst case, Gauss can (or in the past) also have the power of : set anything in %PATH% and/or create invalid filename in Program Files. In this case, only really brute force attack (or apparented, ie : that can find anything)

And even if we find MD5, if theses peoples do correctly the job, we just have the first hash, not the original message.

So attack should focus on RC4 part ...

Reply    

ycombinator1

2012 Aug 21, 02:19
0
 

Mystery (partially) solved


Gauss is retired now...

http://blogs.wsj.com/corruption-currents/2012/08/20/u-s-seizes-150-million-in-alleged-hezbollah-linked-cash/

Reply    

Michael_Mike

2012 Aug 21, 06:15
0
 

Re: Questions and remarks

> Internally MD5 pad the message with one bit to 1, other to 0 and size at end (128 bits). Do this x10000

Not exactly, all for words described in the rfc A,B,C,D are reseted to their original value, so the result is not just to append 10,000 times 0 and 1. Unless mathematics shortcut is found, the only way to speed up the process is a matter of software optimisation i.e. to better manage creation and destruction of variables.

AFAIK we cannot really inverse md5 hash; all we can do is to find collision, there must be more than one collision possible for a given hash value but at the end only one will be the right one, and it won't be even able to decrypt the message, since the key is not generated from that series of md5 hash, as it's a first stage of a key validation.

Reply    

If you would like to comment on this article you must first
login


Bookmark and Share
Share

Analysis

Blog