Helpful Navigation Toolbar

Wednesday, January 29, 2014

Target POS Malware vs. Open Source Tools

The popularity of my last blog post on "how" POS malware functions was honestly a bit of a surprise to me, so I am following it up with another blog post covering the "advanced Target malware" (as I have heard many other "experts" call it) and seeing what information I can pull out of the sample using open source tools. A large portion of my post and research was inspired after reading a very good write-up from Mark Yason, 
which you should definitely read if you have not already done so. After reading his post, I wondered how the comparison of the findings of Mark, and others, compared to the results of freely available tools. In other words, if you do not have much malware reverse engineering experience, exactly how much information can you discover if you have a little bit of time and access to free tools?

The first thing that I did was manage to acquire the malware sample itself (md5sum: ce0296e2d77ec3bb112e270fc260f274), which took quite a bit of digging. Once I finally did find it, I uploaded the sample to VirusShare, where you can download for yourself if you have an account (and if you do not have an account, you should definitely request one). Once that was done I took a look at the file using PeStudio (download) from Marc Ochsenmeier to see what information I could pull out of the malware.

Using PeStudio I was able to determine there were several indicators that the file was malicious, the compile time of "Thu Nov 28 18:08:01 2013", some strings of interest, and even the possible original project path of the malware.

PeStudio initial findings
Possible compile time
A couple strings of interest, including possible project path
From these strings, I had the information that I needed to move onto the next phase, which meant setting up a fake POS environment. For my "fake POS environment" I used a stand-alone Windows XP desktop that I picked up at a second-hand store over Christmas vacation (acquired specifically to be my "Malware Box of Evil") that I put Brian Baskin's Noriben malware sandbox analyzer on. Brian's tool is free (you can donate money to him for adult beverages if you wish).

Brian Baskin's Noriben (background)(download)
 - Requires ActiveState Python (download) and Sysinternals Process Monitor (download)

I created some fake Track1 and Track2 data in a text file. Since the Target malware was specifically targeting the executable "pos.exe", I copied notepad.exe from the C:/Windows folder to the desktop and renamed it "pos.exe" and opened my text file of fake card data by dragging it onto "pos.exe" executable. Although the executable functioned exactly the same as notepad, this method made it appear that "pos.exe" was running and contained Track1 and Track2 data in memory. Once that was done, it was time to use Noriben to see what data I could extract from the system. Having already installed Active State Python, it was simply a matter of running Noriben and waiting to see what the Noriben/Process Monitor combination would reveal. I entered the command to kick off Noriben and once the program told me it was ready, I double-clicked on the malware, which I named "kaptoxa.exe". Surprisingly the malware did not delete itself, "hide" itself from the Windows GUI interface, or anything; it remained sitting on the desktop. (My first definitive indication that this piece of malware was by no means "advanced")

After about a minute I saved the fake card data to two files, foo.txt and foo2.txt, in an effort to try to coax more of the data to be loaded into the fake "pos.exe" process, just in case. Once I was satisfied, I exited Noriben and waited for the results. Not surprisingly, Noriben extracted ALL of the information that I was hoping to see. It confirmed all of the findings that I have seen published on the Target malware to date. Every single piece of it, including the exfiltration commands sending the data to another computer, with the "domain/username password" credentials "ttcopscli3acs\Best1_user BackupU$r" (surely I am not the only one that finds the irony in the most complex string, by far, in the domain-username-password combination is the domain name). I was even able to easily find the "winxml.dll" file in the C:\Windows\System32" folder where the malware stored the Base64 encoded Track1 and Track2 data!

Noriben created processes (note net use, internal IP addresses, paths, domain-username-passwords, and file naming scheme)
Noriben file activity
Noriben registry activity (note POSWDS\Image Path is the full path to where the malware resides)
Noriben network traffic
winxml.dll in C:\Windows\System32 (file logging Base64 encoded Track data)

Contents of "winxml.dll" file (Base 64 encoded Track data)
So there you have it. You can take the Target POS malware and two free tools, which cost a staggering FREE, and gather pretty much all of the information that seasoned malware reverse engineers were able to find while digging through the malware sample. The research that I did on the piece of malware was accomplished in about two hours using these tools, and I am most definitely NOT a malware reverse engineer. I know some RE basics, but anyone could have used these tools and gathered the same results. The freely available tools highlight the hard work put in by tool developers like Brian and Marc that allow us to automate processes and tasks that would have taken us many hours to have performed previously. It also highlights the fact that this piece of malware wasn't particularly advanced by any means, it simply was created with the information needed to target the Target environment specifically (see what I did there?). Cyber criminals will continue to use malware that is only as advanced as it needs to be to allow the compromise, collection, and exfiltration of data. While companies continue to employ weak security practices such as basic username and passwords combinations, the attackers have to make very little, if any, modifications to existing families of malware.

Sunday, January 26, 2014

Quick overview of how some RAM scrapers work

(EDITING NOTE, 27 January 2014. After looking into the regular expressions more, the first regular expression in the malware sample matches Track1 data, so I updated the post with a table, also from tech-faq, which details the data stored in Track1 data. This piece of malware scans for patterns that match either Track1 or Track2 data. This method is a more cumbersome (aka more resource intensive) than searching for only Track2 data, as often both Track1 and Track2 is stored in memory for a brief period of time, and extracting both sets of information means a probable duplication of data. In order to lessen the impact on a system as well as making the eventual exfiltraiton of data smaller, most attackers use malware that only grabs that grabs either the Track1 or Track2 (most commonly Track2) or use another tool to "clean up" collected data in order to take the smallest amount needed.)

It has been in the news all over the place lately. "Credit cards from major US retailer stolen", "Cyber criminals use RAM scraping malware to lift credit card transactions", etc. Every week there are stories emerging about another breach of credit card information. I have even heard some of the recent breaches referred to as "economic terrorism" (my own opinion is that comment might be on the extreme side, but it is definitely deserving of our interest).

This blog post is going to walk through some of the details that I've noticed with some of the RAM scraping malware that I have encountered, including one instance of a piece of malware that VirusTotal currently lists as non-malicious (NOTE: The last analysis was performed on 09 November 2011 and the first analysis was performed on 20 April 2011. Until I chose to reanalyze the malware while writing this post, the 0/43 rate was what the VirusTotal result returned. The new result had 30 out of 50 AV vendors detecting the file as possibly malicious, but that is still only 60%)

Original RAM scraper detection rate was 0/43
A much better 30/50 detection rate on rescan. But that is still only 60% of AV programs

So what makes a RAM scraper? Well, in most cases the RAM scrapers monitor processes running in memory for items that look like the credit card transaction data. Credit cards actually do follow length requirements and beginning numbering schemes, and most follow the "Luhn algorithm" (The wikipedia write-ups on the bank card numbers and the Luhn algorithm is an excellent explanation, so I will not explain it again).  

UPDATE: Track1 data contains more information than the Track1 data, and is usually able to be determined by the presence of names as well as the "^" character, where Track2 data usually contains the "=" character. The Track1 data breakdown table is also taken from tech-faq.

Breakdown of Track1 data

What we will cover in more detail is Track2 data, which is part of the data which resides on the magnetic stripe on the back of a credit card. Track2 data is the data that a majority of the RAM scrapers, that I have encountered, are monitoring memory for. The table below, copied from tech-faq, lists a brief breakdown of the data that is present in Track2 data.

Breakdown of Track2 data
In most cases Track2 data is present, unencrypted in memory, for a very brief period of time, as it is against PCI requirements to store Track2 data. That is where the RAM scraping malware comes into play. It usually monitors processes that are running on the system, and when it sees data that matches the pattern that fits the Track2 data requirements, it grabs that data and either saves it to a file on the device or exfiltrates the data from the compromised system. In this example I loaded the malware into PeStudio (a great, and FREE tool put together by Marc Ochsenmeier that can be downloaded here). The main thing to highlight is this, and many other RAM scrapers, rely on regular expressions that fit either Track1 or Track2 data. 

Track data regular expressions in this piece of RAM scraping malware. 18B46 will grab Track1 data (note the ^ character) and 18F92 will grab Track2 data (note the = character)

Possible malware indicators detected automatically by PeStudio

If only there was a way that AV (or other) solutions could monitor memory and/or processes for other executables that search for Track data... Of course the cyber criminals would then likely change their methodologies in order to circumvent that as well, but this is likely the next stage a very long battle between those trying to steal information and those trying to protect it.

Monday, January 20, 2014

Identifying TrueCrypt Volumes For Fun (and Profit?)

The volatility crew posted a couple of new plug-ins that can search through a memory dump and extract TrueCrypt passwords (please read about it here). But what about TrueCrypt volumes themselves, how can you find those on a system? There are some tools (like TCHunt) and EnScripts (TrueCrypt File Locator and Encrypted Data Finder) that can help you to find TrueCrypt volumes. (DISCLAIMER: I am no longer a user of EnCase and the scripts may/may not still reside at those locations so I cannot verify how well the EnScripts work). But what if you do not have access to those tools, or what if you want to know more about WHAT these scripts are actually doing?

You can choose to encrypt your entire file system or drive, which protects the contents of your hard drive, but without an decryption key, your drive is essentially nothing more than a paperweight. You can also create individually encrypted file containers on a non-encrypted file system/drive.

This post will not cover the full disk encryption. However, it is important to note that some of the TrueCrypt characteristics discussed in this post, particularly the character distribution, will apply to anything created with TrueCrypt. For this post, we are going to focus on individually encrypted files created with TrueCrypt.

The first thing to note about files created with TrueCrypt is that the encrypted file is essentially a file container. On a Windows system, you can go through the process and you are presented with an option of either choosing a FAT or NTFS formatted container (if the file size is less than 3792KB, you can only choose FAT or None). If you choose "None" you will have to format the file after it is mounted, but usually this option is not chosen.

TrueCrypt container format options

Now that we have a little bit of background information, we can move on to the fun part, trying to identify TrueCrypt volumes on your drive. One of the tell-tale signs of a file being a TrueCrypt volume is anything with the extension ".tc". This is the default association of a file as being a TrueCrypt file, but to be honest, if you find ".tc" files, the level of sophistication of the user(s) of the drive may not be very high. Or perhaps there are hidden volumes created within the TrueCrypt volumes meant to throw you off (that will be a post for another time!)

Usually TrueCrypt volumes can be found in folders where "large" (several hundred MB and larger) files are commonly stored and are renamed to look normal among those files. For example, some locations I have encountered are folders associated with Outlook.pst files and downloaded videos.

Another thing to note with TrueCrypt volumes is the file sizes that the file "has" to be. The smallest size that a volume can be is 292KB. So we probably want to look for files that have a minimum size of 292KB (299,008 bytes). Because TrueCrypt volumes can increase in size by 1KB (1024 bytes) increments, we could search for all files that are only cleanly divisible by 1024 bytes. However, as the smallest allocation unit on a disk is a sector (which is typically 512 bytes), we will search for any file sizes that are cleanly divisible by 512 bytes. I would rather have more possible false positives rather than potentially missing some possible TrueCrypt volumes.

Three 292KB TC volume file properties

For example, one of the files referenced above "" has a file size (not size on disk, although with TrueCrypt volumes they are usually the same) of 299,008 bytes. We can take the size of that data and divide it by 512. If the result comes back clean, aka a whole number, our file has passed the size check test, so the file is possibly a TrueCrypt file.

299,008 / 512 = 598 = Pass!

Test-document.docx file properties

For another example, here's a test document that I created - "Test-document.docx" which has a file size of 461,359 bytes. Once again, we take the size and divide it by 512.

461,359 / 512 = 901.091796875 = Fail!

From this, we have learned that really don't have to look any more at the file "Test-document.docx" because it failed one of the easiest tests to perform when trying to find a TrueCrypt volume. Great success!!

The next step that we can take to determine if a file is possibly TrueCrypt or not is by trying to discern the file signature. There are MANY ways to do this, one of the examples I am going to demonstrate is running "file" on a Windows system using GNUWin32. 

Running "file" against our two files of interest

Running "file" recognizes the possible TrueCrypt volume as "data", but the Office document as a "ZIP" file (which it is and has been since Office 2007). So now we know that the file "" has passed the file size test and that it does not have a discernible file header. High five!!

High Five!

Now for the last step in trying to determine if our file is possibly a TrueCrypt file or not. There are once again several ways to do this, but I prefer to use Hex Workshop for this blog post as it makes prettier pictures than running a script of some sort. What we are looking for now is "entropy", aka "randomness", aka "character distribution". File data is stored as an arranged series of bytes on disk. What we are going to look for is how many times certain characters occur in the file. In most files, some characters will occur far more often than others (\x00, \xFF, etc.) In a TrueCrypt volume, the program tries to distribute the characters in a completely "random" (i.e. equally occurring) format. 

To explain this in simpler terms, if a "normal" file is made up of "01189998819991197253" (7 occurrences of 9), a TrueCrypt file will be made up of "01234567890123456789" (each character occurring only twice). With smaller TrueCrypt volumes the percentage of character occurrence is close to being "random", but it is not perfect. The highest percentage of character occurrences that I have seen in testing is 0.47%. As the size of the volume increases though, the percentage of character distribution is more level at 0.39%. I have seen tools demonstrate the "randomness" on a scale of 1 to 10, 1 to 8, and even 0 to 1, with different values correlating to how random the data is. Once again, for the purposes of this blog post, I prefer to use Hex Workshop to demonstrate the distribution. Below are three examples from random Word documents I downloaded from the web, three examples from smaller TrueCrypt files (292KB) and two examples from TrueCrypt larger files (1GB). Even though the actual number of times the characters appear varies, what we are looking for is the percentage of how many times those characters occurs. 

finalreport.authcheckdam.doc character mapping

PilotProjectFinalReportRev2.doc character mapping

TCR-AL252005_Wilma.doc character mapping (292KB) character mapping (292KB) character mapping (292KB) character mapping

TC-hidden-test (1GB) character mapping

TC-test (1GB) character mapping

To summarize all of the topics covered in this post, there are three main characteristics of TrueCrypt volumes that we can test for:

- Minimum File size / File size being a multiple of 512 bytes
- File Signature check
- Character distribution

I sincerely hope that you enjoyed this post on identifying TrueCrypt volumes for fun (and profit?). Hopefully you did indeed have a little fun reading this and can eventually turn that into profit of some sort!

Tuesday, January 14, 2014

All memory dumping tools are not the same

<DISCLAIMER: I am not an in-depth technical expert on memory analysis, and your results and analysis may vary>

A few days ago, Takahiro made a blog post regarding some issues that he discovered while processing a 16GB memory dump on a Windows 7 machine (if you have not read the post, you can find it here). Naturally the findings Takahiro described in his post were concerning since we, as first responders, are often required to gather memory (through a variety of methods) and if Microsoft is * encrypting portions of RAM that will ultimately hamper our analysis, the methods that we use to gather RAM should be adjusted accordingly.

*COMMENT: The volatility crew released a blog post yesterday that did confirm encoding (not encrypting) does occur in some 64 bit dumps, and the method that your tool of choice uses to create the memory dump will determine if the debug block will be decoded or not.

I had access to one Windows 7 machine with 16GB of RAM, but in order to cover as many bases as possible (as so many of us do), I relied on collaboration (in this case with Mari DeGrazia), who had access to Windows 7 devices with 16GB of memory as well. Mari acquired her memory dumps using the free version of DumpIt (v1.3.2.20110401), Belkasoft RAMCapture 64, and FTK Imager I acquired my system with the paid version of Moonsols DumpIt (v2.0.0.20130823), the free version of DumpIt (v1.3.2.20110401), Belkasoft RAMCapture 64, and FTK Imager

On all of the memory dumps, regardless of the tool that was used to create the dump, the "imageinfo" processed the dumps for a long period of time (I finally cancelled the processes after 2 hours, Mari had similar results). This is likely because of how volatility currently processes the  image information scan, which involves some initial kdbg (KdDebuggerDataBlock) searching, as the kdbgscan itself worked fairly quickly on all of the images except for the one created by the free DumpIt tool.

"imageinfo" running for over two hours, with no results = sad panda :(

On the Windows 7 machines, we found, when we used the appropriate profile (Win7SP1x64 in this case) that FTK Imager and RAMCapture64 created a memory dump that was processed with no issues using volatility 2.3, and my findings were that the paid version of the Moonsols tool produced a dump that was processed with no issues as well. However, the free Moonsols tool produced a memory dump that was unable to be processed with the default volatility settings, as well as being unable to be processed while giving volatility some of the more robust options as well. (COMMENT: Make sure if you are manually inputting the kdbg offset, you are using the "virtual" location(s), as the "physical" location will not provide good (if any) results)

The reason for this is likely that volatility, in its current state, attempts to find the kdbg and as soon as it finds a signature match, it uses that location to determine the profile type and other items within the memory dump. It also provides the operating system profile that is given by the examiner to volatility to extract information from the various volatility plugins. It is possible for volatility to detect the "wrong" profile as it currently searches for the kdbg information based on a signature search (in fact, this is currently being worked by the volatility team, and has been identified as a possible issue (one example is here)).

With the free DumpIt tool, even the in-depth options of manually specifying the kdbg location, volatility still did not produce any results from the memory dump. This may be because the free DumpIt reads the memory differently than other tools (that is likely, as all memory dumping tools are not the same) and aligns perfectly with the results that Takahiro, Mari, and I encountered. Below are the results of kdbgscan and pslist from the memory dumps created by each tool:

kdbgscan from free Moonsols DumpIt memory dump

kdbgscan from paid Moonsols DumpIt memory dump

kdbgscan from FTK Imager memory dump

kdbgscan from Belkasoft RAMCapture memory dump

pslist from paid Moonsols DumpIt memory dump

pslist from FTK Imager memory dump

pslist from Belkasoft RAMCapture memory dump

pslist: Free DumpIt, kdbg specified, no good results

netscan: Free DumpIt, kdbg specified, no good results

So, for now, the best recommendation that we can make is to know and understand your tools and their limitations. While you may not need to know the exact API and methodology that each tool uses, you should be aware that in some cases some tools produce better results than others. I personally like the Belkasoft RAMCapture tools, but always make sure that I have additional tools (Moonsols, FTK Imager, f-response, etc.) on hand if I encounter issues while creating a memory dump. As when it comes to imaging a hard drive, the "usual" way is good, but always make sure that you have some other options, just in case.

One more item of interest that I encountered is that, on a properly patched/updated Windows 7 system and later (including Windows 8), the OS now prevents writing to the Windows/System32 folder from a command line entry (even with administrative privileges) without other, external factors. So, for example, a batch scripted collection using DumpIt (free and paid version, although the free version does not work on Windows 8) will cause an error and the memory will not be collected. If you click on the icon for the DumpIt executable the operation works perfectly, but be aware if you are trying to use DumpIt with a batch script or running the program from the command-line.

Cannot find C:\Windows\system32\DumpIt.sys

This is likely an issue that will continue to pop up as the size of RAM increases and the tools that we currently to acquire memory, as well as analyze memory, continue to evolve. We are just scratching the surface on this topic and there are sure to be more posts on this subject in the future!