You may have stripped Symbols…

Posted by Eric | General | Wednesday 28 January 2009 9:22 pm

When programming it is always nice to be verbose. Be verbose in the commenting of your code and your error messages is something taught in most academic programming courses and even the good programming books. Today however I saw a new level of verboseness which helped cut my reversing time by a large amount.

As with all firmware images the developers stripped the symbols before release. This makes things difficult as I mentioned in my previous post because all you have is assembly and not much to go on. You get a series of sub_xxxxx calls and that’s about it. In regular binary images its not that difficult when symbols are stripped because you can use flirt signatures (amongst other methods) on Elf or PE binaries to fill in a few gaps. You can also occasionally use flirt signatures on firmware but I’ve found it only works on X86 based images, that is of course unless you can develop your own signatures for the architecture you are using.

Often times you will find your print functions fairly quick because they are called the most. You can also utilize string XREFS to follow what’s going on and after a while of doing it you notice patterns and how the arguments are passed. You just pretty much go ok, thats fprintf, or printf, or some homebrew of print_to_term.

Today as I was tracking down some common functions before I really tackled things I stumbled into something interesting. It’s more luck than anything else since it rarely happens, but cool for me none the less. I started seeing function names in strings!

The developers were so verbose in their errors that their error messages for all mallocs and fails stated the function name in it. So as you are scrolling through the strings you suddently find something like:

“Malloc Failed in: parseHttpRqst()\n”

As I mapped the cross reference to the string and followed the code for a minute I realized it was part of the error checking for malloc in the function… parseHttpRqst. I went on to clear out about 12,000 functions out of 67,000 so far.

They may have made my job just a little more difficult but… I got more verbose function names in 5 hours than what I could have gotten on my own within 3 weeks! w00t

AVG And Nessus

Posted by Eric | General | Friday 23 January 2009 8:06 pm

Not sure if anyone else is seeing this but it has steadily been vexing me for the past 6 months. One of my clients uses Nessus on a regular basis and about 6 months ago started seeing every host report back “No significant Problems” on almost every single host. Knowing the networks he’s been scanning pretty well this perplexed me. At first I kept wanting to believe the problem was with him forgetting to disable his firewall. Finally he was close enough to town and called me up after having the same problem. I drove out and as I was pilfering through the Nessus logs I started seeing numerous pop ups from AVG.

Once the anti virus is disabled results start populating the hosts almost instantly. I haven’t been able to figure out how AVG is influencing the scans but I know it is. I also googled the shit out of the topic with no results so am I the only person having this problem?

The things Ive learned

Posted by Eric | General | Saturday 17 January 2009 5:54 pm

In Late 2007 I took a position doing reverse engineering, mostly on embedded systems. RE was something I wanted to get into on a professional grade for some time but could never find a segway into it. Now being in the thick of it I’ve come to learn quite a bit through my experience. Reverse engineering takes a special breed. It takes a lot of patience to stare at a debugger or disassembler all day long. There are times I walk out of work and my eyes are blood shot from staring at bindiff or IDA all day long. This is the primary reason my blog has fallen off course. By the time I get home as of late my desire to sit being the computer isn’t always there. I mean, I want to do it but my brain tells me no! Here are a few items off the top of my head about reverse engineering embedded systems. Sometimes I’d rather take obfuscated malware then this stuff…

Learning
The ability to learn fast and get spun up on something such as an architecture is essential for doing this. Quite often when it comes to reverse engineering positions the subject matter is dealing with malware, specifically, malware on x86. Although malware can be seen on mac and linux the majority of it is found on intel based windows systems. You need a concrete knowledge of a single operating system and architecture and it will generally serve you well. When it comes to embedded systems however, you are talking about dozens of operating systems over a handful of architectures. You really need to be able to pick up core concepts of operating systems and architectures really fast.

Architectures
For some reasons the developers of the systems I have worked on can’t make up their mind. One device is x86 and then the next version is ARM, then they hopped over to PPC for last years release and this years device is x86 again. WTF! It gets confusing hoping back and forth between languages. What makes it worse is when you have to find differences in certain features such as protocol implementations or the way the device reads in data. Makes DIFFING a little bit trickier.

For embedded reversing your major architectures are: x86, MIPs, ARM, and PPC. Despite what some of my amigos think PowerPC is far from dying. I say this because it is the predominate architecture that I see in the devices I’ve worked on.

Algorithms
Data structures and algorithms help out because you can start to see patterns in disassembly and will be able to know whats going on a lot faster. Aside from that, just being able to identify structures in disassembly will often bring large portions of code together for you and make your life a lot easier.

Symbols
Every so often a vendors development team will screw up and forget to strip symbols from an updated firmware image they push out. When this happens you better be on your game because they will pull it from their site in minutes upon realizing what they did. Most often firmware images do not contain symbols which makes life a lot harder. When you have 40-60,000 unnamed functions, no imports, no exports….nothing it makes life a bitch. Sometimes you can get around pretty well with just string references and figure out whats going on. Any little bit helps but sometimes it would just be so much easier to have symbols ):

Slow Roll Your Analysis
In July of last year Cody Pierce wrote a blog post on DVLABS about cross references. One of the things he bought up was identifying common functions and clearing them out early on. As you are going through a firmware image that has 60,000 functions would you prefer to repeatedly see CALL loc_67499 or would you rather see CALL print_to_term. Instead of going straight to my objective I run an idapython script that loops through and counts the number of xrefs to each function. What I will do is start at the top and work my way down. Usually performing analysis on the first 20 or so functions because they are the most xrefed functions in the image. Later you realize the pay off from this as you are going through code and you see named functions instead of CALL loc_addr/sub_addr names.

Magic Numbers
Get intimately familiar with magic numbers! From ELF to compression they will come in handy if you have them embedded into your brain. Most firmware images are compressed with some algorithm, in some cases you will see numerous compression blocks. Being able to identify these numbers in a hexeditor will save you a lot of time when trying to find what you are really looking for. Many times the first few segments are not compressed and consist of bootloader code and the decompression routine. The meat of what you are looking for is most likely compressed!


HexEditor

This is another thing you need to have a good relationship with. Unlike PE files and ELF files you can’t just drop a binary image into IDA Pro and get magical results. You will need to open the binary in a hex editor first, identify structs and code from the bootloader and decompression routine. Your hex editor will also come in very handy when mapping the general layout of the image. Your next logical place in the hex editor is to remove the data before your compressed code so you can decompress it.