Monday, July 24, 2017

Beginning x86 disassembly – Understanding "For" statements with Visual Studio 2017

In this series of posts, I’m going through the Open Security Training for beginning Assembly Language and thus am putting my own spin on things to enhance my knowledge of x86 disassembly. However, to make the most of these tutorials you may be better of reviewing the material from Open Security Training directly.

Let’s get started!

To understand for statements, let’s start with the code below:

/*
* This file focuses on flow control via unconditional jumps using the "for" statement
* The objective is to get a better understanding of how these are disassembled in C
* Author Nik Alleyne
* Blog: securitynik.blogspot.com
* File: flow_control_for.c
*
*/
#include <stdio.h>

int main()
{
  int i;
  for (i = 0; i < 10; i++)
    {
      printf("Hey We Are Making Progress!!\n");
    }



… and here is the disassembled code

int main()
{
00401000  push        ebp 
00401001  mov         ebp,esp 
00401003  push        ecx 
  int i;
  for (i = 0; i < 10; i++)
00401004  mov         dword ptr [ebp-4],0 
0040100B  jmp         00401016 
0040100D  mov         eax,dword ptr [ebp-4] 
00401010  add         eax,1 
00401013  mov         dword ptr [ebp-4],eax 
00401016  cmp         dword ptr [ebp-4],0Ah 
0040101A  jge         0040102B 
    {
      printf("Hey We Are Making Progress!!\n");
0040101C  push        404000h 
00401021  call        00401080 
00401026  add         esp,4 
    }
00401029  jmp         0040100D 

}
0040102B  xor         eax,eax 
0040102D  mov         esp,ebp 
0040102F  pop         ebp 
00401030  ret   

Let’s now get into the code. As always, the first set of code “push ebp” and “mov ebp,esp”  relates to the prologue, which you can learn more about in this post.

The next instruction “push ecx” pushes 4 bytes unto the stack to accommodate the local variable “int i”. If we take a look at the registers before we push, we see “ecx” has a value of “1”. This is the value currently of the number of arguments when this code is executed at the command line. In this case the value is “1” because the only argument is the actual compiled code which is being executed.

EAX = 0F291944 EBX = 0023F000 ECX = 00000001 EDX = 004043B4 ESI = 00401440 EDI = 00401440 EIP = 00401003 ESP = 0019FF04 EBP = 0019FF04 EFL = 00000202

If we take a quick glance at the stack we see the following:

0x0019FF04  0019ff18  .ÿ.. – EBP pointing to the top of the stack
0x0019FF08  0040142e  ..@. – Return Pointer
0x0019FF0C  00000001  .... – EBP+8 points to the argument count
0x0019FF10  005a43c0  ÀCZ.
0x0019FF14  005ac940  @ÉZ.
0x0019FF18  0019ff70  pÿ..
0x0019FF1C  00401310  ..@.


As a result, when “push ecx” is executed, ECX is pushed unto the top of the stack, the value “1” will be there. However, this is not a major concern at this time as will be overwritten to begin the loop from “0”.

Next instruction states “mov dword ptr [ebp-4],0”. This instruction moves “0” into “ebp-4” which is the memory addressed being used for “int I”.

Looking at the registers after this instruction is executed we see

EAX = 0F291944 EBX = 0023F000 ECX = 00000001 EDX = 004043B4 ESI = 00401440 EDI = 00401440 EIP = 0040100B ESP = 0019FF00 EBP = 0019FF04 EFL = 00000202

Looking at the stack we see

0x0019FF00  00000000  .... – EBP-4 – Now has a value of “0”
0x0019FF04  0019ff18  .ÿ.. - EBP
0x0019FF08  0040142e  ..@.
0x0019FF0C  00000001  ....
0x0019FF10  005a43c0  ÀCZ.
0x0019FF14  005ac940  @ÉZ.
0x0019FF18  0019ff70  pÿ..


So is the current value of “i” is “0”

Next instruction states “jmp 00401016” which moves execution to the instruction at memory address “00401016”. This instruction is “cmp dword ptr [ebp-4],0Ah”. This compares “ebp-4” which is the value of “i” and currently is “0” to 10. At the same time it sets the “EFLAGS”.

Let’s take a look at these flags after execution has been completed for the “cmp”

OV = 0 UP = 0 EI = 1 PL = 1 ZR = 0 AC = 1 PE = 1 CY = 1

The next instruction state “jge 0040102B”. This will make a jump if the value in “ebp-4” is greater than or equal to the value “0Ah” which is 10 in hex. More specifically, this will be executed if “SF = OF”. In Visual Studio parlance, this would be the “PL = OV”. As we can see above, these two flags are not the same “OV = 0” and “PL = 1”. As a result, after this instruction is executed, it will execute the next instruction in the sequence as the “jge” will not be executed.

The next instruction is “push 404000h”. This takes the value at memory address “404000h” and puts it at the top of the stack. This will be the value that “printf()” function will execute.

Looking at the registers first:

EAX = 0F291944 EBX = 0023F000 ECX = 00000001 EDX = 004043B4 ESI = 00401440 EDI = 00401440 EIP = 00401021 ESP = 0019FEFC EBP = 0019FF04 EFL = 00000297

Now looking at the stack, we see that the value “404000h” which is the pointer to our string has been placed at the top of the stack

0x0019FEFC  00404000  .@@.
0x0019FF00  00000000  ....
0x0019FF04  0019ff18  .ÿ..
0x0019FF08  0040142e  ..@.
0x0019FF0C  00000001  ....
0x0019FF10  005a43c0  ÀCZ.
0x0019FF14  005ac940  @ÉZ.


Looking at the memory address “404000h” which is at the top we see:

0x00404000  20796548  Hey
0x00404004  41206557  We A
0x00404008  4d206572  re M
0x0040400C  6e696b61  akin
0x00404010  72502067  g Pr
0x00404014  6572676f  ogre
0x00404018  21217373  ss!!
0x0040401C  0000000a  ....

We will now “step over” the next instruction “call 00401080” which is calling the “printf()” function. Once we “step over” this, we see the following on our console (screen) “Hey We Are Making Progress!!”.

This brings us to the end of our first loop. In this case “I = 0”.

Continuing the loop!

The next instruction states to “add esp,4”. If we remember from the two previous memory dumps and register output, ESP points to “ESP = 0019FEFC”. If we “add esp, 4”, we will be adding “4” to the ESP value, thus the new ESP value will be “0019FF00”. If we look in the 2nd memory dump above going back from here, we see that at this location is the value “0” which relates to “ebp-4” which is the location being used for “int i” variable. Once this instruction is executed the next instruction is “jmp 0040100D”.

The “jmp 0040100D” instruction sends execution back to memory location “0040100D” which holds the instruction “mov eax,dword ptr [ebp-4]”.

Instruction “mov eax,dword ptr [ebp-4]” says to move the value at “[ebp-4]” memory address which from above we know is “0”.  Let’s look at the register before execution:

EAX = 0000001D EBX = 0023F000 ECX = 398E4B0C EDX = 0F290BA4 ESI = 00401440 EDI = 00401440 EIP = 0040100D ESP = 0019FF00 EBP = 0019FF04 EFL = 00000216

… and after execution:
EAX = 00000000 EBX = 0023F000 ECX = 398E4B0C EDX = 0F290BA4 ESI = 00401440 EDI = 00401440 EIP = 00401010 ESP = 0019FF00 EBP = 0019FF04 EFL = 00000216

Now that the value “0” at “[ebp-4]” has been moved to the “eax” register, the next instruction states “add eax,1”. This now means that the value in “eax” will be “1” after execution. This is because the current value in “eax” is 0, so 0+1 which is 1. Let’s execute this.

EAX = 00000001 EBX = 0023F000 ECX = 398E4B0C EDX = 0F290BA4 ESI = 00401440 EDI = 00401440 EIP = 00401013 ESP = 0019FF00 EBP = 0019FF04 EFL = 00000202

Next instruction states “mov dword ptr [ebp-4],eax”. This now takes the “1” which is in “eax” and places it in the value of “int i” which is found at “[ebp-4]”. After executing the instruction, we see the stack looks like:

0x0019FF00  00000001  .... – [EBP-4] now has a value of 1.
0x0019FF04  0019ff18  .ÿ.. - EBP
0x0019FF08  0040142e  ..@.
0x0019FF0C  00000001  ....
0x0019FF10  005a43c0  ÀCZ.
0x0019FF14  005ac940  @ÉZ.
0x0019FF18  0019ff70  pÿ..


Next instruction “cmp dword ptr [ebp-4],0Ah” is basically the same from above and is comparing the value of 1 to see if it is greater than or equal to 10. Taking a look again at the “EFLAGS” after execution we see:

OV = 0 UP = 0 EI = 1 PL = 1 ZR = 0 AC = 1 PE = 0 CY = 1

The next instruction “jge 0040102B” will not jump to memory address “0040102B” as 1 is not greater than or equal to 10. More specifically, the value in “[ebp-4]” is not greater than or equal to 0Ah (10 decimal). Therefore the next instruction in this sequence will be “push 404000h”. If we remember from above, this will push a pointer to our string "Hey We Are Making Progress!!\n" unto the stack and this string will then be used by “printf()” which is being called via “call 00401080”.

Once these “push 404000h” and “call 00401080” are executed we see the following on our screen …
Hey We Are Making Progress!!
Hey We Are Making Progress!!

At this point our loop count is 1. Now you may be wondering if we are comparing 1 why is it there are two outputs. This is because the counting begins at “0”. Thus if we are at count 1, we now have 2 elements which are printed.

For brevity, I’m skipping the rest of the loop for 2,3,4,5,6,7,8,9 and will go straight to what happens after 9 is completed.

Here is the printout for up to the 9th loop.

Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!
Hey We Are Making Progress!!

At this point we are once again at instruction “mov eax,dword ptr [ebp-4]”. Let’s see what is at “[ebp-4]” now first by looking at the registers to see where EBP is

EAX = 0000001D EBX = 0023F000 ECX = 398E4B0C EDX = 0F290BA4 ESI = 00401440 EDI = 00401440 EIP = 0040100D ESP = 0019FF00 EBP = 0019FF04 EFL = 00000216

… and now our memory dump from the stack
0x0019FF00  00000009  .... – [ebp-4] – now has a value of 9
0x0019FF04  0019ff18  .ÿ.. - EBP
0x0019FF08  0040142e  ..@.
0x0019FF0C  00000001  ....
0x0019FF10  005a43c0  ÀCZ.
0x0019FF14  005ac940  @ÉZ.
0x0019FF18  0019ff70  pÿ..

from above we see the value of “9” is at “[ebp-4]”. When the instruction “mov eax,dword ptr [ebp-4]” is executed, 9 will be moved into the “eax” register. Let’s execute the instruction and see what we get:

EAX = 00000009 EBX = 0023F000 ECX = 398E4B0C EDX = 0F290BA4 ESI = 00401440 EDI = 00401440 EIP = 00401010 ESP = 0019FF00 EBP = 0019FF04 EFL = 00000216

Now that we have “9” in the “eax” register, the next instruction says to “add esp, 1”. This means after execution, the value in “EAX” will equal “A”. Let’s execute the instruction and look at the registers again

EAX = 0000000A EBX = 0023F000 ECX = 398E4B0C EDX = 0F290BA4 ESI = 00401440 EDI = 00401440 EIP = 00401013 ESP = 0019FF00 EBP = 0019FF04 EFL = 00000206

Good stuff, EAX = 0000000A.

Next instruction “mov dword ptr [ebp-4],eax” says to move the value in “eax” which is “10” to “[ebp-4]”. If we remember from the most recent stack printout above [ebp-4] has a value of 9. This instruction will now change that to 10 (A). Let’s execute the instruction and print out the first few bytes on the stack:

0x0019FF00  0000000a  .... [ebp-4] – now has a value of 10
0x0019FF04  0019ff18  .ÿ.. ebp
0x0019FF08  0040142e  ..@.
0x0019FF0C  00000001  ....
0x0019FF10  005a43c0  ÀCZ.
0x0019FF14  005ac940  @ÉZ.
0x0019FF18  0019ff70  pÿ..



Next instruction “cmp dword ptr [ebp-4],0Ah”. Once again we are comparing the value in “[ebp-4]” which is “a” or 10 to see if it is greater than or equal to “0Ah” (10). At this point the instruction is executed with the “EFLAGS” showing as follows:

OV = 0 UP = 0 EI = 1 PL = 0 ZR = 1 AC = 0 PE = 1 CY = 0

If we remember from above the overflow “jge” requires that the “SF = OF”. In this case that would be “PL = OV”. From above we an see that “PL = 0” and “OV = 0”. At this time, it would be safe to conclude that our next instruction “jge 0040102B” will jump to the instruction at memory address “0040102B” which is “xor eax,eax”.


The instruction “xor eax,eax” will zero out the eax register which will be our default return value. If you remember above, the code does not have a “return” statement. Let’s look at the registers after this execution.

EAX = 00000000 EBX = 0023F000 ECX = 398E4B0C EDX = 0F290BA4 ESI = 00401440 EDI = 00401440 EIP = 0040102D ESP = 0019FF00 EBP = 0019FF04 EFL = 00000246

The next set of instructions “mov esp,ebp” and “pop ebp” is part of the epilogue which you can learn more about in this post.

That’s it for the understanding for loops via disassembly

Other posts in this series:

No comments:

Post a Comment