90 lines
5.2 KiB
Plaintext
90 lines
5.2 KiB
Plaintext
.oO Phrack 49 Oo.
|
|
|
|
Volume Seven, Issue Forty-Nine
|
|
|
|
10 of 16
|
|
|
|
|
|
A Steganography Implementation Improvement Proposal
|
|
|
|
by: cjm1@concentric.net
|
|
|
|
[ For those of you who do not know, steganography is cryptographic
|
|
technique that simply hides messages inside of messages. The sender composes
|
|
an innocuous message and then, using one of many tactics, injects the secret
|
|
message into it. Some techniques involve: invisible inks, character
|
|
distortion, handwriting differences, word/letter frequency doping, bit
|
|
flipping, etc... The method the author discusses hinges upon a well known
|
|
steganographic implementation, low-order bit flipping in graphic images. -d9 ]
|
|
|
|
Steganography is a technique for hiding data in other data. The
|
|
general method is to flip bits so that reading the low-order bit of each of
|
|
8-bytes gets one a character. This allows one to use a picture or a sound
|
|
file and hide data, resulting in a small bit of hopefully unnoticeable noise
|
|
in the data and a safely hidden cache of data that can later be extracted.
|
|
This paper details a method for making steganographically hidden data more
|
|
safe, by using pseudo-random dispersion.
|
|
|
|
Ordinarily, if someone suspects that you have data hidden in, say, a
|
|
GIF file, they can simply run the appropriate extractor and find the data. If
|
|
the data is not encrypted, it will be plain for anyone to see. This can be
|
|
ameliorated by using a simple password protection scheme, hiding the password
|
|
in the GIF as a header, encrypting it first with itself. If someone does not
|
|
know the password, they cannot extract the data. This is of course reasonably
|
|
safe, depending on the encryption scheme used, and I recommend it. But, the
|
|
hidden data can be made even safer.
|
|
|
|
Pseudo-random dispersion works by hiding a password, and a seed for a
|
|
random-number-generator in the encrypted header. then, a random number of bytes
|
|
are passed by, before a low-order bit is flipped.
|
|
|
|
To do this, one must first calculate how many bytes a bit can take up
|
|
for itself. For instance, to hide an 800 character message in a GIF would
|
|
mean each character needs 8 bytes (8 bits per character, 1 byte per low-order
|
|
bit), so you need 6,400 bytes of data to hide the message in, 8 bytes per
|
|
character. Let's say we have a GIF that is 10 times this size: 64,000 bytes.
|
|
Thus we have 80 bytes per character to hide data in. Since each bit takes a
|
|
byte, we have 10 bytes per bit to hide data in! Therefore, if we take a
|
|
pseudo-random number between 1 and 10, and use that byte to hide our low-order
|
|
bit in, we have achieved a message dispersed through the GIF in a pseudo-random
|
|
fashion, much harder to extract. A message in which each byte has a bit which
|
|
is significant to the steganographically hidden message can be extracted with
|
|
ease relative to a message in which there are 10 possible bytes for each bit
|
|
of each character. The later is exponentially harder to extract, given no
|
|
esoteric knowledge.
|
|
|
|
A slight improvement can be made to this algorithm. By re-calculating
|
|
the number of available bytes left for each bit after each bit is hidden, the
|
|
data is dispersed more evenly throughout the file, instead of being bunched up
|
|
at the start, which would be a normal occurrence. If you use pseudo-random
|
|
number generator, picking numbers from 0-9, over time, the values will smooth
|
|
to 5. This will cause the hidden message to be clustered at the beginning
|
|
of the GIF. By re-calculating each time the number of available bytes left
|
|
we spread the data out throughout the file, with the added bonus that later
|
|
bits will be further spread apart than earlier ones, resulting in possible
|
|
search spaces of 20, 30, 100, or even 1,000 possible bytes per bit. This too
|
|
serves to make the data much harder to extract.
|
|
|
|
I recommend a header large enough for an 8 character ASCII password,
|
|
an integral random-number seed, an integral version number, and an place
|
|
holder left for future uses. The version number allows us to tweak the
|
|
algorithm and still be able to be compatible with past versions of the
|
|
program. The header should be encrypted and undispersed (ie: 1 byte per
|
|
bit of data) since we haven't seeded the random-number generator yet for
|
|
dispersion purposes.
|
|
|
|
It is useful to make the extractor in such a way that it always
|
|
extracts something, regardless of the password being correct or not. Doing
|
|
this means that it is impossible to tell if you have guessed a correct password
|
|
and gotten encrypted data out, or merely gotten out garbage that looks like
|
|
encrypted data. Use of a password can also be made optional, so that none is
|
|
necessary for extraction. A simple default password can be used in these
|
|
cases. When hiding encrypted data, there is no difference to the naked
|
|
eye between what is extracted and what is garbage, so no password is
|
|
strictly necessary. This means no password has to be remembered, or
|
|
transmitted to other parties. A third party cannot tell if a real password
|
|
has been used or not. It is important for safety purposes to not hide the
|
|
default password in the header if no password is used. Otherwise, a simple
|
|
match can be made by anyone who knows the default password.
|
|
|