Hi,

I'm the author of ccrypt. Sorry for responding so late to this thread,
but I was only made aware of it recently. As always, questions about
ccrypt can also be directed to me (and are likely to receive a faster
response) in the ccrypt forums on SourceForge.

To address the original question: let me state unequivocally that
ccrypt, along with every other password-based encryption program, is
subject to brute-force attacks including dictionary attacks. As Thomas
Pornin aptly put it, "Password-based encryption is inherently
vulnerable to dictionary attacks". Nothing prevents an attacker from
trying every possible password. This is not a flaw, but a basic fact
of life, and one that every user of cryptography must understand. If
you use a weak password such as "poodle", an attacker will be able to
guess your password and decrypt your data. If you use a strong
password such as "hVztmdz28fNemDZnxj5YLjXz", your data will remain
secure for the foreseeable future, to the best of current knowledge.

I would like to clear up a few other points raised by the original
poster and in the answers and comments.

In the original post, there seems to be a misunderstanding about what
ccrypt is a replacement for. This is probably due to the fact that I
haven't updated that part of the documentation in a long time. When I
wrote (in the year 2001) that ccrypt is a replacement for the old unix
crypt program, I was referring to the crypt(1) program, not the
crypt(3) library function.  Crypt(1) was a password-based file
encryption program that has existed in Unix at least since 1979 (it
was in Version 7: http://man.cat-v.org/unix_7th/1/crypt), and was
well-known throughout the 1980s and 1990s. It used a notoriously weak
algorithm based on the Enigma cipher, which had already been broken in
World War II. On the other hand, crypt(3) is a library function that
hashes passwords to the format used in /etc/password. As far as I
know, there is nothing specifically wrong with crypt(3), and it is
still in use today.  Ccrypt(1) is a replacement for crypt(1), not for
crypt(3). It does not make sense to ask whether ccrypt is "better"
than crypt(3) since they do completely different things. Probably I
should update that part of the documentation, given that thankfully
nobody remembers the crypt(1) program today.

The original post also raises a question about the fact that ccrypt
will inform the user when they have typed a wrong password, and
whether this makes dictionary attacks faster. The answer is that this
does not make dictionary attacks faster, except perhaps by a very
small constant factor. Thomas already explained this well in his
answer, but just to emphasize the point, consider this: suppose that
the encrypted file is a JPEG file. Every JPEG file starts with the
same 3 bytes, namely ff d8 ff. So there is already a very simple test
to check the likely correctness of your key: just decrypt the first
block and check whether the file starts with ff d8 ff. Most other file
formats also contain this type of known plaintext, for example, every
PDF file starts with "%PDF". It is also easy to check whether a file
is a plain text file. So the fact that ccrypt provides the password
matching feature does not make attacks any easier than they are
anyway. As a matter of fact, ccrypt's check for a "matching" key is
also not 100% accurate. There is a 1 in 4 billion chance that any
random key will pass the test, even if it is not the correct key. The
feature exists to prevent people from accidentally decrypting a file
with the wrong key (and thereby overwriting all the data in it with
garbage).

There are some troubling points raised by Thomas in his answer, and I
would like to address them.

> "The second method is to convert the password into an encryption key
> using a good, properly configured password hashing function; the
> "proper configuration" there means that a salt will be used to
> prevent parallel attacks (when there are several encrypted files
> with distinct passwords, and the attacker would like to break one or
> several of them), and that the password hashing function is made
> deliberately expensive through many iterations to make dictionary
> attacks harder."

No, this is a red herring. It is technically true that the methods
Thomas recommends make dictionary attacks "harder". But the problem is
that they only make dictionary attacks harder by a constant factor,
and this is not good enough. For example, if I make the hashing
functions a million times harder to compute, as Thomas suggests, this
makes the attack a million times more difficult. This is equivalent to
adding about 20 bits of entropy to the password (i.e., increasing the
length of the password by about 4 random letters or numbers). Now
consider this: a typical hacker's botnet contains between 100,000 and
10,000,000 CPUs. We can assume that the NSA probably has hundreds of
millions of CPUs at their disposal. So making the hashing function a
million times slower is basically useless against a powerful
attacker. The correct response is not to do the equivalent of adding 4
more characters to your password. A much more effective response,
providing a real increase in security, is to *double* the length of
the password. Users should be aware of this.

A similar argument applies to salting as well; again, this only makes
the attack harder by at most a constant factor. Moreover, ccrypt
already uses a form of salting (but for a different reason). The
purpose of salting is to prevent "rainbow table" attacks. Given n
passwords and m hashes, the purpose of salting is to ensure that the
cost of checking whether any password matches any hash is propertional
to nm and not to n+m. But in ccrypt this is already the case. Since
ccrypt uses a random initialization vector in each encryption,
identical files encrypted with identical keys will never be more
similar to each other than random files. This implies, among other
things, that the cost of trying n keys on m encrypted files is
propertional to nm and not n+m - exactly what salting is designed to
achieve. This is true no matter whether the n keys are prepared in the
form of user-level passwords or hashed passwords.

So why not add a bit of salting and a tougher hash function anyway? It
can't hurt, can it? As a matter of fact, it can. There is a very
important design principle in cryptography that ccrypt adheres to: do
not add any feature that gives the impression of increased security
without actually increasing security. I feel strongly that adding such
a feature is not just neutral, but is actually detrimental. Suppose I
added a bell and a whistle to ccrypt to make the dictionary attack
slightly harder. Suppose the man page states "this program contains a
bell and a whistle to make a dictionary attack slightly harder". We
can be sure that there will be a significant non-zero percentage of
users who take this as permission to use weak passwords, which can of
course be easily broken by the (only slightly harder) dictionary
attack. I feel strongly that it is important that a program should not
contain such features, as it encourages and seemingly condones dumb
behavior. Instead, the user manual should clearly state (as it does)
that the encryption is subject to exhaustive search of the key space,
and is only secure if long and secure keys are used. ("Longer keywords
provide better security than short ones, since they are less likely to
be discovered by exhaustive search." ... "an exhaustive search of the
key space is not feasible, at least as long as sufficiently long keys
are actually used in practice.")

Finally, Thomas has some other concerns about the hash function used
by ccrypt. Of course, I welcome the technical analysis - the more
scrutiny, the better. But unfortunately, his analysis of the remaining
points is flawed, and I'd like to respond to them.

But let me start by agreeing with the general point that using a
custom hash function is not usually a good idea. The reason that
ccrypt uses such a function is not that I like home-made
things. Rather, it is that ccrypt has existed since before AES was
standardized, and certainly long before any hash functions were
standardized to use with AES. The first public release of ccrypt was
made in October 2001, about a month before AES was announced. Ccrypt
has remained backward compatible since its initial release, which I
think is an important part of its useability. However, despite the
fact that ccrypt uses a custom hash function, I will argue that to
this day, there are no specific known flaws in ccrypt's design, and in
particular, that the supposed flaws that Thomas pointed out are not
valid.

> XOR every byte of K with c

This statement is not correct. Perhaps this is a nitpick, but it seems
to be the basis of much of the argument that follows. The "++"
operator in C is easy to overlook, but it does mean that K is XORed
with the next 32 password characters, not 32 copies of the same
character. This means that each of the individual encryptions taking
place during the hashing uses one of 2^256 possible keys (not 256
possible keys).

> Moreover, the author talks about the function needing to be
> "collision-free", which is completely irrelevant for password-based
> key derivation

This is plain nonsense. If a hash function contains predictable
collisions, it always decreases the cost of brute-force attacks,
because there are fewer keys that need to be tested. For a simple
illustration, consider a hash function that treats upper- and
lowercase characters as equivalent. So there would be obvious
collisions, as "password" and "PaSSwOrD" would hash to the same
internal key. This would speed up any dictionary attack not just be a
constant factor, but by an exponent, since only lowercase passwords
would need to be checked. Another example, which I have actually
encountered in real life, is a hash function that XORs the password
written left-to-right with the password written right-to-left. This
creates a huge number of passwords that all hash to the same value,
for example "ABBA" and "0000". So being free of exploitable collisions
is definitely a must for any password hashing function. Since nobody
knows which collisions might be "exploitable", a password hashing
function should ideally be collision-free in the cryptographic sense.

On the other hand, as Thomas correctly states, preimage resistance is
not a necessity for password hashing (and consequently, the ccrypt
password hashing function was not designed to be preimage resistant).
When given a Rijndael-256 key, it would be acceptable to be able to
find a corresponding password that hashes to that key. In fact, if the
user's password is at most 32 bytes long, it would be perfectly
acceptable to use the zero-padded password itself as the Rijndael-256
key, without any hashing. One way to look at it is that the hashing is
simply a convenience to allow users to specify keys of arbitrary
length, while still benefitting from the resulting increase in entropy
(up to the maximum of 256 bits).

> The FAQ says "AES" but it is not AES.

This was a typo, thanks for pointing it out. I must admit it took me a
while to find the relevant part of the FAQ, since the FAQ contains
many places where I was very careful to explain the difference between
Rijndael and AES. But you found the one place where I carelessly wrote
"based on AES" instead of "based on Rijndael". It has now been fixed.

**Summary:** Like all password-based encryption software, ccrypt is
subject to dictionary attacks. Users must be aware of this and choose
strong passwords. However, it would be counterproductive to make
insecure passwords only slightly less insecure by increasing the cost
of these attacks by a constant factor as has been suggested. Instead,
the only correct response it to use secure passwords in the first
place. Much of the remaining analysis presented in Thomas Pornin's
answer is flawed, and no actual weaknesses in ccrypt have been
uncovered.

-- Peter