> The real hackers will know that as soon as I found evidence of sqlite3_key_v2 in the Zyzzyva dylib file that getting the key was inevitable. I don’t actually know the steps for removing debug symbols from compiled code off the top of my head, but I bet if this had been done, this would have made my job much, much harder.
I'm not entirely sure about OS X, but at least on Linux, system-assisted dynamic linking (i.e. not mmap(PROT_EXEC)) requires that all required symbols are exposed so that relocation can be done in the original executable; in other words, the OS needs to know where the functions in the library are so that it can tell the program how to call them.
Of course, you could obfuscate the function names, but then tracebacks wouldn't work properly and at that point you'd be better off just statically linking the whole program.
Debug symbols are completely different; if you have those, you can simply do "frame variables" which shows the args with names.
> Yesss. Time to get out the x86 assembly hats.
You don't even really need to do that. Since you know the function signature, you can assume (since it is in a separate library) that the function uses the standard System V AMD64 ABI where "the first six integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9" [0], meaning that the pKey pointer is probably in RDX. I know that the author said that it was in RAX, but since that is caller-saved, there must have been some copying or processing done to it inside the function.
"Dictionaries enjoy copyright protection for two main reasons: Their creators make judgments about what words to include, and entries feature definitions and other original material. (Just last week, a federal court in Massachusetts ruled against[2] a plaintiff who wanted to copy and repurpose the bulk of Merriam-Webster’s Collegiate, including definitions, for his own dictionary.) But in 1991, in Feist Publications Inc. v. Rural Telephone Service Co.[3], the Supreme Court decided that a phone company wasn’t entitled to a copyright on its white pages. That’s because the list of names and numbers lacked an important requirement: originality."
Isn't this a violation of the DMCA's anti-circumvention section? This seems to be explicitly describing how to circumvent protection measures for a copyrighted work.
This assumes it's validly copyrighted.
I wonder if the wordlist is even registered with the copyright office (I can't imagine it is, they are pretty good about not accepting stuff like this).
Additionally, to the degree that hasbro/whoever the heck claims a copyright on the work of other people, they are themselves violating various parts of the DMCA dealing with rights management info, etc.
Hasbro/whoever should know that it is not possible to effect a transfer of copyright without an explicit signed agreement. Thus, if all these people contributed, and then they slapped a copyright on it, they own exactly nothing.
(There is such a thing as a compilation copyright, but it it is a very minimalistic copyright, and assumes they actually did anything creative or original to the compiled list)
If someone was to press this point against the scrabble players, they would
A. likely lose as the list will be considered non-copyrightable subject matter
B. If the list was somehow found copyrightable, and this story is accurate, they would be opening themselves up to copyright infringement lawsuits from the scrabble players who contributed to the wordlist.
The fact that the DMCA could criminalize the act of inspecting the contents of an executeable file acquired legally and running on your personal computer and then telling other people about it is pretty good evidence that the DMCA is an immoral law that should be violated as much as possible. Kudos to the article's author.
It is sometimes surprising how accurate some of Stallman's dystopian visions were and it is frightening because some of them have not become true. Yet.
The article probably does violate that section of the DMCA. But it is also a research/scientific piece subject to the protection of the First Amendment. I could be mistaken, but I suspect if one or the other had to go -- the First Amendment or this law -- the First Amendment would win.
Said another way, even as a very pro-copyright judge, I would have a hard time saying the author did not have a First Amendment right to publish his research. Now if he wrote a program to make it easy to crack these databases and sold it for $5 each, that would be a different matter.
That would presume that the dictionary in question was a copyrightable work. The US has weak database copyright protection due to Feist. There is also Assessment Technologies to consider, but I don't believe that involved the DMCA.
It seems that Linux version has a version of the database as plain text (including their meanings); wc -l tells me that OSPD4.1.txt has 178378 entries, which seems about right. edit: seems like that's only true for v4 and below, while the article could be about v5 (it doesn't say)
How can you copyright a wordlist anyways? I'm really not familiar with scrabble but doesn't it just contain plain English words without any context? They could of course copyright the order but couldn't you just shuffle the list and publish it.
An explanation how copyright works in this case would be great.
Some countries have a database right besides normal copyright. Copyright seems like a stretch, but I can understand why hobbyists don't want to take the risk of getting sued even if the law should be on their side.
Thank you very much for the link but this sounds really stupid. Couldn't I just create a database with all posibile additions ranging from 1,000,000 to 10,000,000 and sue everyone who publishes a book/paper etc because of the "investment that is made in compiling a database" and it doesn't matter if they calculated the result themselves because they could've just used my database.
This is really stupid. I mean I get copyright but I think copyright should only apply to "the 'creative' aspect[s]" if the author wants that.
We have had a similar discussion on Hacker News about a company that claims to be mechanically generating all possible arrangements of words of a certain length and copyrighting those, as well as all possible images of a certain size, and all possible musical melodies of a certain length, and so on.
The reason this is not considered copyrightable is that there must be some evidence of creative effort. Owning an infinite number of monkeys and typewriters does not entitle you to copyright everything they generate.
No, you couldn't, because (at least for most of the implementations listed in the article) it DOES matter if they actually used your database or not. Also note database rights not necessarily are copyrights.
Excellent work .. and of course, a salient reminder of why we all, individually, should copyright our own works, even if it is something done for free and/or for volunteer basis with no commercial interest. A right not exercised is one lost.
I think its preposterous that someone is able to trademark a word list. I bet its not even complete.
All works that can be copyrighted are automatically copyrighted from the moment of creation, unless you live in one of the dozen countries which haven't ratified neither the Berne Convention nor TRIPS. The US signed Berne in '89, by the way.
This is actually a better argument for copyleft licensing for any community effort like this. It would not have been possible for any company to lock down the efforts of this community if all contributions to the wordlist at question had been under a license like Creative Commons or the GPL.
Sick system. In other developed countries you have copyright if you do it. You don´t need to register that right anywhere. Of course more difficult to prove but thats another story...
When I found out you had to register your stuff for copyright in certain countries, I was actually surprised ... I just thought having copyright for your work by default was normal.
If I were inclined to twist the copyright tiger's tail, the way I would do it would be to encrypt the plaintext with a one-time pad and them publish the cyphertext and the pad anonymously in two different locations (preferably on two different domains). The key and the ciphertext in a one-time-pad are mathematically indistinguishable, so both publishing parties have plausible deniability that what they published was the key, i.e. just a string of random bits, which if course they have every right to do.
An even more interesting experiment would be to copyright the resulting key and the ciphertext, and put in the TOS for getting either one that you will not sue the publisher for any copyright violations.
> Treating Colour as a function is almost the same as attaching tags to the bits - the difference is that when the Colour is a function of the bits, we don't have to worry about the tags being detached; on the other hand, when the Colour is a function of the bits, we can never have more than one possible Colour for a given sequence of bits. Monolith depends on exploiting this problem: it assumes that one file can only ever have one Colour, asserts that the Colour of its output file is the "you may copy this" Colour because of the (correct) claim that fixing any other single unchangeable Colour would raise legal problems, and then follows the logic to a claim that it can produce what would otherwise be an illegal copy of the copyrighted input, without breaking copyright law. One Colour per file was never one of the lawyers' rules of Colour; it's merely a consequence of "Colour is a function", and Colour being a function is just something we computer people decided to believe because functions make sense to our training and Colour doesn't. Colour is not actually a function at all.
so like.......did they put this decrypted database online, or something? true, we could just perform the same operations they did, but if you're going to go through the trouble of putting your crack in public, might as well spread its fruits too
I'm pretty sure Cesar wants to avoid such a direct copyright violation. Sure just breaking the encryption might be considered a violation of the DMCA anti-DRM stuff, but that is a much more controversial law that many people oppose. A lawsuit over the anti-DRM would likely pull the EFF in, and make large news, while a lawsuit over plain spreading copyrighted information would be much more straightforward and likely for him to lose.
By only publishing the steps, he gets the benefit of the publicity of breaking the encryption. Then anonymous people can easily break it themselves and spread the actual list, free from worry of being sued.
I'm not entirely sure about OS X, but at least on Linux, system-assisted dynamic linking (i.e. not mmap(PROT_EXEC)) requires that all required symbols are exposed so that relocation can be done in the original executable; in other words, the OS needs to know where the functions in the library are so that it can tell the program how to call them.
Of course, you could obfuscate the function names, but then tracebacks wouldn't work properly and at that point you'd be better off just statically linking the whole program.
Debug symbols are completely different; if you have those, you can simply do "frame variables" which shows the args with names.
> Yesss. Time to get out the x86 assembly hats.
You don't even really need to do that. Since you know the function signature, you can assume (since it is in a separate library) that the function uses the standard System V AMD64 ABI where "the first six integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9" [0], meaning that the pKey pointer is probably in RDX. I know that the author said that it was in RAX, but since that is caller-saved, there must have been some copying or processing done to it inside the function.
[0] https://en.wikipedia.org/wiki/X86_calling_conventions#System...