Exercise 1
原文:http://blog.csdn.net/qq_34861102/article/details/78307672
平台:x86_64 mac
编译器:clang
实验任务:
You have just intercepted an encoded message. The message is a sequence of bits which reads as follows in hexadecimal:
6363636363636363724646636F6D6F72 466D203A65693A7243646E206F54540A 5920453A54756F0A6F6F470A21643A6F 594E2020206F776F797275744563200A 6F786F686E6963736C206765796C656B 2C3365737420346E20216F74726F5966 7565636F202061206C61676374206C6F 20206F74747865656561727632727463 6E617920680A64746F69766120646E69 21687467630020656C6C786178742078 6578206F727478787863617800783174
You have no idea how to decode it, but you know that your grade depends on it, so you are willing to do anything to extract the message. Fortunately, one of your many agents on the field has stolen the source code for the decoder. This agent (007) has put the code and the message in the file secret.cpp, which you can download from the laboratory of your technical staff (Q).
Q has noticed that the decoder takes four integers as arguments. Executing the decoder with various arguments seems to either crash the program or produce unintelligible output. It seems that the correct four integers have to be chen in order for the program to produce the decoded message. These four integers are the "secret keys."
007 has been unable to find the keys, but from the desk of the encrypting personnel he was able to cunningly retrieve the first five characters of the unencoded message. These characters are:
From:Assignment
Your assignment is to decode the message, and find the keys.
Reminders
This exercise is not extremely difficult. However, the strategy of trying things until something works will be ineffective. Try to understand the material in the course, particularly the following:
? Memory contains nothing but bits. Bits are interpreted as integers, characters, or instructions by the compiler, but they have no intrinsic type in memory.? The compiler can be strong-armed into interpreting integers as characters, or even as instructions, and vice versa.
? Every group of 8 bits (a byte) has an address.
? A pointer in C is merely a stored memory address.
? The activation records for each function call are all together in memory, and they are
organized in a stack that grows downwards and shrinks upwards on function calls and
returns respectively.
? The return address of one function as well as the addresses of all of its local variables are
allocated within one activation record.
Strategy
The designers of this decoder weren't very good. They made it psible for us to attack the keys in two independent parts. Try to break the first two keys first, and do not try to break the third and fourth keys until you have succeeded with the first two.
You can do the first part by specifying only two integer arguments when you execute the decoder. If you get the first and second keys right, a message that starts with From: will appear. This message is not the true message, but a decoy. It is useful, however,
to let you know that you have indeed broken the first two keys.
In breaking the first two keys, realize that the function process_keys12 must be somehow changing the value of the dummy variable. This must be so, because the variables start and stride control the extraction of the message, and they are calculated from the value of dummy.
In breaking the third and fourth keys, try to get the code to invoke extract_message2 instead of extract_message1. This modification must somehow be controlled from within the function process_keys34.
Files
When you are done, write a brief report that includes at least the following:
1. The secret message.
2. The secret keys.
3. One paragraph describing, in your own pre, what process_keys12 does. For example, you might say that it modifies a specific program variable.
4. The meaning of the first two keys in terms of variables and addresses in the decoder program. For example, you might describe key2 by saying that its X-Y bits contain the value to which variable start is set. Or you might describe key1 by saying, for example, that it must be set equal to the number of memory addresses separating the address of two specific variables. These are only examples.
5. One paragraph describing, in your own pre, what process_keys34 does.
6. One paragraph describing the line of source code that is executed when the first call
to process_keys34 returns.
7. The meaning of the third and fourth keys in terms of variables and addresses in the
decoder program.
Be precise, clear, and brief in each of the points above. Your report should not, in any case, be
longer than one page. Do not get frustrated if this takes a little longer than you expected: brief and clear text often requires more time to write than rambling pre.
Your teacher can tell you what word processors you may use to write your report. Chances are that you can write your report in a number of formats, and for simplicity's sake, you might even want to write it using Notepad.
Enjoy!
可以看到,当函数只有两个参数输入的时候,这个时候函数是只会输出
msg1
的信息,其中msg1
是由函数extract_message1
定义实现的。if (*msg1 == '\0') {process_keys34(&key3, &key4);msg2 = extract_message2(start, stride);printf("%s\n", msg2);}else {printf("%s\n", msg1);}
函数
extract_message1
定义:可以看到是将data
这一int
型的数组中的每一个int数值进行转换,其转化规则是从data[start+1]
开始,向前输出stride-1
个字符。隔一个不读,那接下来问题是观察start
和stride
的值。char * extract_message1(int start, int stride) {int i, j, k;int done = 0;for (i = 0, j = start + 1; ! done; j++) {for (k = 1; k < stride; k++, j++, i++) {if (*(((char *) data) + j) == '\0') {done = 1;break;}message[i] = *(((char *) data) + j);}}message[i] = '\0';return message;}
关于
start
和stride
的值的确定中,start
等于dummy
最低地址那一字节中的数值。stride
等于dummy
第二低地址那一字节中的数值。而main
函数有信息打印,所以可以发现stride
的值肯定不是初始值,所以函数process_keys12
一定修改了dummy
的值。在函数
process_keys12
中,key1
表示key1
的地址,*key1
表示main
函数中的指针。在我们所知的内存分配中,这里的dummy
和key1
的值相差了3个int
型的变量,因此key1
的值为 3process_keys12(&key1, &key2);start = (int)(*(((char *) &dummy)));stride = (int)(*(((char *) &dummy) + 1));
void process_keys12 (int * key1, int * key2) {*((int *) (key1 + *key1)) = *key2; }
对于
key2
的值确定,我使用的是平台是x86_64 mac
中的gcc
编译器。这个时候猜想信息的打印来自于data
数组,其是由ASC
码组成的。根据初始信息From:
结合函数extract_message1
可以看出:stride = 3
、start = 9
。使用代码判断编译平台,发现内存格式是小端格式。而我们知道,在这种格式之中,数据以8
bit
为单位(64位格式),因此dummy
的值为3 * 256 + 9 = 777
,因此key2 = 777
对
key1
和key2
的值进行检测,实验成功:和之前的分析类似,对函数
process_keys34
和extract_message2
进行理解。函数process_keys34
: 获取process_keys34
的执行堆栈中的key3
的地址,并向其添加作为主函数的局部变量的key3
的值,以获取地址其中存储process_keys34
的返回地址。 然后将主函数的局部变量key4
的值添加到返回地址以更改控制流,并直接跳转到执行extract_message2
。
而对main
函数进行分析,如果是按照顺序执行,则需要修改start
和stride
的值进行修改,但是要修改成满足if
条件的情况,则start
和stride
的值都不会让extract_message2
得到满足已有信息From:
的条件,所以这里一定发生了函数的跳转。而这里函数返回和局部变量之间还存在一个寄存器rbp,而在process_keys34
函数中,地址被修改成了int *
进行操作,在64位系统中,这里则要要求key3
的值为4,而不是在内存布局中的2。*(((int *)&key3) + *key3) += *key4;
同时,对于
key4
的确定,使用lldb
方法查看process_keys34
的修改值,可以得到地址差值即key4
的值为44char * extract_message2(int start, int stride) {int i, j;for (i = 0, j = start; *(((char *) data) + j) != '\0';i++, j += stride) {message[i] = *(((char *) data) + j);}message[i] = '\0';return message;}