logo资料库

浙江大学高级数据库技术考试卷.doc

第1页 / 共4页
第2页 / 共4页
第3页 / 共4页
第4页 / 共4页
资料共4页,全文预览结束
Advanced Database Technology
Baseline Test
(Sample)
Problem 1 : Index and Query (20 points)
Problem 3. Query Optimization (12 points)
Please answer the following questions:
Advanced Database Technology Baseline Test (Sample) Problem 1 : Index and Query (20 points) Consider the following relational schemas: Create table emp(id char(20) primary key, name char(30), age int, salary real, address char(50), dname char(30) , foreign key (dname) references dept); Create table dept(dname char(30) primary key, location char(50) ); Assume table emp has 6000 records, which are stored sequentially based on their ids; table dept has 200 records, which are stored sequentially based on their dnames; each (data or index) block can hold 2K (2048) bytes, in which 48 bytes are reserved for the block header. The attribute with real type is 8 bytes and int type is 4bytes. No records span two or more blocks. Each block pointer(which points to a block) is 6 bytes. Each record pointer (which points a data record) is 8 bytes ( includes a block pointer + 2bytes offset within a block). Index entries do not span blocks. (1) Suppose we want to build a conventional primary index INDEX1 on emp.id using a dense index. What is the minimum number of blocks needed for INDEX1? (2) For the same primary index INDEX1 on emp.id, we now want to use a sparse index. What is the minimum number of blocks needed for INDEX1? (3) Suppose we want to build a secondery index INDEX2 on dept.location. What type of INDEX2 makes sense, sparse or dense? What is the minimum number of blocks needed for INDEX2? (4) Suppose now we want to do the natural join emp dept. The available memory buffer size is 10 blocks. Estimate the cost for the join, and explain the algorithm to get the lowest cost. 1
Problem 2 : B+ Tree (24 points) For the following B+ tree (n=4): 31 20 37 51 61 10 13 20 22 31 33 35 37 38 41 51 53 57 61 65 71 Answer the following questions: (1) Draw the B+ tree after insert an index item with key ‘55’ to the above tree. (2) Draw the B+ tree after delete an index item with key ‘13’ from the original tree. (3) Suppose the B+ tree contains 900 index items, please estimate the height of the B+ tree. (Consider the tree with half-filled and full-filled respectively.) (4) Suppose that the B+ tree contains 900 index items, please estimate the size (i.e. the number of nodes) of the B+ tree. (Consider the tree with half-filled and full-filled respectively.) Problem 3. Query Optimization (12 points) Suppose we have the database tables R(ID, A) and S(ID, B) in which IDs are keys, and we would like to execute the following query: SELECT A, B FROM R, S WHERE R.ID = S.ID AND A = 'a1' AND B = 'b1' ; We have the following statistics and assumptions: R stored contiguously in br = 1000 blocks; S stored contiguously in bs = 600 blocks; V (R, ID) = 15000; // V(R, ID) is the number of distinct values that appear in the relation R for attribute ID . V (R, A) = 50; V (S, ID) = 10000; V (S, B) = 100; Assume that the values in R.A and S.B are uniformly distributed, in which value 'c1' and 'b1' exist. The available memory buffer size is 30 blocks. Please answer the following questions: (1) Please give out an execution plan with the lowest cost for the query at the relational-algebra level. 2
(2) Estimate the cost for the query with your optimized algorithm.(Note: the cost is measured with the number of blocks transferred between disk and memory buffer and the seek times of disk. As usual, do not count the cost to write out the final query result.) Problem 4: Concurrent Control (24 points) There are two concurrent schedules: S1 : w1(Y );w2(X); r3(X); r1(Z);w4(Z); r4(X);r2(Y ). S2 : r1(X); r3(X);r1(Z);r4(V);w3(Z);w2(Y );w2(X); w4(V ); r3(Y ). Here:wi(X) means Transaction i writes data X, ri(X) means Transaction i reads data X. For each of the above schedules S1 and S2, answer the following questions respectively: (1) Draw the precedence graph for the schedule. (2) Is the schedule conflict-serializable? If yes, describe all the conflict-equivalent serial schedules. (You do not need to explicitly write out the serial schedules, just use the form like < T1, T3, T2 > .) If no, explain why it is not. (3) Could the schedule be created by the 2PL protocol? If yes, put the lock and unlock actions (e.g. use ‘Ls(Q)’ for Lock-s(Q), ‘Lx(Q)’ for Lock-x(Q), ‘UL(Q)’ for Unlock(Q)) into the schedule to show it can be generated by 2PL. explain briefly why. If not, Problem 5 Crash Recovery (20 points) Consider the following transaction log of a database system. Suppose the system crashes just after the last log record. 1) < T1 START > 2) < T1, A, 20, 30 > 3) < CKPT (T1) > 4) < T2 START > 5) < T1, A, 30, 50 > 6) < T2, B, 52, 66 > 7) < T3 START > 8) < T3, E, 0, 15 > 9) < T2 COMMIT > 10) < CKPT (T1,T3) > 11) < T3, E, 15, 40 > 12) < T1, C, 45, 90 > 13) < T1 COMMIT > 14) < T3 ABORT > 15) < T4 START > 16) < T4, D, 39, 0 > 3
Assume the log entries are in the format . Please answer each of the following questions: (1) What are the values of A, B, C, D and E in the database just after system crash? (2) Which transactions should be undone? Which transactions should be redone? (3) What are the start and end points for undo and redo processes respectively? (4) What are the values of A, B, C, D and E after system recovery ? 4
分享到:
收藏