Comps-2010-Fall
Contents |
Theory
Question 1 Sort Language
We have a function sort which can be applied to a language and sorts the characters in each word by alphabetical order.
- Give an example of a regular language which sort makes non-regular.
- Consider the language of <math>(ab)^n</math> this language is regular and can be represented by a 3 state machine. when sorted this becomes <math> a^nb^n </math>.
- Give an example of a non-regular language which sort makes regular.
- What about <math> b^nab^n </math>? which turns to <math> ab^{2n} </math> after sort is applied.
- This is just an a followed by an even number of b's which I can pretty easily create a DFA for.
- But first I need to prove that <math> b^nab^n </math> is non-regular.
- We can prove that the number of equivalence classes is infinite by showing that for a u and v in the language when we append a w one is in the language while the other is not.
- Let u = <math> b^ia </math> and v = <math> b^i-1a </math>
- Let v = <math> b^i </math> then uw is in the language while vw is not.
Most non-regular languages have one or more letters tied together by some relationship, e.g. <math> a^nb^n\; |\; n\; >\; 0 </math>.
Question 3 Biased Coin
- Biased coin flips heads with probability p tails with probability (1-p).
- How could you use this coin to simulate a fair coin?
Flip this coin repeatedly until you get a heads followed by a tails or tails followed by a heads, then select the first flip.
The probability of this occurring is 2*p*(1-p).
- <math>X\;=\;</math> # of rounds
- Pr<math>(HH\;\cup\;TT)\; = \;b\;=\;2p^2-2p+1</math>
- Pr<math>(HT\;\cup\;TH)\; = \;a\;=\;2p(1-p)</math>
- E<math>[x]\;=1a+2ab+3ab^2+.\;.\;.</math>
Question 4 Match
- Deck of 27 Cards
- Each Card has color, shape and number
- {red,blue,green}, {square, circle, heart}, {1,2,3}
- A match for any 3 cards is when each of the three cards share a property or are all different.
What is the probability of finding a match from drawing three cards.
- We draw one card which matches 9 other cards in exactly one property, 9 other cards in two properties, and 8 cards in zero properties.
- In case one there remains a single card that produces a single property match.
- In case two there is 1 remaining cards that produce a double property match.
- In case 3 there remains a single card which does not match any property of the other cards.
- (27/27)*(12/26)*(1/25)+(27/27)*(6/26)*(1/25)+(27/27)*(8/26)*(1/25) = 0.04
A simpler, key insight is to realize for any 2 cards there is only 1 card of 25 that creates a match.
If we shuffle the deck and turn over n cards where <math> n\; \leq= \; 27</math>, what is the expected number of matches where we count each match separately?
- (n choose 3) * (1/25)
The Saia approved way uses indicator variables:
- <math>X_{ijk}=\{1\; if\; i,j,k\; match,\; 0\; otherwise</math>
- <math>X_{ijk}=Pr(i,j,k\; match) = 1/25</math>
- <math>E[X_{ijk}]=1*(1/25)+0*(1-(1/25))</math>
- Define a new indicator random variable
- <math>X_n\;=\;#\; of\; matches\; in\; n\; card\; draws</math>
- <math>X_n\;=\; \sum_{ijk} X_{ijk}</math>
- <math>E[X_n]\;=\;E[\sum_{ijk}X_{ijk}]</math>
- Mention linearity of expectation.
- <math>E[X_n]\;=\;\sum_{ijk}E[X_{ijk}]</math>
- Also it's a good idea to get familiar with Markov Inequality.
- And Union Bounds.
Question 5 Chromatic Coloring
Show that determining if the largest Clique size equals the smallest k that you can color a graph <math>G</math> in is an NP complete problem. Reduce from the following: Graph 3-Coloring, Clique, Vertex Cover, Independent Set, Max Cut, Subset Sum, and 3-SAT.
- In short, let k be the size of the maximum clique for the graph G.
- Does a k-coloring of the graph G exist?
- Prove that it is in NP, i.e. we can verify a solution in polynomial time.
- Verify that the clique given is actually a clique. Then verify that the coloring is a valid coloring.
- If the coloring could be smaller, the clique given is not the largest.
- If the clique given could be larger, then the coloring given is not a valid coloring.
- Prove that it is as hard as another NP complete problem from the list.
- Show that it is a polynomial time reduction.
Question 6 Halting Reduction
Given two programs <math>\Pi_1,\; \Pi_2</math> and asked whether they are equivalent in the sense they compute the same partial function. That is, for any input x either neither <math>\Pi_1(x)</math> nor <math>\Pi_2(x)</math> halts, or they both halt and return the same answer. Prove that this question is undecidable.
Halting(Pi, x){
Pi_1(x'){
halt
}
Pi_2(x'){
Pi(x)
}
if equiv(Pi_1, Pi_2) then
return True
else
return False
}
Programming Languages
Question 1.1 Matrix
Part 1
- This was my first attempt. Sadly it doesnt work. An exception is thrown when a map tail is attempted on an empty list-list.
transpose :: [[Int]] -> [[Int]]
transpose [] = []
transpose xs = map head xs ++ transpose (map tail xs)
- This is a correct implementation
transpose :: [[Int]] -> [[Int]]
transpose xs = case length (head xs) of
0 -> []
1 -> map head xs : []
otherwise -> map head xs : (transpose (map tail xs))
Part 2
transpose :: Int -> Int transpose xs = case length (head xs) of
0 -> [] 1 -> map head xs : [] otherwise -> map head xs : (transpose (map tail xs))
Part 3
Question 2.3 Abstract Machines
We need to define fix within the function eval:
eval :: Term -> Env -> Val
eval t e = case t of
Var x -> e x
...
Fix (Abs f (Abs x t12)) -> eval (Abs x t12) (extend f (eval Fix (abs f (Abs x t12))
Systems
Question 1 Short Answer
Why and when must an OS occasionally flush the system Translation Lookaside Buffer (TLB)?
- On context switch of processes that have separate address spaces.
RAID 0 has no redundancy (parity bits or mirroring). What is the main reason to consider using RAID 0?
- Speed, by striping.
An IP packet is sent from L.A. to New York along a path comprised of routers that fully comply with all RFCs regarding IP routing. State two (2) reasons why the IP packet received in New York will differ from the one sent in L.A.
- TTL
- Checksum since the checksum is over the header and affected by the TTL.
Give one reason why you might initialize a semaphore to something other than zero.
What are the main differences between paging and segmentation?
- Pages are a set size determined by the OS.
- Segments are variable size and affected by external fragmentation.
- Typically segments might span multiple pages.
What are the key concepts of an exokernel-based operating system?
- Libraries provide system services and you link the system services you will use.
- The kernel only ensures that the requested resource is free, and the application is allowed to access it. This low-level hardware access allows the programmer to implement custom abstractions, and omit unnecessary ones, most commonly to improve a program's performance.
What is process migration? What state should be moved in order to completely migrate a process?
- Moving a process from one processor to another.
- Everything required in a context switch.
- In addition memory needs to mapped or copied, depending on whether the processors share memory.
Question 2.1 Scheduling
Given a shared system scheduler built for CPU bound processes which uses static priorities and round-robin policy with large, fixed
quanta. Preemption only occurs when a quantum expires.
Now, the developers of DinosaurOS wish to target the desktop and laptop markets. These days, people mostly use their computers to connect devices, transfer files, use their cell phones as a bluetooth modem, organize mp3s on their mp3 player, etc.
- What general modifications to the DinosaurOS scheduler can you hypothesize might help with it’s performance on I/O-heavy tasks.
- You need a scheduler which can strive to provide fairness to both CPU bound and I/O bound processes. This means it should not be Round Robin, and it should have the option to preempt processes before a quantum expires. Round Robin won't work because the CPU bound processes will effectively starve the I/O bound processes, since CPU bound uses its full quantum, while I/O bound will frequently relinquish control while it waits for I/O. A traditional approach gives a higher priority to I/O bound tasks so that they can do their small amount of work and then go back to waiting on I/O. While the I/O bound tasks are waiting -- CPU bound tasks can get access to the processor(s). Sometimes you might have multiple queues one for CPU bound processes and another for I/O bound processes. Or you might use Red Black trees for sorting processes on other metrics.
- What kinds of experiments would you run to find out if your hypotheses are correct?
I might gather a set of benchmark applications both CPU and I/O bound and test it across both the old scheduler and the new one. I would want to examine the following characteristics.
- Latency - The time it takes the scheduler to switch processes
- Throughput - The number of processes completed per unit time
- Wait Time - How long each process has to wait before given access to the processor
- Turnaround time - How long the time to completion is for each process
- CPU saturation - How active the CPU's are.
Question 2.3 IPC
(a) Describe the pros and cons of using shared memory vs. message passing abstractions.
- Pay attention to the abstractions keyword.
- The approach taken by distributed systems is generally message passing, e.g. MPI.
- OpenMP - shared memory IPC is used on single nodes.
- Shared Memory problems
- Cache coherency: a program will update memory and this needs to be reflected in all processes involved in the share.
- Shared Memory Advantage
- Speed - maybe, depending on the abstraction.
- Native data structures bundled up.
- Message passing makes shared data structures more complicated.
(b) Describe an implementation of send() and recv() message passing abstractions built on top of a shared memory mechanism in a manner that involves data/buffer copying. What are the pros and cons of this implementation?
- You've made it more difficult to utilize distributed systems.
- Kernel messages, etc would be faster in general.
Describe an implementation of send() and recv() message passing abstractions built on top of a shared memory mechanism in a manner that does not involve any data/buffer copying. What are the pros and cons of this implementation?
- One approach would be swapping pages.
- Synchronization issues
Question 3.1 Design
On a dual booting system, how might simply hibernating the unused OS, rather than shutting down the system result in corrupt files?
Describe the likely inconsistencies and their causes.
- I imagine this depends where the hibernating OS was written to disk. If this is a shared/raw partition, then without knowledge of the other hibernating OS on disk, there is nothing to prevent the active OS from overwriting blocks which store the hibernating OS.
- Phil says that hibernation is typically written to swap space.
- Scott says it's possible that the OS modifies superblock
Describe how design modifications to the OS and file-system format can solve this problem. You may assume both OSes are open source and you can make arbitrary changes to their kernels and filesystem formats.
- Creation of a file-system which maintains a shared file organization module is a possible solution.
- Or when you come out of hibernation make sure you reload the superblock off the disk.
- Another thought is that a lot of disc operations are buffered waiting to actually be applied to disk. We would need to ensure that these changes are applied before the system goes into hibernation to prevent inconsistencies and conflicts such as the hibernating OS writing to occupied space after being woken.
Describe a second design to solve this problem. This time, assume one OS is open source and arbitrarily modifiable, while the other OS, which uses the proprietary and unmodifiable NTFS filesystem format, keeps getting corrupted.
- Read only for the open source OS, while NTFS is mounted.
For both scenarios, try to maintain the capability of mounting file-systems with both read and write permissions. If this is not possible explain why not. Your answer should include a general discussion of relevant file-system design issues and give some ideas about what a file-system designed specifically to be shared among several live systems might look like. Also, you should consider both the file-system format on disk and the implementation of caching, buffering and everything that the OS does.
The layered file-system per The Dinosaur book:
- application programs
- logical file system (file control blocks)
- file-organization module
- basic file-system - unchanged
- I/O control - unchanged
- devices - unchanged
Question 3.2 Checkpointing
a) It depends, one thread is running at a time, there might be some lightweight synchronization for memory.
b) There are timing issues