Recent Changes - Search:

Research

Notes

Architecture

Faults

System

Planning

Background

OS

Misc

edit SideBar

SysPapers

Systems is fairly broad, so I'm not sure what is going to land in here. A fair amount of systems-ish papers can be found in the main Papers page.



Need to Categorize

Priority Inheritance Protocols: An Approach to Real-Time Synchronization

  • Sha, Lui and Rajkumar, Ragunathan and Lehoczky, John P
  • Deals with priority inversion (Others and I cite this paper). TODO: Read / Summarize
  • Attach:Sha1990Priority.pdf

@article{sha1990priority, title={Priority inheritance protocols: An approach to real-time synchronization}, author={Sha, Lui and Rajkumar, Ragunathan and Lehoczky, John P}, journal={Computers, IEEE Transactions on}, volume={39}, number={9}, pages={1175--1185}, year={1990}, publisher={IEEE} }


RTOS

A Comparison of Scheduling Latency in Linux, PREEMPT_RT, and LITMUSRT

  • Cerqueira, Felipe and Brandenburg, Bj{\"o}rn
  • Compares the groups custom real time Linux solution to the preempt_rt patch by using the standard cyclictest.
  • Attach:Cerqueira13SchedulingLatency.pdf

@article{brandenburg2013comparison, title={A Comparison of Scheduling Latency in Linux, PREEMPT RT, and LITMUSRT}, author={Cerqueira, Felipe and Brandenburg, Bj{\"o}rn}, journal={OSPERT 2013}, pages={20}, year={2013} }

Real-Time Operating Systems

  • Stankovic, John A and Rajkumar, Raj
  • A useful survey of the RTOS field. Breaks up current systems into an appropriate taxonomy, discusses each category, and then describes a few examples. Most examples are of research kernels, but unfortunately the paper is a bit dated. Also discusses common paradigms in the domain, but this section seems to be focused upon the authors' areas of research.
  • Attach:Stankovic04RTOS.pdf

@article{stankovic2004real, title={Real-time operating systems}, author={Stankovic, John A and Rajkumar, Raj}, journal={Real-Time Systems}, volume={28}, number={2-3}, pages={237--253}, year={2004}, publisher={Springer} }

Cost and Benefit of Separate Address Spaces in Real-Time Operating Systems

  • Mehnert, Frank and Hohmuth, Michael and Hartig, Hermann
  • This paper adequately shows the cost of using separate address spaces for RT tasks in mixed systems; the benefit is assumed (but the extra protection of separate address spaces is a reasonable assumption). RTLinux is used as the baseline; it runs RT threads within kernel space along with the Linux kernel and a RT executive which runs underneath everything. L4RTL is the new system which provides separate address spaces for RT tasks. It is based on DROPS (Dresden Real-time OPerating System (annoying acronym)), which uses Fiasco as the kernel (an L4 derivative)). L4Linux (user level) provides for regular Linux applications, and a user level real time library allows RTLinux applications to be run at user level. Bottom line, giving RT tasks separate address spaces adds to the WCET, average case and jitter. But it isn't that bad.
  • Attach:Mehnert02Cost.pdf

@inproceedings{mehnert2002cost, title={Cost and benefit of separate address spaces in real-time operating systems}, author={Mehnert, Frank and Hohmuth, Michael and Hartig, Hermann}, booktitle={Real-Time Systems Symposium, 2002. RTSS 2002. 23rd IEEE}, pages={124--133}, year={2002}, organization={IEEE} }



µ-kernel

Reorganizing UNIX for reliability

  • Herder, Jorrit N and Bos, Herbert and Gras, Ben and Homburg, Philip and Tanenbaum, Andrew S
  • Discusses the MINIX 3 microkernel, which is Unix-like. The reliability part is the ability to recover from errors in device drivers. TODO: Finish summary.
  • Attach:Herder2006Reorganizing.pdf

@incollection{herder2006reorganizing, title={Reorganizing UNIX for reliability}, author={Herder, Jorrit N and Bos, Herbert and Gras, Ben and Homburg, Philip and Tanenbaum, Andrew S}, booktitle={Advances in Computer Systems Architecture}, pages={81--94}, year={2006}, publisher={Springer} }

Keep Net Working-On a Dependable and Fast Networking Stack

  • Hruby, Tomas and Vogt, Dirk and Bos, Herbert and Tanenbaum, Andrew S
  • TODO: Read. Looks interesting: exploiting SMP archs for better micro-kernel performance.
  • Attach:Hruby2012Keep.pdf

@inproceedings{hruby2012keep, title={Keep net working-on a dependable and fast networking stack}, author={Hruby, Tomas and Vogt, Dirk and Bos, Herbert and Tanenbaum, Andrew S}, booktitle={Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on}, pages={1--12}, year={2012}, organization={IEEE} }

Much of this is related to seL4 below, but is general enough to consider separately.

On Micro-Kernel Construction

  • Liedtke, Jochen
  • Mostly here due to historical significance. Showed that µ-kernels could work by greatly reducing IPC costs.
  • Attach:Liedtke95Micro.pdf

@article{liedtke1995micro, title={On micro-kernel construction}, author={Liedtke, Jochen}, journal={ACM SIGOPS Operating Systems Review}, volume={29}, number={5}, pages={237--250}, year={1995}, publisher={ACM} }

Correct, fast, maintainable: choose any three!

  • Blackham, Bernard and Heiser, Gernot
  • A short, informal paper with a simple thesis and only a single anecdote to back it up: IPC fastpath code should be written and optimized in c, instead of assembly. The performance is almost as good, and the development and maintenance is easier (with the exception of being sensitive to compiler changes). Most useful as a good description of fastpaths and IPC in µ-kernels.
  • Attach:Blackham12Fastpath.pdf

@inproceedings{blackham2012correct, title={Correct, fast, maintainable: choose any three!}, author={Blackham, Bernard and Heiser, Gernot}, booktitle={Proceedings of the Asia-Pacific Workshop on Systems}, pages={13}, year={2012}, organization={ACM} }


Composite

Predictable, Efficient System-Level Fault Tolerance in C3

@inproceedings{song2013predictable, title={Predictable, Efficient System-Level Fault Tolerance in C\^{} 3}, author={Song, Jiguo and Wittrock, John and Parmer, Gabriel}, booktitle={Real-Time Systems Symposium (RTSS), 2013 IEEE 34th}, pages={21--32}, year={2013}, organization={IEEE} }

C’MON: a Predictable Monitoring Infrastructure for System-Level Latent Fault Detection and Recovery

  • Song, Jiguo and Parmer, Gabriel
  • Looks to prevent latent faults in system level components by inspecting them at an interface level. TODO: Finish reading and summarizing. The most interesting aspect to me at the moment is Appendix A, which lays out a justification for treating SEUs as periodic, and calculating their rate. Basically, a statement is made that there is a 99.999999% chance that two errors will not happen within x amount of time. They come up with a rate of 2 Hz.
  • Attach:Song15CMON.pdf

@inproceedings{song15cmon, title={C’MON: a Predictable Monitoring Infrastructure for System-Level Latent Fault Detection and Recovery}, author={Song, Jiguo and Parmer, Gabriel} booktitle={Real-Time and Embedded Technology and Applications Symposium (RTAS), 2015 IEEE 21st}, year={2015}, organization={IEEE} }


Fiasco

Where Have all the Cycles Gone? Investigating Runtime Overheads of OS-Assisted Replication

  • Döbel, Björn and Härtig, Hermann
  • Skimmed. Attempts to assess the where the performance costs of Romain (a redundant threading library for Fiasco.OC) come from. Interesting experimental notes, as well as the best description of Romain I have seen thus far. Mentions the use of compiler-based fault tolerance to harden the kernel, but authors were not able to apply it because existing techniques have only been applied to user level applications (investigate?).
  • Attach:Dobel13Cycles.pdf

@inproceedings{dobel2013have, title={Where Have all the Cycles Gone? Investigating Runtime Overheads of OS-Assisted Replication}, author={D{\"o}bel, Bj{\"o}rn and H{\"a}rtig, Hermann}, booktitle={Workshop on Software-Based Methods for Robust Embedded Systems, SOBRES}, volume={13}, year={2013} }

Stay Strong, Stay Safe: Enhancing Reliability of a Secure Operating System

  • Vogt, Dirk and Döbel, Björn and Lackorzynski, Adam
  • Skimmed. Introduces l4ReAnimator as a framework to restart applications running on top of Fiasco.OC (kernel) and L4Re (api). Fiasco.OC is capability based so much of the paper is concerned with proper management of capabilities. Should read further if of interest.
  • Attach:Vogt10Reliability.pdf

@inproceedings{vogt2010stay, title={Stay strong, stay safe: Enhancing reliability of a secure operating system}, author={Vogt, Dirk and D{\"o}bel, Bj{\"o}rn and Lackorzynski, Adam}, booktitle={Proceedings of the Workshop on Isolation and Integration for Dependable Systems (IIDS 2010), Paris, France, April 2010}, year={2010} }


seL4

This L4 derivative is intensely interesting as it is a µ-kernel that was designed to be verified and has also undergone WCET analysis.

Timing Analysis of a Protected Operating System Kernel

  • Blackham, Bernard and Shi, Yao and Chattopadhyay, Sudipta and Roychoudhury, Abhik and Heiser, Gernot
  • This was the first WCET analysis for interrupts in seL4. They lay the foundation for the static analysis, but are only able to achieve very pessimistic estimates. They also divided syscalls into two categories: open and closed. The open syscalls are long running and since all syscalls are non-preemptable (for now) the WCET is horrible. The closed calls are sufficient for a system that has been initialized and will not be creating or deleting kernel objects (ie. closed to new tasks).
  • Attach:Blackham11Timing.pdf

@inproceedings{blackham2011timing, title={Timing analysis of a protected operating system kernel}, author={Blackham, Bernard and Shi, Yao and Chattopadhyay, Sudipta and Roychoudhury, Abhik and Heiser, Gernot}, booktitle={Real-Time Systems Symposium (RTSS), 2011 IEEE 32nd}, pages={339--348}, year={2011}, organization={IEEE} }

Improving Interrupt Response Time in a Verifiable Protected Microkernel

  • Blackham, Bernard and Shi, Yao and Heiser, Gernot
  • This paper follows the formal verification and the original WCET analysis of seL4. To tackle the problem of long running syscalls, preemption-points are introduced. These points allow progress to be made, save the syscall state, check for any interrupts, and then resume the syscall. Other changes are made to sacrifice average case execution time for shorter WCET. This seems to be mostly accomplished by eliminating all lazy-processing and making a few structural changes to allow constant time operations. An important observation is that the time to complete a longer syscall is now potentially abysmal, as interrupts may slow progress to almost (but not quite) a halt. Re-verification has not yet been undertaken.
  • Attach:Blackham12ImprovingInterrupt.pdf

@inproceedings{Blackham:2012:IIR:2168836.2168869, author = {Blackham, Bernard and Shi, Yao and Heiser, Gernot}, title = {Improving interrupt response time in a verifiable protected microkernel}, booktitle = {Proceedings of the 7th ACM european conference on Computer Systems}, series = {EuroSys '12}, year = {2012}, isbn = {978-1-4503-1223-3}, location = {Bern, Switzerland}, pages = {323--336}, numpages = {14}, url = {http://doi.acm.org/10.1145/2168836.2168869}, doi = {10.1145/2168836.2168869}, acmid = {2168869}, publisher = {ACM}, address = {New York, NY, USA}, keywords = {formal verification, hard real-time systems, microkernels, trusted systems, worst-case execution time}, }

From L3 to seL4 What Have We Learnt in 20 Years of L4 Microkernels?

  • Elphinstone, Kevin and Heiser, Gernot
  • Really cool paper, and a good introduction to L4. Discusses the major changes to design over the years, and what was axed vs. what has survived. One does get the feeling that many of the original features were discarded; leaving a rather generic µ-kernel, but this is to be expected. After years of research and experimentation, the design has settled into what now may be considered best practices for a µ-kernel (or at least the gold standard to which new designs will be held). seL4 represents a major departure, and is covered in more detail in later papers.
  • Attach:Elphinstone13L3toseL4.pdf

@article{elphinstonel3, title={From L3 to seL4 What Have We Learnt in 20 Years of L4 Microkernels?}, author={Elphinstone, Kevin and Heiser, Gernot} }

seL4: Formal Verification of an Operating-System Kernel

  • Klein, Gerwin and Andronick, June and Elphinstone, Kevin and Heiser, Gernot and Cock, David and Derrin, Philip and Elkaduwe, Dhammika and Engelhardt, Kai and Kolanski, Rafal and Norrish, Michael and others
  • Kinda a big deal. The team was able to prove the Functional Correctness of seL4 (that it is a true implementation of the abstract specification). The section "implications" lists some of the derived guarantees such as no buffer-overflows, always well-formed data structures, and no memory leaks. The typical trusted computing base is still there (hello compiler), and the assembly code (100's of lines) could not be verified. So only the c part of the kernel (approx. 7500 lines) was verified. But this is still a huge step forward. The verification effort was massive.
  • Attach:Klein10seL4Formal.pdf

@article{klein2010sel4, title={seL4: formal verification of an operating-system kernel}, author={Klein, Gerwin and Andronick, June and Elphinstone, Kevin and Heiser, Gernot and Cock, David and Derrin, Philip and Elkaduwe, Dhammika and Engelhardt, Kai and Kolanski, Rafal and Norrish, Michael and others}, journal={Communications of the ACM}, volume={53}, number={6}, pages={107--115}, year={2010}, publisher={ACM} }

Edit - History - Print - Recent Changes - Search
Page last modified on October 27, 2015, at 01:29 PM