Arch Decision #1
Should the reference kernel use a single virtual address space?
Added by Phil Garcia about 3 years ago. Updated by Phil Garcia about 3 years ago.
| Status: | Under Discussion | ||
| Assignee: | - |
Problem Description
Single virtual address space or seperate virtual address spaces for all processes?
Resolution
Factors
| # | Status | Factors (in order of importance) | |
|---|---|---|---|
| 1 | Under Discussion | ||
Strategies
| # | Rejected? | Short name | Summary |
|---|
Issues
| # | Relation to AD | Status | Redmine Issue or External URL |
|---|
Discussion
Re: Should the reference kernel use a single virtual address space? - Added by Simon Wollwage about 3 years ago
I would vote for it, as it implies a heavy relief for the TLB and seems easy to implement (compared to the other models).
Re: Should the reference kernel use a single virtual address space? - Added by Ben Kloosterman about 3 years ago
Single address space is easier until you have to implement shared assemblies than it prob is about the same difficulty but shared assemblies are a long way off. Note another debate is whether you patch address on loading apps into user address space even if you use VM since it means no remapping of pages on context switch.
Note it is not just the full TLB and cache flush.
As a reference model there are benefits to simplicity .
Obviously im biased here since my Kernel will not have VM :-)
It also heavily depends on your OS and esp IPC design , my OS is a deeply asyncronous keneless design running lots of single threaded services. Which means that you are context switching VERY often and have very little static data. A more traditional monolihic OS would have a lot less switches.
Anyway there is a strong case for a flexible MM interface so you can do both in the long term.
Issue is not just TLB . cache flush - Added by Ben Kloosterman about 3 years ago
The price is not just the TLB / cache flush ..
1. TLB misses need to be handled. This requires getting the process id. I think TLB hit rates are 95-98%. With misses often costing a 100 cycles or more. Highly asyncronus systems will have much higher TLB miss rates.
2. Copying memory from private pages from one process to another (without shared mem) requires an intermediate buffer. This is a big issue with IPC /kernel calls.
3. Page table memory usage.
4. Normally, the entries in the x86 TLBs are not associated with any address space. Hence, every time there is a change in address space, such as a context switch, the entire TLB has to be flushed. Maintaining a tag which associates each TLB entry with an address space in software and comparing this tag during TLB lookup and TLB flush is very expensive, especially since the x86 TLB is designed to operate with very low latency and completely in hardware. Without such a tag you cannot run 2 process on the same core at the same time ( you need to stage them and flush the TPB in between).
Note Hyper threading shares the same TLB so you are limited to running 2 threads within a single process ( unless you can solve the above get pid / use tag issue) where in a system with no VMM each HT thread can run a different process.
5. Managing VM and Page Tables between Cores is a pain ...
Also note
In 2008, both Intel (Nehalem)[6] and AMD (SVM)[7] have introduced tags as part of the TLB entry and dedicated hardware which checks the tag during lookup. Even though these are not fully exploited, it is envisioned that in the future, these tags will identify the address space to which every TLB entry belongs. Thus a context switch will not result in the flushing of the TLB - but just changing the tag of the current address space to the tag of the address space of the new task. ( But the HT issue still exists)
Re: Should the reference kernel use a single virtual address space? - Added by Ben Kloosterman about 3 years ago
I think its viable to not share the code that accesses static data and then patch it.
eg
Shared Assembly is sompiled into 2 x86 libs made up of
Shared Code
Static user access stub .
Methods that access static data code are hence not shared..
This would remove the performance overhead for shared statics vs increasing memory usage. It may be more appropriate for traditional / monolithic OS which has more static code rather than a Minix style Micro Kernel.
Issue is not just TLB . cache flush - Added by Phil Garcia about 3 years ago
Ben Kloosterman wrote:
The price is not just the TLB / cache flush ..
1. TLB misses need to be handled. This requires getting the process id. I think TLB hit rates are 95-98%. With misses often costing a 100 cycles or more. Highly asyncronus systems will have much higher TLB miss rates.
Just a correction: On the x86/x64 platform TLB misses are completely handled by the hardware. The cost is two-three additional memory lookups on the Page Directory and Page Tables by the CPU.