Status: | Tech Preview |
Architecture(s): | Intel x86 |
Component(s): | Hypervisor, toolstack |
Hardware: | MBA is supported on Skylake Server and beyond |
The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate control over memory bandwidth available per-core. This feature provides OS/ hypervisor the ability to slow misbehaving apps/domains by using a credit-based throttling mechanism.
Feature Enabling:
Add “psr=mba” to boot line parameter to enable MBA feature.
xl interfaces:
psr-mba-show [domain-id|domain-name]
:
Show memory bandwidth throttling for domain. Under different modes, it shows different type of data.
There are two modes: Linear mode: the input precision is defined as 100-(MBA_MAX). For instance, if the MBA_MAX value is 90, the input precision is 10%. Values not an even multiple of the precision (e.g., 12%) will be rounded down (e.g., to 10% delay applied) by HW automatically. The response of throttling value is linear.
Non-linear mode: input delay values are powers-of-two from zero to the MBA_MAX value from CPUID. In this case any values not a power of two will be rounded down the next nearest power of two by HW automatically. The response of throttling value is non-linear.
For linear mode, it shows the decimal value. For non-linear mode, it shows hexadecimal value.
psr-mba-set [OPTIONS] <domain-id|domain-name> <throttling>
:
Set memory bandwidth throttling for domain.
Options: ‘-s’: Specify the socket to process, otherwise all sockets are processed.
Throttling value set in register implies the approximate amount of
delaying the traffic between core and memory. Higher throttling value
result in lower bandwidth. The max throttling value (MBA_MAX) supported
can be obtained through CPUID inside hypervisor. Users can fetch the
MBA_MAX value using the psr-hwinfo
xl command.
MBA is a member of Intel PSR features, it shares the base PSR infrastructure in Xen.
MBA defines a range of MSRs to support specifying a delay value (Thrtl) per COS, with details below.
+----------------------------+----------------+
| MSR (per socket) | Address |
+----------------------------+----------------+
| IA32_L2_QOS_Ext_BW_Thrtl_0 | 0xD50 |
+----------------------------+----------------+
| ... | ... |
+----------------------------+----------------+
| IA32_L2_QOS_Ext_BW_Thrtl_n | 0xD50+n |
+----------------------------+----------------+
When context switch happens, the COS ID of domain is written to
per-hyper- thread MSR IA32_PQR_ASSOC
, and then hardware
enforces bandwidth allocation according to the throttling value stored
in the Thrtl MSR register.
Generally speaking, MBA is completely independent of CAT/CDP, and any combination may be applied at any time, e.g. enabling MBA with CAT disabled.
But it needs to be noticed that MBA shares COS infrastructure with CAT, although MBA is enumerated by different CPUID leaf from CAT (which indicates that the max COS of MBA may be different from CAT). In some cases, a domain is permitted to have a COS that is beyond one (or more) of PSR features but within the others. For instance, let’s assume the max COS of MBA is 8 but the max COS of L3 CAT is 16, when a domain is assigned 9 as COS, the L3 CAT CBM associated to COS 9 would be enforced, but for MBA, the HW works as default value is set since COS 9 is beyond the max COS (8) of MBA.
Core COS/Thrtl association
When enforcing Memory Bandwidth Allocation, all cores of domains have the same default Thrtl MSR (COS0) which stores the same Thrtl (0). The default Thrtl MSR is used only in hypervisor and is transparent to tool stack and user.
System administrators can change PSR allocation policy at runtime by using the tool stack. Since MBA shares COS ID with CAT/CDP, a COS ID corresponds to a 2-tuple, like [CBM, Thrtl] with only-CAT enabled, when CDP is enabled, the COS ID corresponds to a 3-tuple, like [Code_CBM, Data_CBM, Thrtl]. If neither CAT nor CDP is enabled, things are easier, since one COS ID corresponds to one Thrtl.
VCPU schedule
This part reuses CAT COS infrastructure.
Multi-sockets
Different sockets may have different MBA capabilities (like max COS) although it is consistent on the same socket. So the capability of per-socket MBA is specified.
This part reuses CAT COS infrastructure.
Hypervisor interfaces:
Boot line param: “psr=mba” to enable the feature.
SYSCTL: - XEN_SYSCTL_PSR_MBA_get_info: Get system MBA information.
DOMCTL: - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a domain. - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a domain.
xl interfaces:
psr-mba-show [domain-id] Show system/domain runtime MBA throttling value. For linear mode, it shows the decimal value. For non-linear mode, it shows hexadecimal value. => XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL
psr-mba-set [OPTIONS]
psr-hwinfo Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA. => XEN_SYSCTL_PSR_MBA_get_info
Key data structure:
Feature HW info
``` struct { unsigned int thrtl_max; bool linear; } mba;
Member thrtl_max
thrtl_max
is the max throttling value to be set,
i.e. MBA_MAX.
Member linear
linear
means the response of delay value is linear or
not.
As mentioned above, MBA is a member of Intel PSR features, it shares the base PSR infrastructure in Xen. For example, the ‘cos_max’ is a common HW property for all features. So, for other data structure details, please refer to ‘intel_psr_cat_cdp.pandoc’.
MBA can only work on HW which supports it (check CPUID).
We can execute these commands to verify MBA on different HWs supporting them.
For example: 1. User can get the MBA hardware info through ‘psr-hwinfo’ command. From result, user can know if this hardware works under linear mode or non- linear mode, the max throttling value (MBA_MAX) and so on.
root@:~$ xl psr-hwinfo --mba
Memory Bandwidth Allocation (MBA):
Socket ID : 0
Linear Mode : Enabled
Maximum COS : 7
Maximum Throttling Value: 90
Default Throttling Value: 0
root@:~$ xl psr-mba-set 1 10
root@:~$ xl psr-mba-show 1
Socket ID : 0
Default THRTL : 0
ID NAME THRTL
1 ubuntu14 10
N/A
N/A
“INTEL RESOURCE DIRECTOR TECHNOLOGY (INTEL RDT) ALLOCATION FEATURES” Intel 64 and IA-32 Architectures Software Developer Manuals, vol3
Date | Revision | Version | Notes |
---|---|---|---|
2017-01-10 2017-07-10 | 1.0 1.1 | Xen 4.9 Xen 4.10 | Design document written Changes: 1. Modify data structure according to latest codes; 2. Add content for ‘Areas for improvement’; 3. Other minor changes. Changes: 1. Remove a special character to avoid error when building pandoc. Changes: 1. Add terminology ‘HW’. 2. Change ‘COS ID of VCPU’ to ‘COS ID of domain’. 3. Change ‘COS register’ to ‘Thrtl MSR’. 4. Explain the value shown for ‘psr-mba-show’ under different modes. 5. Remove content in ‘Areas for improvement’. Changes: 1. Add ‘<>’ for mandatory argument. Changes: 1. Modify words in ‘Overview’ to make it easier to understand. 2. Explain ‘linear/non-linear’ modes before mention them. 3. Explain throttling value more accurate. 4. Explain ‘MBA_MAX’. 5. Correct some words in ‘Design Overview’. 6. Change ‘mba_info’ to ‘mba’ according to code changes. Also, modify contents of it. 7. Add context in ‘Testing’ part to make things more clear. 8. Remove ‘n<64’ to avoid out-of-sync. Changes: 1. Add ‘domain-name’ as parameter of ‘psr-mba-show/ psr-mba-set’. 2. Fix some wordings. 3. Explain how user can know the MBA_MAX. 4. Move the description of ‘Linear mode/Non-linear mode’ into section of ‘psr-mba-show’. 5. Change ‘per-thread’ to ‘per-hyper-thread’. Changes: 1. Correct some words. 2. Change ‘xl psr-mba-set 1 0xa’ to ‘xl psr-mba-set 1 10’ Changes: 1. Correct some words. |