Discussion:
[Xen-devel] [PATCH v6 00/14] arm/mem_access: Walk guest page tables in SW if mem_access is active
Sergej Proskurin
2017-07-06 11:50:03 UTC
Permalink
Hi all,

The function p2m_mem_access_check_and_get_page is called from the function
get_page_from_gva if mem_access is active and the hardware-aided translation of
the given guest virtual address (gva) into machine address fails. That is, if
the stage-2 translation tables constrain access to the guests's page tables,
hardware-assisted translation will fail. The idea of the function
p2m_mem_access_check_and_get_page is thus to translate the given gva and check
the requested access rights in software. However, as the current implementation
of p2m_mem_access_check_and_get_page makes use of the hardware-aided gva to ipa
translation, the translation might also fail because of reasons stated above
and will become equally relevant for the altp2m implementation on ARM. As
such, we provide a software guest translation table walk to address the above
mentioned issue.

The current version of the implementation supports translation of both the
short-descriptor as well as the long-descriptor translation table format on
ARMv7 and ARMv8 (AArch32/AArch64).

This revised version incorporates the comments of the previous patch series. In
this patch we move (i) PAGE_*_* macros to xen/page-defs.h instead of xen/lib.h
and (ii) we move the function vgic_access_guest_memory to to guestcopy.c (and
asm/guest_access.h) and subsequently rename it to access_guest_memory_by_ipa
(as its functionality is not specific to vgic). Some additional changes
comprising code readability and correct type usage have been made and stated in
the individual commits.

The following patch series can be found on Github[0].

Cheers,
~Sergej

[0] https://github.com/sergej-proskurin/xen (branch arm-gpt-walk-v6)

Sergej Proskurin (14):
arm/mem_access: Add and cleanup (TCR_|TTBCR_)* defines
arm/mem_access: Move PAGE_*_* macros to xen/page-defs.h
arm/mem_access: Add defines supporting PTs with varying page sizes
arm/lpae: Introduce lpae_is_page helper
arm/mem_access: Add short-descriptor pte typedefs and macros
arm/mem_access: Introduce GV2M_EXEC permission
arm/mem_access: Introduce BIT_ULL bit operation
arm/mem_access: Introduce GENMASK_ULL bit operation
arm/guest_access: Move vgic_access_guest_memory to guest_access.h
arm/guest_access: Rename vgic_access_guest_memory
arm/mem_access: Add software guest-page-table walk
arm/mem_access: Add long-descriptor based gpt
arm/mem_access: Add short-descriptor based gpt
arm/mem_access: Walk the guest's pt in software

xen/arch/arm/Makefile | 1 +
xen/arch/arm/guest_walk.c | 629 +++++++++++++++++++++++++++++++++++++
xen/arch/arm/guestcopy.c | 50 +++
xen/arch/arm/mem_access.c | 31 +-
xen/arch/arm/vgic-v3-its.c | 37 +--
xen/arch/arm/vgic.c | 49 ---
xen/include/asm-arm/bitops.h | 1 +
xen/include/asm-arm/config.h | 2 +
xen/include/asm-arm/guest_access.h | 3 +
xen/include/asm-arm/guest_walk.h | 19 ++
xen/include/asm-arm/lpae.h | 67 ++++
xen/include/asm-arm/page.h | 1 +
xen/include/asm-arm/processor.h | 69 +++-
xen/include/asm-arm/short-desc.h | 130 ++++++++
xen/include/asm-arm/vgic.h | 3 -
xen/include/xen/bitops.h | 5 +-
xen/include/xen/iommu.h | 15 +-
xen/include/xen/page-defs.h | 24 ++
18 files changed, 1046 insertions(+), 90 deletions(-)
create mode 100644 xen/arch/arm/guest_walk.c
create mode 100644 xen/include/asm-arm/guest_walk.h
create mode 100644 xen/include/asm-arm/short-desc.h
create mode 100644 xen/include/xen/page-defs.h

--
2.13.2
Sergej Proskurin
2017-07-06 11:50:13 UTC
Permalink
This commit renames the function vgic_access_guest_memory to
access_guest_memory_by_ipa. As the function name suggests, the functions
expects an ipa as argument. Thus, to make the function's purpose more
clearly, we have also renamed the argument gva into ipa. All invocations
of this function have been adapted accordingly.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v6: We added this patch to our patch series.
---
xen/arch/arm/guestcopy.c | 8 ++++----
xen/arch/arm/vgic-v3-its.c | 36 ++++++++++++++++++------------------
xen/include/asm-arm/guest_access.h | 4 ++--
3 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 938ffe2668..9ea8cb79a4 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -123,11 +123,11 @@ unsigned long raw_copy_from_guest(void *to, const void __user *from, unsigned le
* Temporarily map one physical guest page and copy data to or from it.
* The data to be copied cannot cross a page boundary.
*/
-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool is_write)
+int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
+ uint32_t size, bool is_write)
{
struct page_info *page;
- uint64_t offset = gpa & ~PAGE_MASK; /* Offset within the mapped page */
+ uint64_t offset = ipa & ~PAGE_MASK; /* Offset within the mapped page */
p2m_type_t p2mt;
void *p;

@@ -139,7 +139,7 @@ int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
return -EINVAL;
}

- page = get_page_from_gfn(d, paddr_to_pfn(gpa), &p2mt, P2M_ALLOC);
+ page = get_page_from_gfn(d, paddr_to_pfn(ipa), &p2mt, P2M_ALLOC);
if ( !page )
{
printk(XENLOG_G_ERR "d%d: vITS: Failed to get table entry\n",
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 1af6820cab..72a5c70656 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -131,9 +131,9 @@ static int its_set_collection(struct virt_its *its, uint16_t collid,
if ( collid >= its->max_collections )
return -ENOENT;

- return vgic_access_guest_memory(its->d,
- addr + collid * sizeof(coll_table_entry_t),
- &vcpu_id, sizeof(vcpu_id), true);
+ return access_guest_memory_by_ipa(its->d,
+ addr + collid * sizeof(coll_table_entry_t),
+ &vcpu_id, sizeof(vcpu_id), true);
}

/* Must be called with the ITS lock held. */
@@ -149,9 +149,9 @@ static struct vcpu *get_vcpu_from_collection(struct virt_its *its,
if ( collid >= its->max_collections )
return NULL;

- ret = vgic_access_guest_memory(its->d,
- addr + collid * sizeof(coll_table_entry_t),
- &vcpu_id, sizeof(coll_table_entry_t), false);
+ ret = access_guest_memory_by_ipa(its->d,
+ addr + collid * sizeof(coll_table_entry_t),
+ &vcpu_id, sizeof(coll_table_entry_t), false);
if ( ret )
return NULL;

@@ -171,9 +171,9 @@ static int its_set_itt_address(struct virt_its *its, uint32_t devid,
if ( devid >= its->max_devices )
return -ENOENT;

- return vgic_access_guest_memory(its->d,
- addr + devid * sizeof(dev_table_entry_t),
- &itt_entry, sizeof(itt_entry), true);
+ return access_guest_memory_by_ipa(its->d,
+ addr + devid * sizeof(dev_table_entry_t),
+ &itt_entry, sizeof(itt_entry), true);
}

/*
@@ -189,9 +189,9 @@ static int its_get_itt(struct virt_its *its, uint32_t devid,
if ( devid >= its->max_devices )
return -EINVAL;

- return vgic_access_guest_memory(its->d,
- addr + devid * sizeof(dev_table_entry_t),
- itt, sizeof(*itt), false);
+ return access_guest_memory_by_ipa(its->d,
+ addr + devid * sizeof(dev_table_entry_t),
+ itt, sizeof(*itt), false);
}

/*
@@ -236,7 +236,7 @@ static bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
if ( addr == INVALID_PADDR )
return false;

- if ( vgic_access_guest_memory(its->d, addr, &itte, sizeof(itte), false) )
+ if ( access_guest_memory_by_ipa(its->d, addr, &itte, sizeof(itte), false) )
return false;

vcpu = get_vcpu_from_collection(its, itte.collection);
@@ -270,7 +270,7 @@ static bool write_itte(struct virt_its *its, uint32_t devid,
itte.collection = collid;
itte.vlpi = vlpi;

- if ( vgic_access_guest_memory(its->d, addr, &itte, sizeof(itte), true) )
+ if ( access_guest_memory_by_ipa(its->d, addr, &itte, sizeof(itte), true) )
return false;

return true;
@@ -415,8 +415,8 @@ static int update_lpi_property(struct domain *d, struct pending_irq *p)

addr = d->arch.vgic.rdist_propbase & GENMASK(51, 12);

- ret = vgic_access_guest_memory(d, addr + p->irq - LPI_OFFSET,
- &property, sizeof(property), false);
+ ret = access_guest_memory_by_ipa(d, addr + p->irq - LPI_OFFSET,
+ &property, sizeof(property), false);
if ( ret )
return ret;

@@ -920,8 +920,8 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its)
{
int ret;

- ret = vgic_access_guest_memory(d, addr + its->creadr,
- command, sizeof(command), false);
+ ret = access_guest_memory_by_ipa(d, addr + its->creadr,
+ command, sizeof(command), false);
if ( ret )
return ret;

diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 49716501a4..e321c8a144 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -10,8 +10,8 @@ unsigned long raw_copy_to_guest_flush_dcache(void *to, const void *from,
unsigned long raw_copy_from_guest(void *to, const void *from, unsigned len);
unsigned long raw_clear_guest(void *to, unsigned len);

-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool_t is_write);
+int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
+ uint32_t size, bool_t is_write);

#define __raw_copy_to_guest raw_copy_to_guest
#define __raw_copy_from_guest raw_copy_from_guest
--
2.13.2
Julien Grall
2017-07-17 15:43:31 UTC
Permalink
Hi Sergej,
Post by Sergej Proskurin
This commit renames the function vgic_access_guest_memory to
access_guest_memory_by_ipa. As the function name suggests, the functions
expects an ipa as argument. Thus, to make the function's purpose more
s/ipa/IPA/
Post by Sergej Proskurin
clearly, we have also renamed the argument gva into ipa. All invocations
The argument is call gpa not gva. gpa stands for "Guest Physical
Address" which is the name commonly used in Xen. IPA is the ARM naming.

So I am not convinced of the usefulness of this rename.
Post by Sergej Proskurin
of this function have been adapted accordingly.
---
---
v6: We added this patch to our patch series.
---
xen/arch/arm/guestcopy.c | 8 ++++----
xen/arch/arm/vgic-v3-its.c | 36 ++++++++++++++++++------------------
xen/include/asm-arm/guest_access.h | 4 ++--
3 files changed, 24 insertions(+), 24 deletions(-)
diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 938ffe2668..9ea8cb79a4 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -123,11 +123,11 @@ unsigned long raw_copy_from_guest(void *to, const void __user *from, unsigned le
* Temporarily map one physical guest page and copy data to or from it.
* The data to be copied cannot cross a page boundary.
*/
-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool is_write)
+int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
+ uint32_t size, bool is_write)
{
struct page_info *page;
- uint64_t offset = gpa & ~PAGE_MASK; /* Offset within the mapped page */
+ uint64_t offset = ipa & ~PAGE_MASK; /* Offset within the mapped page */
p2m_type_t p2mt;
void *p;
@@ -139,7 +139,7 @@ int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
return -EINVAL;
}
- page = get_page_from_gfn(d, paddr_to_pfn(gpa), &p2mt, P2M_ALLOC);
+ page = get_page_from_gfn(d, paddr_to_pfn(ipa), &p2mt, P2M_ALLOC);
if ( !page )
{
printk(XENLOG_G_ERR "d%d: vITS: Failed to get table entry\n",
You want to remove any mention of vITS in all the printks.

Cheers,
--
Julien Grall
Sergej Proskurin
2017-07-18 08:42:50 UTC
Permalink
Hi Julien,
Post by Julien Grall
Hi Sergej,
Post by Sergej Proskurin
This commit renames the function vgic_access_guest_memory to
access_guest_memory_by_ipa. As the function name suggests, the functions
expects an ipa as argument. Thus, to make the function's purpose more
s/ipa/IPA/
Post by Sergej Proskurin
clearly, we have also renamed the argument gva into ipa. All invocations
The argument is call gpa not gva. gpa stands for "Guest Physical
Address" which is the name commonly used in Xen. IPA is the ARM naming.
Thanks. I have mistyped that one.
Post by Julien Grall
So I am not convinced of the usefulness of this rename.
I Agree. As you have suggested to use access_guest_memory_by_ipa as
function name, I just wanted to be consistent. How about renaming the
function to access_guest_memory_by_gpa instead? Then, we would remain
consistent with Xen's naming conventions also on the function level.
Post by Julien Grall
Post by Sergej Proskurin
of this function have been adapted accordingly.
---
---
v6: We added this patch to our patch series.
---
xen/arch/arm/guestcopy.c | 8 ++++----
xen/arch/arm/vgic-v3-its.c | 36
++++++++++++++++++------------------
xen/include/asm-arm/guest_access.h | 4 ++--
3 files changed, 24 insertions(+), 24 deletions(-)
diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 938ffe2668..9ea8cb79a4 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -123,11 +123,11 @@ unsigned long raw_copy_from_guest(void *to,
const void __user *from, unsigned le
* Temporarily map one physical guest page and copy data to or from it.
* The data to be copied cannot cross a page boundary.
*/
-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool is_write)
+int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
+ uint32_t size, bool is_write)
{
struct page_info *page;
- uint64_t offset = gpa & ~PAGE_MASK; /* Offset within the mapped page */
+ uint64_t offset = ipa & ~PAGE_MASK; /* Offset within the mapped page */
p2m_type_t p2mt;
void *p;
@@ -139,7 +139,7 @@ int vgic_access_guest_memory(struct domain *d,
paddr_t gpa, void *buf,
return -EINVAL;
}
- page = get_page_from_gfn(d, paddr_to_pfn(gpa), &p2mt, P2M_ALLOC);
+ page = get_page_from_gfn(d, paddr_to_pfn(ipa), &p2mt, P2M_ALLOC);
if ( !page )
{
printk(XENLOG_G_ERR "d%d: vITS: Failed to get table entry\n",
You want to remove any mention of vITS in all the printks.
Thanks,
~Sergej
Julien Grall
2017-07-18 10:28:33 UTC
Permalink
Post by Sergej Proskurin
Hi Julien,
Post by Julien Grall
Hi Sergej,
Post by Sergej Proskurin
This commit renames the function vgic_access_guest_memory to
access_guest_memory_by_ipa. As the function name suggests, the functions
expects an ipa as argument. Thus, to make the function's purpose more
s/ipa/IPA/
Post by Sergej Proskurin
clearly, we have also renamed the argument gva into ipa. All invocations
The argument is call gpa not gva. gpa stands for "Guest Physical
Address" which is the name commonly used in Xen. IPA is the ARM naming.
Thanks. I have mistyped that one.
Post by Julien Grall
So I am not convinced of the usefulness of this rename.
I Agree. As you have suggested to use access_guest_memory_by_ipa as
function name, I just wanted to be consistent. How about renaming the
function to access_guest_memory_by_gpa instead? Then, we would remain
consistent with Xen's naming conventions also on the function level.
Either name is fine by me. I just don't see any reason to rename GPA to
IPA as both have the same meaning.

Cheers,
--
Julien Grall
Sergej Proskurin
2017-07-06 11:50:05 UTC
Permalink
The following commits introduce a software guest page table walk
software implementation that supports varying guest page size
granularities. This commit moves already existing
PAGE_(SHIFT|SIZE|MASK|ALIGN)_(4K|64K) and introduces corresponding
defines for 16K page granularity to a common place in xen/page-defs.h as
to allow the following commits to use the consolidated defines.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Jan Beulich <***@suse.com>
---
v6: Move in addition to PAGE_SHIFT_* also
PAGE_(SIZE|MASK|ALIGN)_(4K|64K) macros and introduce the
corresponding macros for 16K. Also, move the macros mentioned above
into xen/page-defs.h instead of xen/lib.h.
---
xen/include/xen/iommu.h | 15 +--------------
xen/include/xen/page-defs.h | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+), 14 deletions(-)
create mode 100644 xen/include/xen/page-defs.h

diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 5803e3f95b..08f43c5daf 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -20,6 +20,7 @@
#define _IOMMU_H_

#include <xen/init.h>
+#include <xen/page-defs.h>
#include <xen/spinlock.h>
#include <xen/pci.h>
#include <public/hvm/ioreq.h>
@@ -37,20 +38,6 @@ extern bool_t amd_iommu_perdev_intremap;

extern unsigned int iommu_dev_iotlb_timeout;

-#define IOMMU_PAGE_SIZE(sz) (1UL << PAGE_SHIFT_##sz)
-#define IOMMU_PAGE_MASK(sz) (~(u64)0 << PAGE_SHIFT_##sz)
-#define IOMMU_PAGE_ALIGN(sz, addr) (((addr) + ~PAGE_MASK_##sz) & PAGE_MASK_##sz)
-
-#define PAGE_SHIFT_4K (12)
-#define PAGE_SIZE_4K IOMMU_PAGE_SIZE(4K)
-#define PAGE_MASK_4K IOMMU_PAGE_MASK(4K)
-#define PAGE_ALIGN_4K(addr) IOMMU_PAGE_ALIGN(4K, addr)
-
-#define PAGE_SHIFT_64K (16)
-#define PAGE_SIZE_64K IOMMU_PAGE_SIZE(64K)
-#define PAGE_MASK_64K IOMMU_PAGE_MASK(64K)
-#define PAGE_ALIGN_64K(addr) IOMMU_PAGE_ALIGN(64K, addr)
-
int iommu_setup(void);

int iommu_add_device(struct pci_dev *pdev);
diff --git a/xen/include/xen/page-defs.h b/xen/include/xen/page-defs.h
new file mode 100644
index 0000000000..5e2a944db2
--- /dev/null
+++ b/xen/include/xen/page-defs.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_PAGE_DEFS_H__
+#define __XEN_PAGE_DEFS_H__
+
+/* Helpers for different page granularities. */
+#define PAGE_SIZE_GRAN(gran) (1UL << PAGE_SHIFT_##gran)
+#define PAGE_MASK_GRAN(gran) (~(0ULL) << PAGE_SHIFT_##gran)
+#define PAGE_ALIGN_GRAN(gran, addr) (((addr) + ~PAGE_MASK_##gran) & PAGE_MASK_##gran)
+
+#define PAGE_SHIFT_4K (12)
+#define PAGE_SIZE_4K PAGE_SIZE_GRAN(4K)
+#define PAGE_MASK_4K PAGE_MASK_GRAN(4K)
+#define PAGE_ALIGN_4K(addr) PAGE_ALIGN_GRAN(4K, addr)
+
+#define PAGE_SHIFT_16K (14)
+#define PAGE_SIZE_16K PAGE_SIZE_GRAN(16K)
+#define PAGE_MASK_16K PAGE_MASK_GRAN(16K)
+#define PAGE_ALIGN_16K(addr) PAGE_ALIGN_GRAN(16K, addr)
+
+#define PAGE_SHIFT_64K (16)
+#define PAGE_SIZE_64K PAGE_SIZE_GRAN(64K)
+#define PAGE_MASK_64K PAGE_MASK_GRAN(64K)
+#define PAGE_ALIGN_64K(addr) PAGE_ALIGN_GRAN(64K, addr)
+
+#endif /* __XEN_PAGE_DEFS_H__ */
--
2.13.2
Jan Beulich
2017-07-06 12:10:12 UTC
Permalink
Post by Sergej Proskurin
--- /dev/null
+++ b/xen/include/xen/page-defs.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_PAGE_DEFS_H__
+#define __XEN_PAGE_DEFS_H__
+
+/* Helpers for different page granularities. */
+#define PAGE_SIZE_GRAN(gran) (1UL << PAGE_SHIFT_##gran)
+#define PAGE_MASK_GRAN(gran) (~(0ULL) << PAGE_SHIFT_##gran)
Stray parentheses. I'm also unhappy about the type difference
between size and mask. I guess both would best be paddr_t.
That'll then also allow mask to be defined as -size. Another
alternative would be to use 1L for size, thus guaranteeing
suitable sign extension when used in contexts requiring a width
wider than long.
Post by Sergej Proskurin
+#define PAGE_ALIGN_GRAN(gran, addr) (((addr) + ~PAGE_MASK_##gran) & PAGE_MASK_##gran)
+
+#define PAGE_SHIFT_4K (12)
Stray parentheses again.

Also, with you adding a new header that'll fall under REST
maintainership, you should have Cc-ed all the REST maintainers
imo.

Jan
Sergej Proskurin
2017-07-06 14:53:37 UTC
Permalink
Hi Jan,
Post by Jan Beulich
Post by Sergej Proskurin
--- /dev/null
+++ b/xen/include/xen/page-defs.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_PAGE_DEFS_H__
+#define __XEN_PAGE_DEFS_H__
+
+/* Helpers for different page granularities. */
+#define PAGE_SIZE_GRAN(gran) (1UL << PAGE_SHIFT_##gran)
+#define PAGE_MASK_GRAN(gran) (~(0ULL) << PAGE_SHIFT_##gran)
Stray parentheses. I'm also unhappy about the type difference
between size and mask. I guess both would best be paddr_t.
That'll then also allow mask to be defined as -size. Another
alternative would be to use 1L for size, thus guaranteeing
suitable sign extension when used in contexts requiring a width
wider than long.
Sounds reasonable. How about using 1L for PAGE_SIZE_GRAN to ensure a
suitable sign extension for types wider than long and ~((paddr_t)0) for
PAGE_MASK_GRAN?
Post by Jan Beulich
Post by Sergej Proskurin
+#define PAGE_ALIGN_GRAN(gran, addr) (((addr) + ~PAGE_MASK_##gran) & PAGE_MASK_##gran)
+
+#define PAGE_SHIFT_4K (12)
Stray parentheses again.
Also, with you adding a new header that'll fall under REST
maintainership, you should have Cc-ed all the REST maintainers
imo.
Thank you for adding the REST maintainers in Cc. In addition, I have
included Konrad Rzeszutek Wilk and Julien Grall.

Cheers,
~Sergej
Jan Beulich
2017-07-06 15:24:09 UTC
Permalink
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
--- /dev/null
+++ b/xen/include/xen/page-defs.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_PAGE_DEFS_H__
+#define __XEN_PAGE_DEFS_H__
+
+/* Helpers for different page granularities. */
+#define PAGE_SIZE_GRAN(gran) (1UL << PAGE_SHIFT_##gran)
+#define PAGE_MASK_GRAN(gran) (~(0ULL) << PAGE_SHIFT_##gran)
Stray parentheses. I'm also unhappy about the type difference
between size and mask. I guess both would best be paddr_t.
That'll then also allow mask to be defined as -size. Another
alternative would be to use 1L for size, thus guaranteeing
suitable sign extension when used in contexts requiring a width
wider than long.
Sounds reasonable. How about using 1L for PAGE_SIZE_GRAN to ensure a
suitable sign extension for types wider than long and ~((paddr_t)0) for
PAGE_MASK_GRAN?
Once again - I really think the two should be of identical type.
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
+#define PAGE_ALIGN_GRAN(gran, addr) (((addr) + ~PAGE_MASK_##gran) & PAGE_MASK_##gran)
+
+#define PAGE_SHIFT_4K (12)
Stray parentheses again.
Also, with you adding a new header that'll fall under REST
maintainership, you should have Cc-ed all the REST maintainers
imo.
Thank you for adding the REST maintainers in Cc. In addition, I have
included Konrad Rzeszutek Wilk and Julien Grall.
Except that Konrad was Cc-ed, just that his list settings result in
you not seeing that he was.

Jan
Sergej Proskurin
2017-07-06 15:34:16 UTC
Permalink
Hi Jan,
Post by Jan Beulich
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
--- /dev/null
+++ b/xen/include/xen/page-defs.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_PAGE_DEFS_H__
+#define __XEN_PAGE_DEFS_H__
+
+/* Helpers for different page granularities. */
+#define PAGE_SIZE_GRAN(gran) (1UL << PAGE_SHIFT_##gran)
+#define PAGE_MASK_GRAN(gran) (~(0ULL) << PAGE_SHIFT_##gran)
Stray parentheses. I'm also unhappy about the type difference
between size and mask. I guess both would best be paddr_t.
That'll then also allow mask to be defined as -size. Another
alternative would be to use 1L for size, thus guaranteeing
suitable sign extension when used in contexts requiring a width
wider than long.
Sounds reasonable. How about using 1L for PAGE_SIZE_GRAN to ensure a
suitable sign extension for types wider than long and ~((paddr_t)0) for
PAGE_MASK_GRAN?
Once again - I really think the two should be of identical type.
Alright, then I will use paddr_t in both cases. Thank you.
Post by Jan Beulich
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
+#define PAGE_ALIGN_GRAN(gran, addr) (((addr) + ~PAGE_MASK_##gran) & PAGE_MASK_##gran)
+
+#define PAGE_SHIFT_4K (12)
Stray parentheses again.
Also, with you adding a new header that'll fall under REST
maintainership, you should have Cc-ed all the REST maintainers
imo.
Thank you for adding the REST maintainers in Cc. In addition, I have
included Konrad Rzeszutek Wilk and Julien Grall.
Except that Konrad was Cc-ed, just that his list settings result in
you not seeing that he was.
Cheers,
~Sergej
Jan Beulich
2017-07-06 16:20:46 UTC
Permalink
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
--- /dev/null
+++ b/xen/include/xen/page-defs.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_PAGE_DEFS_H__
+#define __XEN_PAGE_DEFS_H__
+
+/* Helpers for different page granularities. */
+#define PAGE_SIZE_GRAN(gran) (1UL << PAGE_SHIFT_##gran)
+#define PAGE_MASK_GRAN(gran) (~(0ULL) << PAGE_SHIFT_##gran)
Stray parentheses. I'm also unhappy about the type difference
between size and mask. I guess both would best be paddr_t.
That'll then also allow mask to be defined as -size. Another
alternative would be to use 1L for size, thus guaranteeing
suitable sign extension when used in contexts requiring a width
wider than long.
Sounds reasonable. How about using 1L for PAGE_SIZE_GRAN to ensure a
suitable sign extension for types wider than long and ~((paddr_t)0) for
PAGE_MASK_GRAN?
Once again - I really think the two should be of identical type.
Alright, then I will use paddr_t in both cases. Thank you.
Well, considering what I've said earlier, "in both cases" is a little
suspicious - when you define mask in terms of size, there's not
going to be any explicit 2nd use of the intended type.

Jan
Sergej Proskurin
2017-07-07 08:27:53 UTC
Permalink
Hi Jan,
Post by Jan Beulich
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
--- /dev/null
+++ b/xen/include/xen/page-defs.h
@@ -0,0 +1,24 @@
+#ifndef __XEN_PAGE_DEFS_H__
+#define __XEN_PAGE_DEFS_H__
+
+/* Helpers for different page granularities. */
+#define PAGE_SIZE_GRAN(gran) (1UL << PAGE_SHIFT_##gran)
+#define PAGE_MASK_GRAN(gran) (~(0ULL) << PAGE_SHIFT_##gran)
Stray parentheses. I'm also unhappy about the type difference
between size and mask. I guess both would best be paddr_t.
That'll then also allow mask to be defined as -size. Another
alternative would be to use 1L for size, thus guaranteeing
suitable sign extension when used in contexts requiring a width
wider than long.
Sounds reasonable. How about using 1L for PAGE_SIZE_GRAN to ensure a
suitable sign extension for types wider than long and ~((paddr_t)0) for
PAGE_MASK_GRAN?
Once again - I really think the two should be of identical type.
Alright, then I will use paddr_t in both cases. Thank you.
Well, considering what I've said earlier, "in both cases" is a little
suspicious - when you define mask in terms of size, there's not
going to be any explicit 2nd use of the intended type.
Apparently, I have misunderstood you before, now I see your point. I
will gladly define MASK through -SIZE and thus avoid having the type
difference at this point.

Cheers,
~Sergej
Sergej Proskurin
2017-07-06 11:50:10 UTC
Permalink
We introduce the BIT_ULL macro to using values of unsigned long long as
to enable setting bits of 64-bit registers on AArch32. In addition,
this commit adds a define holding the register width of 64 bit
double-word registers. This define simplifies using the associated
constants in the following commits.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
Reviewed-by: Julien Grall <***@arm.com>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v4: We reused the previous commit with the msg "arm/mem_access: Add
defines holding the width of 32/64bit regs" from v3, as we can reuse
the already existing define BITS_PER_WORD.

v5: Introduce a new macro BIT_ULL instead of changing the type of the
macro BIT.

Remove the define BITS_PER_DOUBLE_WORD.

v6: Add Julien Grall's Reviewed-by.
---
xen/include/asm-arm/bitops.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
index bda889841b..1cbfb9edb2 100644
--- a/xen/include/asm-arm/bitops.h
+++ b/xen/include/asm-arm/bitops.h
@@ -24,6 +24,7 @@
#define BIT(nr) (1UL << (nr))
#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_WORD))
#define BIT_WORD(nr) ((nr) / BITS_PER_WORD)
+#define BIT_ULL(nr) (1ULL << (nr))
#define BITS_PER_BYTE 8

#define ADDR (*(volatile int *) addr)
--
2.13.2
Sergej Proskurin
2017-07-06 11:50:16 UTC
Permalink
This commit adds functionality to walk the guest's page tables using the
short-descriptor translation table format for both ARMv7 and ARMv8. The
implementation is based on ARM DDI 0487B-a J1-6002 and ARM DDI 0406C-b
B3-1506.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v3: Move the implementation to ./xen/arch/arm/guest_copy.c.

Use defines instead of hardcoded values.

Cosmetic fixes & Added more coments.

v4: Adjusted the names of short-descriptor data-types.

Adapt the function to the new parameter of type "struct vcpu *".

Cosmetic fixes.

v5: Make use of the function vgic_access_guest_memory read page table
entries in guest memory. At the same time, eliminate the offsets
array, as there is no need for an array. Instead, we apply the
associated masks to compute the GVA offsets directly in the code.

Use GENMASK to compute complex masks to ease code readability.

Use the type uint32_t for the TTBR register.

Make use of L2DESC_{SMALL|LARGE}_PAGE_SHIFT instead of
PAGE_SHIFT_{4K|64K} macros.

Remove {L1|L2}DESC_* defines from this commit.

Add comments and cosmetic fixes.

v6: Remove the variable level from the function guest_walk_sd as it is a
left-over from previous commits and is not used anymore.

Remove the falsely added issue that applied the mask to the gva
using the %-operator in the L1DESC_PAGE_TABLE case. Instead, use the
&-operator as it should have been done in the first place.

Make use of renamed function access_guest_memory_by_ipa instead of
vgic_access_guest_memory.
---
xen/arch/arm/guest_walk.c | 142 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 140 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/guest_walk.c b/xen/arch/arm/guest_walk.c
index 965264a9e6..3f5d28afb7 100644
--- a/xen/arch/arm/guest_walk.c
+++ b/xen/arch/arm/guest_walk.c
@@ -19,6 +19,7 @@
#include <xen/sched.h>
#include <asm/guest_access.h>
#include <asm/guest_walk.h>
+#include <asm/short-desc.h>

/*
* The function guest_walk_sd translates a given GVA into an IPA using the
@@ -31,8 +32,145 @@ static int guest_walk_sd(const struct vcpu *v,
vaddr_t gva, paddr_t *ipa,
unsigned int *perms)
{
- /* Not implemented yet. */
- return -EFAULT;
+ int ret;
+ bool disabled = true;
+ uint32_t ttbr;
+ paddr_t mask, paddr;
+ short_desc_t pte;
+ register_t ttbcr = READ_SYSREG(TCR_EL1);
+ unsigned int n = ttbcr & TTBCR_N_MASK;
+ struct domain *d = v->domain;
+
+ mask = GENMASK_ULL(31, (32 - n));
+
+ if ( n == 0 || !(gva & mask) )
+ {
+ /*
+ * Use TTBR0 for GVA to IPA translation.
+ *
+ * Note that on AArch32, the TTBR0_EL1 register is 32-bit wide.
+ * Nevertheless, we have to use the READ_SYSREG64 macro, as it is
+ * required for reading TTBR0_EL1.
+ */
+ ttbr = READ_SYSREG64(TTBR0_EL1);
+
+ /* If TTBCR.PD0 is set, translations using TTBR0 are disabled. */
+ disabled = ttbcr & TTBCR_PD0;
+ }
+ else
+ {
+ /*
+ * Use TTBR1 for GVA to IPA translation.
+ *
+ * Note that on AArch32, the TTBR1_EL1 register is 32-bit wide.
+ * Nevertheless, we have to use the READ_SYSREG64 macro, as it is
+ * required for reading TTBR1_EL1.
+ */
+ ttbr = READ_SYSREG64(TTBR1_EL1);
+
+ /* If TTBCR.PD1 is set, translations using TTBR1 are disabled. */
+ disabled = ttbcr & TTBCR_PD1;
+
+ /*
+ * TTBR1 translation always works like n==0 TTBR0 translation (ARM DDI
+ * 0487B.a J1-6003).
+ */
+ n = 0;
+ }
+
+ if ( disabled )
+ return -EFAULT;
+
+ /*
+ * The address of the L1 descriptor for the initial lookup has the
+ * following format: [ttbr<31:14-n>:gva<31-n:20>:00] (ARM DDI 0487B.a
+ * J1-6003). Note that the following GPA computation already considers that
+ * the first level address translation might comprise up to four
+ * consecutive pages and does not need to be page-aligned if n > 2.
+ */
+ mask = GENMASK(31, (14 - n));
+ paddr = (ttbr & mask);
+
+ mask = GENMASK((31 - n), 20);
+ paddr |= (gva & mask) >> 18;
+
+ /* Access the guest's memory to read only one PTE. */
+ ret = access_guest_memory_by_ipa(d, paddr, &pte, sizeof(short_desc_t), false);
+ if ( ret )
+ return -EINVAL;
+
+ switch ( pte.walk.dt )
+ {
+ case L1DESC_INVALID:
+ return -EFAULT;
+
+ case L1DESC_PAGE_TABLE:
+ /*
+ * The address of the L2 descriptor has the following format:
+ * [l1desc<31:10>:gva<19:12>:00] (ARM DDI 0487B.aJ1-6004). Note that
+ * the following address computation already considers that the second
+ * level translation table does not need to be page aligned.
+ */
+ mask = GENMASK(19, 12);
+ paddr = (pte.walk.base << 10) | ((gva & mask) >> 10);
+
+ /* Access the guest's memory to read only one PTE. */
+ ret = access_guest_memory_by_ipa(d, paddr, &pte, sizeof(short_desc_t), false);
+ if ( ret )
+ return -EINVAL;
+
+ if ( pte.walk.dt == L2DESC_INVALID )
+ return -EFAULT;
+
+ if ( pte.pg.page ) /* Small page. */
+ {
+ mask = (1ULL << L2DESC_SMALL_PAGE_SHIFT) - 1;
+ *ipa = (pte.pg.base << L2DESC_SMALL_PAGE_SHIFT) | (gva & mask);
+
+ /* Set execute permissions associated with the small page. */
+ if ( !pte.pg.xn )
+ *perms |= GV2M_EXEC;
+ }
+ else /* Large page. */
+ {
+ mask = (1ULL << L2DESC_LARGE_PAGE_SHIFT) - 1;
+ *ipa = (pte.lpg.base << L2DESC_LARGE_PAGE_SHIFT) | (gva & mask);
+
+ /* Set execute permissions associated with the large page. */
+ if ( !pte.lpg.xn )
+ *perms |= GV2M_EXEC;
+ }
+
+ /* Set permissions so that the caller can check the flags by herself. */
+ if ( !pte.pg.ro )
+ *perms |= GV2M_WRITE;
+
+ break;
+
+ case L1DESC_SECTION:
+ case L1DESC_SECTION_PXN:
+ if ( !pte.sec.supersec ) /* Section */
+ {
+ mask = (1ULL << L1DESC_SECTION_SHIFT) - 1;
+ *ipa = (pte.sec.base << L1DESC_SECTION_SHIFT) | (gva & mask);
+ }
+ else /* Supersection */
+ {
+ mask = (1ULL << L1DESC_SUPERSECTION_SHIFT) - 1;
+ *ipa = gva & mask;
+ *ipa |= (paddr_t)(pte.supersec.base) << L1DESC_SUPERSECTION_SHIFT;
+ *ipa |= (paddr_t)(pte.supersec.extbase1) << L1DESC_SUPERSECTION_EXT_BASE1_SHIFT;
+ *ipa |= (paddr_t)(pte.supersec.extbase2) << L1DESC_SUPERSECTION_EXT_BASE2_SHIFT;
+ }
+
+ /* Set permissions so that the caller can check the flags by herself. */
+ if ( !pte.sec.ro )
+ *perms |= GV2M_WRITE;
+ if ( !pte.sec.xn )
+ *perms |= GV2M_EXEC;
+ }
+
+ return 0;
}

/*
--
2.13.2
Julien Grall
2017-07-17 16:26:57 UTC
Permalink
Hi Sergej,
Post by Sergej Proskurin
This commit adds functionality to walk the guest's page tables using the
short-descriptor translation table format for both ARMv7 and ARMv8. The
implementation is based on ARM DDI 0487B-a J1-6002 and ARM DDI 0406C-b
B3-1506.
Acked-by: Julien Grall <***@arm.com>

Cheers,
--
Julien Grall
Sergej Proskurin
2017-07-06 11:50:07 UTC
Permalink
This commit introduces a new helper that checks whether the target PTE
holds a page mapping or not. This helper will be used as part of the
following commits.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
Reviewed-by: Julien Grall <***@arm.com>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v6: Change the name of the lpae_page helper to lpae_is_page.

Add Julien Grall's Reviewed-by.
---
xen/include/asm-arm/lpae.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index f0b3d21aa7..bad401baea 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -153,6 +153,11 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
return (level < 3) && lpae_mapping(pte);
}

+static inline bool lpae_is_page(lpae_t pte, unsigned int level)
+{
+ return (level == 3) && lpae_valid(pte) && pte.walk.table;
+}
+
/*
* The ARMv8 architecture supports pages with different sizes (4K, 16K, and
* 64K). To enable page table walks for various configurations, the following
--
2.13.2
Sergej Proskurin
2017-07-06 11:50:11 UTC
Permalink
The current implementation of GENMASK is capable of creating bitmasks of
32-bit values on AArch32 and 64-bit values on AArch64. As we need to
create masks for 64-bit values on AArch32 as well, in this commit we
introduce the GENMASK_ULL bit operation. Please note that the
GENMASK_ULL implementation has been lifted from the linux kernel source
code.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Andrew Cooper <***@citrix.com>
Cc: George Dunlap <***@eu.citrix.com>
Cc: Ian Jackson <***@eu.citrix.com>
Cc: Jan Beulich <***@suse.com>
Cc: Julien Grall <***@arm.com>
Cc: Konrad Rzeszutek Wilk <***@oracle.com>
Cc: Stefano Stabellini <***@kernel.org>
Cc: Tim Deegan <***@xen.org>
Cc: Wei Liu <***@citrix.com>
---
v6: As similar patches have been already submitted and NACKED in the
past, we resubmit this patch with 'THE REST' maintainers in Cc to
discuss whether this patch shall be applied into common or put into
ARM related code.
---
xen/include/asm-arm/config.h | 2 ++
xen/include/xen/bitops.h | 5 ++++-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
index 5b6f3c985d..7fa412f1b1 100644
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -19,6 +19,8 @@
#define BITS_PER_LONG (BYTES_PER_LONG << 3)
#define POINTER_ALIGN BYTES_PER_LONG

+#define BITS_PER_LONG_LONG 64
+
/* xen_ulong_t is always 64 bits */
#define BITS_PER_XEN_ULONG 64

diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h
index bd0883ab22..3d85164bba 100644
--- a/xen/include/xen/bitops.h
+++ b/xen/include/xen/bitops.h
@@ -10,6 +10,9 @@
#define GENMASK(h, l) \
(((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h))))

+#define GENMASK_ULL(h, l) \
+ (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h))))
+
/*
* ffs: find first bit set. This is defined the same way as
* the libc and compiler builtin ffs routines, therefore
@@ -128,7 +131,7 @@ static inline int generic_fls64(__u64 x)
static __inline__ int get_bitmask_order(unsigned int count)
{
int order;
-
+
order = fls(count);
return order; /* We could be slightly more clever with -1 here... */
}
--
2.13.2
Jan Beulich
2017-07-06 12:18:35 UTC
Permalink
Post by Sergej Proskurin
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -19,6 +19,8 @@
#define BITS_PER_LONG (BYTES_PER_LONG << 3)
#define POINTER_ALIGN BYTES_PER_LONG
+#define BITS_PER_LONG_LONG 64
If we really want to be prepared for architectures where long long
is other than 64 bits wide, you'll have to also add this to x86.
Otherwise it's not needed here either.

Also how about BITS_PER_LLONG?
Post by Sergej Proskurin
@@ -128,7 +131,7 @@ static inline int generic_fls64(__u64 x)
static __inline__ int get_bitmask_order(unsigned int count)
{
int order;
-
+
order = fls(count);
return order; /* We could be slightly more clever with -1 here... */
}
If you really want to do cleanup here, please shrink the function
body to a single return statement. But then again I'm unconvinced
the function is actually correct (which could easily be the case for
an unused one), in particular for power-of-2 counts. Nor can I see
how this would be useful with anything more narrow than size_t
or unsigned long as parameter type.

Jan
Sergej Proskurin
2017-07-06 14:38:03 UTC
Permalink
Hi Jan,
Post by Jan Beulich
Post by Sergej Proskurin
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -19,6 +19,8 @@
#define BITS_PER_LONG (BYTES_PER_LONG << 3)
#define POINTER_ALIGN BYTES_PER_LONG
+#define BITS_PER_LONG_LONG 64
If we really want to be prepared for architectures where long long
is other than 64 bits wide, you'll have to also add this to x86.
Otherwise it's not needed here either.
Absolutely. I kinda got lost in the ARM world. Of course, I can (will)
do that as part of the next patch series. Thank you.
Post by Jan Beulich
Also how about BITS_PER_LLONG?
I am also fine with BITS_PER_LLONG. We used BITS_PER_LONG_LONG as it was
lifted from the Linux kernel (and suggested by Julien). It would be
great to know the opinion of the remaining maintainters about that.
Post by Jan Beulich
Post by Sergej Proskurin
@@ -128,7 +131,7 @@ static inline int generic_fls64(__u64 x)
static __inline__ int get_bitmask_order(unsigned int count)
{
int order;
-
+
order = fls(count);
return order; /* We could be slightly more clever with -1 here... */
}
If you really want to do cleanup here, please shrink the function
body to a single return statement. But then again I'm unconvinced
the function is actually correct (which could easily be the case for
an unused one), in particular for power-of-2 counts. Nor can I see
how this would be useful with anything more narrow than size_t
or unsigned long as parameter type.
Jan
Right. That whitespace elimination was actually unintended. However, if
you wish, I could do the cleanup in a separate patch.

Concerning the correctness of this function: I am not sure whether this
would be the right patch series to address that.

Thank you,
~Sergej
Jan Beulich
2017-07-06 15:22:42 UTC
Permalink
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
@@ -128,7 +131,7 @@ static inline int generic_fls64(__u64 x)
static __inline__ int get_bitmask_order(unsigned int count)
{
int order;
-
+
order = fls(count);
return order; /* We could be slightly more clever with -1 here... */
}
If you really want to do cleanup here, please shrink the function
body to a single return statement. But then again I'm unconvinced
the function is actually correct (which could easily be the case for
an unused one), in particular for power-of-2 counts. Nor can I see
how this would be useful with anything more narrow than size_t
or unsigned long as parameter type.
Right. That whitespace elimination was actually unintended. However, if
you wish, I could do the cleanup in a separate patch.
Concerning the correctness of this function: I am not sure whether this
would be the right patch series to address that.
In that case simply leave out the cleanup part?

Jan
Sergej Proskurin
2017-07-06 15:34:46 UTC
Permalink
Hi Jan,
Post by Jan Beulich
Post by Sergej Proskurin
Post by Jan Beulich
Post by Sergej Proskurin
@@ -128,7 +131,7 @@ static inline int generic_fls64(__u64 x)
static __inline__ int get_bitmask_order(unsigned int count)
{
int order;
-
+
order = fls(count);
return order; /* We could be slightly more clever with -1 here... */
}
If you really want to do cleanup here, please shrink the function
body to a single return statement. But then again I'm unconvinced
the function is actually correct (which could easily be the case for
an unused one), in particular for power-of-2 counts. Nor can I see
how this would be useful with anything more narrow than size_t
or unsigned long as parameter type.
Right. That whitespace elimination was actually unintended. However, if
you wish, I could do the cleanup in a separate patch.
Concerning the correctness of this function: I am not sure whether this
would be the right patch series to address that.
In that case simply leave out the cleanup part?
Yeap, will do.

Cheers,
~Sergej
Sergej Proskurin
2017-07-06 11:50:15 UTC
Permalink
This commit adds functionality to walk the guest's page tables using the
long-descriptor translation table format for both ARMv7 and ARMv8.
Similar to the hardware architecture, the implementation supports
different page granularities (4K, 16K, and 64K). The implementation is
based on ARM DDI 0487B.a J1-5922, J1-5999, and ARM DDI 0406C.b B3-1510.

Note that the current implementation lacks support for Large VA/PA on
ARMv8.2 architectures (LVA/LPA, 52-bit virtual and physical address
sizes). The associated location in the code is marked appropriately.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v2: Use TCR_SZ_MASK instead of TTBCR_SZ_MASK for ARM 32-bit guests using
the long-descriptor translation table format.

Cosmetic fixes.

v3: Move the implementation to ./xen/arch/arm/guest_copy.c.

Remove the array strides and declare the array grainsizes as static
const instead of just const to reduce the function stack overhead.

Move parts of the funtion guest_walk_ld into the static functions
get_ttbr_and_gran_64bit and get_top_bit to reduce complexity.

Use the macro BIT(x) instead of (1UL << x).

Add more comments && Cosmetic fixes.

v4: Move functionality responsible for determining the configured IPA
output-size into a separate function get_ipa_output_size. In this
function, we remove the previously used switch statement, which was
responsible for distinguishing between different IPA output-sizes.
Instead, we retrieve the information from the introduced ipa_sizes
array.

Remove the defines GRANULE_SIZE_INDEX_* and TTBR0_VALID from
guest_walk.h. Instead, introduce the enums granule_size_index
active_ttbr directly inside of guest_walk.c so that the associated
fields don't get exported.

Adapt the function to the new parameter of type "struct vcpu *".

Remove support for 52bit IPA output-sizes entirely from this commit.

Use lpae_* helpers instead of p2m_* helpers.

Cosmetic fixes & Additional comments.

v5: Make use of the function vgic_access_guest_memory to read page table
entries in guest memory.

Invert the indeces of the arrays "offsets" and "masks" and simplify
readability by using an appropriate macro for the entries.

Remove remaining CONFIG_ARM_64 #ifdefs.

Remove the use of the macros BITS_PER_WORD and BITS_PER_DOUBLE_WORD.

Use GENMASK_ULL instead of manually creating complex masks to ease
readability.

Also, create a macro CHECK_BASE_SIZE which simply reduces the code
size and simplifies readability.

Make use of the newly introduced lpae_page macro in the if-statement
to test for invalid/reserved mappings in the L3 PTE.

Cosmetic fixes and additional comments.

v6: Convert the macro CHECK_BASE_SIZE into a helper function
check_base_size. The use of the old CHECK_BASE_SIZE was confusing as
it affected the control-flow through a return as part of the macro.

Return the value -EFAULT instead of -EINVAL if access to the guest's
memory fails.

Simplify the check in the end of the table walk that ensures that
the found PTE is a page or a superpage. The new implementation
checks if the pte maps a valid page or a superpage and returns an
-EFAULT only if both conditions are not true.

Adjust the type of the array offsets to paddr_t instead of vaddr_t
to allow working with the changed *_table_offset_* helpers, which
return offsets of type paddr_t.

Make use of renamed function access_guest_memory_by_ipa instead of
vgic_access_guest_memory.
---
xen/arch/arm/guest_walk.c | 396 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 394 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/guest_walk.c b/xen/arch/arm/guest_walk.c
index 78badc2949..965264a9e6 100644
--- a/xen/arch/arm/guest_walk.c
+++ b/xen/arch/arm/guest_walk.c
@@ -15,7 +15,10 @@
* this program; If not, see <http://www.gnu.org/licenses/>.
*/

+#include <xen/domain_page.h>
#include <xen/sched.h>
+#include <asm/guest_access.h>
+#include <asm/guest_walk.h>

/*
* The function guest_walk_sd translates a given GVA into an IPA using the
@@ -33,6 +36,174 @@ static int guest_walk_sd(const struct vcpu *v,
}

/*
+ * Get the IPA output_size (configured in TCR_EL1) that shall be used for the
+ * long-descriptor based translation table walk.
+ */
+static int get_ipa_output_size(struct domain *d, register_t tcr,
+ unsigned int *output_size)
+{
+ unsigned int ips;
+
+ static const unsigned int ipa_sizes[7] = {
+ TCR_EL1_IPS_32_BIT_VAL,
+ TCR_EL1_IPS_36_BIT_VAL,
+ TCR_EL1_IPS_40_BIT_VAL,
+ TCR_EL1_IPS_42_BIT_VAL,
+ TCR_EL1_IPS_44_BIT_VAL,
+ TCR_EL1_IPS_48_BIT_VAL,
+ TCR_EL1_IPS_52_BIT_VAL
+ };
+
+ if ( is_64bit_domain(d) )
+ {
+ /* Get the intermediate physical address size. */
+ ips = (tcr & TCR_EL1_IPS_MASK) >> TCR_EL1_IPS_SHIFT;
+
+ /*
+ * Return an error on reserved IPA output-sizes and if the IPA
+ * output-size is 52bit.
+ *
+ * XXX: 52 bit output-size is not supported yet.
+ */
+ if ( ips > TCR_EL1_IPS_48_BIT )
+ return -EFAULT;
+
+ *output_size = ipa_sizes[ips];
+ }
+ else
+ *output_size = TCR_EL1_IPS_40_BIT_VAL;
+
+ return 0;
+}
+
+/* Normalized page granule size indices. */
+enum granule_size_index {
+ GRANULE_SIZE_INDEX_4K,
+ GRANULE_SIZE_INDEX_16K,
+ GRANULE_SIZE_INDEX_64K
+};
+
+/* Represent whether TTBR0 or TTBR1 is active. */
+enum active_ttbr {
+ TTBR0_ACTIVE,
+ TTBR1_ACTIVE
+};
+
+/*
+ * Select the TTBR(0|1)_EL1 that will be used for address translation using the
+ * long-descriptor translation table format and return the page granularity
+ * that is used by the selected TTBR. Please note that the TCR.TG0 and TCR.TG1
+ * encodings differ.
+ */
+static bool get_ttbr_and_gran_64bit(uint64_t *ttbr, unsigned int *gran,
+ register_t tcr, enum active_ttbr ttbrx)
+{
+ bool disabled;
+
+ if ( ttbrx == TTBR0_ACTIVE )
+ {
+ /* Normalize granule size. */
+ switch ( tcr & TCR_TG0_MASK )
+ {
+ case TCR_TG0_16K:
+ *gran = GRANULE_SIZE_INDEX_16K;
+ break;
+ case TCR_TG0_64K:
+ *gran = GRANULE_SIZE_INDEX_64K;
+ break;
+ default:
+ /*
+ * According to ARM DDI 0487B.a D7-2487, if the TCR_EL1.TG0 value
+ * is programmed to either a reserved value, or a size that has not
+ * been implemented, then the hardware will treat the field as if
+ * it has been programmed to an IMPLEMENTATION DEFINED choice.
+ *
+ * This implementation strongly follows the pseudo-code
+ * implementation from ARM DDI 0487B.a J1-5924 which suggests to
+ * fall back to 4K by default.
+ */
+ *gran = GRANULE_SIZE_INDEX_4K;
+ }
+
+ /* Use TTBR0 for GVA to IPA translation. */
+ *ttbr = READ_SYSREG64(TTBR0_EL1);
+
+ /* If TCR.EPD0 is set, translations using TTBR0 are disabled. */
+ disabled = tcr & TCR_EPD0;
+ }
+ else
+ {
+ /* Normalize granule size. */
+ switch ( tcr & TCR_EL1_TG1_MASK )
+ {
+ case TCR_EL1_TG1_16K:
+ *gran = GRANULE_SIZE_INDEX_16K;
+ break;
+ case TCR_EL1_TG1_64K:
+ *gran = GRANULE_SIZE_INDEX_64K;
+ break;
+ default:
+ /*
+ * According to ARM DDI 0487B.a D7-2486, if the TCR_EL1.TG1 value
+ * is programmed to either a reserved value, or a size that has not
+ * been implemented, then the hardware will treat the field as if
+ * it has been programmed to an IMPLEMENTATION DEFINED choice.
+ *
+ * This implementation strongly follows the pseudo-code
+ * implementation from ARM DDI 0487B.a J1-5924 which suggests to
+ * fall back to 4K by default.
+ */
+ *gran = GRANULE_SIZE_INDEX_4K;
+ }
+
+ /* Use TTBR1 for GVA to IPA translation. */
+ *ttbr = READ_SYSREG64(TTBR1_EL1);
+
+ /* If TCR.EPD1 is set, translations using TTBR1 are disabled. */
+ disabled = tcr & TCR_EPD1;
+ }
+
+ return disabled;
+}
+
+/*
+ * Get the MSB number of the GVA, according to "AddrTop" pseudocode
+ * implementation in ARM DDI 0487B.a J1-6066.
+ */
+static unsigned int get_top_bit(struct domain *d, vaddr_t gva, register_t tcr)
+{
+ unsigned int topbit;
+
+ /*
+ * IF EL1 is using AArch64 then addresses from EL0 using AArch32 are
+ * zero-extended to 64 bits (ARM DDI 0487B.a J1-6066).
+ */
+ if ( is_32bit_domain(d) )
+ topbit = 31;
+ else if ( is_64bit_domain(d) )
+ {
+ if ( ((gva & BIT_ULL(55)) && (tcr & TCR_EL1_TBI1)) ||
+ (!(gva & BIT_ULL(55)) && (tcr & TCR_EL1_TBI0)) )
+ topbit = 55;
+ else
+ topbit = 63;
+ }
+
+ return topbit;
+}
+
+/* Make sure the base address does not exceed its configured size. */
+static int check_base_size(unsigned int output_size, uint64_t base)
+{
+ paddr_t mask = GENMASK_ULL((TCR_EL1_IPS_48_BIT_VAL - 1), output_size);
+
+ if ( (output_size < TCR_EL1_IPS_48_BIT_VAL) && (base & mask) )
+ return -EFAULT;
+
+ return 0;
+}
+
+/*
* The function guest_walk_ld translates a given GVA into an IPA using the
* long-descriptor translation table format in software. This function assumes
* that the domain is running on the currently active vCPU. To walk the guest's
@@ -43,8 +214,229 @@ static int guest_walk_ld(const struct vcpu *v,
vaddr_t gva, paddr_t *ipa,
unsigned int *perms)
{
- /* Not implemented yet. */
- return -EFAULT;
+ int ret;
+ bool disabled = true;
+ bool ro_table = false, xn_table = false;
+ unsigned int t0_sz, t1_sz;
+ unsigned int level, gran;
+ unsigned int topbit = 0, input_size = 0, output_size;
+ uint64_t ttbr = 0;
+ paddr_t mask, paddr;
+ lpae_t pte;
+ register_t tcr = READ_SYSREG(TCR_EL1);
+ struct domain *d = v->domain;
+
+#define OFFSETS(gva, gran) \
+{ \
+ zeroeth_table_offset_##gran(gva), \
+ first_table_offset_##gran(gva), \
+ second_table_offset_##gran(gva), \
+ third_table_offset_##gran(gva) \
+}
+
+ const paddr_t offsets[3][4] = {
+ OFFSETS(gva, 4K),
+ OFFSETS(gva, 16K),
+ OFFSETS(gva, 64K)
+ };
+
+#undef OFFSETS
+
+#define MASKS(gran) \
+{ \
+ zeroeth_size(gran) - 1, \
+ first_size(gran) - 1, \
+ second_size(gran) - 1, \
+ third_size(gran) - 1 \
+}
+
+ static const paddr_t masks[3][4] = {
+ MASKS(4K),
+ MASKS(16K),
+ MASKS(64K)
+ };
+
+#undef MASKS
+
+ static const unsigned int grainsizes[3] = {
+ PAGE_SHIFT_4K,
+ PAGE_SHIFT_16K,
+ PAGE_SHIFT_64K
+ };
+
+ t0_sz = (tcr >> TCR_T0SZ_SHIFT) & TCR_SZ_MASK;
+ t1_sz = (tcr >> TCR_T1SZ_SHIFT) & TCR_SZ_MASK;
+
+ /* Get the MSB number of the GVA. */
+ topbit = get_top_bit(d, gva, tcr);
+
+ if ( is_64bit_domain(d) )
+ {
+ /* Select the TTBR(0|1)_EL1 that will be used for address translation. */
+
+ if ( (gva & BIT_ULL(topbit)) == 0 )
+ {
+ input_size = 64 - t0_sz;
+
+ /* Get TTBR0 and configured page granularity. */
+ disabled = get_ttbr_and_gran_64bit(&ttbr, &gran, tcr, TTBR0_ACTIVE);
+ }
+ else
+ {
+ input_size = 64 - t1_sz;
+
+ /* Get TTBR1 and configured page granularity. */
+ disabled = get_ttbr_and_gran_64bit(&ttbr, &gran, tcr, TTBR1_ACTIVE);
+ }
+
+ /*
+ * The current implementation supports intermediate physical address
+ * sizes (IPS) up to 48 bit.
+ *
+ * XXX: Determine whether the IPS_MAX_VAL is 48 or 52 in software.
+ */
+ if ( (input_size > TCR_EL1_IPS_48_BIT_VAL) ||
+ (input_size < TCR_EL1_IPS_MIN_VAL) )
+ return -EFAULT;
+ }
+ else
+ {
+ /* Granule size of AArch32 architectures is always 4K. */
+ gran = GRANULE_SIZE_INDEX_4K;
+
+ /* Select the TTBR(0|1)_EL1 that will be used for address translation. */
+
+ /*
+ * Check if the bits <31:32-t0_sz> of the GVA are set to 0 (DDI 0487B.a
+ * J1-5999). If so, TTBR0 shall be used for address translation.
+ */
+ mask = GENMASK_ULL(31, (32 - t0_sz));
+
+ if ( t0_sz == 0 || !(gva & mask) )
+ {
+ input_size = 32 - t0_sz;
+
+ /* Use TTBR0 for GVA to IPA translation. */
+ ttbr = READ_SYSREG64(TTBR0_EL1);
+
+ /* If TCR.EPD0 is set, translations using TTBR0 are disabled. */
+ disabled = tcr & TCR_EPD0;
+ }
+
+ /*
+ * Check if the bits <31:32-t1_sz> of the GVA are set to 1 (DDI 0487B.a
+ * J1-6000). If so, TTBR1 shall be used for address translation.
+ */
+ mask = GENMASK_ULL(31, (32 - t1_sz));
+
+ if ( ((t1_sz == 0) && !ttbr) || (t1_sz && (gva & mask) == mask) )
+ {
+ input_size = 32 - t1_sz;
+
+ /* Use TTBR1 for GVA to IPA translation. */
+ ttbr = READ_SYSREG64(TTBR1_EL1);
+
+ /* If TCR.EPD1 is set, translations using TTBR1 are disabled. */
+ disabled = tcr & TCR_EPD1;
+ }
+ }
+
+ if ( disabled )
+ return -EFAULT;
+
+ /*
+ * The starting level is the number of strides (grainsizes[gran] - 3)
+ * needed to consume the input address (ARM DDI 0487B.a J1-5924).
+ */
+ level = 4 - DIV_ROUND_UP((input_size - grainsizes[gran]), (grainsizes[gran] - 3));
+
+ /* Get the IPA output_size. */
+ ret = get_ipa_output_size(d, tcr, &output_size);
+ if ( ret )
+ return -EFAULT;
+
+ /* Make sure the base address does not exceed its configured size. */
+ ret = check_base_size(output_size, ttbr);
+ if ( ret )
+ return -EFAULT;
+
+ /*
+ * Compute the base address of the first level translation table that is
+ * given by TTBRx_EL1 (ARM DDI 0487B.a D4-2024 and J1-5926).
+ */
+ mask = GENMASK_ULL(47, grainsizes[gran]);
+ paddr = (ttbr & mask);
+
+ for ( ; ; level++ )
+ {
+ /*
+ * Add offset given by the GVA to the translation table base address.
+ * Shift the offset by 3 as it is 8-byte aligned.
+ */
+ paddr |= offsets[gran][level] << 3;
+
+ /* Access the guest's memory to read only one PTE. */
+ ret = access_guest_memory_by_ipa(d, paddr, &pte, sizeof(lpae_t), false);
+ if ( ret )
+ return -EFAULT;
+
+ /* Make sure the base address does not exceed its configured size. */
+ ret = check_base_size(output_size, pfn_to_paddr(pte.walk.base));
+ if ( ret )
+ return -EFAULT;
+
+ /*
+ * If page granularity is 64K, make sure the address is aligned
+ * appropriately.
+ */
+ if ( (output_size < TCR_EL1_IPS_52_BIT_VAL) &&
+ (gran == GRANULE_SIZE_INDEX_64K) &&
+ (pte.walk.base & 0xf) )
+ return -EFAULT;
+
+ /*
+ * Break if one of the following conditions is true:
+ *
+ * - We have found the PTE holding the IPA (level == 3).
+ * - The PTE is not valid.
+ * - If (level < 3) and the PTE is valid, we found a block descriptor.
+ */
+ if ( level == 3 || !lpae_valid(pte) || lpae_is_superpage(pte, level) )
+ break;
+
+ /*
+ * Temporarily store permissions of the table descriptor as they are
+ * inherited by page table attributes (ARM DDI 0487B.a J1-5928).
+ */
+ xn_table |= pte.pt.xnt; /* Execute-Never */
+ ro_table |= pte.pt.apt & BIT(1); /* Read-Only */
+
+ /* Compute the base address of the next level translation table. */
+ mask = GENMASK_ULL(47, grainsizes[gran]);
+ paddr = pfn_to_paddr(pte.walk.base) & mask;
+ }
+
+ /*
+ * According to to ARM DDI 0487B.a J1-5927, we return an error if the found
+ * PTE is invalid or holds a reserved entry (PTE<1:0> == x0)) or if the PTE
+ * maps a memory block at level 3 (PTE<1:0> == 01).
+ */
+ if ( !lpae_is_page(pte, level) && !lpae_is_superpage(pte, level) )
+ return -EFAULT;
+
+ *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[gran][level]);
+
+ /*
+ * Set permissions so that the caller can check the flags by herself. Note
+ * that stage 1 translations also inherit attributes from the tables
+ * (ARM DDI 0487B.a J1-5928).
+ */
+ if ( !pte.pt.ro && !ro_table )
+ *perms |= GV2M_WRITE;
+ if ( !pte.pt.xn && !xn_table )
+ *perms |= GV2M_EXEC;
+
+ return 0;
}

int guest_walk_tables(const struct vcpu *v, vaddr_t gva,
--
2.13.2


_______________________________________________
Xen-devel mailing list
Julien Grall
2017-07-17 16:18:04 UTC
Permalink
Hi Sergej,
Post by Sergej Proskurin
+/*
+ * Get the MSB number of the GVA, according to "AddrTop" pseudocode
+ * implementation in ARM DDI 0487B.a J1-6066.
+ */
+static unsigned int get_top_bit(struct domain *d, vaddr_t gva, register_t tcr)
+{
+ unsigned int topbit;
+
+ /*
+ * IF EL1 is using AArch64 then addresses from EL0 using AArch32 are
NIT: s/IF/If/
Post by Sergej Proskurin
+ * zero-extended to 64 bits (ARM DDI 0487B.a J1-6066).
+ */
+ if ( is_32bit_domain(d) )
+ topbit = 31;
+ else if ( is_64bit_domain(d) )
+ {
+ if ( ((gva & BIT_ULL(55)) && (tcr & TCR_EL1_TBI1)) ||
+ (!(gva & BIT_ULL(55)) && (tcr & TCR_EL1_TBI0)) )
+ topbit = 55;
+ else
+ topbit = 63;
+ }
+
+ return topbit;
+}
+
+/* Make sure the base address does not exceed its configured size. */
+static int check_base_size(unsigned int output_size, uint64_t base)
+{
+ paddr_t mask = GENMASK_ULL((TCR_EL1_IPS_48_BIT_VAL - 1), output_size);
+
+ if ( (output_size < TCR_EL1_IPS_48_BIT_VAL) && (base & mask) )
+ return -EFAULT;
+
+ return 0;
This function only return 0 or -EFAULT and the caller doesn't care of
the exact value. I would prefer if you return a boolean here.

[...]
Post by Sergej Proskurin
+ /*
+ * According to to ARM DDI 0487B.a J1-5927, we return an error if the found
+ * PTE is invalid or holds a reserved entry (PTE<1:0> == x0)) or if the PTE
+ * maps a memory block at level 3 (PTE<1:0> == 01).
+ */
+ if ( !lpae_is_page(pte, level) && !lpae_is_superpage(pte, level) )
+ return -EFAULT;
+
+ *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[gran][level]);
I haven't noticed it until now. When using 16KB and 64KB, you rely on
the bottom bits to be zeroed. Although, the guest could purposefully put
wrong value here. So you want to mask it as you do just above.

Furthermore, as other part of the Xen ARM you rely on the page size of
Xen to always be 4KB. This is not really true and this code will break
as soon as we introduce 16KB/64KB page granularity support in Xen. I
will have a look on what to do here. No need to worry about that for now.
Post by Sergej Proskurin
+
+ /*
+ * Set permissions so that the caller can check the flags by herself. Note
+ * that stage 1 translations also inherit attributes from the tables
+ * (ARM DDI 0487B.a J1-5928).
+ */
+ if ( !pte.pt.ro && !ro_table )
+ *perms |= GV2M_WRITE;
+ if ( !pte.pt.xn && !xn_table )
+ *perms |= GV2M_EXEC;
+
+ return 0;
}
int guest_walk_tables(const struct vcpu *v, vaddr_t gva,
Cheers,
--
Julien Grall
Sergej Proskurin
2017-07-06 11:50:09 UTC
Permalink
We extend the current implementation by an additional permission,
GV2M_EXEC, which will be used to describe execute permissions of PTE's
as part of our guest translation table walk implementation.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
Acked-by: Julien Grall <***@arm.com>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
xen/include/asm-arm/page.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index cef2f28914..b8d641bfaf 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -90,6 +90,7 @@
/* Flags for get_page_from_gva, gvirt_to_maddr etc */
#define GV2M_READ (0u<<0)
#define GV2M_WRITE (1u<<0)
+#define GV2M_EXEC (1u<<1)

#ifndef __ASSEMBLY__
--
2.13.2
Sergej Proskurin
2017-07-06 11:50:08 UTC
Permalink
The current implementation does not provide appropriate types for
short-descriptor translation table entries. As such, this commit adds new
types, which simplify managing the respective translation table entries.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
Acked-by: Julien Grall <***@arm.com>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v3: Add more short-descriptor related pte typedefs that will be used by
the following commits.

v4: Move short-descriptor pte typedefs out of page.h into short-desc.h.

Change the type unsigned int to bool of every bitfield in
short-descriptor related data-structures that holds only one bit.

Change the typedef names from pte_sd_* to short_desc_*.

v5: Add {L1|L2}DESC_* defines to this commit.

v6: Add Julien Grall's Acked-by.
---
xen/include/asm-arm/short-desc.h | 130 +++++++++++++++++++++++++++++++++++++++
1 file changed, 130 insertions(+)
create mode 100644 xen/include/asm-arm/short-desc.h

diff --git a/xen/include/asm-arm/short-desc.h b/xen/include/asm-arm/short-desc.h
new file mode 100644
index 0000000000..9652a103c4
--- /dev/null
+++ b/xen/include/asm-arm/short-desc.h
@@ -0,0 +1,130 @@
+#ifndef __ARM_SHORT_DESC_H__
+#define __ARM_SHORT_DESC_H__
+
+/*
+ * First level translation table descriptor types used by the AArch32
+ * short-descriptor translation table format.
+ */
+#define L1DESC_INVALID (0)
+#define L1DESC_PAGE_TABLE (1)
+#define L1DESC_SECTION (2)
+#define L1DESC_SECTION_PXN (3)
+
+/* Defines for section and supersection shifts. */
+#define L1DESC_SECTION_SHIFT (20)
+#define L1DESC_SUPERSECTION_SHIFT (24)
+#define L1DESC_SUPERSECTION_EXT_BASE1_SHIFT (32)
+#define L1DESC_SUPERSECTION_EXT_BASE2_SHIFT (36)
+
+/* Second level translation table descriptor types. */
+#define L2DESC_INVALID (0)
+
+/* Defines for small (4K) and large page (64K) shifts. */
+#define L2DESC_SMALL_PAGE_SHIFT (12)
+#define L2DESC_LARGE_PAGE_SHIFT (16)
+
+/*
+ * Comprises bits of the level 1 short-descriptor format representing
+ * a section.
+ */
+typedef struct __packed {
+ bool pxn:1; /* Privileged Execute Never */
+ bool sec:1; /* == 1 if section or supersection */
+ bool b:1; /* Bufferable */
+ bool c:1; /* Cacheable */
+ bool xn:1; /* Execute Never */
+ unsigned int dom:4; /* Domain field */
+ bool impl:1; /* Implementation defined */
+ unsigned int ap:2; /* AP[1:0] */
+ unsigned int tex:3; /* TEX[2:0] */
+ bool ro:1; /* AP[2] */
+ bool s:1; /* Shareable */
+ bool ng:1; /* Non-global */
+ bool supersec:1; /* Must be 0 for sections */
+ bool ns:1; /* Non-secure */
+ unsigned int base:12; /* Section base address */
+} short_desc_l1_sec_t;
+
+/*
+ * Comprises bits of the level 1 short-descriptor format representing
+ * a supersection.
+ */
+typedef struct __packed {
+ bool pxn:1; /* Privileged Execute Never */
+ bool sec:1; /* == 1 if section or supersection */
+ bool b:1; /* Bufferable */
+ bool c:1; /* Cacheable */
+ bool xn:1; /* Execute Never */
+ unsigned int extbase2:4; /* Extended base address, PA[39:36] */
+ bool impl:1; /* Implementation defined */
+ unsigned int ap:2; /* AP[1:0] */
+ unsigned int tex:3; /* TEX[2:0] */
+ bool ro:1; /* AP[2] */
+ bool s:1; /* Shareable */
+ bool ng:1; /* Non-global */
+ bool supersec:1; /* Must be 0 for sections */
+ bool ns:1; /* Non-secure */
+ unsigned int extbase1:4; /* Extended base address, PA[35:32] */
+ unsigned int base:8; /* Supersection base address */
+} short_desc_l1_supersec_t;
+
+/*
+ * Comprises bits of the level 2 short-descriptor format representing
+ * a small page.
+ */
+typedef struct __packed {
+ bool xn:1; /* Execute Never */
+ bool page:1; /* ==1 if small page */
+ bool b:1; /* Bufferable */
+ bool c:1; /* Cacheable */
+ unsigned int ap:2; /* AP[1:0] */
+ unsigned int tex:3; /* TEX[2:0] */
+ bool ro:1; /* AP[2] */
+ bool s:1; /* Shareable */
+ bool ng:1; /* Non-global */
+ unsigned int base:20; /* Small page base address */
+} short_desc_l2_page_t;
+
+/*
+ * Comprises bits of the level 2 short-descriptor format representing
+ * a large page.
+ */
+typedef struct __packed {
+ bool lpage:1; /* ==1 if large page */
+ bool page:1; /* ==0 if large page */
+ bool b:1; /* Bufferable */
+ bool c:1; /* Cacheable */
+ unsigned int ap:2; /* AP[1:0] */
+ unsigned int sbz:3; /* Should be zero */
+ bool ro:1; /* AP[2] */
+ bool s:1; /* Shareable */
+ bool ng:1; /* Non-global */
+ unsigned int tex:3; /* TEX[2:0] */
+ bool xn:1; /* Execute Never */
+ unsigned int base:16; /* Large page base address */
+} short_desc_l2_lpage_t;
+
+/*
+ * Comprises the bits required to walk page tables adhering to the
+ * short-descriptor translation table format.
+ */
+typedef struct __packed {
+ unsigned int dt:2; /* Descriptor type */
+ unsigned int pad1:8;
+ unsigned int base:22; /* Base address of block or next table */
+} short_desc_walk_t;
+
+/*
+ * Represents page table entries adhering to the short-descriptor translation
+ * table format.
+ */
+typedef union {
+ uint32_t bits;
+ short_desc_walk_t walk;
+ short_desc_l1_sec_t sec;
+ short_desc_l1_supersec_t supersec;
+ short_desc_l2_page_t pg;
+ short_desc_l2_lpage_t lpg;
+} short_desc_t;
+
+#endif /* __ARM_SHORT_DESC_H__ */
--
2.13.2
Sergej Proskurin
2017-07-06 11:50:04 UTC
Permalink
This commit adds (TCR_|TTBCR_)* defines to simplify access to the
respective register contents. At the same time, we adjust the macros
TCR_T0SZ and TCR_TG0_* by using the newly introduced TCR_T0SZ_SHIFT and
TCR_TG0_SHIFT instead of the hardcoded values.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
Acked-by: Julien Grall <***@arm.com>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v2: Define TCR_SZ_MASK in a way so that it can be also applied to 32-bit guests
using the long-descriptor translation table format.

Extend the previous commit by further defines allowing a simplified access
to the registers TCR_EL1 and TTBCR.

v3: Replace the hardcoded value 0 in the TCR_T0SZ macro with the newly
introduced TCR_T0SZ_SHIFT. Also, replace the hardcoded value 14 in
the TCR_TG0_* macros with the introduced TCR_TG0_SHIFT.

Comment when to apply the defines TTBCR_PD(0|1), according to ARM
DDI 0487B.a and ARM DDI 0406C.b.

Remove TCR_TB_* defines.

Comment when certain TCR_EL2 register fields can be applied.

v4: Cosmetic changes.

v5: Remove the shift by 0 of the TCR_SZ_MASK as it can be applied to
both TCR_T0SZ and TCR_T1SZ (which reside at different offsets).

Adjust commit message to make clear that we do not only add but also
cleanup some TCR_* defines.

v6: Changed the comment of TCR_SZ_MASK as we falsely referenced a
section instead of a page.

Add Julien Grall's Acked-by.
---
xen/include/asm-arm/processor.h | 69 ++++++++++++++++++++++++++++++++++++++---
1 file changed, 65 insertions(+), 4 deletions(-)

diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 855ded1b07..898160ce00 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -94,6 +94,13 @@
#define TTBCR_N_2KB _AC(0x03,U)
#define TTBCR_N_1KB _AC(0x04,U)

+/*
+ * TTBCR_PD(0|1) can be applied only if LPAE is disabled, i.e., TTBCR.EAE==0
+ * (ARM DDI 0487B.a G6-5203 and ARM DDI 0406C.b B4-1722).
+ */
+#define TTBCR_PD0 (_AC(1,U)<<4)
+#define TTBCR_PD1 (_AC(1,U)<<5)
+
/* SCTLR System Control Register. */
/* HSCTLR is a subset of this. */
#define SCTLR_TE (_AC(1,U)<<30)
@@ -154,7 +161,20 @@

/* TCR: Stage 1 Translation Control */

-#define TCR_T0SZ(x) ((x)<<0)
+#define TCR_T0SZ_SHIFT (0)
+#define TCR_T1SZ_SHIFT (16)
+#define TCR_T0SZ(x) ((x)<<TCR_T0SZ_SHIFT)
+
+/*
+ * According to ARM DDI 0487B.a, TCR_EL1.{T0SZ,T1SZ} (AArch64, page D7-2480)
+ * comprises 6 bits and TTBCR.{T0SZ,T1SZ} (AArch32, page G6-5204) comprises 3
+ * bits following another 3 bits for RES0. Thus, the mask for both registers
+ * should be 0x3f.
+ */
+#define TCR_SZ_MASK (_AC(0x3f,UL))
+
+#define TCR_EPD0 (_AC(0x1,UL)<<7)
+#define TCR_EPD1 (_AC(0x1,UL)<<23)

#define TCR_IRGN0_NC (_AC(0x0,UL)<<8)
#define TCR_IRGN0_WBWA (_AC(0x1,UL)<<8)
@@ -170,9 +190,50 @@
#define TCR_SH0_OS (_AC(0x2,UL)<<12)
#define TCR_SH0_IS (_AC(0x3,UL)<<12)

-#define TCR_TG0_4K (_AC(0x0,UL)<<14)
-#define TCR_TG0_64K (_AC(0x1,UL)<<14)
-#define TCR_TG0_16K (_AC(0x2,UL)<<14)
+/* Note that the fields TCR_EL1.{TG0,TG1} are not available on AArch32. */
+#define TCR_TG0_SHIFT (14)
+#define TCR_TG0_MASK (_AC(0x3,UL)<<TCR_TG0_SHIFT)
+#define TCR_TG0_4K (_AC(0x0,UL)<<TCR_TG0_SHIFT)
+#define TCR_TG0_64K (_AC(0x1,UL)<<TCR_TG0_SHIFT)
+#define TCR_TG0_16K (_AC(0x2,UL)<<TCR_TG0_SHIFT)
+
+/* Note that the field TCR_EL2.TG1 exists only if HCR_EL2.E2H==1. */
+#define TCR_EL1_TG1_SHIFT (30)
+#define TCR_EL1_TG1_MASK (_AC(0x3,UL)<<TCR_EL1_TG1_SHIFT)
+#define TCR_EL1_TG1_16K (_AC(0x1,UL)<<TCR_EL1_TG1_SHIFT)
+#define TCR_EL1_TG1_4K (_AC(0x2,UL)<<TCR_EL1_TG1_SHIFT)
+#define TCR_EL1_TG1_64K (_AC(0x3,UL)<<TCR_EL1_TG1_SHIFT)
+
+/*
+ * Note that the field TCR_EL1.IPS is not available on AArch32. Also, the field
+ * TCR_EL2.IPS exists only if HCR_EL2.E2H==1.
+ */
+#define TCR_EL1_IPS_SHIFT (32)
+#define TCR_EL1_IPS_MASK (_AC(0x7,ULL)<<TCR_EL1_IPS_SHIFT)
+#define TCR_EL1_IPS_32_BIT (_AC(0x0,ULL)<<TCR_EL1_IPS_SHIFT)
+#define TCR_EL1_IPS_36_BIT (_AC(0x1,ULL)<<TCR_EL1_IPS_SHIFT)
+#define TCR_EL1_IPS_40_BIT (_AC(0x2,ULL)<<TCR_EL1_IPS_SHIFT)
+#define TCR_EL1_IPS_42_BIT (_AC(0x3,ULL)<<TCR_EL1_IPS_SHIFT)
+#define TCR_EL1_IPS_44_BIT (_AC(0x4,ULL)<<TCR_EL1_IPS_SHIFT)
+#define TCR_EL1_IPS_48_BIT (_AC(0x5,ULL)<<TCR_EL1_IPS_SHIFT)
+#define TCR_EL1_IPS_52_BIT (_AC(0x6,ULL)<<TCR_EL1_IPS_SHIFT)
+
+/*
+ * The following values correspond to the bit masks represented by
+ * TCR_EL1_IPS_XX_BIT defines.
+ */
+#define TCR_EL1_IPS_32_BIT_VAL (32)
+#define TCR_EL1_IPS_36_BIT_VAL (36)
+#define TCR_EL1_IPS_40_BIT_VAL (40)
+#define TCR_EL1_IPS_42_BIT_VAL (42)
+#define TCR_EL1_IPS_44_BIT_VAL (44)
+#define TCR_EL1_IPS_48_BIT_VAL (48)
+#define TCR_EL1_IPS_52_BIT_VAL (52)
+#define TCR_EL1_IPS_MIN_VAL (25)
+
+/* Note that the fields TCR_EL2.TBI(0|1) exist only if HCR_EL2.E2H==1. */
+#define TCR_EL1_TBI0 (_AC(0x1,ULL)<<37)
+#define TCR_EL1_TBI1 (_AC(0x1,ULL)<<38)

#ifdef CONFIG_ARM_64
--
2.13.2
Sergej Proskurin
2017-07-06 11:50:14 UTC
Permalink
The function p2m_mem_access_check_and_get_page in mem_access.c
translates a gva to an ipa by means of the hardware functionality of the
ARM architecture. This is implemented in the function gva_to_ipa. If
mem_access is active, hardware-based gva to ipa translation might fail,
as gva_to_ipa uses the guest's translation tables, access to which might
be restricted by the active VTTBR. To address this issue, in this commit
we add a software-based guest-page-table walk, which will be used by the
function p2m_mem_access_check_and_get_page perform the gva to ipa
translation in software in one of the following commits.

Note: The introduced function guest_walk_tables assumes that the domain,
the gva of which is to be translated, is running on the currently active
vCPU. To walk the guest's page tables on a different vCPU, the following
registers would need to be loaded: TCR_EL1, TTBR0_EL1, TTBR1_EL1, and
SCTLR_EL1.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v2: Rename p2m_gva_to_ipa to p2m_walk_gpt and move it to p2m.c.

Move the functionality responsible for walking long-descriptor based
translation tables out of the function p2m_walk_gpt. Also move out
the long-descriptor based translation out of this commit.

Change function parameters in order to return access access rights
to a requested gva.

Cosmetic fixes.

v3: Rename the introduced functions to guest_walk_(tables|sd|ld) and
move the implementation to guest_copy.(c|h).

Set permissions in guest_walk_tables also if the MMU is disabled.

Change the function parameter of type "struct p2m_domain *" to
"struct vcpu *" in the function guest_walk_tables.

v4: Change the function parameter of type "struct p2m_domain *" to
"struct vcpu *" in the functions guest_walk_(sd|ld) as well.

v5: Merge two if-statements in guest_walk_tables to ease readability.

Set perms to GV2M_READ as to avoid undefined permissions.

Add Julien Grall's Acked-by.

v6: Adjusted change-log of v5.

Remove Julien Grall's Acked-by as we have changed the initialization
of perms. This needs to be reviewed.

Comment why we initialize perms with GV2M_READ by default. This is
due to the fact that in the current implementation we assume a GVA
to IPA translation with EL1 privileges. Since, valid mappings in the
first stage address translation table are readable by default for
EL1, we initialize perms with GV2M_READ and extend the permissions
according to the particular page table walk.
---
xen/arch/arm/Makefile | 1 +
xen/arch/arm/guest_walk.c | 99 ++++++++++++++++++++++++++++++++++++++++
xen/include/asm-arm/guest_walk.h | 19 ++++++++
3 files changed, 119 insertions(+)
create mode 100644 xen/arch/arm/guest_walk.c
create mode 100644 xen/include/asm-arm/guest_walk.h

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 49e1fb2f84..282d2c2949 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_HAS_GICV3) += gic-v3.o
obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
obj-y += guestcopy.o
+obj-y += guest_walk.o
obj-y += hvm.o
obj-y += io.o
obj-y += irq.o
diff --git a/xen/arch/arm/guest_walk.c b/xen/arch/arm/guest_walk.c
new file mode 100644
index 0000000000..78badc2949
--- /dev/null
+++ b/xen/arch/arm/guest_walk.c
@@ -0,0 +1,99 @@
+/*
+ * Guest page table walk
+ * Copyright (c) 2017 Sergej Proskurin <***@sec.in.tum.de>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/sched.h>
+
+/*
+ * The function guest_walk_sd translates a given GVA into an IPA using the
+ * short-descriptor translation table format in software. This function assumes
+ * that the domain is running on the currently active vCPU. To walk the guest's
+ * page table on a different vCPU, the following registers would need to be
+ * loaded: TCR_EL1, TTBR0_EL1, TTBR1_EL1, and SCTLR_EL1.
+ */
+static int guest_walk_sd(const struct vcpu *v,
+ vaddr_t gva, paddr_t *ipa,
+ unsigned int *perms)
+{
+ /* Not implemented yet. */
+ return -EFAULT;
+}
+
+/*
+ * The function guest_walk_ld translates a given GVA into an IPA using the
+ * long-descriptor translation table format in software. This function assumes
+ * that the domain is running on the currently active vCPU. To walk the guest's
+ * page table on a different vCPU, the following registers would need to be
+ * loaded: TCR_EL1, TTBR0_EL1, TTBR1_EL1, and SCTLR_EL1.
+ */
+static int guest_walk_ld(const struct vcpu *v,
+ vaddr_t gva, paddr_t *ipa,
+ unsigned int *perms)
+{
+ /* Not implemented yet. */
+ return -EFAULT;
+}
+
+int guest_walk_tables(const struct vcpu *v, vaddr_t gva,
+ paddr_t *ipa, unsigned int *perms)
+{
+ uint32_t sctlr = READ_SYSREG(SCTLR_EL1);
+ register_t tcr = READ_SYSREG(TCR_EL1);
+ unsigned int _perms;
+
+ /* We assume that the domain is running on the currently active domain. */
+ if ( v != current )
+ return -EFAULT;
+
+ /* Allow perms to be NULL. */
+ perms = perms ?: &_perms;
+
+ /*
+ * Currently, we assume a GVA to IPA translation with EL1 privileges.
+ * Since, valid mappings in the first stage address translation table are
+ * readable by default for EL1, we initialize perms with GV2M_READ and
+ * extend the permissions as part of the particular page table walk. Please
+ * note that the current implementation does not consider further
+ * attributes that distinguish between EL0 and EL1 permissions (EL0 might
+ * not have permissions on the particular mapping).
+ */
+ *perms = GV2M_READ;
+
+ /* If the MMU is disabled, there is no need to translate the gva. */
+ if ( !(sctlr & SCTLR_M) )
+ {
+ *ipa = gva;
+
+ /* Memory can be accessed without any restrictions. */
+ *perms = GV2M_READ|GV2M_WRITE|GV2M_EXEC;
+
+ return 0;
+ }
+
+ if ( is_32bit_domain(v->domain) && !(tcr & TTBCR_EAE) )
+ return guest_walk_sd(v, gva, ipa, perms);
+ else
+ return guest_walk_ld(v, gva, ipa, perms);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/guest_walk.h b/xen/include/asm-arm/guest_walk.h
new file mode 100644
index 0000000000..4ed8476e08
--- /dev/null
+++ b/xen/include/asm-arm/guest_walk.h
@@ -0,0 +1,19 @@
+#ifndef _XEN_GUEST_WALK_H
+#define _XEN_GUEST_WALK_H
+
+/* Walk the guest's page tables in software. */
+int guest_walk_tables(const struct vcpu *v,
+ vaddr_t gva,
+ paddr_t *ipa,
+ unsigned int *perms);
+
+#endif /* _XEN_GUEST_WALK_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--
2.13.2
Julien Grall
2017-07-17 15:47:18 UTC
Permalink
Hi Sergej,
Post by Sergej Proskurin
The function p2m_mem_access_check_and_get_page in mem_access.c
translates a gva to an ipa by means of the hardware functionality of the
ARM architecture. This is implemented in the function gva_to_ipa. If
mem_access is active, hardware-based gva to ipa translation might fail,
as gva_to_ipa uses the guest's translation tables, access to which might
be restricted by the active VTTBR. To address this issue, in this commit
we add a software-based guest-page-table walk, which will be used by the
function p2m_mem_access_check_and_get_page perform the gva to ipa
translation in software in one of the following commits.
Note: The introduced function guest_walk_tables assumes that the domain,
the gva of which is to be translated, is running on the currently active
vCPU. To walk the guest's page tables on a different vCPU, the following
registers would need to be loaded: TCR_EL1, TTBR0_EL1, TTBR1_EL1, and
SCTLR_EL1.
Acked-by: Julien Grall <***@arm.com>

Cheers,
--
Julien Grall
Sergej Proskurin
2017-07-06 11:50:12 UTC
Permalink
This commit moves the function vgic_access_guest_memory to guestcopy.c
and the header asm/guest_access.h. No functional changes are made.
Please note that the function will be renamed in the following commit.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v6: We added this patch to our patch series.
---
xen/arch/arm/guestcopy.c | 50 ++++++++++++++++++++++++++++++++++++++
xen/arch/arm/vgic-v3-its.c | 1 +
xen/arch/arm/vgic.c | 49 -------------------------------------
xen/include/asm-arm/guest_access.h | 3 +++
xen/include/asm-arm/vgic.h | 3 ---
5 files changed, 54 insertions(+), 52 deletions(-)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 413125f02b..938ffe2668 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -118,6 +118,56 @@ unsigned long raw_copy_from_guest(void *to, const void __user *from, unsigned le
}
return 0;
}
+
+/*
+ * Temporarily map one physical guest page and copy data to or from it.
+ * The data to be copied cannot cross a page boundary.
+ */
+int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
+ uint32_t size, bool is_write)
+{
+ struct page_info *page;
+ uint64_t offset = gpa & ~PAGE_MASK; /* Offset within the mapped page */
+ p2m_type_t p2mt;
+ void *p;
+
+ /* Do not cross a page boundary. */
+ if ( size > (PAGE_SIZE - offset) )
+ {
+ printk(XENLOG_G_ERR "d%d: vITS: memory access would cross page boundary\n",
+ d->domain_id);
+ return -EINVAL;
+ }
+
+ page = get_page_from_gfn(d, paddr_to_pfn(gpa), &p2mt, P2M_ALLOC);
+ if ( !page )
+ {
+ printk(XENLOG_G_ERR "d%d: vITS: Failed to get table entry\n",
+ d->domain_id);
+ return -EINVAL;
+ }
+
+ if ( !p2m_is_ram(p2mt) )
+ {
+ put_page(page);
+ printk(XENLOG_G_ERR "d%d: vITS: memory used by the ITS should be RAM.",
+ d->domain_id);
+ return -EINVAL;
+ }
+
+ p = __map_domain_page(page);
+
+ if ( is_write )
+ memcpy(p + offset, buf, size);
+ else
+ memcpy(buf, p + offset, size);
+
+ unmap_domain_page(p);
+ put_page(page);
+
+ return 0;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 9ef792f479..1af6820cab 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -39,6 +39,7 @@
#include <xen/sched.h>
#include <xen/sizes.h>
#include <asm/current.h>
+#include <asm/guest_access.h>
#include <asm/mmio.h>
#include <asm/gic_v3_defs.h>
#include <asm/gic_v3_its.h>
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 1e5107b9f8..7a4e3cdc88 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -638,55 +638,6 @@ void vgic_free_virq(struct domain *d, unsigned int virq)
}

/*
- * Temporarily map one physical guest page and copy data to or from it.
- * The data to be copied cannot cross a page boundary.
- */
-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool is_write)
-{
- struct page_info *page;
- uint64_t offset = gpa & ~PAGE_MASK; /* Offset within the mapped page */
- p2m_type_t p2mt;
- void *p;
-
- /* Do not cross a page boundary. */
- if ( size > (PAGE_SIZE - offset) )
- {
- printk(XENLOG_G_ERR "d%d: vITS: memory access would cross page boundary\n",
- d->domain_id);
- return -EINVAL;
- }
-
- page = get_page_from_gfn(d, paddr_to_pfn(gpa), &p2mt, P2M_ALLOC);
- if ( !page )
- {
- printk(XENLOG_G_ERR "d%d: vITS: Failed to get table entry\n",
- d->domain_id);
- return -EINVAL;
- }
-
- if ( !p2m_is_ram(p2mt) )
- {
- put_page(page);
- printk(XENLOG_G_ERR "d%d: vITS: memory used by the ITS should be RAM.",
- d->domain_id);
- return -EINVAL;
- }
-
- p = __map_domain_page(page);
-
- if ( is_write )
- memcpy(p + offset, buf, size);
- else
- memcpy(buf, p + offset, size);
-
- unmap_domain_page(p);
- put_page(page);
-
- return 0;
-}
-
-/*
* Local variables:
* mode: C
* c-file-style: "BSD"
diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 251e935597..49716501a4 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -10,6 +10,9 @@ unsigned long raw_copy_to_guest_flush_dcache(void *to, const void *from,
unsigned long raw_copy_from_guest(void *to, const void *from, unsigned len);
unsigned long raw_clear_guest(void *to, unsigned len);

+int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
+ uint32_t size, bool_t is_write);
+
#define __raw_copy_to_guest raw_copy_to_guest
#define __raw_copy_from_guest raw_copy_from_guest
#define __raw_clear_guest raw_clear_guest
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index d4ed23df28..e489d0bf21 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -217,9 +217,6 @@ extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
int vgic_v2_init(struct domain *d, int *mmio_count);
int vgic_v3_init(struct domain *d, int *mmio_count);

-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool_t is_write);
-
extern int domain_vgic_register(struct domain *d, int *mmio_count);
extern int vcpu_vgic_free(struct vcpu *v);
extern bool vgic_to_sgi(struct vcpu *v, register_t sgir,
--
2.13.2
Julien Grall
2017-07-17 15:38:27 UTC
Permalink
Hi Sergej,
Post by Sergej Proskurin
This commit moves the function vgic_access_guest_memory to guestcopy.c
and the header asm/guest_access.h. No functional changes are made.
Please note that the function will be renamed in the following commit.
Acked-by: Julien Grall <***@arm.com>

Cheers,
Post by Sergej Proskurin
---
---
v6: We added this patch to our patch series.
---
xen/arch/arm/guestcopy.c | 50 ++++++++++++++++++++++++++++++++++++++
xen/arch/arm/vgic-v3-its.c | 1 +
xen/arch/arm/vgic.c | 49 -------------------------------------
xen/include/asm-arm/guest_access.h | 3 +++
xen/include/asm-arm/vgic.h | 3 ---
5 files changed, 54 insertions(+), 52 deletions(-)
diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 413125f02b..938ffe2668 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -118,6 +118,56 @@ unsigned long raw_copy_from_guest(void *to, const void __user *from, unsigned le
}
return 0;
}
+
+/*
+ * Temporarily map one physical guest page and copy data to or from it.
+ * The data to be copied cannot cross a page boundary.
+ */
+int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
+ uint32_t size, bool is_write)
+{
+ struct page_info *page;
+ uint64_t offset = gpa & ~PAGE_MASK; /* Offset within the mapped page */
+ p2m_type_t p2mt;
+ void *p;
+
+ /* Do not cross a page boundary. */
+ if ( size > (PAGE_SIZE - offset) )
+ {
+ printk(XENLOG_G_ERR "d%d: vITS: memory access would cross page boundary\n",
+ d->domain_id);
+ return -EINVAL;
+ }
+
+ page = get_page_from_gfn(d, paddr_to_pfn(gpa), &p2mt, P2M_ALLOC);
+ if ( !page )
+ {
+ printk(XENLOG_G_ERR "d%d: vITS: Failed to get table entry\n",
+ d->domain_id);
+ return -EINVAL;
+ }
+
+ if ( !p2m_is_ram(p2mt) )
+ {
+ put_page(page);
+ printk(XENLOG_G_ERR "d%d: vITS: memory used by the ITS should be RAM.",
+ d->domain_id);
+ return -EINVAL;
+ }
+
+ p = __map_domain_page(page);
+
+ if ( is_write )
+ memcpy(p + offset, buf, size);
+ else
+ memcpy(buf, p + offset, size);
+
+ unmap_domain_page(p);
+ put_page(page);
+
+ return 0;
+}
+
/*
* mode: C
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 9ef792f479..1af6820cab 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -39,6 +39,7 @@
#include <xen/sched.h>
#include <xen/sizes.h>
#include <asm/current.h>
+#include <asm/guest_access.h>
#include <asm/mmio.h>
#include <asm/gic_v3_defs.h>
#include <asm/gic_v3_its.h>
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 1e5107b9f8..7a4e3cdc88 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -638,55 +638,6 @@ void vgic_free_virq(struct domain *d, unsigned int virq)
}
/*
- * Temporarily map one physical guest page and copy data to or from it.
- * The data to be copied cannot cross a page boundary.
- */
-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool is_write)
-{
- struct page_info *page;
- uint64_t offset = gpa & ~PAGE_MASK; /* Offset within the mapped page */
- p2m_type_t p2mt;
- void *p;
-
- /* Do not cross a page boundary. */
- if ( size > (PAGE_SIZE - offset) )
- {
- printk(XENLOG_G_ERR "d%d: vITS: memory access would cross page boundary\n",
- d->domain_id);
- return -EINVAL;
- }
-
- page = get_page_from_gfn(d, paddr_to_pfn(gpa), &p2mt, P2M_ALLOC);
- if ( !page )
- {
- printk(XENLOG_G_ERR "d%d: vITS: Failed to get table entry\n",
- d->domain_id);
- return -EINVAL;
- }
-
- if ( !p2m_is_ram(p2mt) )
- {
- put_page(page);
- printk(XENLOG_G_ERR "d%d: vITS: memory used by the ITS should be RAM.",
- d->domain_id);
- return -EINVAL;
- }
-
- p = __map_domain_page(page);
-
- if ( is_write )
- memcpy(p + offset, buf, size);
- else
- memcpy(buf, p + offset, size);
-
- unmap_domain_page(p);
- put_page(page);
-
- return 0;
-}
-
-/*
* mode: C
* c-file-style: "BSD"
diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 251e935597..49716501a4 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -10,6 +10,9 @@ unsigned long raw_copy_to_guest_flush_dcache(void *to, const void *from,
unsigned long raw_copy_from_guest(void *to, const void *from, unsigned len);
unsigned long raw_clear_guest(void *to, unsigned len);
+int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
+ uint32_t size, bool_t is_write);
+
#define __raw_copy_to_guest raw_copy_to_guest
#define __raw_copy_from_guest raw_copy_from_guest
#define __raw_clear_guest raw_clear_guest
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index d4ed23df28..e489d0bf21 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -217,9 +217,6 @@ extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
int vgic_v2_init(struct domain *d, int *mmio_count);
int vgic_v3_init(struct domain *d, int *mmio_count);
-int vgic_access_guest_memory(struct domain *d, paddr_t gpa, void *buf,
- uint32_t size, bool_t is_write);
-
extern int domain_vgic_register(struct domain *d, int *mmio_count);
extern int vcpu_vgic_free(struct vcpu *v);
extern bool vgic_to_sgi(struct vcpu *v, register_t sgir,
--
Julien Grall

_______________________________________________
Xen-devel mailing list
Sergej Proskurin
2017-07-18 09:49:42 UTC
Permalink
Hi Julien,
Post by Julien Grall
Hi Sergej,
Post by Sergej Proskurin
This commit moves the function vgic_access_guest_memory to guestcopy.c
and the header asm/guest_access.h. No functional changes are made.
Please note that the function will be renamed in the following commit.
Through Travis CI, I have noticed that clang had troubles compiling the
source, as it missed some types in guest_access.h. I fixed it by
including <xen/types.h> in guest_access.h. With this change, may I keep
your Acked-by or remove it in the next version?

Thanks,
~Sergej


_______________________________________________
Xen-devel mailing list
Julien Grall
2017-07-18 10:43:13 UTC
Permalink
Post by Sergej Proskurin
Hi Julien,
Hello Sergej,
Post by Sergej Proskurin
Post by Julien Grall
Hi Sergej,
Post by Sergej Proskurin
This commit moves the function vgic_access_guest_memory to guestcopy.c
and the header asm/guest_access.h. No functional changes are made.
Please note that the function will be renamed in the following commit.
Through Travis CI, I have noticed that clang had troubles compiling the
source, as it missed some types in guest_access.h. I fixed it by
including <xen/types.h> in guest_access.h. With this change, may I keep
your Acked-by or remove it in the next version?
I can't tell whether this is the right thing to do without seen the
error commit message.

But I am a bit surprised that Travis CI is trying to build Xen ARM with
clang... Last time at looked at it, I remember some missing patches in
Xen to use clang.

Cheers,
--
Julien Grall
Sergej Proskurin
2017-07-18 11:59:02 UTC
Permalink
Hi Julien,
Post by Julien Grall
Post by Sergej Proskurin
Hi Julien,
Hello Sergej,
Post by Sergej Proskurin
Post by Julien Grall
Hi Sergej,
Post by Sergej Proskurin
This commit moves the function vgic_access_guest_memory to guestcopy.c
and the header asm/guest_access.h. No functional changes are made.
Please note that the function will be renamed in the following commit.
Through Travis CI, I have noticed that clang had troubles compiling the
source, as it missed some types in guest_access.h. I fixed it by
including <xen/types.h> in guest_access.h. With this change, may I keep
your Acked-by or remove it in the next version?
I can't tell whether this is the right thing to do without seen the
error commit message.
I have just removed the upper include from guest_access.h to reproduce
the mentioned errors. To my surprise, Travis works right through without
generating any issues; it did though last week. It is likely that the
mentioned issues from last week have been provoked by my tests, which I
have not immediately recognized as such. I am very sorry for the noise!
Post by Julien Grall
But I am a bit surprised that Travis CI is trying to build Xen ARM
with clang... Last time at looked at it, I remember some missing
patches in Xen to use clang.
I recently startet using Travic CI with Xen again and was also surprised
that it worked with clang as well. I remember having troubles with it
about 6 months ago.

Cheers,
~Sergej
Julien Grall
2017-07-18 12:12:09 UTC
Permalink
Post by Sergej Proskurin
Hi Julien,
Post by Julien Grall
Post by Sergej Proskurin
Hi Julien,
Hello Sergej,
Post by Sergej Proskurin
Post by Julien Grall
Hi Sergej,
Post by Sergej Proskurin
This commit moves the function vgic_access_guest_memory to guestcopy.c
and the header asm/guest_access.h. No functional changes are made.
Please note that the function will be renamed in the following commit.
Through Travis CI, I have noticed that clang had troubles compiling the
source, as it missed some types in guest_access.h. I fixed it by
including <xen/types.h> in guest_access.h. With this change, may I keep
your Acked-by or remove it in the next version?
I can't tell whether this is the right thing to do without seen the
error commit message.
I have just removed the upper include from guest_access.h to reproduce
the mentioned errors. To my surprise, Travis works right through without
generating any issues; it did though last week. It is likely that the
mentioned issues from last week have been provoked by my tests, which I
have not immediately recognized as such. I am very sorry for the noise!
Post by Julien Grall
But I am a bit surprised that Travis CI is trying to build Xen ARM
with clang... Last time at looked at it, I remember some missing
patches in Xen to use clang.
I recently startet using Travic CI with Xen again and was also surprised
that it worked with clang as well. I remember having troubles with it
about 6 months ago.
Well clang is well supported for Xen x86. But for ARM... I would be
interested to see the full logs, clang version, and whether it
cross-compile.

Cheers,
--
Julien Grall
Sergej Proskurin
2017-07-18 12:20:58 UTC
Permalink
Post by Julien Grall
Post by Sergej Proskurin
Hi Julien,
Post by Julien Grall
Post by Sergej Proskurin
Hi Julien,
Hello Sergej,
Post by Sergej Proskurin
Post by Julien Grall
Hi Sergej,
Post by Sergej Proskurin
This commit moves the function vgic_access_guest_memory to guestcopy.c
and the header asm/guest_access.h. No functional changes are made.
Please note that the function will be renamed in the following commit.
Through Travis CI, I have noticed that clang had troubles compiling the
source, as it missed some types in guest_access.h. I fixed it by
including <xen/types.h> in guest_access.h. With this change, may I keep
your Acked-by or remove it in the next version?
I can't tell whether this is the right thing to do without seen the
error commit message.
I have just removed the upper include from guest_access.h to reproduce
the mentioned errors. To my surprise, Travis works right through without
generating any issues; it did though last week. It is likely that the
mentioned issues from last week have been provoked by my tests, which I
have not immediately recognized as such. I am very sorry for the noise!
Post by Julien Grall
But I am a bit surprised that Travis CI is trying to build Xen ARM
with clang... Last time at looked at it, I remember some missing
patches in Xen to use clang.
I recently startet using Travic CI with Xen again and was also surprised
that it worked with clang as well. I remember having troubles with it
about 6 months ago.
Well clang is well supported for Xen x86. But for ARM... I would be
interested to see the full logs, clang version, and whether it
cross-compile.
Sorry for the misunderstanding: As you said, Travis CI uses clang only
for x86 atm. Which is especially strange that I got issues last week, as
this patch does not touch any code that is shared between ARM and x86. I
can't explain it right now. Again, sorry for the noise.

Thanks,
~Sergej

Sergej Proskurin
2017-07-06 11:50:06 UTC
Permalink
The ARMv8 architecture supports pages with different (4K, 16K, and 64K) sizes.
To enable guest page table walks for various configurations, this commit
extends the defines and helpers of the current implementation.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
---
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v3: Eliminate redundant macro definitions by introducing generic macros.

v4: Replace existing macros with ones that generate static inline
helpers as to ease the readability of the code.

Move the introduced code into lpae.h

v5: Remove PAGE_SHIFT_* defines from lpae.h as we import them now from
the header xen/lib.h.

Remove *_guest_table_offset macros as to reduce the number of
exported macros which are only used once. Instead, use the
associated functionality directly within the
GUEST_TABLE_OFFSET_HELPERS.

Add comment in GUEST_TABLE_OFFSET_HELPERS stating that a page table
with 64K page size granularity does not have a zeroeth lookup level.

Add #undefs for GUEST_TABLE_OFFSET and GUEST_TABLE_OFFSET_HELPERS.

Remove CONFIG_ARM_64 #defines.

v6: Rename *_guest_table_offset_* helpers to *_table_offset_* as they
are sufficiently generic to be applied not only to the guest's page
table walks.

Change the type of the parameter and return value of the
*_table_offset_* helpers from vaddr_t to paddr_t to enable applying
these helpers also for other purposes such as computation of IPA
offsets in second stage translation tables.
---
xen/include/asm-arm/lpae.h | 62 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)

diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index a62b118630..f0b3d21aa7 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -3,6 +3,8 @@

#ifndef __ASSEMBLY__

+#include <xen/page-defs.h>
+
/*
* WARNING! Unlike the x86 pagetable code, where l1 is the lowest level and
* l4 is the root of the trie, the ARM pagetables follow ARM's documentation:
@@ -151,6 +153,66 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
return (level < 3) && lpae_mapping(pte);
}

+/*
+ * The ARMv8 architecture supports pages with different sizes (4K, 16K, and
+ * 64K). To enable page table walks for various configurations, the following
+ * helpers enable walking the translation table with varying page size
+ * granularities.
+ */
+
+#define LPAE_SHIFT_4K (9)
+#define LPAE_SHIFT_16K (11)
+#define LPAE_SHIFT_64K (13)
+
+#define lpae_entries(gran) (_AC(1,U) << LPAE_SHIFT_##gran)
+#define lpae_entry_mask(gran) (lpae_entries(gran) - 1)
+
+#define third_shift(gran) (PAGE_SHIFT_##gran)
+#define third_size(gran) ((paddr_t)1 << third_shift(gran))
+
+#define second_shift(gran) (third_shift(gran) + LPAE_SHIFT_##gran)
+#define second_size(gran) ((paddr_t)1 << second_shift(gran))
+
+#define first_shift(gran) (second_shift(gran) + LPAE_SHIFT_##gran)
+#define first_size(gran) ((paddr_t)1 << first_shift(gran))
+
+/* Note that there is no zeroeth lookup level with a 64K granule size. */
+#define zeroeth_shift(gran) (first_shift(gran) + LPAE_SHIFT_##gran)
+#define zeroeth_size(gran) ((paddr_t)1 << zeroeth_shift(gran))
+
+#define GUEST_TABLE_OFFSET(offs, gran) (offs & lpae_entry_mask(gran))
+#define GUEST_TABLE_OFFSET_HELPERS(gran) \
+static inline paddr_t third_table_offset_##gran##K(paddr_t va) \
+{ \
+ return GUEST_TABLE_OFFSET((va >> third_shift(gran##K)), gran##K); \
+} \
+ \
+static inline paddr_t second_table_offset_##gran##K(paddr_t va) \
+{ \
+ return GUEST_TABLE_OFFSET((va >> second_shift(gran##K)), gran##K); \
+} \
+ \
+static inline paddr_t first_table_offset_##gran##K(paddr_t va) \
+{ \
+ return GUEST_TABLE_OFFSET((va >> first_shift(gran##K)), gran##K); \
+} \
+ \
+static inline paddr_t zeroeth_table_offset_##gran##K(paddr_t va) \
+{ \
+ /* Note that there is no zeroeth lookup level with a 64K granule size. */ \
+ if ( gran == 64 ) \
+ return 0; \
+ else \
+ return GUEST_TABLE_OFFSET((va >> zeroeth_shift(gran##K)), gran##K); \
+} \
+
+GUEST_TABLE_OFFSET_HELPERS(4);
+GUEST_TABLE_OFFSET_HELPERS(16);
+GUEST_TABLE_OFFSET_HELPERS(64);
+
+#undef GUEST_TABLE_OFFSET
+#undef GUEST_TABLE_OFFSET_HELPERS
+
#endif /* __ASSEMBLY__ */

/*
--
2.13.2
Julien Grall
2017-07-17 14:12:39 UTC
Permalink
Hi Sergej,
Post by Sergej Proskurin
The ARMv8 architecture supports pages with different (4K, 16K, and 64K) sizes.
NIT: ARMv8 supports both AArch32 and AArch64. However, only AArch64
supports different page granularities. To avoid confusion, please
s/ARMv8 architecture/AArch64/
Post by Sergej Proskurin
To enable guest page table walks for various configurations, this commit
extends the defines and helpers of the current implementation.
---
---
v3: Eliminate redundant macro definitions by introducing generic macros.
v4: Replace existing macros with ones that generate static inline
helpers as to ease the readability of the code.
Move the introduced code into lpae.h
v5: Remove PAGE_SHIFT_* defines from lpae.h as we import them now from
the header xen/lib.h.
Remove *_guest_table_offset macros as to reduce the number of
exported macros which are only used once. Instead, use the
associated functionality directly within the
GUEST_TABLE_OFFSET_HELPERS.
Add comment in GUEST_TABLE_OFFSET_HELPERS stating that a page table
with 64K page size granularity does not have a zeroeth lookup level.
Add #undefs for GUEST_TABLE_OFFSET and GUEST_TABLE_OFFSET_HELPERS.
Remove CONFIG_ARM_64 #defines.
v6: Rename *_guest_table_offset_* helpers to *_table_offset_* as they
are sufficiently generic to be applied not only to the guest's page
table walks.
Change the type of the parameter and return value of the
*_table_offset_* helpers from vaddr_t to paddr_t to enable applying
these helpers also for other purposes such as computation of IPA
offsets in second stage translation tables.
---
xen/include/asm-arm/lpae.h | 62 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index a62b118630..f0b3d21aa7 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -3,6 +3,8 @@
#ifndef __ASSEMBLY__
+#include <xen/page-defs.h>
+
/*
* WARNING! Unlike the x86 pagetable code, where l1 is the lowest level and
@@ -151,6 +153,66 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
return (level < 3) && lpae_mapping(pte);
}
+/*
+ * The ARMv8 architecture supports pages with different sizes (4K, 16K, and
Same here.
Post by Sergej Proskurin
+ * 64K). To enable page table walks for various configurations, the following
+ * helpers enable walking the translation table with varying page size
+ * granularities.
+ */
+
+#define LPAE_SHIFT_4K (9)
+#define LPAE_SHIFT_16K (11)
+#define LPAE_SHIFT_64K (13)
+
+#define lpae_entries(gran) (_AC(1,U) << LPAE_SHIFT_##gran)
+#define lpae_entry_mask(gran) (lpae_entries(gran) - 1)
+
+#define third_shift(gran) (PAGE_SHIFT_##gran)
+#define third_size(gran) ((paddr_t)1 << third_shift(gran))
+
+#define second_shift(gran) (third_shift(gran) + LPAE_SHIFT_##gran)
+#define second_size(gran) ((paddr_t)1 << second_shift(gran))
+
+#define first_shift(gran) (second_shift(gran) + LPAE_SHIFT_##gran)
+#define first_size(gran) ((paddr_t)1 << first_shift(gran))
+
+/* Note that there is no zeroeth lookup level with a 64K granule size. */
+#define zeroeth_shift(gran) (first_shift(gran) + LPAE_SHIFT_##gran)
+#define zeroeth_size(gran) ((paddr_t)1 << zeroeth_shift(gran))
+
+#define GUEST_TABLE_OFFSET(offs, gran) (offs & lpae_entry_mask(gran))
+#define GUEST_TABLE_OFFSET_HELPERS(gran) \
You renamed *_guest_table_offset to *_table_offset but not GUEST_TABLE_*
one.

Please drop GUEST from both macros.

With that fixed:

Reviewed-by: Julien Grall <***@arm.com>

Cheers,
Post by Sergej Proskurin
+static inline paddr_t third_table_offset_##gran##K(paddr_t va) \
+{ \
+ return GUEST_TABLE_OFFSET((va >> third_shift(gran##K)), gran##K); \
+} \
+ \
+static inline paddr_t second_table_offset_##gran##K(paddr_t va) \
+{ \
+ return GUEST_TABLE_OFFSET((va >> second_shift(gran##K)), gran##K); \
+} \
+ \
+static inline paddr_t first_table_offset_##gran##K(paddr_t va) \
+{ \
+ return GUEST_TABLE_OFFSET((va >> first_shift(gran##K)), gran##K); \
+} \
+ \
+static inline paddr_t zeroeth_table_offset_##gran##K(paddr_t va) \
+{ \
+ /* Note that there is no zeroeth lookup level with a 64K granule size. */ \
+ if ( gran == 64 ) \
+ return 0; \
+ else \
+ return GUEST_TABLE_OFFSET((va >> zeroeth_shift(gran##K)), gran##K); \
+} \
+
+GUEST_TABLE_OFFSET_HELPERS(4);
+GUEST_TABLE_OFFSET_HELPERS(16);
+GUEST_TABLE_OFFSET_HELPERS(64);
+
+#undef GUEST_TABLE_OFFSET
+#undef GUEST_TABLE_OFFSET_HELPERS
+
#endif /* __ASSEMBLY__ */
/*
--
Julien Grall

_______________________________________________
Xen-devel mailing list
Sergej Proskurin
2017-07-06 11:50:17 UTC
Permalink
In this commit, we make use of the gpt walk functionality introduced in
the previous commits. If mem_access is active, hardware-based gva to ipa
translation might fail, as gva_to_ipa uses the guest's translation
tables, access to which might be restricted by the active VTTBR. To
side-step potential translation errors in the function
p2m_mem_access_check_and_get_page due to restricted memory (e.g. to the
guest's page tables themselves), we walk the guest's page tables in
software.

Signed-off-by: Sergej Proskurin <***@sec.in.tum.de>
Acked-by: Tamas K Lengyel <***@tklengyel.com>
---
Cc: Razvan Cojocaru <***@bitdefender.com>
Cc: Tamas K Lengyel <***@tklengyel.com>
Cc: Stefano Stabellini <***@kernel.org>
Cc: Julien Grall <***@arm.com>
---
v2: Check the returned access rights after walking the guest's page tables in
the function p2m_mem_access_check_and_get_page.

v3: Adapt Function names and parameter.

v4: Comment why we need to fail if the permission flags that are
requested by the caller do not satisfy the mapped page.

Cosmetic fix that simplifies the if-statement checking for the
GV2M_WRITE permission.

v5: Move comment to ease code readability.
---
xen/arch/arm/mem_access.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index bcf49f5c15..a58611daed 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -22,6 +22,7 @@
#include <xen/vm_event.h>
#include <public/vm_event.h>
#include <asm/event.h>
+#include <asm/guest_walk.h>

static int __p2m_get_mem_access(struct domain *d, gfn_t gfn,
xenmem_access_t *access)
@@ -101,6 +102,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
const struct vcpu *v)
{
long rc;
+ unsigned int perms;
paddr_t ipa;
gfn_t gfn;
mfn_t mfn;
@@ -110,8 +112,35 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
struct p2m_domain *p2m = &v->domain->arch.p2m;

rc = gva_to_ipa(gva, &ipa, flag);
+
+ /*
+ * In case mem_access is active, hardware-based gva_to_ipa translation
+ * might fail. Since gva_to_ipa uses the guest's translation tables, access
+ * to which might be restricted by the active VTTBR, we perform a gva to
+ * ipa translation in software.
+ */
if ( rc < 0 )
- goto err;
+ {
+ /*
+ * The software gva to ipa translation can still fail, e.g., if the gva
+ * is not mapped.
+ */
+ if ( guest_walk_tables(v, gva, &ipa, &perms) < 0 )
+ goto err;
+
+ /*
+ * Check permissions that are assumed by the caller. For instance in
+ * case of guestcopy, the caller assumes that the translated page can
+ * be accessed with requested permissions. If this is not the case, we
+ * should fail.
+ *
+ * Please note that we do not check for the GV2M_EXEC permission. Yet,
+ * since the hardware-based translation through gva_to_ipa does not
+ * test for execute permissions this check can be left out.
+ */
+ if ( (flag & GV2M_WRITE) && !(perms & GV2M_WRITE) )
+ goto err;
+ }

gfn = gaddr_to_gfn(ipa);
--
2.13.2
Loading...