OpenXR 手部跟踪
前言
备注
此页面着重于 OpenXR 提供的特性。这里提到的部分功能同样适用于 WebXR,也可以由其他 XR 接口提供。
在讨论手跟踪前应当先意识到,各个不同的系统对各个区域的理解不同。因此,不同的 OpenXR 运行时实现的表现也会不同。在开发过程中,你可能遇到某硬件不支持某一操作的问题,或者出现与其他平台表现的差别大到需要额外处理的情况。
尽管如此,近期对 OpenXR 规范的改进缩小了这种差距,而随着各个平台开始实现新版规范,我们也在逐步迈向一个各个平台之间可以完美移植的美好未来,或者至少能拥有一个可以检测平台功能的明确方法。
早期的大型 VR 平台主要采用的是基于跟踪手柄的输入。这套系统追踪一个物理设备,该设备上还装有一系列按键,用于进一步的输入。从跟踪数据中可以推断出玩家手部位置,但无法进一步获取更多信息。通常情况下,由游戏负责实现一套利用进一步输入来显示玩家的手部并创建手指动画的机制,其中的输入可能来自于按键或接近传感器。大部分时候手指的位置还会参考游戏上下文进行判断,例如用户手持的物品,或者正在进行的动作。
近来,光学手部跟踪逐渐开始流行,这套系统使用摄像头来追踪用户手部,可以生成完整的手部和手指位置数据。大部分供应商将其视为完全不同于手柄跟踪的一套体系,并发布了独立的 API 来访问手部和手指的位置与旋转数据。在处理输入时,由游戏开发人员负责实现手势检测系统。
这种两边分割的情况同样存在于 OpenXR 中,其中手柄跟踪主要由动作映射系统处理,而光学手部跟踪则主要由手部跟踪 API 扩展处理。
然而,这世界并非非黑即白,有这么几种情况难以简单地归类:
同时符合两种类别的设备,例如带跟踪的手套或手柄,例如同时拥有手指追踪功能的 Index controller。
通过实现用手柄数据进行推断手部追踪来解决多个手柄的手指放置问题的 XR 运行时。
意图在手柄和手部追踪之间流畅切换,对两种方法提供相同的用户体验的 XR 应用软件。
OpenXR is answering this call by introducing further extensions that lets us query the capabilities of the XR runtime/hardware or that add further functionality across this divide. The problem that currently does remain is that there are gaps in adopting these extensions, with some platforms thus not reporting capabilities to their full extent. As such you may need to test for the features available on specific hardware and adjust your approach accordingly.
示例项目
The information presented on this page was used to create a demo project that can be found here.
手部跟踪 API
As mentioned in our introduction, the hand tracking API is primarily used with optical hand tracking and on many platforms only works when the user is not holding a controller. Some platforms support controller inferred hand tracking meaning that you will get hand tracking data even if the user is holding a controller. This includes SteamVR, Meta Quest (currently native only but Meta link support is likely coming), and hopefully soon others as well.
The hand tracking implementation in Godot has been standardized around the Godot Humanoid Skeleton and works both in OpenXR and WebXR. The instructions below will thus work in both environments.
In order to use the hand tracking API with OpenXR you first need to enable it. This can be done in the project settings:

For some standalone XR devices you also need to configure the hand tracking extension in export settings, for instance for Meta Quest:

Now you need to add 3 components into your scene for each hand:
A tracked node to position the hand.
A properly skinned hand mesh with skeleton.
A skeleton modifier that applies finger tracking data to the skeleton.

手部跟踪节点
The hand tracking system uses separate hand trackers to track the position of the player's hands within our tracking space.
This information has been separated out for the following use cases:
Tracking happens in the local space of the XROrigin3D node. This node must be a child of the XROrigin3D node in order to be correctly placed.
This node can be used as an IK target when an upper body mesh with arms is used instead of separate hand meshes.
Actual placement of the hands may be loosely bound to the tracking in scenarios such as avatar creation UIs, fake mirrors, or similar situations resulting in the hand mesh and finger tracking being localized elsewhere.
We'll concentrate on the first use case only.
For this you need to add an XRNode3D node to your XROrigin3D
node.
On this node the
tracker
should be set to/user/hand_tracker/left
or/user/hand_tracker/right
for the left or right hand respectively.The
pose
should remain set todefault
, no other option will work here.The checkbox
Show When Tracked
will automatically hide this node if no tracking data is available, or make this node visible if tracking data is available.
Rigged hand mesh
In order to display our hand we need a hand mesh that is properly rigged and skinned. For this Godot uses the hand bone structure as defined for the Godot Humanoid but optionally supporting an extra tip bone for each finger.
The OpenXR hand tracking demo contains example glTF files of properly rigged hands.
We will be using those here and add them as a child to our XRNode3D
node.
We also need to enable editable children to gain access to our Skeleton3D node.
手部骨架修改器
Finally we need to add a XRHandModifier3D node as a child to our Skeleton3D
node.
This node will obtain the finger tracking data from OpenXR and apply it the hand model.
You need to set the Hand Tracker
property to either /user/hand_tracker/left
or /user/hand_tracker/right
depending on whether we are apply the tracking data of respectively the left or right hand.
You can also set the Bone Update
mode on this node.
Full
applies the hand tracking data fully. This does mean that the skeleton positioning will potentially reflect the size of the actual hand of the user. This can lead to scrunching effect if meshes aren't weighted properly to account for this. Make sure you test your game with players of all sizes when optical hand tracking is used!
Rotation Only
will only apply rotation to the bones of the hands and keep the bone length as is. In this mode the size of the hand mesh doesn't change.
With this added, when we run the project we should see the hand correctly displayed if hand tracking is supported.
手部跟踪数据源
This is an OpenXR extension that provides information about the source of the hand tracking data. At this moment only a few runtimes implement it but if it is available, Godot will activate it.
If this extension is not supported and thus unknown is returned, you can make the following assumptions:
If you are using SteamVR (including Steam link), only controller based hand tracking is supported.
For any other runtime, if hand tracking is supported, only optical hand tracking is supported (Note, Meta Link currently fall into this category).
In all other cases, no hand tracking is supported at all.
You can access this information through code:
var hand_tracker : XRHandTracker = XRServer.get_tracker('/user/hand_tracker/left')
if hand_tracker:
if hand_tracker.has_tracking_data:
if hand_tracker.hand_tracking_source == XRHandTracker.HAND_TRACKING_SOURCE_UNKNOWN:
print("Hand tracking source unknown")
elif hand_tracker.hand_tracking_source == XRHandTracker.HAND_TRACKING_SOURCE_UNOBSTRUCTED:
print("Hand tracking source is optical hand tracking")
elif hand_tracker.hand_tracking_source == XRHandTracker.HAND_TRACKING_SOURCE_CONTROLLER:
print("Hand tracking data is inferred from controller data")
else:
print("Unknown hand tracking source ", hand_tracker.hand_tracking_source)
else:
print("Hand is currently not being tracked")
else:
print("No hand tracker registered")
This example logs the state for the left hand.
If in this example no hand tracker is returned by get_tracker
,
this means the hand tracking API is not supported on the XR runtime at all.
If there is a tracker but has_tracking_data is false, the user's hand is currently not being tracked. This is likely caused by one of the following reasons:
The player's hand is not visible by any of the tracking cameras on the headset
The player is currently using a controller and the headset only supports optical hand tracking
The controller is turned off and only controller hand tracking is supported.
处理用户输入
Reacting to actions performed by the user is handled through XR 动作映射 if controllers are used. In the action map you can map various inputs like the trigger or joystick on the controller to an action. This can then drive logic in your game.
When hand tracking is used we originally had no such inputs, inputs are driven by gestures made by the user such as making a fist to grab or pinching the thumb and index finger together to select something. It was up to the game developer to implement this.
Recognizing that there is an increasing demand for applications that can switch seamlessly between controller and hand tracking and the need some form of basic input capability, a number of extensions were added to the specification that provide some basic gesture recognition and can be used with the action map.
手部交互式配置
The hand interaction profile extension is a new core extension which supports pinch, grasp, and poke gestures and related poses. There is still limited support for this extension but it should become available in more runtimes in the near future.

The pinch gesture is triggered by pinching your thumb and index finger together. This is often used as a select gesture for menu systems, similar to using your controller to point at an object and press the trigger to select and is thus often mapped as such.
The
pinch pose
is a pose positioned in the middle between the tip of the thumb and the tip of the index finger and oriented such that a ray cast can be used to identify a target.The
pinch
float input is a value between 0.0 (the tip of the thumb and index finger are apart) and 1.0 (the tip of the thumb and index finger are touching).The
pinch ready
input is true when the tips of the fingers are (close to) touching.
The grasp gesture is triggered by making a fist and is often used to pick items up, similar to engaging the squeeze input on controllers.
The
grasp
float input is a value between 0.0 (open hand) and 1.0 (fist).The
grasp ready
input is true when the user made a fist.
The poke gesture is triggered by extending your index finger, this one is a bit
of an exception as the pose at the tip of your index finger is often used to poke
an interactable object. The poke pose
is a pose positioned on the tip of the index finger.
Finally the aim activate (ready)
input is defined as an input that is 1.0/true
when the index finger is extended and pointing at a target that can be activated.
How runtimes interpret this, is not clear.
With this setup the normal left_hand
and right_hand
trackers are used and you can
thus seamlessly switch between controller and hand tracking input.
备注
You need to enable the hand interaction profile extension in the OpenXR project settings.
Microsoft hand interaction profile
The Microsoft hand interaction profile extension was introduced by Microsoft and loosely mimics the simple controller profile. Meta has also added support for this extension but only on their native OpenXR client, it is currently not available over Meta Link.

Pinch support is exposed through the select
input, the value of which
is 0.0 when the tip of the thumb and index finger are apart
and 1.0 when they are together.
Note that in this profile the aim pose
is redefined as a pose between thumb
and index finger, oriented so a ray cast can be used to identify a target.
Grasp support is exposed through the squeeze
input, the value of which
is 0.0 when the hand is open, and 1.0 when a fist is made.
With this setup the normal left_hand
and right_hand
trackers are used and you can
thus seamlessly switch between controller and hand tracking input.
HTC hand interaction profile
The HTC hand interaction profile extension was introduced by HTC and is defined similarly to the Microsoft extension. It is only supported by HTC for the Focus 3 and Elite XR headsets.

See the Microsoft hand interaction profile for the gesture support.
The defining difference is that this extension introduces two new trackers,
/user/hand_htc/left
and /user/hand_htc/right
.
This means that extra logic needs to be implemented to switch between the default trackers
and the HTC specific trackers when the user puts down, or picks up, their controller.
Simple controller profile
The simple controller profile is a standard core profile defined as a fallback profile when a controller is used for which no profile exists.
There are a number of OpenXR runtimes that will mimic controllers through the simple controller profile when hand tracking is used.
Unfortunately there is no sound way to determine whether an unknown controller is used or whether hand tracking is emulating a controller through this profile.

XR runtimes are free to define how the simple controller profile operates, so there is also no certainty to how this profile is mapped to gestures.
The most common mapping seems to be that select click
is true
when the tip of the thumb and index fingers are touching while the
user's palm is facing away from the user.
menu click
will be true when tip of the thumb and index fingers
are touching while the user's palm is facing towards the user.
With this setup the normal left_hand
and right_hand
trackers are used and you can
thus seamlessly switch between controller and hand tracking input.
备注
As some of these interaction profiles have overlap it is important to know that you can add each profile to your action map and the XR runtime will choose the best fitting profile.
For instance, a Meta Quest supports both the Microsoft hand interaction profile and simple controller profile. If both are specified the Microsoft hand interaction profile will take precedence and will be used.
The expectation is that once Meta supports the core hand interaction profile extension, that profile will take precedence over both Microsoft and simple controller profiles.
Gesture based input
If the platform doesn't support any interaction profiles when hand tracking is used, or if you're building an application where you need more complicated gesture support you're going to need to build your own gesture recognition system.
You can obtain the full hand tracking data through the XRHandTracker
resource for each hand. You can obtain the hand tracker by calling XRServer.get_tracker
and using either /user/hand_tracker/left
or /user/hand_tracker/left
as the tracker.
This resource provides access to all the joint information for the given hand.
Detailing out a full gesture recognition algorithm goes beyond the scope of this manual however there are a number of community projects you can look at: