From 7abbe6241e06b23a35d57f5cb2d77f3a569b9e57 Mon Sep 17 00:00:00 2001 From: Zheyuan Wu <60459821+Trance-0@users.noreply.github.com> Date: Wed, 22 Oct 2025 11:10:53 -0500 Subject: [PATCH] Update CSE5519_I3.md --- content/CSE5519/CSE5519_I3.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/content/CSE5519/CSE5519_I3.md b/content/CSE5519/CSE5519_I3.md index 0ebf170..3d265a9 100644 --- a/content/CSE5519/CSE5519_I3.md +++ b/content/CSE5519/CSE5519_I3.md @@ -8,3 +8,8 @@ VLA, vision-language-action models. +> [!TIP] +> +> This paper shows a new way to transfer web knowledge to robotic control. The key is to use a vision-language-action model to transfer the knowledge from the web to the robotic control. +> +> I'm considering how this framework could be migrated to two-hand robotic control. In general case, the action is solely done by one hand, but in most real-world applications, the action is done by two hands. I wonder if this framework could be extended to two-hand robotic control? \ No newline at end of file