Nonparametric Online Learning Control for Soft Continuum Robot: An Enabling Technique for Effective Endoscopic Navigation

Abstract

Bioinspired robotic structures comprising soft actuation units have attracted increasing research interest. Taking advantage of its inherent compliance, soft robots can assure safe interaction with external environments, provided that precise and effective manipulation could be achieved. Endoscopy is a typical application. However, previous model-based control approaches often require simplified geometric assumptions on the soft manipulator, but which could be very inaccurate in the presence of unmodeled external interaction forces. In this study, we propose a generic control framework based on nonparametric and online, as well as local, training to learn the inverse model directly, without prior knowledge of the robot's structural parameters. Detailed experimental evaluation was conducted on a soft robot prototype with control redundancy, performing trajectory tracking in dynamically constrained environments. Advanced element formulation of finite element analysis is employed to initialize the control policy, hence eliminating the need for random exploration in the robot's workspace. The proposed control framework enabled a soft fluid-driven continuum robot to follow a 3D trajectory precisely, even under dynamic external disturbance. Such enhanced control accuracy and adaptability would facilitate effective endoscopic navigation in complex and changing environments.

Introduction

Design of nature-inspired manipulators actuated based on soft material properties has become one of the most engaged research areas in robotics.¹ Soft robots embedded with delicate chambers can be driven by fluidic input,^1–4 resulting in functional deformations such as bending and elongation/shortening.⁵ Accredited to the limber robotic structure, its manipulation assures high compliance within a confined region, facilitating versatile interaction with surrounding objects.^6,7 These features introduce a potential impact to many robotic applications demanding safe interaction within a dynamic environment, such as soft tissue in minimally invasive surgery.^8,9 Therefore, endoscopy is one of the timely applications.

Conventional endoscopes predominately comprise a metallic skeleton driven by steel cables, governing the kinematics of a series of bending mechanisms. It inevitably induces high friction and is susceptible to fatigue failure upon prolonged duration of service. These metallic structures also come with high rigidity at the scope tip that may increase the risk of causing trauma or even perforation when the scope is forcefully pushed against the wall of a confined lumen or cavity.¹⁰ This has motivated the development of soft robotic instruments for surgical interventions,^11–14 which can also be disposable to ensure zero risk of endoscopy-related infection transmission. Endotics^11,12 was the first system developed for the purpose of pain-free colonoscopy. Its novel locomotion scheme attempted to prevent the formation of complicated looping at the sigmoid/descending colon. As a result, its single-segment bending is capable of omnidirectional endoscopic exploration along the colon. Aer-O-Scope¹³ was another commercial colonoscope relying on a simple approach making use of single-segment bending, which is combined with effective locomotion. The STIFF-FLOP soft robot^9,14 was another milestone in keyhole surgery to offer intracavitary exploration using a soft-material robot validated in a cadaveric trial for the first time.

Soft robotic endoscopes have brought a few branches of research directions in the limelight. Various control approaches have also been developed to master the dexterity of such manipulators, giving rise to agile and responsive telemanipulation. Paramount to surgical safety, having a decent control performance in the presence of a confined and dynamic environment is also essential. Therefore, much research effort^15–18 has been paid for deriving analytical models with the aim to describe or predict the robot kinematic/dynamic behavior,¹⁹ akin to controlling conventional rigid-link robots. However, these analytical models are complex due to the intrinsic nonlinear hyperelastic property of soft elastomeric materials that constitute the robot body. Any additional control dimensionality of the soft robot would further exacerbate the complexity of such kinematic equations.¹⁶

To simplify the modeling process, the piecewise constant curvature (PCC) assumption is one of the widely used techniques^15,16,18,20 to obtain close-formed solutions.^21,22 This enables real-time kinematic control of curvature discrepancy to attain the desired pose²³ and to perform dynamic motion primitives²⁴ for fluidically driven soft continuum robots. The parameters that govern the analytical models can also be estimated online.²⁵ Other model-based methods have been proposed without taking the PCC assumption such as approximation of trunk-like structures to infinite degree-of-freedom (DoF) system²⁶ and modeling spring–mass modeling techniques,^27,28 which can be incorporated in a hierarchical controller for generating stereotyped motions of an octopus-like manipulator.²⁷ Recently, the Cosserat theory²⁹ of elasticity has been used to predict underwater motion of a cable-driven, octopus-like soft robot³⁰ by deducing its geometrically exact formulations.

Yet, external disturbance to the robot, such as gravity, payload, and external interaction, can promptly invalidate those assumptions. These oversimplified assumptions would substantially degrade the model's reliability in real applications. Moreover, structural parameters in the kinematics have to be determined before the modeling process. The search for these invariant coefficients is heuristic in nature. This might induce further complications when mapping the robot motion analytically. In addition, such invariants can only hold upon slight modification of the robot as they possess strong correlation with the robot's mechanical structure. Inevitably, the analytical model has to be revisited after any major change to the robot structure, further diminishing the effectiveness of such an approach.

With the foreseen difficulty of developing the analytical/kinematic model, research attempts were made to control the soft pliable robot using nonparametric learning-based approaches. The idea is to obtain forward/inverse mappings for kinematic/dynamic robot control based on measurement data only. Model-free control methods can also be developed based on direct modeling architecture,³¹ where the inverse mapping is directly obtained. This mapping depicts the inverse transition model of the robot, which could be a changing function due to the contact between the robot and the environments, such as soft tissue.

The use of neural networks (NNs) has been proposed to globally approximate the inverse mapping between end-effector and robot actuation.^32,33 Such an approach can compensate for uncertainties in robot dynamics³² and has been demonstrated to yield even more reliable solutions when compared with using an analytical model of a cable-driven soft robot.³³ Previous studies of NNs mostly consider simplified scenarios, such as a nonredundant manipulator and contact-free situation.^32,33 Although redundantly actuated robotic systems can be controlled in lower dimensionality in a hierarchical manner, it may require predefined movement patterns (primitives) for specific task goals.²⁷

Moreover, there has been a great demand on using machine learning approaches to address the change in inverse mapping of the hyperelastic robot upon contact.¹ A Jacobian-based model-free controller has shown its capabilities to manipulate a planar, cable-driven continuum robot in an environment with static constraints.³⁴ However, there is still no example that demonstrates manipulation of redundantly actuated soft continuum robot in three-dimensional (3D) space and is adaptive to unknown external disturbance.

In this article, we propose a control framework based on nonparametric local learning technique. Nonparametric local learning methods, such as those described by Nguyen et al. and Peters et al.,^35,36 possess the ability to learn the high-dimensional inverse transition of rigid-link robots. The essence of nonparametric local methods is to construct a batch of locally weighted models that collectively approximate inverse mapping. Each of these models is spawned and updated in an independent manner such that the overall architecture can be rapidly transformed to accommodate new input data. Meanwhile, the weighted global approximation can be optimized on the fly and consistent with the desired control behavior.³⁶ Such nonparametric local learning approach can thus facilitate fast online correction of the learning model.³⁷ Therefore, the proposed framework is suitable for providing a rapid response to soft robot manipulation within constrained environments.

Workspace exploration is a prerequisite to collect pretraining data for learning the proposed controller. It is desirable to have accurate enough kinematic data to initialize the controller offline since it is impractical to carry out robot exploration in the confined transluminal workspace. We propose to use finite element analysis (FEA) to sample the kinematic data for the offline learning process. FEA has been widely used in design optimization and miniaturization of soft robots.¹³ Not only can the FEA accurately predict the highly deformable behaviors but it can also provide data for characterization of inverse kinematic relationships for control.³⁸ However, the application of FEA to robotic control has only been minimally investigated in continuum structure with small deformation.^38,39 The major contributions of this work are as follows:

• It is the first attempt to exploit online nonparametric local learning technique with the aim to directly approximate the inverse kinematics of a redundantly actuated, fluid-driven endoscope prototype for soft robot control in 3D space (see the Methods section).

• Integration of FEA into the online learning method is implemented to initialize a reliable inverse model offline before deployment of the proposed controller in practical scenarios (see the Experiments, Results, and Discussion section).

• Experimental validation of the control performance and adaptability is conducted to demonstrate 3D trajectory tracking (mean error <2.49°) of soft continuum robot even under dynamic external disturbance (see the Experiments, Results, and Discussion section).

Methods

Design of soft endoscope prototype

A generic, fluidic-driven soft continuum robot made of RTV (Room Temperature Vulcanization) silicone rubber (Ecoflex 0050; Smooth-On, Inc.) is designed and fabricated to evaluate the proposed framework for endoscopic navigation (Fig. 1a). The soft robot comprises three cylindrical inflatable chambers, each covered by a helical Kevlar string layer with a pitch of 1 mm. This fiber-constrained structure was first proposed by Suzumori et al.,^4,40 in which the helical constraint layer enforces axial anisotropic expansion of inflatable chambers so as to generate an effective bending moment when subject to pressure input. To enable effective endoscopic navigation, the three air chambers can be individually actuated by air or other fluid, facilitating a large panoramic workspace with a bending angle >150°. The slender robot configuration with 13-mm outer diameter and 93-mm length is also compatible with conventional endoscopes, which is of importance to dexterous manipulation inside a confined transluminal workspace.

FIG. 1.

(a) Soft robotic endoscope prototype made of silicone rubber. It has a dimension compatible with the insertion tube of conventional endoscope; (b) CAD/CAM model of the soft manipulator showing simulated helical strain-wrapping constrains around its individual actuation chamber using linear truss, where the anisotropic expansion can be achieved; (c) Finite element model tessellated with 12,000 linear hexahedron elements. A total of 2,214 truss elements are defined to emulate the effect of strain-wrapping constraint; (d) Cross-sectional area tessellated by hexahedron meshing.

Fabrication of the robot involves three major phases: (1) three cylindrical air chambers are cast with RTV silicone in inner molds; (2) Kevlar strings are wrapped densely in a single helical structure along each soft chamber; and (3) additional layers of silicone are cast to house the three inflatable chambers into one. This could fix the strings against dislocation, even after numerous bending actions.

Characterization of robot motion transition

Gradual smooth regulation of the fluidic flow rate allows steady bending of the presented soft manipulator. It also allows rapid reaching of fluid pressure equilibrium, minimizing the residual motion generated during such fluidic actuation. During endoscopic navigation within small and confined spaces (e.g., duodenum), such quasi-static motion characteristic⁴¹ can facilitate effective precise targeting of the endoscopic camera or interventional tools (e.g., biopsy forceps or brush cytology) at the surgical regions of interest, thereby avoiding inadvertent damage to delicate tissue and potential discomfort to the patient.

To mathematically describe motion transition of the soft robot, let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{u}}_k} \in U$$ \end{document} be the fluid pressure (at equilibrium) in the actuation chambers at time step k where U denotes the control space. Let \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ { \theta }}_k}$$ \end{document} be the state of the robot when the chambers are filled with the pressure of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{u}}_k}$$ \end{document} at equilibrium. This state corresponds to the distal tip position \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{p}} \in { \Re ^3}$$ \end{document} and orientation normal \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{n}} \in { \Re ^3}$$ \end{document} in the Cartesian space (Fig. 2), which are collectively represented by \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{x}}_k} = { [ { \bf{p}} , { \bf{n}} ] ^T} \in { \Re ^6}$$ \end{document} . The forward transition model of the soft robot can be described by the following equation system: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \left\{ { \begin{matrix} {{{ { \theta }}_{k + 1}} = f ( {{ { \theta }}_k} , \Delta {{ \bf{u}}_k} ) } \\ {{{ \bf{x}}_k} = h ( {{{ \theta }}_k} ) } \\ \end{matrix} } \right. \tag{1} \end{align*} \end{document}

FIG. 2.

Three robot configurations illustrating an example of localized inverse models. Assume that their tip directions s_i will undergo the same rotation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{s}}_{ref}}$$ \end{document} (blue arrow) when proper pressure changes \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_i}$$ \end{document} are applied, where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i = 1 , 2 , 3$$ \end{document} . In the case of configurations 1 and 2, the average of their control inputs \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_1}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_2}$$ \end{document} would still lead to a rotation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{s}}_{avg{ \rm{ }}1\& 2}}$$ \end{document} (red arrow) consistent with \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{s}}_{ref}}$$ \end{document} (blue arrow); When two configurations, such as 1 and 3, are vastly different, the average of inputs \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_1}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_3}$$ \end{document} may lead to a rotation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{s}}_{avg{ \rm{ }}1\& 3}}$$ \end{document} (green arrow) that is significantly different from \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{s}}_{ref}}$$ \end{document} (blue arrow), leading to undesired movement. Therefore, learning the inverse model directly with a global function approximator may lead to invalid solutions and unstable robot performance.

where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k} = {{ \bf{u}}_{k + 1}} - {{ \bf{u}}_k}$$ \end{document} is the difference of the fluid pressure. The motion transition function f is a continuous mapping that depends on the current state of the robot \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ { \theta }}_k}$$ \end{document} . Compared with rigid-link robots where the robot state can be well defined by joint kinematics, it is difficult to describe the exact state of the soft robot. For example, model-based approaches approximate this robot state based on PCC^{15,16,18,20–25} and non-PCC^26–30 constraints. The nonlinear function h transforms robot state \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{{ \theta }}_k}$$ \end{document} to Cartesian representation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{x}}_k}$$ \end{document} .

Typical endoscopic navigation requires delicate articulation of the distal tip so as to provide accurate positioning and easy access to the soft tissue lesion. A microcamera at the soft robot tip provides forward vision. Therefore, the operator can aim the distal tip at a lesion target on the luminal wall so as to guide the interventional instruments to deploy from the tip through the biopsy channel. This telemanipulated endoscopic navigation gives rise to a robot task space coordinate \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{s}}_k}$$ \end{document} defined by its viewing direction (i.e., pitch and yaw angle). The system equation in Equation (1) can hence be extended to an actuation to task space mapping \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${f_{ \bf{s}}}$$ \end{document} as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {{ \bf{s}}_{k + 1}} = {f_{ \bf{s}}} ( {{{ \theta }}_k} , \Delta {{ \bf{u}}_k} ) \tag{2} \end{align*} \end{document}

Inverse problem for online learning of task space control

Our control objective is to enable the operator to control displacement of the robot directly in the task space coordinate \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{s}}_k^*$$ \end{document} (i.e., the desired change in the robot tip orientation) with the use of a motion input device. The superscript “*” denotes the desired motion specified by users or other reference input. Thus, the controller is designed to approximate the inverse of the motion transition \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${f_{ \bf{s}}}$$ \end{document} in Equation (2), that is, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k} = \tilde \Psi ( \Delta { \bf{s}}_k^* , {{{ \theta }}_k} )$$ \end{document} , to estimate the required change in control input \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k}$$ \end{document} (as shown in Fig. 5). The inverse motion transition model \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\tilde \Psi$$ \end{document} heavily depends on the current robot state. However, the exact state \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ { \theta }}_k}$$ \end{document} cannot be directly measured due to its hyperflexibility and the interactions with enclosed workspace inside a patient's cavity. We sought to adopt the task space coordinates \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{s}}$$ \end{document} , which would offer updated clues about the current robot state.

This approach is also of practical interest because these measurements are readily available in our control system. The task space coordinate \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{s}}$$ \end{document} can be tracked using advanced positional tracking systems. For example, electromagnetic (EM) tracking systems are commonly used in medical application to provide submillimeter-level tracking.^42,43 Together with the actuator's input \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{u}}_k}$$ \end{document} , these online acquired data are presented to the learning algorithms to update the inverse mapping \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Psi$$ \end{document} during robot run time. \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \Delta {{ \bf{u}}_k} = \Psi ( \Delta { \bf{s}}_k^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} ) \tag{3} \end{align*} \end{document}

Note that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Psi$$ \end{document} is the approximation of the true inverse mapping \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\tilde \Psi$$ \end{document} . If dimensionality of the task space is smaller than that of the control space, theoretically there exist an infinite number of solutions of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k}$$ \end{document} that result in the same task space displacement \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{s}}_k^*$$ \end{document} . This leads to the ill-posed problem in learning the inverse mapping \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Psi$$ \end{document} .

Inverse model learning with multiple local controllers

Nonparametric local learning techniques have been applied to learn the ill-posed inverse problem, aiming to control redundantly actuated robots.^31,44,45 Referring to Peters and Schaal,³⁶ the inverse model of a rigid-link robot can be learnt using spatially localized nonparametric learning techniques given that the robot state is well defined by joint kinematics. In this study, spatial localization refers to the robot state \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ { \theta }}_k}$$ \end{document} . Such localization scheme is motivated by the hypothesis that the inverse problem would be well defined locally.³⁶ It is because nonparametric learning techniques essentially average out the sampled data. Model learning based on nonconvex training datasets would give invalid solutions.³⁶

However, in the vicinity of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$( { \bf{s}} , { \bf{u}} )$$ \end{document} , the average of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{u}}$$ \end{document} would be consistent with the average of the task space displacement \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{s}}$$ \end{document} (Fig. 2). Therefore, in a local region of a given \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$( { \bf{s}} , { \bf{u}} )$$ \end{document} , the training dataset \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\left\{ { \Delta { \bf{u}} , \Delta { \bf{s}} , { \bf{s}} , { \bf{u}}} \right\} $$ \end{document} would become a convex set. This enables learning of inverse mapping in the vicinity of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$( { \bf{s}} , { \bf{u}} )$$ \end{document} (Fig. 2). We approximate the local inverse mapping from the desired task space displacement to the actuation command as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \Delta {{ \bf{u}}_k} = \Psi _{}^i ( \Delta { \bf{s}}_k^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} ) = { [ \Delta { \bf{s}}_k^* ] ^{\rm T}}{ \beta ^i} \tag{4} \end{align*} \end{document}

Online learning of the global controller

To approximate the global inverse mapping, we employ a linear combination of the locally learned mapping⁴⁶: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{split}\Delta {{ \bf{u}}_k} = {{ \sum \nolimits_{i = 1}^n {{w^i} ( {{ \bf{s}}_k} , {{ \bf{u}}_k} ) \Phi _{}^i ( \Delta { \bf{s}}_k^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} ) } } \over { \sum \nolimits_{i = 1}^n {{w^i} ( {{ \bf{s}}_k} , {{ \bf{u}}_k} ) } }} \\= {{ \sum \nolimits_{i = 1}^n {{w^i} ( {{ \bf{s}}_k} , {{ \bf{u}}_k} ) {{ [ \Delta { \bf{s}}_k^* ] }^T}{ \beta ^i}} } \over { \sum \nolimits_{i = 1}^n {{w^i} ( {{ \bf{s}}_k} , {{ \bf{u}}_k} ) } }}.\end{split} \tag{5} \end{align*} \end{document}

This controller architecture allows straightforward one-iteration computation in each time step, in contrast to indirect modeling approaches.³⁴ The number of local models n and the weight \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${w^i} ( {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document} , as well as the local controllers \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Psi _{}^i ( \Delta { \bf{s}}_k^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document} , can be obtained in an online manner.

For this purpose, the local forward model is learnt using locally weighted projection regression (LWPR),³⁷ which offers piecewise linear function approximation, while it simultaneously determines the appropriate local region of each linear model. Each local forward model performs a linear mapping as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \Delta { \bf{s}}_k^{} = f_{ \bf{s}}^i ( {{ \bf{s}}_k} , {{ \bf{u}}_k} , \Delta {{ \bf{u}}_k} ) = { [ \Delta {{ \bf{u}}_k} ] ^T}{ \hat \beta ^i} \tag{6} \end{align*} \end{document}

centered at \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{c}}^i}$$ \end{document} , where Dⁱ is the distance metric. Each membership function weights the corresponding locally learned inverse model in the controller (Eq. 6).

One advantage of LWPR is that it can automatically spawn new linear models and the corresponding RF when new data laid outside all existing RF are presented. Meanwhile, the center \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{c}}^i}$$ \end{document} of RF is determined by the input space of new data through incremental learning so as the total number of local regions n (Fig. 3). Each newly spawned RF is initialized with a diagonal distance metric Dⁱ value. This Dⁱ value will be updated throughout the incremental learning process to improve the overall regression accuracy and convergence rate. To prevent overfitting and allocation of too many numbers of RFs n, a smaller initial Dⁱ value is preferred (i.e., larger RFs). Cross-validation is also employed in determining the initial Dⁱ, which is important to ensure that the forward model can be accurately reflected by piecewise linear regression.

FIG. 3.

Example set of localized linear controllers that approximate the nonlinear inverse mapping \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Psi$$ \end{document} of a 1D actuation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{u}}$$ \end{document} . The valid region of each spatially localized controller is centered at \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{c}}^i}$$ \end{document} (denoted by plus sign), with the range parameterized by \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{D}}^i}$$ \end{document} (colored ellipse) in the robot state space. The warm color depicts the actuation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{u}}$$ \end{document} predicted by the linear control law \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \beta ^i}$$ \end{document} to achieve a particular movement \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{s}}_{}^*$$ \end{document} in task space.

Despite the fact that each RF could fulfill the local convexity requirement due to redundancy in the robotic system, the solutions of local controllers (Eq. 4) could be inconsistent with the desired solutions.³⁶ Although this problem could be resolved by preprocessing the training data such that it only produces one particular solution, it lacks generality and is difficult to apply in high-dimensional systems.³¹ Therefore, we employ another approach that reshapes local inverse models using constrained optimization, where the local controllers are enforced to provide consistent solutions from infinite possibilities in the null space of the control space. We then define the optimization problem as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \mathop { \min } \limits_{ \Delta { \bf{u}}} {C_k} ( \Delta {{ \bf{u}}_k} ) = { ( \Delta {{ \bf{u}}_k} - \Delta {{ \bf{u}}_{0 , k}} ) ^{\rm T}}{ \bf{N}} ( \Delta {{ \bf{u}}_k} - \Delta {{ \bf{u}}_{0 , k}} ) \tag{8} \end{align*} \end{document}

subject to \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k} = \Psi ( \Delta { \bf{s}}_k^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document}

where the cost function C_k represents the user-defined optimality scaled by a diagonal matrix N. \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_{0 , k}} = \upsilon ( {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document} is the user-defined null-space behavior. One example of null-space behavior could be minimizing the elongation of the robot, which results in smaller bending radius to facilitate dexterous motion inside enclosed cavity. Finally, the optimization constraint \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k} = \Psi ( \Delta { \bf{s}}_k^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document} ensures the correctness of the inverse solution.

The constrained optimization problem can be solved by introducing a reward function (Eq. 9) and a cost function (Eq. 10): \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} r ( {{ \bf{u}}_k} ) = { \sigma _i} \exp ( - 0.5 \sigma _i^2{C_k} ) \tag{9} \end{align*} \end{document} \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {E_i} = \mathop \sum \limits_{k = 1}^N {r ( {{ \bf{u}}_k} ) {w^i} ( {{ \bf{s}}_k} , {{ \bf{u}}_k} ) {{ ( \Delta {{ \bf{u}}_k} - [ \Delta { \bf{s}}_k^T ] {{ { \beta }}^i} ) }^2}} \tag{10} \end{align*} \end{document}

The reward function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$r ( {{ \bf{u}}_k} )$$ \end{document} is scaled by the mean cost \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \sigma _i}$$ \end{document} to improve learning efficiency³⁶: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \sigma _i^2 = \sum \nolimits_{h = 1}^k {{w^i} ( {{ \bf{s}}_h} , {{ \bf{u}}_h} ) } {C_h} / \sum \nolimits_{h = 1}^k {{w^i} ( {{ \bf{s}}_h} , {{ \bf{u}}_h} ) } \tag{11} \end{align*} \end{document}

The cost function is then minimized by means of reward-weighted regression, where each local model needed to be updated: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} {{ \beta }}_{k + 1}^i = { ( {{ \bf{X}}^{\rm T}}{{ \bf{W}}^i}{ \bf{X}} ) ^{ - 1}}{{ \bf{X}}^T}{{ \bf{W}}^i}{ \bf{Y}} \tag{12} \end{align*} \end{document}

Algorithm 1.

Online Learning Algorithm of Inverse Mapping

1	for each new input data sample: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$[ \Delta { \bf{s}}_k^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} , \Delta {{ \bf{u}}_k} ]$$ \end{document}
2	Add \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$( {{ \bf{s}}_k} , {{ \bf{u}}_k} , \Delta {{ \bf{u}}_k} ) \to \Delta { \bf{s}}_k^*$$ \end{document} to the forward model LWPR.
3	Update the current number of models n and localization of the forward models \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${w^i} ( { \bf{s}} , { \bf{u}} )$$ \end{document} for all input data
	Compute desired null-space behavior
4	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_{0 , k}} = \upsilon ( {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document} .
	Compute costs \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${C_k} = { ( \Delta {{ \bf{u}}_{1 , k}} ) ^{\rm T}}{ \bf{N}} \Delta {{ \bf{u}}_{1 , k}}$$ \end{document} , where
5	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_{1 , k}} = \Delta {{ \bf{u}}_k} - \Delta {{ \bf{u}}_{0 , k}}$$ \end{document}
6	for each model \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$i = 1 , 2 , \ldots , n$$ \end{document}
	Update the mean cost:
7	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sigma _i^2 = \sum \nolimits_{h = 1}^k {{w^i} ( {{ \bf{s}}_h} , {{ \bf{u}}_h} ) } {C_h} / \sum \nolimits_{h = 1}^k {{w^i} ( {{ \bf{s}}_h} , {{ \bf{u}}_h} ) }$$ \end{document} .
	Compute reward:
8	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$r ( \Delta {{ \bf{u}}_k} ) = { \sigma _i} \exp ( - 0.5 \sigma _i^2{C_k} )$$ \end{document}
	Solve the following reward-weighted regression problem with step 10–13:
9	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${E_i} = \mathop \sum \limits_{k = 1}^N {r ( {{ \bf{u}}_k} ) {w^i} ( {{ \bf{s}}_k} , {{ \bf{u}}_k} ) {{ ( \Delta {{ \bf{u}}_k} - [ \Delta { \bf{s}}_k^T ] {{ { \beta }}^i} ) }^2}}$$ \end{document}
	Add new data point to the weighted regression:
10	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{X}}_k} = [ \Delta { \bf{s}}_k^* ]$$ \end{document}
11	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{Y}}_k} = [ \Delta {{ \bf{u}}_k} ]$$ \end{document}
12	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{W}}^i} = diag ( r ( {{ \bf{u}}_1} ) w_1^i , \, \ldots \, , r ( {{ \bf{u}}_k} ) w_k^i )$$ \end{document}
13	Update the weighted regression of inverse mapping model
	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \beta }}_{k + 1}^i = { ( {{ \bf{X}}^T}{{ \bf{W}}^i}{ \bf{X}} ) ^{ - 1}}{{ \bf{X}}^T}{{ \bf{W}}^i}{ \bf{Y}}$$ \end{document}
14	end
15	end

Experiments, Results, and Discussion

The proposed control framework is implemented on a custom-made soft robot to investigate its performance and behavior under external dynamic constraints. We have also attempted to utilize FEA to simulate robot motion data for pretraining of an initial control policy. This can avoid the need for random exploration of its robot workspace to initialize online learning functions. Such exploration is usually time-consuming and may not be practical, particularly for single-use purposes in surgical applications. Accuracy and stability of the proposed controller are examined through path following under various constrained environments. The interaction force with the external constraint is also measured throughout the experiments. The control block diagram of the overall robotic system, including the processing core and actuation system, is illustrated in Figure 6.

Initialization of online learning by FEA-based model

Proper initialization of pretraining data is essential to many online learning techniques. These preceding data are dedicated to pretraining an initial control policy before the online learning begins. It is usually acquired by driving the robot with random input. Instead, we proposed to incorporate FEA, by which robot deformation can be simulated with a hyperelastic computation model. This simulation can generate comprehensive pretraining samples that cover the entire robot workspace at a high resolution, facilitating offline pretraining of the learning-based controller (Fig. 5).

The FEA model of the robot is constructed using ABAQUS⁴⁷ to predict the robot kinematics and workspace. RTV silicone rubber is considered as incompressible hyperelastic material formulated by Odgen material model.⁴⁸ It exhibits negligible volume change under hydrostatic compression and has a Poisson's ratio close to 0.5. Due to the incompressibility of silicone rubber and the large deformation nature of the simulation, the element formulation and mesh quality pose a compelling effect on both accuracy and convergence of the simulation. Therefore, hexahedral element (C3D8RH; Fig. 1c) based on u-p hybrid formulation with hourglass control⁴⁷ is chosen over the commonly used quadratic tetrahedral elements (Fig. 1d) in the FEA of our soft robotic manipulators.

The C3D8RH element possesses eight displacement nodes and one interior pressure node. The combination of these displacement and pressure nodes is often close to optimal.⁴⁹ Such integration scheme improves not only element efficiency but also element accuracy under bending load. However, compared with tetrahedrons, automatic mesh generation of hexahedrons is relatively ineffective, resulting in poor tessellation quality. To this end, the presented meshing has to be obtained by custom-designed protrusions, and all elements are right prisms initially. By restoring the mesh quality, the assemblage contains far fewer elements and is much more robust in convergence.

The presented manipulator model is tessellated with 12k linear hexahedral elements (C3D8RH; Fig. 1c). There are also 2,214 linear truss elements (T3D2) being placed along each actuation chamber in a layer-by-layer arrangement (Fig. 1b). Truss elements are used to model the helical strain-wrapping constraints that ensure the anisotropic expansion of chambers upon pressure actuation. Actuation and gravity loads are applied to the presented FEA model. The gradual change of the stress input, which is distributed across the surface mesh along the inner chamber surface, guarantees reliable convergence, giving rise to an equilibrium solution throughout all the time steps during the FEA.

Quasi-static motion with negligible hysteresis can be achieved when the real robot prototype is manipulated while delicately regulating the inflation pressure into the chamber at high-resolution steps. It is worth noting that deformation/bending of both the FEA-modeled manipulator and the actual one are very similar corresponding to the same levels of inflation pressure simulated, as shown in Figure 4. Over 1,000 simulated motion samples \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\left\{ { \Delta { \bf{u}} , \Delta { \bf{s}} , { \bf{s}} , { \bf{u}}} \right\} $$ \end{document} have been obtained using the FEA, covering the entire robot workspace (Fig. 5). These simulated data are adopted to pretrain the online learning controller as described in the following sections.

FIG. 4.

FEA models (left) simulated with seven levels of inflation pressure in a single chamber. Similar deformation characteristics are exhibited in actual configurations of the soft manipulator (right) under the same corresponding pressure levels. FEA, finite element analysis.

FIG. 5.

FEA-simulated kinematic data covering the entire workspace of the soft robot. The arrows illustrate the predicted movement of the robot tip when an arbitrary pressure change \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k}$$ \end{document} is applied. These data enable pretraining of a reasonable initial control policy before the online learning begins, without the need for undesired random movement (babbling).

Experimental setup

To evaluate the proposed control performance, three motorized pneumatic units are employed to actuate the presented soft manipulator incorporated with our close-loop control testing platform (Fig. 6). Each unit consists of a pneumatic cylinder coupled to a precise stepper motor through a lead screw transmission. This facilitates accurate regulation of air flow. Our soft robotic manipulator can be fully articulated in a dome-shaped workspace with a maximum curve angle of >150° in all directions.

FIG. 6.

System architecture of the proposed control framework depicting interconnections of key components. The processing core is responsible for fast computation of inverse solution. The inverse model is also updated continuously by incorporating the online data in real time. The operator can specify the reference input \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{s}}_k^{ref}$$ \end{document} through a motion input device for effective endoscopic navigation. In our experiments, this input is replaced by a predefined reference trajectory to evaluate the online learning performance of inverse mapping.

An EM tracking system (NDI Medical Aurora) is employed to close the robot control loop by the continuous positional data feedback (Fig. 7a). This tracking system is commonly available in many image-guided intervention systems. It can track the position and orientation of tiny EM coils in real time with root mean square (RMS) accuracy of 0.7 mm and 0.2° at 40 Hz. A tiny tracking coil is embedded at the robot distal tip. Online updating (at 20 Hz) of the inverse mapping estimation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k} = \Psi ( \Delta {{ \bf{s}}_k}^* , {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document} by the local learning algorithm is achieved, where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{s}}_k}$$ \end{document} is measured tip direction. The positional data are also recorded throughout the robot task so as to evaluate overall control performance. The entire control framework is implemented in the MATLAB environment. The open-source library of LWPR⁵⁰ is employed to incrementally learn the robot forward model, which determines valid linearization of each local controller.

FIG. 7.

(a) Registration process of the predefined trajectory using an electromagnetic (EM) position tracking system. Blue line on the transparent sphere illustrates the tracking trajectory on the task space; (b) Soft manipulator is commanded to follow the desired trajectory automatically. Its end-effector position is also measured by the tracking system to close the feedback loop under online learning control policy. Plastic rod actuated by a stepping motor pushes against the soft robot, generating external constrains. The contact force is monitored by a force/torque sensor.

A series of path-following tasks is performed under various constraint scenarios to investigate how the online learning control approach reacts to such unknown interactions. At the beginning, the robot is allowed to move freely in its workspace without any interference. This serves as the control experiment to establish the baseline of controller performance. Subsequently, the robot is gently pushed by a plastic rod to simulate an unknown dynamic interaction with the robot manipulation (Fig. 7b). The rod is actuated by a high-precision stepping motor to generate repeatable contact with the robot body; meanwhile, the contact force is monitored by a force/torque sensor (ATI Industrial Automation: F/T Nano17). The tracking error is defined as the shortest distance between the robot targeting direction \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{s}}_k}$$ \end{document} and the desired trajectory.

Evaluation of online local learning controller

To realize accurate navigation under unknown constraints, the inverse model is adapted in the proposed learning-based controller, which has to be updated online based on the newly acquired motion data. In this study, we compared three types of data sources for the inverse model training: (1) pretrained by FEA data without using online data; (2) initialized by random exploration with online learning data; and (3) pretrained by the FEA data, and then updated by online data. These online-updated inverse models are evaluated for resolved motion rate control⁵¹ to track a predefined trajectory. Thus, the desired task space displacement \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{s}}_k^*$$ \end{document} that tracks the reference input is obtained as follows: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \Delta { \bf{s}}_k^* = \Delta { \bf{s}}_k^{ref} + { \bf{K}}_p^{ref} ( { \bf{s}}_k^{ref} - {{ \bf{s}}_k} ) \tag{13} \end{align*} \end{document}

where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta { \bf{s}}_k^{ref}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{s}}_k^{ref}$$ \end{document} are the reference task space displacements and coordinates generated from interpolating a predefined trajectory. Note that the reference input can be replaced by manual control in actual endoscopic navigation scenario. We employed the same proportional–derivative (PD) gain \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{K}}_p^{ref} = { \bf{I}}$$ \end{document} for all three settings to perform tracking along a reference trajectory. Thus, the actuation input \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_k}$$ \end{document} is estimated by the online learning inverse model as depicted in Equation (4).

To enforce the consistency of inverse mapping among all localized linear controllers, a standard null-space behavior \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_{0 , k}} = \upsilon ( {{ \bf{s}}_k} , {{ \bf{u}}_k} )$$ \end{document} is defined. This gives rise to an immediate reward function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$r ( \Delta {{ \bf{u}}_k} )$$ \end{document} to weigh the training data that best imitate the desired null-space behavior (Eq. 9). For the presented soft robot, we first choose a rest configuration to be \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{u}}_{rest}} = { [ 0 , 0 , 0 ] ^{ \bf{T}}}$$ \end{document} , which can minimize the overall inflation pressure as well as elongation of the manipulator. Then, the robot is attracted toward the rest configuration with a loose attractor function \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta {{ \bf{u}}_{0 , k}} = - {{ \bf{K}}_p} ( {{ \bf{u}}_k} - {{ \bf{u}}_{rest}} )$$ \end{document} , where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{K}}_p} = 0.2{ \bf{I}}$$ \end{document} . We defined an identity metric \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{N}} = { \bf{I}}$$ \end{document} as all three inflatable actuators of the robot are identical and should contribute the same in achieving the desired null-space behavior.

It is also necessary to normalize the training dataset into the same scale component-wise so that the LWPR can learn the data variance properly. Min-max normalization is a simple but effective technique commonly used⁵²: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} { \hat q_i } = { \frac { { q_i } - \min ( { q_i } ) } { \max ( { q_i } ) - \min ( { q_i } ) } } \tag { 14 } \end{align*} \end{document}

However, the statistical max(q_i) and min(q_i) values would be sensitive to outliers; therefore, we define the min–max values according to the physical constraints of data, including the typical robot workspace and the maximum volume of the cylinder unit.

Pretrained by FEA without using online data

In this setting, both the forward model and control policy are pretrained solely by the FEA-simulated data (see the Initialization of online learning by FEA-based model section). The online data were not taken into account in this setting. This acts as a control experiment to depict the actual influence of external interactions. In the unconstrained experiment (Fig. 8a), it was observed that the controller could roughly follow the trajectory with a relatively large tracking error of ±1.79° and a maximum error of ±6.96° with the use of the feedback controller (Table 1). Despite the considerable discrepancy between the FEA-simulated and actual configuration, this experiment still demonstrates that the FEA data are capable of pretraining a reasonable inverse model for rough path following.

FIG. 8.

Tracked trajectory plotted (left) and the corresponding tracking error in time domain (right). In the control experiment, the robot is allowed to move freely without any constraint. Control performance of the online learning controllers trained by three different data sources is validated: (a) Pretrained by FEA without using online data; (b) Pure online learning initialized by random exploration; (c) Pretrained by FEA data and updated by online data. The online learning initialized by the FEA data approach (c) combines the advantage of (a, b), in which random exploration (green path in (b)) is not required, but its tracking errors converge to similar accuracy as in pure online learning.

Table 1.

Trajectory Tracking Performance Under Freely Moveable Environment

	Mean absolute error		Maximum absolute error		Error SD \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sigma$$ \end{document}
Training mode	First cycle	After	First cycle	After	First cycle	After
Pretrained by FEA only	±1.79°	±1.82°	±6.96°	±6.56°	1.71°	1.66°
Pure online learning	±1.13°	±0.87°	±4.24°	±1.92°	0.86°	0.45°
Combined	±2.21°	±0.90°	±7.49°	±2.80°	1.81°	0.65°

FEA, finite element analysis; SD, standard deviation.

In the later constrained experiment (Fig. 9a), the robot maintained tracking of the trajectory with similar accuracy at the beginning. When the external interaction is engaged at the moment of 25 s, the robot was pushed further away from the desired trajectory, resulting in an increased mean tracking error ±4.64° and a maximum error of ±14° (Table 2). This indicates that the feedback controller cannot fully compensate the significant motion bias that is induced by external disturbance.

FIG. 9.

Tracked trajectory plotted (left) and the corresponding tracking error in time domain (right) under external interactions. Control performance is validated in three different conditions as in Figure 8. It can be observed that online learning for (b, c) is capable of compensating the external interaction with the tracking error reduced, compared with the controller without using online data (a).

Table 2.

Trajectory Tracking Performance Under Constrained Environment

Training mode	Mean absolute error	Maximum absolute error	Error SD \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sigma$$ \end{document}	Mean contact force	Maximum contact force
Pretrained by FEA only	±4.64°	±14.00°	3.20°	0.77N	1.06N
Pure online learning	±2.35°	±18.38°	2.47°	0.78N	0.97N
Combined	±2.49°	±11.03°	1.74°	0.80N	0.96N

In the case of a conventional rigid-linked robot, this kind of error due to the interaction with the constraint is often considered as a perturbation. The error can hence be compensated by increasing the feedback control gain given that the inverse model is readily available from the kinematic chain. However, such approach is not directly applicable to a soft robot due to their mechanical compliance that inevitably induces much larger positioning errors. In addition, the interaction force may also alter the force equilibrium of the robot and therefore substantially degrading the reliability of the predetermined inverse model. The following experiments demonstrate how the proposed online algorithm can accommodate the influence of constrained environment, which is particularly demanding for the control of soft robots.

Initialized by random exploration with online learning

The random exploration of robot workspace is a typical approach³⁴ to initialize a data-driven controller before its actual deployment. This kind of arbitrary movement is necessary to provide preceding data for setting up a learning model. It involves tracking 50 random input pressure waypoints \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{u}}_k}$$ \end{document} with a PD feedback controller. The deliberately tuned PD gains can cause poor tracking of random waypoints. Such babbling movement (green path in Figs. 8b and 9b) can facilitate a faster learning rate as the robot sweeps throughout a wider neighboring workspace. Pretraining with the exploration data resulted in a forward LWPR model with 110 RFs, which define linearization for the piecewise linear inverse model in advance to actual deployment of the online learning.

Upon exploration, the online learning controller could follow the desired trajectory with an average error of ±1.13° in the first cycle under the constraint-free environment (Fig. 8b). The error was found to be significantly lower than the inverse model pretrained by FEA-simulated data. It is reasonable because the actual robot data were used. After a few cycles, the tracking error further decayed to an average of ±0.87° and maximum of ±1.92° as having the online learning controller adapted with the trajectory.

Next, the feasibility of online inverse model adaptation was validated by engaging external force interactions (Fig. 9b). The online learning controller can compensate the bias and hence minimize the error down to an average of ±2.35° within 5 s upon contact with the constraint. The external constraint is moved away after 30 s of contact. It is also worth noting that the controller could quickly update the inverse mapping online and follow the trajectory will high accuracy. No control instability is observed throughout the experiment. The pure online learning approach achieves the highest average accuracy among all settings, both for constrained and unconstrained scenarios (Tables 1 and 2). However, the need for initialization by babbling motion (green path in Figs. 8b and 9b) should be avoided in clinical scenarios to prevent unnecessary interactions with patient anatomy.

Pretrained by FEA data, then updated by online data

To alleviate the need for random exploration, we attempted to pretrain the controller with FEA data and then update the inverse model by online learning. This approach combines the advantages of the both aforementioned settings, in which the inverse model can be initialized with FEA data. The robot can immediately begin navigation using this pretrained model without the need of initialization through undesired babbling movement. The subsequent manipulation data are also acquired to incrementally train a more precise inverse model so as to adapt to external interactions. This feature is demonstrated in Figure 8c, in which the robot is allowed to move freely.

Although the robot begins with a relatively large tracking error of average ±2.21° and maximum of ±7.49° in the first cycle, the error is quickly compensated by the online learning and converged to an average of ±0.90° and maximum of ±2.80°. This tracking result is compared with the other two approaches in Table 1. In the first cycle, the combined approach exhibits tracking error close to pretraining with FEA only (average ±2.21° vs. ±1.79°) because both inverse models are initialized with less accurate FEA data. The learning technique then corrects the inverse model with online data so that the tracking error decreases rapidly and becomes comparable with the pure online approach (average ±0.90° vs. ±0.87°).

This shows that the combined approach can initialize a reasonable learning-based controller with less accurate FEA data, then further refine the inverse model while performing the tracking task. Note that the combined approach does not required random exploration (green path in Figs. 8b and 9b) to obtain pretraining data, which is difficult to cover the entire robot workspace with sufficient density.

This combined approach is also capable of adapting to the unknown external interaction (Fig. 9c). The inverse model can quickly adapt the inverse mapping upon contact with the external interaction at 36 s. It continues to follow the trajectory with a small mean absolute error of ±2.49°. The controller also remains stable and readapts after the removal of constraints. Readers could also refer to the attached Supplementary Video (Supplementary Data are available online at www.liebertpub.com/soro) for extra details about the robot behavior and the characteristics of constraint.

Referring to the Evaluation of online local learning controller section, we presented the challenge in learning an inverse model spatially localized by the unmeasurable robot state \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ { \theta }}_k}$$ \end{document} as well as how this robot state can be retrieved indirectly from sensory measurements. These trajectory tracking experiments have shown that the inverse model could be successfully learnt by continuous updates of both the task space coordinate \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{s}}_k}$$ \end{document} and control input \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ \bf{u}}_k}$$ \end{document} . Both are set as the localization parameters required in the inverse model. Therefore, the robot state \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${{ { \theta }}_k}$$ \end{document} could be estimated sustainably by the learning algorithm. These 3-6D positional data updates are clinically practical. The comparable position-tracking techniques designed for image-guided interventions are also under active research,⁵³ one of which would be magnetic resonance imaging-guided endoscopic retrograde cholangiopancreatography.

Conclusion and Future Work

We have proposed a model-free control framework that adopts an online nonparametric local learning technique for manipulation of a redundantly actuated, fluid-driven soft continuum robot in the presence of a dynamic external disturbance. Nonparametric techniques are capable of constructing highly nonlinear functions by measurement of data solely, which is particularly suitable for characterization of hyperelastic robot structure. To accommodate the flexibility of soft robot body, we approximate the global inverse kinematics by a linear combination of many locally learnt inverse kinematic models.

Our model-free controller employs this global approximation, where the behavior of the redundant actuator can be optimized by a user-defined criterion, and simultaneously fulfilling the control objective defined in task space coordinates. In addition, the controller is adaptive to changes in the environment, where each local model can be updated online independently according to newly acquired data. This equips the robot with the ability to maintain control accuracy under external dynamic disturbance. Our work is the first attempt of implementing such direct inverse modeling using an online nonparametric learning technique to control a redundantly actuated soft continuum robot.

We have also incorporated FEA into the learning control framework for proper initialization of the robot inverse model. It enables precise prediction of the hyperelastic robot deformation under various actuation pressures, without the need for the oversimplified analytical model. It can also offer adequate sample data covering the entire workspace at high resolution. This avoids the need of time-consuming random exploration to initialize the learning model, which may not be practical in many surgical applications. The proposed controller can hence be initialized offline using FEA-simulated data, ready for endoscopic navigation procedure.

The proposed novel control framework has been experimentally validated. In the constrained experiment, after FEA-based initialization of the controller, the endoscope prototype could follow a 3D trajectory with an accuracy of mean ± 2.21° and maximum ±7.49° and attained almost the same tracking accuracy (mean ±2.49° and maximum ±11.03°) after 5 s upon addition/removal of external disturbance (maximum 1N). This is also the first demonstration of realizing model-free close-loop control of a fluid-driven soft continuum in 3D task space even under dynamic external disturbance.

The current form of our learning-based control method is first designed for a single segment manipulator. In our future work, we intend to extend the framework to address soft manipulation with multisegments.⁵⁴ As a cascade of multiple actuation modules, it provides enhanced manipulation flexibility for interventional tools, facilitating more complicated operations in a confined space. In this case, a generic optimization function will be developed to resolve the null-space control of hyper-redundant robot.⁵⁵ Further characterization of such multisegment soft manipulators will be investigated. To address its hyper-redundancy, it will also require additional sensory systems or algorithms to parameterize the possible motion transition of robot configuration, thus estimating the inverse model for the higher DoF robot.

Footnotes

Acknowledgments

This work is supported, in part, by the Croucher Foundation, the Research Grants Council (RGC) of Hong Kong (Ref. Nos. 27209151 and 17227616), the Innovation and Technology Fund (ITF) of Hong Kong (ITS/361/15FX), and NISI (HK) Limited.

Author Disclosure Statement

No competing financial interests exist.

References

Trivedi

, et al. Soft robotics: Biological inspiration, state of the art, and future research. Appl Bionics Biomech, 2008; 5:99–117.

Mao

, et al. Gait study and pattern generation of a starfish-like soft robot with flexible rays actuated by SMAs. J Bionic Eng, 2014; 11:400–411.

Sareh

, et al. Bio-inspired tactile sensor sleeve for surgical soft manipulators. In: IEEE International Conference on Robotics and Automation (ICRA). Hong Kong, China, May 31–June 7, 2014.

Suzumori

, Iikura

, Tanaka

. Development of flexible microactuator and its applications to robotic mechanisms. In: The 1991 IEEE International Conference on Robotics and Automation. Sacramento, CA, April 9–11, 1991.

Wang

, Iida

. Deformation in soft-matter robotics. IEEE Robot Automat Magaz, 2015; 22:125–139.

McMahan

, et al. Field trials and testing of the OctArm continuum manipulator. In: IEEE International Conference on Robotics and Automation (ICRA). Orlando, FL, May 15–19, 2006.

Runge

, et al. SpineMan: Design of a soft robotic spine-like manipulator for safe human-robot interaction. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Hamburg, Germany, September 28–Oct 2, 2015.

Maghooa

, et al. Tendon and pressure actuation for a bio-inspired manipulator based on an antagonistic principle. In: IEEE International Conference on Robotics and Automation (ICRA). Seattle, WA, May 26–30, 2015.

Cianchetti

, et al. Soft robotics technologies to address shortcomings in today's minimally invasive surgery: The STIFF-FLOP approach. Soft Robotics, 2014; 1:122–131.

10.

Lohsiriwat

. Colonoscopic perforation: incidence, risk factors, management and outcome. World J Gastroenterol, 2010; 16:425.

11.

Tumino

, et al. Endotics system vs colonoscopy for the detection of polyps. World J Gastroenterol, 2010; 16:5452–5456.

12.

Cosentino

, et al. Functional evaluation of the endotics system, a new disposable self-propelled robotic colonoscope: in vitro tests and clinical trial. Int J Art Organs, 2009; 32:517–527.

13.

Pfeffer

, et al. The Aer-O-Scope: Proof of the concept of a pneumatic, skill-independent, self-propelling, self-navigating colonoscope in a pig model. Endoscopy, 2006; 38:144–148.

14.

Fras

, et al. New STIFF-FLOP module construction idea for improved actuation and sensing. In: IEEE International Conference on Robotics and Automation (ICRA). Seattle, WA, May 26–30, 2015.

15.

Camarillo

, et al. Mechanics modeling of tendon-driven continuum manipulators. IEEE Trans Robot, 2008; 24:1262–1273.

16.

Jones

, Walker

. Kinematics for multisection continuum robots. IEEE Trans Robot, 2006; 22:43–55.

17.

Mahvash

, Dupont

. Stiffness control of surgical continuum manipulators. IEEE Trans Robot, 2011; 27:334–345.

18.

Webster RJ

III

, Jones

. Design and kinematic modeling of constant curvature continuum robots: A review. Int J Robot Res, 2010; 29:1661–1683.

19.

Ganji

, Janabi-Sharifi

. Catheter kinematics for intracardiac navigation. IEEE Trans Biomed Eng, 2009; 56:621–632.

20.

Jones

, Walker

. A New Approach to Jacobian Formulation for a Class of Multi-Section Continuum Robots. In: IEEE International Conference on Robotics and Automation (ICRA). Barcelona, Spain, April 18–22, 2005.

21.

Webster

III , et al. Closed-form differential kinematics for concentric-tube continuum robots with application to visual servoing. In: Experimental Robotics: The Eleventh International Symposium. Athens: Greece, July 13–16, 2008, pp. 485–494.

22.

Neppalli

, et al. Closed-form inverse kinematics for continuum manipulators. Adv Robot, 2009; 23:2077–2091.

23.

Marchese

, et al. Design and control of a soft and continuously deformable 2d robotic manipulation system. In: The IEEE International Conference on Robotics and Automation. Hong Kong: China, May 31–June 7, 2014, pp. 2189–2196.

24.

Marchese

, Tedrake

, Rus

. Dynamics and trajectory optimization for a soft spatial fluidic elastomer manipulator. Int J Robot Res, 2016; 35:1000–1019.

25.

Wang

, et al. Visual servo control of cable-driven soft robotic manipulator. In: The 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo: Japan, November 3–7, 2013, pp. 57–62.

26.

Chirikjian

. Hyper-redundant manipulator dynamics: a continuum approximation. J Adv Robot, 1995; 9:217–243.

27.

Kang

, et al. Design, modeling and control of a pneumatically actuated manipulator inspired by biological continuum structures. Bioinspir Biomim, 2013; 8:036008.

28.

Yekutieli

, et al. Dynamic model of the octopus arm. I. biomechanics of the octopus reaching movement. J Neurophysiol, 2005; 94:1443–1458.

29.

Giorelli

, et al. A two dimensional inverse kinetics model of a cable driven manipulator inspired by the octopus arm. In: The 2012 IEEE International Conference on Robotics and Autonomous Systems. Saint Paul: MN, May 14–18, 2012, pp. 3819–3824.

30.

Renda

, et al. Dynamic model of a multibending soft robot arm driven by cables. IEEE Trans Robot, 2014; 30:1109–1122.

31.

Nguyen-Tuong

, Peters

. Model learning for robot control: A survey. Cogn Process, 2011; 12:319–340.

32.

Braganza

, et al. A neural network controller for continuum robots. IEEE Trans Robot, 2007; 23:1270–1277.

33.

Giorelli

, et al. Neural network and Jacobian method for solving the inverse statics of a cable-driven soft arm with nonconstant curvature. IEEE Trans Robot, 2015; 31:823–834.

34.

Yip

, Camarillo

. Model-less feedback control of continuum manipulators in unknown environments. IEEE Trans Robot, 2014; 30:880–889.

35.

Nguyen-Tuong

, Seeger

, Peters

. Model learning with local gaussian process regression. Adv Robot, 2009; 23:2015–2034.

36.

Peters

, Schaal

. Learning to control in operational space. Int J Robot Res, 2008; 27:197–212.

37.

Vijayakumar

, D'Souza

, Schaal

. Incremental online learning in high dimensions. Neural Comput, 2005; 17:2602–2634.

38.

Largilliere

, et al. Real-time control of soft-robots using asynchronous finite element modeling. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, May 26–30, 2015.

39.

Duriez

. Control of elastic soft robots based on real-time finite element method. In: 2013 IEEE International Conference on Robotics and Automation (ICRA). Karlsruhe, Germany, May 6–10, 2013.

40.

Faudzi

AAM

, et al. Development of bending soft actuator with different braided angles. In: IEEE/ASME International Conference on Advanced Intelligent Mechatronics. Kachsiung, Taiwan, July 11–14, 2012.

41.

Greigarn

, Cavusoglu

. Task-space motion planning of MRI-actuated catheters for catheter ablation of atrial fibrillation. In: International Conference on Intelligent Robots and Systems (IROS 2014). Chicago, IL, September 14–18, 2014.

42.

, Frasson

, Baena

FRY

. Closed-loop planar motion control of a steerable probe with a “programmable bevel” inspired by nature. IEEE Trans Robot, 2011; 27:970–983.

43.

, et al. Position control of concentric-tube continuum robots using a modified Jacobian-based approach. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, May 6–10, 2013.

44.

Hartmann

, et al. Real-time inverse dynamics learning for musculoskeletal robots based on echo state Gaussian process regression. In: Robotics: Science and Systems. Sydney, NSW, Australia, July 9–13, 2012.

45.

Sigaud

, Salaün

, Padois

. On-line regression algorithms for learning mechanical models of robots: A survey. Robot Autonom Syst, 2011; 59:1115–1129.

46.

Schaal

, Atkeson

, Vijayakumar

. Scalable techniques from nonparametric statistics for real time robot learning. Appl Intellig, 2002; 17:49–60.

47.

Simulia

. ABAQUS 6.13 User's Manual. Providence, RI: Dassault Systems, 2013.

48.

Başar

, Itskov

. Finite element formulation of the Ogden material model with application to rubber‐like shells. Int J Numer Methods Eng, 1998; 42:1279–1305.

49.

Hughes

. The Finite Element Method: Linear Static and Dynamic Finite Element Analysis. New York: Dover Publications, 2012.

50.

Klanke

, Vijayakumar

, Schaal

. A library for locally weighted projection regression. J Machine Learn Res, 2008; 9:623–626.

51.

Whitney

. Resolved motion rate control of manipulators and human prostheses. IEEE Trans Man-Machine Syst, 1969; 10:47–53.

52.

Jain

, Nandakumar

, Ross

. Score normalization in multimodal biometric systems. Pattern Recognit, 2005; 38:2270–2285.

53.

Chen

, et al. Design and fabrication of MR-tracked metallic stylet for gynecologic brachytherapy. IEEE/ASME Trans Mechatron, 2015; 21:956–962.

54.

Sadati

, et al. Stiffness control of soft robotic manipulator for minimally invasive surgery (MIS) using scale jamming. In: Intelligent Robotics and Applications. Portsmouth, UK; Springer, August 24–27, 2015, pp. 141–151.

55.

Kwok

, et al. Dimensionality reduction in controlling articulated snake robot for endoscopy under dynamic active constraints. IEEE Trans Robot, 2013; 29:15–31.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

53.23 MB

0.00 MB