Abstract
Joint Range of Motion (JROM) development has been shown to facilitate learning motor control in human beings. This developmental strategy has been applied in robotics to improve learning performance with different outcomes: sometimes it is favourable, others irrelevant, and others, even detrimental. The reasons that underpin this variability in the results are still not well understood. In this paper, we seek to better understand the principles underlying the application of JROM based morphological development to make its use more straightforward. To this end, empirical studies were carried out over two representative use cases: quadruped and bipedal robot morphologies learning to walk. Different parameters of the application of JROM development (morphological configuration, JROM developmental strategy, etc.) have been evaluated to elucidate their effects over learning. The results show that there are significant connections between the reduction of the motor space induced by JROM and the way the exploration and exploitation of the solution space is carried out by the learning algorithm, and the performance achieved. Through these connections, we have identified a set of conditions that must be satisfied for JROM development to be effective as a tool for learning improvement.
Keywords
Introduction
Most of the work in developmental robotics focuses on cognitive development. That is, a robot autonomously learns complex skills by interacting with its environment, using developmental principles extracted from the developmental psychology literature but, unlike what happens in human beings, using a fixed morphology.1–6 Only a few authors have started to include the development of the morphology in ontogenetic time scales as a parameter to consider during learning to take advantage of the relevance of the morphology for learning. 7–10
Different classes of strategies can be found in the robotics literature that seek developmental variations in the morphology to improve learning. According to Naya et al., 11 these can be grouped into three categories: physical body development,12,13 sensor development,14,15 and motor development.16–18 Out of these, in this paper we concentrate on motor development.
The application of motor development is inspired by Bernstein’s studies on the “Degrees of Freedom problem”. 19 He postulates that in the early stages of motor control, the central nervous system reduces the level of involvement of some of the elements that contribute to body motion (muscles, tendons, etc.) in order to facilitate a tighter control over the remaining ones. 18 Progressively, these limitations become less restrictive through an increase in the range of motion of specific joints or by releasing them completely, until all the constraints disappear.20,21 This developmental strategy generally follows a proximodistal trend, 22 as reported by several studies in the literature on human motor control and coordination.23–26
In robotics, motor development is usually achieved in two ways: By reducing the maximum Range of Motion available to the Joint (JROM) in the early stages of development and increasing it up to the “adult” or “mature” range of motion as development progresses. Alternatively, by a drastic reduction of the JROM available at the beginning of learning to 0, and then releasing it abruptly at some point in time. The latter is usually called development through “Freezing and freeing Degrees of Freedom (DOF)”.25,27,28
Looking at the robotics literature that tries to adapt these ideas to improve learning in robots, there is not that much work on the application of JROM based development, and the work addresses two main tasks, as we will see now: the walking tasks as examples of clear dynamic control processes and the reaching task, or the combined reaching and grasping task, which are less critical in temporal terms.
Considering the walking task, Bongard 29 showed that gradual morphological changes in quadrupeds and hexapods outperform the learning performance of a fixed morphology when the task is challenging (e.g. learning to walk diagonally instead of in a straight line). He observed that an early limitation of the robot DOFs and their gradual release, accelerates the acquisition of optimal behaviours, but it is not clear why this happens. In addition, if the morphology changes abruptly, learning performance degrades when using development with respect to not using it. This suggests that the abrupt change in the controller-morphology relationship may be a distorting factor.
In a quadruped morphology, Baranes et al. 30 show how motor development, plays an important role in maximizing learning efficiency. This is attributed to the simplification of the learning process caused by the reduction of the motor space, which is gradually increased when a certain level of mastery is achieved. Addressing a bipedal morphology in a swinging task, which is more dynamically challenging, Lungarella and Berthouze, 17 suggested that starting with frozen DOFs helps to stabilize the system and allows finding more robust behaviours due to “a more efficient exploration of the sensorimotor space”. The authors suggest that this happens because of the physical entrainment of the body, which helps to find a high-yield area in the parameter space. Nevertheless, this more efficient exploration is not enough to find stable behaviours under system perturbations, requiring a process of alternating freezing and freeing DOFs. 28
Lapeyre et al., 31 showed how motor development “leads to a faster and safer way for learning”, in a bipedal walking task helped by a trolley. This was particularly relevant in cases where the motor space was strongly constrained and the DOFs were slowly released. Seemingly, because in less restricted scenarios, the motor space was too large and the optimization method could not find good solutions without many iterations.
Other authors considered use cases that did not contemplate walking. For instance, the work done by Qiang Shen and his colleagues in different experiments involving robotic arms and vision systems (learning to write Chinese stroke, 32 object perception and recognition, 33 or feature perception 34 ) has shown how morphological development, based on the “Lift-Constraint Act and Saturate” (LCAS) algorithm, 35 can improve learning. The LCAS algorithm imposes maturational constraints not only on the motor system but also on the sensor system and the computational one (controller or learning algorithm). These constraints are lifted once a certain level of mastery is achieved. These experiments were inspired by the developmental psychology literature and pursued the goal of creating an infant-like learning mechanism, rather than a goal of understanding the mechanisms and implications of the developmental sequences. Little information is provided about why these infant-like learning mechanisms have favoured learning. blackIn this line, Campos-Alfaro et al. 36 propose an approach to integrate open-ended learning in modular robotics. This work emphasizes the importance of equipping robots with morphological adaptability and the ability to autonomously learn utility models specific to each morphology through a motivational system designed for open-ended learning.5,37 This contribution is relevant because it suggests that morphological adaptability not only enhances learning performance, but also aligns with the need to understand how different parameters interact in the learning process in dynamic environments.
The objective of developing an infant-like learning system was shared by Savastano and Nolfi 14 while learning to reach and grasp using an iCub robot. The authors addressed the development of the sensor and motor system separately and jointly. They encountered that the development of the sensor system was irrelevant, while the development of the motor system reducing the number of DOFs improved learning. In this case, the robot initially starts with basic primitive movements whose complexity increases as development progresses and DOFs are released, which is aligned with Bernstein’s hypothesis about motor control: simplifying learning at the beginning through constrained motion, and once a good level of proficiency is attained, the constraints can be freed. 19
Bernstein’s hypotheses are also supported by Ivanchenko and Jacobs, 38 in a simulated three-joint arm learning specific trajectories. An early freezing of the elbow and wrist joints (the furthest ones) contributes to obtaining better results than in the no development case. The authors argue that “the knowledge gained during these early training stages would provide a useful foundation for further learning at later training stages”. In addition, they also found that when the task is simple (the reference trajectory is easy to learn) development is irrelevant, and it only makes sense when the trajectory is hard to learn (complex task). Nevertheless, the authors did not focus on identifying the parameters or variables that cause this improvement, and no clear information about that is presented.
In this line, there are articles in the literature that report positive outcomes addressing motor development in the reaching task27,39 but as it seems that they are preliminary results, they do not study the causes or reasons that produce this advantage in any depth. Table 1 summarizes the different authors’ claims, what they considered, and how their experiments were carried out. Based on the literature and what is shown in Table 1, we observe that motor development has favoured learning based on two factors that are related to each other: 1) A reduction in the motor space30,30,31,38 and 2) the greater stability provided by such reduction in the motor space.28,31 However, certain limitations are also observed in the conclusions provided: 1) Stability can affect walking tasks, but not tasks that only address reaching, where stability is intrinsic to the morphology; 2) The implementation of motor development in the literature has been quite heterogeneous and there does not seem to be a unified framework for the problem: different definitions for parameters and aspects relevant to the topic, different experimental conditions and, in most cases, and there is no clear explanation about why JROM development has influenced learning except in very general terms. In addition, some experiments address motor development based on time constraints14,29,38 and others, on performance.17,30,33 All of this makes it difficult to identify which parameters are really important for the successful implementation of motor development.
Summary of the representative articles found in literature.
Summary of the representative articles found in literature.
At this point, it is important to note that in the cases where intrinsic dynamicity was not the key aspect of the task, such as in reaching, most of the work considered the simultaneous development of the controller, the motor and the sensory system, thus making the problem even more untraceable.14,15,40 This was not the case in walking tasks, where in some cases, and given the fact that open-loop solutions to walking exist, no sensory system is even considered, just an external observer to measure the final performance. This nicely decouples motor development from sensory-motor development, allowing for a more nuanced study of the problem in hand.
Hence, in this article, our goal is to try to understand the implications of JROM-based development, decoupled from sensing, as an aid for learning in robotic systems by exploiting the embodied characteristics of the morphology through the different developmental stages. Thus, we seek to identify those parameters or approaches that allow a finer control of the learning process when applying JROM morphological development to explore learning paths that traditional learning algorithms ignore. This objective aims to complement our previous work13,41 where we studied the influence that morphological development based on physical changes in the robot’s body has on learning, and analyzed reasons that cause it. In that article, experiments were carried out on 2, 4, 6, and 8-legged robots.
Thus, the remainder of the article is organized as follows. Section 2 provides a formalization of morphological development we use in this article, and explains how JROM-based development is applied. Section 3 presents the methodology we have followed. It describes in detail the characteristics of the selected morphologies and the experimental framework. Section 4 providess the structure and logic of the different experiments performed. Section 5 and 6 are devoted to the presentation of the results of the experiments that were carried out over the quadruped and biped morphology, related to the hypotheses established in the methodology. A discussion of these results is provided in Section 7. Finally, Section 8 is concerned with the main conclusions extracted from this work.
For coherence with our previous work, we consider the formalization of general morphological development that we introduced in Naya-Varela et al. 11 and particularized for growth in Naya-Varela et al. 13 In this case, the formalization is particularized to JROM development as this is the main topic of this paper.
To formalize morphological development, it is always necessary to define what we understand by morphology. In this line, we consider that a robot is made up of a set of
Additionally, regarding robot operation, the robot functions over the time interval
From the perspective of robot operation and learning, as shown in Figure 1, the morphology is managed by a control system.42,43 This control system is defined by a set of parameters (in our experiments, by the structure and weight values of an artificial neural network, but they can be any other parameters depending on the implementation of the controller) that are optimized along the learning process to achieve the goals given to the robot using any learning algorithm (e.g., neuroevolution).

Flow control in a typical robotic learning problem. In white, the environment. In green, the fitness value that represents the performance of a given solution. In orange, the physical elements that constitute the robot morphology. In yellow, transfer functions that map a set of parameters (e.g. perceptions in the form of images to inputs to the controller). In blue, the set of elements that constitute the controller of the robot. In grey the learning algorithm.
The control system receives data (Inputs,
Limiting the range of motion of a joint implies changing the mapping between the output of the controller and the command sent to the joint, that is, changing
To round off these definitions of terms, the combined effect of the joint commands, the robot morphology, and the environment in which the robot operates, results in a value or set of values assigned by the designer/user representing the fitness
Hence, in this formalization, the solution,
In most optimization and learning problems, the
Reaching this point, the question now is how to design
Methodology
As mentioned in previous sections, we want to study the implications of implementing JROM-based morphological development during learning. This implies addressing some initial questions that arise from the work of previous authors. Looking at the literature (see Table 1 for a summary), some authors claim that “learning is simplified through a reduction of the motor space”.30,31 This hints at the fact that controlling the size of the motor space, which in our case is given by the Joint Command Space,
In addition, it must be made clear here that the final goal is to obtain a controller that, when learning is completed, controls the robot without any constraints on the JROM apart from those given by its morphology. This means that whatever reduction in the JROM is initially imposed, it must have been somehow removed by the end of learning. However, the question here that has not been really answered in the literature is how they should be removed. That is, how must JROM change in time during learning? Should it vary progressively towards the final JROM? In one shot? Does this depend on the specific problem? In fact, if progressively were the answer, it would beg the question of how fast. In other words, is there any factor that determines this speed?
To address these questions, and guide the experimental work, we are going to establish a series of hypotheses based on information extracted from the literature and our previous work on the relevance of the application of morphological development for learning. These hypotheses are:
A limitation of the available JROM implies a reduction of the motor space. This limitation means a reduction in the available The developmental speed influences learning: the higher the speed of development, the less relevant development is, because in these cases, the learning algorithm does not have enough time to explore the A proper synergy among the various components of the development and learning process
41
is needed for JROM-based development to be relevant for learning. This means that it is not enough to simply reduce the motor space to improve the
To experimentally study these hypotheses, collecting statistical data to compare and analyze the results of JROM development, we chose two use cases related to walking: quadruped and bipedal walking. The main reason for this choice is that the basic walking task can be achieved without any sensory feedback, as other authors have done.30,31,41 This way, it should be possible to discriminate the effects of JROM development from, for instance, sensor development or even sensor choice for the feedback, which is the problem of tasks such as grasping or reaching. In addition, these two morphologies have a different number of joints and different stability characteristics, which makes the same task (learning to walk) more challenging for one morphology than for the other. Finally, several authors make use of these cases, and, in fact, we have studied them in a growth-based development case, and thus the results can be compared among developmental strategies, to exploit the relevance of morphological variation at the same time as learning occurs.
Taking these hypotheses into account we have designed a series of experiments in which we have compared the process of learning controllers with and without JROM development considering different parameters, with the aim of generalizing as much as possible the conclusions obtained from the results. To begin, we will start with the quadruped morphology, as it is the more stable one and, thus, theoretically, presents the lowest difficulty for learning. It allows us to test different configurations of the morphology, leading to different configurations of the motor space.
In these settings we will evaluate different developmental strategies, such as progressive JROM development, abrupt JROM change (freezing and freeing DOF), or a mixture of both. To support the previous experiments, a group of experiments based on different configurations of the morphology learning without morphological development, have also been performed. Additionally, we present the results of two experiments with different morphological configurations but contemplating the same developmental speeds.

Representation of the quadruped with 8 DOF and an angle offset for the joints of 0° with respect to the vertical plane.
On the other hand, the experiments carried out with the biped have been designed to complement the information obtained using the quadruped morphology and thus provide a more general view of the results. To this end, JROM development experiments similar to those performed with the quadruped have been carried out (such as a gradual or abrupt JROM development), but with variations with respect to those implemented with the quadruped (among other reasons, because the morphologies are different) expanding thus the use cases studied.
In the following subsections, we provide more details on the morphologies that were used and the configuration and execution of the experiments themselves.
The quadruped morphology (Figure 2) is made up of a central body and four limbs attached to it. Each limb consists of an upper link and a lower link, connected by revolute joints. The upper link measures 5

Top: Quadruped with a joint angle offset (JAO) of 60° and with a JROM of [
Maximum JROM for the upper limb and lower limb of the quadruped in each joint angle offset.
The morphology of the bipedal robot is based on a real NAO robot model created in the CoppeliaSim simulator (Figure 4). For coherence with our previous work, the NAO model has the legs modified to change its morphology during learning (allowing it to grow),13,30,44 although in this article the length of the legs will be fixed:
Upper link: The upper section consists of three joints (hip yaw-pitch, hip roll, and hip pitch) and two identical cuboids, each with dimensions of 8 Lower link: The lower leg section includes one joint (knee pitch) and two distinct cuboids. The upper cuboid measures 8 Foot: The foot model was simplified by reducing the number of cuboids from the original NAO design. Each foot is now sized at 18.4

Left: Frontal view of the bipedal (NAO) robot. In green, the default meshes of the original robot. In grey, the modified parts. Rotational and prismatic joints are indicated in red. Right: Side view of the bipedal robot with various parts labelled.
In addition, the shoulders of the NAO are also actuated, being able to move forward and backwards by means of the shoulder pitch joint. That is, there are a total of 14 joints and their ranges of motion are presented in Table 3.
Maximum available JROM values for each joint (right and left sides) of the bipedal robot.
The robots’ controller is based on a Neural Network (NN) structure using sigmoid activation functions. The inputs and outputs of the NN for each morphology, as well as other parameters of the controller are summarized in Table 4.
Summary of the NN controller parameters for each morphology.
Summary of the NN controller parameters for each morphology.
Learning is achieved through neuroevolution using the NEAT algorithm. NEAT was selected because of its capability to simultaneously optimize both the topology and the connection weights of the NN, 45 thus reducing the influence of the human designer in the learning process.
To simplify the study of joint range of motion-based development, the complexity of the controller has been reduced to a minimum. The inputs to the NNs are sinusoidal signals, used as pattern generators,46–49 1 for the quadruped and 3 for the biped. The difference is motivated by the complexity of the morphology. A biped requires a larger number of pattern generators to produce different types of gaits (hence the phase change between each of the input signals shown in Table 4). The amplitude of the pattern generators is set to 2 to avoid normalization of the input values. Furthermore, the frequency is also different for each morphology and was obtained experimentally. The outputs are given by the number of joints available for control in each morphology (8 for the quadruped and 14 for the NAO) and they are scaled from [0, 1] to align with the specific range of motion available for each joint (

Example of the NN structure at the beginning of the learning process for the quadruped, which has 8 degrees of freedom (DOF) and includes 1 sinusoidal input plus a bias. When learning progress, the NEAT algorithm can add extra neurons and connections (weights) to the NN.
The experiments were conducted using the CoppeliaSim simulator with the Open Dynamics Engine. For each independent run, the NEAT algorithm optimizes a population of 50 individuals for 300 generations. These parameters were selected deliberately low, because JROM development allows to ”offload” computation from the algorithm to the morphology, helping to find an optimal solution with less computational resources. The simulation configuration is the default one, with a time step of 50 ms. In addition, as the movements to perform for each morphology are different, the joints of the quadruped are updated every two simulation time steps. This allows enough time to perform the movements, as they are larger than in the biped case. Finally, in the simulator, each individual is evaluated for 9 seconds, time enough to properly learn gait patterns without extending too much the experimental phase. These parameters are summarized in Table 5.
Simulation experimental parameters.
The fitness value of each morphology is related to the distance travelled in a straight line and the possibility of falling:
Being:
Thus, given the
In this context, we assume that the NAO has fallen when its head is at a height of less than 0.4 m.
The following experiments have been carried out over the quadruped and NAO morphology:
To study the influence of motor development, two groups of additional experiments have been designed: The first one involves experiments without variation in the number of DOF and the JROM during the learning process (i.e. the
No Development with Reduced JROM (RND): These experiments are similar to the ND one but with reduced values for the JROM. They are conducted with the aim of finding out whether a reduction in JROM alone, without development, could achieve an improvement in learning compared to the standard case of ND or whether it would be detrimental or irrelevant (and to dilucidate why). The value of JROM for each joint depends on the type of experiment. For example, for the quadruped morphology, these experiments are characterized by performing multiple experimental runs with various Joint Angle Offsets (JAO) and total JROM configurations for the joints of the lower limbs of the quadruped (Table 6). The term “total JROM” is understood as the sum of the absolute values of the upper bound and lower bound (e.g., a total JROM of 100° means a JROM of [
Different JROM tested for each joint angle offset (JAO) (rows) of the lower limbs (Columns).
The other group of experiments involves variations in the JROM or in the number of DOFs available during learning. These experiments imply a modification of the parameters of function
Proximodistal JROM Development (PJD): These experiments are characterized by keeping the JROM limits of the joints closest to the body invariable whilst reducing the JROM of those farthest. Generally, the initial JROM available is 1/2 of the final one, but in some cases, it starts completely limited at generation 0 in the case of the NAO or 1/8 in the case of the quadruped. Such reduced JROMs increase linearly until they reach the experimental configuration of the no-development case. After that, learning continues as in the reference case. For the quadruped, development is applied to the joints of the lower link, while in the NAO, it is applied to the joints of the ankles, knees, and shoulders. Freezing and Freeing DOF development (DOFD): These experiments are characterized by starting learning with the farthest joints completely locked and at generation 30 (1/10 of the total learning period), these DOFs are abruptly or gradually released up to generation 90 (quadruped case) or 150 (NAO case). After that generation, the experiment continues as a ND one. Again, for the NAO, this type of developmental strategy is applied to the joints of the ankles, knees and shoulders.
A summary of these experiments with a brief description of their characteristics is presented in Table 7. In addition, the source code of each experiment is available for the quadruped 1 and for the NAO 2 .
Summary of experiments and characteristics for quadruped and biped morphologies.
We begin our study of the influence that the application of JROM development has on learning by addressing the validity of each of the hypotheses that were established considering a quadruped morphology.
H1: A reduction of the motor space facilitates learning
To start, we first select different initial configurations of the quadruped (3 different joint angle offsets: 0°, 30°, and 60°) and evaluate the results in three cases: 1) When there is no variation in the JROM (ND); 2) When learning begins at generation 0 with a limited JROM of the lower limbs available (half of the final one) and it increases gradually until reaching the maximum JROM available at generation 90 (PJD); and 3) when the JROM of the lower limbs joints’ starts completely blocked up to generation 30. After that, the JROM is progressively released until reaching its maximum range, at generation 90. From there on, the learning continues with the adult morphology (DOFD). In all cases, ND, PJD, and DOFD, learning continues from generation 90 onwards with a fixed morphology and the JROM available is the same for all of them. The comparative results of these experiments are presented in Figure 6. The characteristics of the boxplots are the same for all the boxplots in the article. Each boxplot corresponds to the median and the 75 and 25 quartiles of the results obtained at the end of learning of the 50 independent runs. The whiskers are extended to 1.5 of the interquartile range (IQR). Single points are values that are out of the IQR. The statistical analysis has been carried out using the two-tailored Mann-Whitney U test. 50 We consider a p-value of 0.05 as the significance value for accepting or rejecting the null hypothesis (the compared samples are equal). A Bonferroni correction 51 has been applied to the statistical analysis. The comparative results of these experiments are presented in Figure 6. For clarity, the numerical results from the statistical analysis were replaced with asterisks.

Results for the quadruped, considering a Joint Angle Offset (JAO) of the lower limbs’ joints of 0°, 30° and 60° for the Proximodistal JROM Development (PJD), the Freezing and Freeing Degrees of Freedom (DOFD) and the No Developmental (ND) cases.
Results of the learning process with the joints of the lower limbs starting at 0° (JAO: 0°) are shown in Figure 6-left. In it, there is no difference in learning between the ND and the developmental experiments (PJD and DOFD). When the JAO changes to 30° (Figure 6-middle), results partially change. PJD surpasses the distance achieved by the ND case (p-value of 1.3
Secondly, having observed that the configuration of morphology can affect the influence of JROM-based development on learning (in this case, the initial position of the joints of the lower limb) we now carry out a series of experiments in which we try to identify the causes of this effect. They consist in a sweep of multiple morphological configurations with a fixed morphology (no development), combining different values of the JAO and the available JROM of the joints of the lower limbs. The results are shown in Figure 7. The colour for each grid cell represents the median of the distance travelled, in meters, of 50 independent experiments for each combination of JAO and JROM displayed in Table 6. The best-valued combinations (those with a strong orange-red colour) are grouped around the JAO range between 20° and 80° and with total JROMs between 40° and 80°. Furthermore, according to Figure 7, a JAO of 30° and total JROM of 80° (leading to a JROM range of [

Joint Command Space Sweep (without development) addressing different joint angle offsets from
To further explore this hypothesis, we have carried out a series of experiments in which the motor space is initially strongly reduced (to 1/8 of the final JROM) and is progressively released until reaching the final JROM. This strong limitation has been tested over two JAO cases: one with a JAO of 30°, and another one with a JAO of 60°. Thus, on the one hand, in the case of the JAO of 30°, an initial JROM of 1/8 of the final one (an initial JROM of [

Results for initial JROM at 1/8 of the final value in Proximodistal JROM Development (PJD) experiments with Joint Angle Offset (JAO) of 30° (left) and 60° (right). The number with “PJD” indicates the generation where development ends.
Consequently, it seems that developmental speed is quite important when the optimum is not contained within the initial motor space, hinting at the fact that the exploration capabilities of the algorithm used to explore the Solution Space is related to how fast development should be.
From Figures 6 to 8, we can extract that there are morphological configurations that are more adequate for learning to walk than others, considering as morphological configuration both the JAO and the JROM available (the experiments presented in Figure 7 are a clear example of that). In addition, reducing the JROM available for each joint implies a reduction of the
This is the key point of JROM development: It helps to improve learning performance by reducing the size of the
The optimal joint commands (the area where the optimum is located in the motor space) must be available in the initial reduced The effects of the JAO can be observed in Figure 6, where JROM development is irrelevant for a JAO at 0° (far from the optimum), but beneficial at 30° and 60°, which given the initial JROM of one half the final one, both include the optimum in the initial The developmental speed for learning. As indicated before, Figure 8-right shows how considering different developmental speeds for the same JAO (60°) and initial JROM available (1/8) the results can be completely different. This may be motivated by the speed of change of the
Thus, it seems that the influence of JROM development as a developmental strategy is a combination of multiple factors, but all of them are related to the reduction of the
Videos of the best individuals obtained in some of the experiments can be found in our repository 3 .
Biped morphology
This section is intended to complete and complement the results obtained with the quadruped in a more taxing problem from a walking perspective. Biped walking is much more unstable and exploring the motor space becomes much harder.
H1: A reduction of the motor space facilitates learning
In this case, we started by studying the results obtained by applying different JROM development strategies and comparing them with the ND results (Figure 9). Regarding the fitness values (Figure 9-left), the majority of the developmental strategies have provided better results than the no developmental case (with p values below

Learning results for the NAO under Proximodistal JROM development: fully limited JROM (PJD-0), half JROM (PJD-0.5), abrupt DOF development (DOFDA), gradual DOF development (DOFDG), and No Development (ND). Left: Fitness at the end of learning (PJD ends at generation 150; DOFDA fully frees DOF at generation 30; DOFDG gradually increases JROM from generation 30 to 150). Middle: Distance travelled at the end of learning. Right: Percentage of falls of the best individuals for each type of experiment at each generation. The vertical dotted black line at generation 30 indicates the abrupt release of the DOFs
The two parameters that make up the fitness value, travelled distance and the number of falls, are plotted in Figure 9-middle and Figure 9-right. Figure 9-middle displays the distance travelled in each type of experiment at the end of learning. The PJD experiments clearly beat the results of the ND case with a high statistical significance (p values below
Figure 9-right represents the percentage of falls that the best individuals of all independent executions suffer at each generation. At the end of learning, the ND experiment achieved 48
These results are in line with those obtained in the case of the quadruped: the relevance of JROM development is conditioned by both the reduction of the motor space, how it is reduced and how it is increased until reaching the final motor space, and by the capacity of the NEAT algorithm to find optimal solutions in the
Considering that these results show how an initial reduction of the JROM and its subsequent gradual release has improved learning to a higher or lower degree, a question arises regarding the relevance of the motor space reduction: What would happen if the whole learning process were carried out by a no-developmental type of experiment whose JROM is reduced to half or a quarter (strong reduction) of the one shown in Table 3? This question aims to support the hypothesis that it is not enough to reduce the motor space to improve learning because the development of the morphology is also needed for maximizing learning performance. In addition, the proposed experiments will also allow us to address the hypotheses raised in the literature, but not studied in depth by the authors, about the relevance of JROM development in learning. Hypothesis such as “it helps to find a high-yield area in the parameter space”, 17 “Strong constraints lead to a faster and safer learning”, 31 “an initial reduction of the motor space, which simplifies exploration”. 30
A set of experiments with a reduced motor space and fixed morphology, RND-0.5 and RND-0.25 which have a maximum JROM of 1/2 and 1/4 of that of ND respectively, are compared with a series of developmental experiments. These developmental experiments start learning with the same reduction of the motor space, but the motor space increases as long as development progresses, until reaching the final one, which is the same as that of the ND case. These are the PJD-0.5 and PJD-0.25 experiments. In addition, all of them are compared with the ND case. The results of this comparison are presented in Figure 10. Figure 10-left shows how, regarding the fitness value, all experiments outperform the ND one, with p-values lower than

Comparison of No Development (ND), Proximodistal Joint Development (PJD-0.5, PJD-0.25), and fixed morphology experiments (RND-0.5, RND-0.25). Left: Fitness at the end of learning. Middle: Distance travelled at the end of learning. Right: Percentage of falls of the best individuals for each type of experiment at each generation.
The results obtained by limiting the JROM in different ways (DOFD and RND) show how a reduction in the motor space leads to safer learning, by decreasing the number of falls. Statement that has also been mentioned by some authors in the literature on learning to walk with bipedal morphologies.
31
However, it is important to remark that this does not seem to indicate that better results per se have to be obtained. On the one hand, the results of the RND-0.5 experiment seem to indicate that if the reduced
Videos of the best individuals obtained in some experiments can be found in our repository 4 .
Discussion
In the previous experiments, JROM-based development has been favourable or irrelevant to learning (Figures 6 to 9), but not unfavourable, except for abrupt development in the case of NAO (DOFA), where the fitness values where worse than in the case of ND (Figure 9-left), although the robots travel a similar distance (Figure 9-middle). Nevertheless, obtaining favourable results by implementing JROM development is not straightforward, and some specific conditions must be fulfilled.
Positive outcomes
In our scenario, motor development favours learning based on: 1) Having a specific morphological configuration, defined by the morphology, the JAO and JROM, which determine the motor space (or the
The effectiveness behind the modification of the
On the other hand, the continuous variation of the
These conclusions are in line with our previous work13,52 studying growth development. In them, we encounter that growth development (modifying the parameters of the links
In the current article addressing joint development, the relationship between the fitness landscape and the Solution Space is also modified by means of variations in the
Another point to mention is the increase in the robot’s stability. Reducing JROM eliminates actions that involve large movements that can destabilize the robot balance. Thus, small JROMs facilitate controlling the robot’s movements and increase its stability, which is especially relevant in the case of the NAO. Fewer falls during learning, especially at the beginning, produce more informative individuals (the learning processes are not interrupted and restarted) facilitating finding optimal solutions (Figure 9-right) and reducing the so called bootstrap problem.
In some cases a simple reduction of JROM without development has led to better results than ND with the full ROM, the JROM developmental process is needed because the optimal reduction of the
Finally, the implementation of JROM development in real robots presents a clear advantage: it does not modify the body of the robot. Then, compared to other development developmental strategies, such as our previous work studying growth,13,41 JROM can be applied to any robot that is already in the market without requiring any hardware modification.
Limitations
On the other hand, we have seen how it is not enough to simply reduce the
Another limitation we have observed, and which is also reflected in the literature, is that morphological development based on JROM is much more favourable, among other cases, when it is slow and has an effect on the stability of the morphology, as is the case of the biped. This influence on stability can reduce the efficiency of JROM in those tasks where stability has little or no influence, as is the case of reaching.
These limitations present a challenge for the direct and effective implementation of the JROM strategy in real robots. Specifically, it is not straightforward either to predetermine the morphological configuration that optimally facilitates motor development to enhance learning or the developmental speed for a given morphology. Consequently, experimental tests are required to identify the optimal combination of parameters to provide the best results.
Conclusion
The main conclusion of the work presented here is that the influence of JROM-based morphological development is related to three main experimental parameters or conditions: 1) The reduction of the motor space, defined by the morphology (quadruped and biped) and its configuration (different initial positions of the joints and JROM available); 2) The developmental speed, responsible for maintaining an optimal relationship between the changes in the morphology that JROM development caused, and the learning algorithm’s capacity to respond to these changes: if the developmental speed is too fast, the effect of morphological development becomes irrelevant for learning; and, 3) a synergy between the learning algorithm and the motor space at each developmental stage, encapsulating the previous two parameters, coupled to the type of developmental strategy. This synergy represents the capacity of the learning algorithm to establish a relationship between the reduced motor space, defined by the
The mechanisms commented above, in essence, reduce the available Joint Command Space at the beginning of development, facilitating the search and exploitation of optimal solutions when these are included in the reduced Joint Command Space or are included in the initial stages of development and then provide an appropriate development path. This path involves a continuous modification of the solution space, and thus, the fitness landscape throughout development, at a speed that allows the learning algorithms to adapt, and that guides these algorithms towards the optimal controller. When this evolution of the fitness landscape takes place in a manner that does not support the above mentioned synergies, development using JROM may be irrelevant or even harmful.
In particular, the results presented in this study have shown how the Proximodistal Joint Development (PJD) strategy has offered better learning performance than learning without morphological changes, or using the other developmental strategies tested (freezing and freeing degrees of freedom with abrupt and gradual release). Nevertheless, most of the other developmental mechanisms where a gradual in crease of JROM at the right speed was contemplated were also quite successful in achieving the desired result, attesting to the robustness of the general approach.
There is obviously still a lot of work to be carried out in this field, both in the study of JROM development strategies in different use cases, to confirm the results obtained here, and in the study of other developmental strategies such as sensor development. In our work we have started this path with growth based and JROM based development and will continue with sensor development and the study the interactions that may occur when different modalities are used simultaneously.
Footnotes
Acknowledgments
This research was partially funded by the European Union’s Horizon 2020, Research and Innovation Programme, GA 101070381 (“PILLAR-Robots - Purposeful Intrinsically-motivated Lifelong Learning Autonomous Robots”), by Xunta de Galicia (EDC431C-2021/39 and M. Naya-Varela’s grant ED481B), by the Spanish Science and Education Ministry (PID2021-126220OB-I00), and the Ministry for Digital Transformation and Civil Service and Next-Generation EU/RRF (TSI-100925-2023-1), by “ERDF A way of making Europe’, Centro de Investigación de Galicia “CITIC” (ED431G 2019/01), and “Centro de Supercomputación de Galicia” (CESGA).
Conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This research was partially funded by the European Union’s Horizon 2020, Research and Innovation Programme, GA 101070381 (“PILLAR-Robots - Purposeful Intrinsically-motivated Lifelong Learning Autonomous Robots”), by Xunta de Galicia (EDC431C-2021/39 and M. Naya-Varela’s grant ED481B), by the Spanish Science and Education Ministry (PID2021-126220OB-I00), and the Ministry for Digital Transformation and Civil Service and Next-Generation EU/RRF (TSI-100925-2023-1), by “ERDF A way of making Europe’, Centro de Investigación de Galicia “CITIC” (ED431G 2019/01).
