## The Evolution of Co-op Games

Play together instead of against each other: As early as the late 1970s, some game designers followed this motto and thus created completely new gameplay experiences. One of the very first co-op games: Atari's arcade machine Fire Truck from 1978. Programmed by Howard Delman, two players have the task of maneuvering a fire truck through the winding streets of a city together. Player A takes a seat in the driver's cab of the truck and steers and accelerates, while player B (standing and equipped with his own steering wheel) coordinates the control of the wheels of the attached ladder cart.

Can use a white button on your right Both can also be heard acoustically - player A with a horn, player B with a bell. Then the following applies: the further the duo gets, the more points it receives. When certain sections are reached, the fuel gauge fills up again a little. If, on the other hand, the tank is empty, the game ends.

## Egoistic punishment outcompetes altruistic punishment in the spatial public goods game

We first focus on the performance of egoistic punishment in the well-mixed population. As we all know, a peer altruistic punisher cannot survive in a well-mixed population and a pool altruistic punisher can only prevail assuming the additional punishment of pure cooperators51. However, egoistic punishers can survive without further complexity, as shown in Figure S1 and S2 in the supplementary material. Pool egoistic punishers prevail as the fine ($$\beta$$) increases or as the punishment cost ($$\alpha$$) decreases, and the system consecutively transitions from the pure $$\mathrm D$$ phase to the $$\mathrm D+P_S$$ phase, then to the $$\mathrm C+D+P_S$$ phase. Peer egoistic punishers replace defectors and then are invaded by pure cooperators as $$\beta$$ increases or as $$\alpha$$ decreases. Accordingly, the system discontinuously transitions from the pure $$\mathrm D$$ phase to the pure $$\mathrm P_B$$ phase, and then consecutively transitions to the $$\mathrm P_B+C$$ phase. Both in peer and pool punishment, the phase transition boundaries of $$(\beta , \alpha )$$ move left as the cooperative coefficient (r) increases. The difference is that the critical lines are linear in the pool mode, while they are nonlinear in the peer mode.

The performance of egoistic punishment is highlighted in the well-mixed population and the influence of perimeters on the egoistic punishment mechanism seems obvious. Actually, the fitness of egoistic punishers is not only related to $$\alpha$$ and $$\beta$$, but also to the number of defectors. The fitness of the defector is also related to the number of punishers. It is difficult to portray and demonstrate the evolution of such complex relationships in a well-mixed homogeneous system. Therefore, studying the evolution and cooperation of complex dynamic process in a structured population is necessary. By introducing a four-neighbor lattice structure, the population is split into heterogeneous interactive groups. Despite its simplicity, the spatial model exhibits really complex behavior in different spatial and time scales.

We are interested in who will be the winner between punishers motivated by egoism and altruism by examining which mechanism can achieve a higher cooperation level and how the introduced punishment strategy performs. We first explored the performance of egoistic punishment proposed in this study and its fundamental mechanism for promoting cooperation. Then we compared the performance of egoism with typical altruism under the peer-punishment and pool-punishment modes from the level of cooperation, the evolutionary equilibrium (EE), and the punisher’s survivability.

Performances of egoism under peer- and pool-punishment modes

First, we illustrate the phase transitions of EI and EP punishment in the full fine-cost areas for $$r=3.5$$ (the cooperators cannot survive in the absence of egoistic punishment) and 3.8 (the C and D coexist in the absence of punishment) by systematic Monte Carlo (MC) simulations. In each case we have determined the stationary frequencies of strategies when varying the fine $$\beta$$ for many fixed values of cost $$\alpha$$. The transition points and the type of phase transitions are identified from the dataset. The phase boundaries are plotted in the full fine-cost phase diagrams, as shown in Fig. 1.

Figure 1

Full fine-cost phase diagrams of egoistic punishment in the structured population. Solid (dash) lines indicate continuous (discontinuous) phase transitions. The results show the solutions are significantly different under different penalty modes. The phase transitions in egoistic pool punishment are more complex.

The phase diagram of EI in the structured population for $$r=3.5$$ (as Fig. 1a shows) is similar to that in the well-mixed population. But due to the spatial structure’s restrictive interaction, the phase boundary value of ($$\beta , \alpha$$) required for the egoistic punishers’s survival is lower. The phase transitions from the pure D phase to the $$\mathrm D+P_S$$ phase, then to the $$\mathrm C+D+P_S$$ phase as $$\beta$$ increases. When increasing the cost $$\alpha$$ at a high value of fine ($$\beta =0.9$$), one can observe three discontinuous transitions from the pure P ($$\alpha =0$$) to the pure C phase, to the pure D phase and then to the $$\mathrm C+D+P_S$$ phase. In the D phase, punishers are first eliminated by pure cooperators, and then defectors invade the cooperators and prevail. Below this phase, the defector is eliminated first, and above this phase, the three strategies coexist under the cyclic dominance.

The phase diagram changes a lot when $$r=3.8$$, as Fig. 1b illustrates. The $$\mathrm D+P_S$$ phase is surround by the $$\mathrm C+D+P_S$$ phase when the value of ($$\beta , \alpha$$) is higher, which is the result of the overlapping effects of multiple forces. The synergy factor r supports C, and the fine β supports $$P_S$$ and inhibits D. In addition, a very important influence is the three strategies’ cyclic dominance (as discussed in detail below). Due to r’s support for C, the $$P_S$$ cannot survive in high-cost areas. In the early stage of evolution, the increase of C provides the impetus for the evolution of D, and at the same time increases the exploitation of P, resulting in the punishment losing its effectiveness.

Compared with the EI punishment, the phase diagrams of the EP punishment seem straightforward, as displayed in Fig. 1c and d. When increasing the fine $$\beta$$ at a low value of cost $$\alpha$$ ($$\alpha &lt;0.3$$ for $$r=3.5$$ and $$\alpha &lt;0.32$$ for $$r=3.8$$), PB gradually dominates the system to transition from the D (or $$\mathrm C+D$$) phase to the $$\mathrm D+P_B$$ phase, and finally forms the pure $$\mathrm P_B$$ phase. When the cost value is high, the $$\mathrm P_B$$ phase replaces the D (or $$\mathrm C+D$$) phase directly. It is worth noting that in the $$\mathrm P_B$$ phase, although the $$P_B$$ behaves in the same manner as the C, it is essentially different from the C because at this time $$P_B$$ has the punitive attribute; that is, once D invades $$P_B$$, it has the ability to eliminate it. In addition, C and $$P_B$$ coexist only when the cost is zero.

The presented fine-cost phase diagram shows clearly that $$P_B$$ has an absolute evolutionary advantage of eliminating D, while $$P_S$$ can survive at a lower-fine and higher-cost area. The three roles of r, $$\beta$$, and cycle dominance lead the disappearance of the evolutionary advantage of $$P_S$$ in high-cost and high-r areas. There are big differences in the phase transition types of the two punishment modes and the survival methods of two punishers due to the cycle-dominant role, which exists in the EI punishment mechanism but not in the EP punishment mechanism, as shown in Fig. 2.

Figure 2

Evolution of three strategies under the EI and EP mechanisms. Snapshots from (a) to (d) are steps 1, 100, 1000, and 3000 under the EP mechanism, respectively, and snapshots from (e) to (h) are steps 1, 100, 300, and 500 under the EI mechanism, respectively. Results accrued for $$r=2.0,\alpha =0.5,\beta =1.0$$, and $$L=1000$$ with prepared initial distributions. The rock-paper-scissors phenomenon in the EI mechanism allows the egoistic pool punisher to coexist with the other two strategies. While in the EP mechanism, the defector first eliminates the pure cooperator and the egoistic peer punisher eventually eliminates the defector.

Cyclic dominance, or multiple Nash equilibrium, is an important and common property in the system of three or more strategies. This phenomenon has been reported in previous studies, and we will not explain it in detail. Here, we are interested in why this phenomenon occurs in the EI but not in the EP, and what effect this dominant role has on the EI punishment mechanism.

In order to facilitate the observation of the competition between strategies, we show some typical snapshots of the strategies with prepared initial distribution under EP and EI mechanisms, as shown in Fig. 2. The results of the random initial distribution are shown in Figure S3 in the supplementary material. In the EP mechanism (Fig. 2a–d), the speed of the defector invading the cooperator is greater than that of the punisher invading the defector. Because the egoistic peer punisher ($$P_B$$) is the same as the pure cooperator if the group has no defectors, the boundary of these two cooperative strategies is not disturbed. Thus, leading the defector eliminates the cooperator first and then the punisher destroyed the defector cluster. After eliminating the defector, the egoistic peer punisher becomes a pure cooperator without paying any additional cost, and the group becomes a full-cooperative group with hidden punishment mechanisms. Once a mutated defector attacks the group, the punisher reappears to resist the defector’s invasion, which is driven by the force that protects their cooperation benefits. In the EI mechanism, the three strategies suppress each other, as shown in Fig. 2e–h. Swirls observed in the junction of the three strategies are rotating as well as extending gradually (Fig. 2f). In this rock-scissor-paper dynamic, defectors invade cooperators’ territory, and cooperators invade compensatory punishers while egoistic pool punishers defeat defectors in the border between them. The three strategies coexist when the evolution is stable.

Due to the strategies’ cyclic dominance, the influence of various parameters in the EI mechanism on cooperation is counter-intuitive. Figure 3 shows the influence of variables $$\alpha$$ and $$\beta$$ on the evolution of strategies.

Figure 3

The effect of the punishment fine [graph (a)] and the cost [graph (d)] on the evolution of strategies in the EI mechanism. Graphs (b) and (c) show the evolution of strategies on a time scale when $$r=2.0$$; Graphs (e) and (f) show the evolution of strategies on a time scale when $$r=3.5$$. Results accrued for $$\alpha =0.1$$ in the upper layer, and for $$\beta =0.5$$ in the lower layer. Initially, the three strategies are randomly uniformly distributed. Results show $$\alpha$$ affects the level of cooperation and $$\beta$$ adjusts the proportion of the pure cooperator and the egoistic pool punisher in the coexistence phase of the three strategies.

When increasing the fine $$\beta$$ for a fixed value of cost $$\alpha$$ ($$\alpha =0.1$$) (Fig. 3a), the cooperation rate ($$i.e.,\rho _C+\rho _{P_S}$$ or $$1-\rho _D$$) was largely unchanged in the three strategies’ coexisting stage, although the proportions of the pure cooperator ($$\rho _C$$) and the punisher ($$\rho _{P_S}$$) changed. Moreover, the frequency of the punisher decreases in contrast to the intuition that the fine is good for the punisher but not for the defector, as explained by the evolutionary time scales. When the fine is small ($$\beta =0.5$$) (Fig. 3b), the damage to the defector and the reinforcement on the punisher are weak. The defector first occupies the evolutionary advantage when the cooperative synergy is low ($$r=2.0$$); then the punisher increases rapidly through the effect of strategies’ cyclic dominance. In contrast, when the fine is large ($$\beta =0.9$$) (Fig. 3c), the damage to the defector and the reinforcement of the punisher are strong. The punisher first increases rapidly, whereas the defector disappears. Under the influence of the cyclic dominance, the pure cooperator invades the punisher population and occupies the dominant position in a stable state. Therefore, the final result we see is that the fine’s main role is to adjust the proportion of the pure cooperator and egoistic pool punisher in the strategies’ coexistence phase, except to invade the pure defector group.

When increasing the cost $$\alpha$$ for a fixed value of fine $$\beta$$ ($$\beta =0.5$$) (Fig. 3d), the egoistic punisher’s frequency increases slightly, which contradicts common sense that the cost weakens the punisher. We compare the evolution of strategies on time scales under high- and low-cost conditions. When the cost is low ($$\alpha =0.1$$) (Fig. 3e), the punisher first dominates and weakens the defector. Through cyclic dominance, the pure cooperator increases rapidly in the population of punishers and dominates in the stable state. When the cost is high ($$\alpha =0.9$$) (Fig. 3f), first, the heavy cost greatly weakens the punisher, thus, the defector is dominant. Although the punisher has an opportunity to invade the defectors’ population, the pure cooperator destroys it and the punisher ultimately does not succeed. Therefore, adjusting the overall level of cooperation reflects the ultimate impact of cost by affecting the egoistic pool punisher’s initial status.

In addition, the non-intuitive effects of r in EI have also been explored in that the cooperation level decreases when r increases. Figure S4 in the supplementary material shows these results.

From the above results, we can see that the influence of parameters on the strategy occurs first in the EI mechanism, and then the cycle dominant role of the strategy appears, which makes the final role of the parameters change. It is worth mentioning that the effect of the cycle dominance in the three strategies’ coexistence state is only reflected in the first-order role; that is, if the punisher temporarily dominates under the influence of the parameters, the pure cooperator will ultimately dominate; if the defector has the advantage, the punisher dominates; and if the pure cooperator is dominant, the defector will ultimately prevail.

In general, egoistic punishment can effectively promote cooperation, whether through peer-punishment or pool-punishment methods. This result is robust to different strategy-update rules (See Figure S5 in the supplementary material for details). But the effects of promoting cooperation and the operating mechanisms behind them are very different.

Comparison of egoistic and altruistic punishment

The above part fully explored egoistic punishment under the two punishment modes. In this part, we focus on the comparison between egoistic and altruistic punishment from the three following aspects: the level of cooperation when the evolution is stable, the type of EE, and the punisher’s survivability.

Figure 4

The levels of cooperation as a function of punishment cost ($$\alpha$$) and punishment fine ($$\beta$$) at $$r = 2.0,3.5,3.8$$,and 4.0 in four punishment mechanisms. From top to bottom, each row of graphs represents EI, AI, EP, and AP mechanisms, respectively. From left to right, each column represents $$r = 2.0,3.5,3.8$$, and 4.0. Results show egoistic punishment preforms better than altruistic punishment. EP can maintain a high level of cooperation, while EI can promote cooperation within a larger range of parameters for $$r \le 3.5$$.

First, we compared the level of cooperation. The cooperation rate is usually considered the ratio of cooperators (C and $$P_i$$, $$i=R,G,S,or~ B$$) in the population in an evolutionary stable state. To make the comparison more comprehensive, we studied the stationary ratio of cooperators when simultaneously changing the values of $$\alpha$$ and $$\beta$$ for different values of synergy factor r, as shown in Fig. 4. There is a significant difference between egoism and altruism in promoting cooperation. Under the peer-punishment mode (graphs ($$c_i$$) and ($$d_i$$)), cooperation rates of EP and AP can reach a large-scale, full cooperation level. However, egoistic punishment can promote cooperation with lower fines in high-cost areas when $$r\ge 3.5$$. Under the pool-punishment mode (graphs ($$a_i$$) and ($$b_i$$)), egoistic punishment promotes cooperation in a larger fine-cost area, especially in low-synergy-factor and high-cost conditions, compared with altruistic punishment. Despite a small transition interval (as Fig. 1 explains), the role of egoistic punishment has a very significant advantage. In summary, egoistic punishment preforms better than altruistic punishment in promoting cooperation under the two punishment modes. Moreover, EP can maintain a high level of cooperation, while EI can promote cooperation within a larger range of parameters when $$r\le 3.5$$.

The advantages of egoistic punishment seem to disappear in the condition that $$r=2.0$$ under peer-punishment, as Fig. 4 ($$c_1$$) and ($$d_1$$) show. In fact, the defector in altruistic punishment is punished much more severely than in egoistic punishment, although the egoistic punishers’ fitness is larger than that of altruistic punishers. In detail, if we suppose the number of punishers in the group is the same, a defector loses benefits $$\beta *N_{P_R}$$ when encountering altruistic punishers ($$P_R$$), whereas the defector loses benefits $$\beta *r/5*N_{P_B}$$ when encountering egoistic punishers ($$P_B$$). If we make the losses for defectors in both games the same, setting $$\beta(P_R)=\beta(P_B)*r/5$$, clearly, egoistic punishment will reach a steady state of full cooperation and evolve faster than altruistic punishment. This result shows that if defectors are punished to the same degree, obviously, EP promotes cooperation more efficiently than AP by increasing punishers’ fitness.

Promoting cooperation under both motives significantly depends on the punishment cost ($$\alpha$$) and fines ($$\beta$$). The harsher the punishment, the lower the cost and the greater the possibility of promoting cooperation, which is also the requirement of altruistic punishment to promote cooperation. Egoistic punishment can promote cooperation in areas with lower fines and higher costs, reflecting the notion that egoistic punishment has a more tolerant requirement for punishment conditions. This result provides another way to improve cooperation through punishment; that is, a punishment mechanism that compensates the punisher and tolerates the defector. On one hand, compensation covers part of the punisher’s cost to improve its fitness; on the other hand, tolerant punishment can also weaken the defector’s fitness to promote cooperation. Results show that the latter way—where egoistic punishment goes–performs better.

Then, we compared the four types of punishers’ survivability under the conditions that the punishment cost is low ($$\alpha = 0.05$$), medium ($$\alpha = 0.5$$), and high ($$\alpha = 0.95$$), as shown in Fig. 5. When increasing the fine $$\beta$$ at a low cost value in the peer-punishment mode, altruistic punishers ($$P_R$$) first prevail and occupy the population, as followed by the egoistic punishers ($$P_B$$). After the peer punishers occupy the population, they behave like pure cooperators and are easily invaded by pure cooperators, so they cannot be identified when there is no defector. Therefore, we marked the first and last time the $$\beta$$ value of the population was occupied by the peer punishers in the graphs, while ignoring the intermediate value for easy observation. As the cost increases, it becomes increasingly difficult for the peer punishers to survive. When increasing the r, $$P_B$$ is more survivable than $$P_R$$.

Figure 5

The frequencies of punishers as a function of punishment fines in four punishment mechanisms at different costs ($$\alpha$$) and synergy factors (r). Red solid circles represent egoistic pool punisher ($$P_S$$) and blue circles mean egoistic peer punisher ($$P_B$$). Black solid squares represent altruistic pool punisher ($$P_G$$) and black hollow squares mean altruistic peer punisher ($$P_R$$). Results show survivability of $$P_S$$ and $$P_G$$ are stronger than that of $$P_B$$ and $$P_R$$, but adaptability is worse. Moreover, $$P_S$$ can survive better in lower-fine areas than $$P_G$$. After the peer punishers occupy the population, they behave like pure cooperators and are easily invaded by pure cooperators, so they cannot be identified when there is no defector. Therefore, we marked the first and last time the beta value of the population occupied by the peer punishers in the graphs, while ignoring the intermediate value for easy observation.

Figure 6

Evolutionary equilibrium states of four punishment mechanisms. We show the evolutionary equilibrium state of the strategies from full defective (D) to full cooperation (C). The red line indicates pure cooperator, the blue line indicates the defector, the dark yellow line in EP indicates the egoistic peer punisher ($$P_B$$), the green line in EI is the egoistic pool punisher ($$P_S$$), the black dash line in AP is the altruistic peer punisher ($$P_R$$) and the black solid line in AI is the altruistic pool punisher ($$P_G$$). “No EE” stands for no such evolutionary equilibrium state under the current punishment mechanism; that is, these strategies cannot coexist. The strategy evolution under the four mechanisms is different; thus, we try to observe the evolutionary stable state under the same parameter combination ($$r, \alpha , \beta$$) in the same EE state. D phases are obtained for (2, 0.5, 0.5); D+C phases for (3.8, 0.8, 0.2); D+P phases for (3.5, 0.1, 0.2) in AI and (3.5, 0.1, 0.1) for other mechanisms; D+C+P phases for (2, 0.1, 0.9) in AI and EI, and (2, 0.6, 0.9) in AP. The full-cooperation phases accrued for $$r = 4$$. In AI, P phase is obtained for (r, 0, 0.3) and C phase is obtained for (r, 0.0001, 0.3). In EI, P phase is obtained for (r, 0, 0.2), C phase is obtained for (r, 0.0001, 0.2), and P+C phase is obtained for (r, 0, 0.6). In fact, when $$\alpha = 0$$ punishers are the same as cooperators. In EP and AP, although cooperators and punishers cannot be distinguished in theory, simulation results show that all three states of C, P, and C+P can be observed in the full-cooperation area.

In the pool-punishment mode, when increasing the fine $$\beta$$ at a low cost value, the pool punishers ($$P_S$$ and $$P_G$$) rise rapidly and then decrease gradually as the punishment fine transitions from tolerant to severe. The egoistic punishers ($$P_S$$) first prevail but the altruistic punisher’s ($$P_G$$) frequency is higher than $$P_S$$. As the cost increases, $$P_G$$ has no survivability, while $$P_S$$ can survive at a high-fine area. As r increases, the $$P_S$$ can survive with a lower fine.

Peer punishers have strong survivability within a considerable range of parameters, but are easily invaded by pure cooperators, while the pool punishers’ survival parameters range is quite limited. Compared to motivation, egoistic punishers can survive with higher costs and lower fines than altruistic punishers.

Finally,we analyzed the EE of egoism and altruism in two punishment modes to compare the dynamic-evolution process shown in Fig. 6 (the results can also be obtained by random initial distribution). EE is the ultimate stable proportion to which an evolutionarily changing population converges, referring to Maynard Smith’s original definition. Therefore, a game of three strategies has no more than seven EE states. Analyzing the EE that emerged in four punishment mechanisms shows three regions from high-cost lenient punishment to low-cost severe punishment. First, in the defection prosperous area, cooperation emerges and forms a coexistence of defectors and cooperators, and finally cooperation completely occupies the population. The factors influencing the evolution of cooperation in different regions are also different. In the defection prosperous area, two types of EE states exist: D and C+D. The D phase appears when $$r=2.0$$ and 3.5 (the blue region in Fig. 4), and the C+D phase emerges when $$r=3.8$$ and 4.0 (the cyan and green region in Fig. 4), which reveals that synergy factor r supports cooperation. In the transitional area, two equilibrium states—D+$$\mathrm P_i$$ and $$\mathrm (C+D+P_i)_c$$($$i=R,S$$ and G)—emerge. In the EP mechanism, only the $$\mathrm D+P_B$$ phase exists because the pure cooperator cannot take a free ride from the EP. This stage reflects the influence of punishment in promoting cooperation. Interestingly, the full-cooperation area has three EE states: P, P+C, and C, but they do not appear in all four punishment models. The P+C phase cannot be observed in the AI mechanism, which reflects the mandatory nature of the institutional punishment the government implement. In peer punishment (EP and AP), observing all three EEs is understandable because the individual punisher becomes the same as the pure cooperator in the absence of defectors. This stage reflects the influence of cyclic dominance in the three strategies of EE.