When species richness is plotted vs area, the graph follows the equation : log S = log C + Z log A where Z is the slope of the line. Z values are usually in the range of 0.1 to 0.2 but if very large areas like entire continents are analyzed, the slope is much steeper. (0.6-1.2).

Why is it so ?

(Here S = species richness and A= area under consideration)

Z value is a fitted constant and it ranges between 0.1 to 0.3 regardless of the region or taxonomic groups.i.e. the slope is almost similar .Among larger areas like continent,the slope tends to be much steeper.becoz, Larger the area larger will be number of species(species richness),for larger areas the Z values tends to be 0.6 to 1.2.(NOTE:if Z is less,lesser area is enough to capture more species).

## The island species–area relationship: biology and statistics

Kostas A. Triantis, Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal. E-mails: [email protected] [email protected] Search for more papers by this author

Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal

UMR CNRS-UM2-IFREMER-IRD 5119 ECOSYM, Université Montpellier 2 cc 093, 34 095 Montpellier Cedex 5, France

‘Rui Nabeiro’ Biodiversity Chair CIBIO – Universidade de Évora, Casa Cordovil, Rua Dr. Joaquim Henrique da Fonseca, 7000-890 Évora, Portugal

All authors contributed equally to this work.

Biodiversity Research Group, School of Geography and the Environment, University of Oxford, South Parks Road, Oxford OX1 3QY, UK

Center for Macroecology, Evolution and Climate, Department of Biology, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark

All authors contributed equally to this work.

Biodiversity Research Group, School of Geography and the Environment, University of Oxford, South Parks Road, Oxford OX1 3QY, UK

Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal

Department of Ecology and Taxonomy, Faculty of Biology, National and Kapodistrian University, Athens GR-15784, Greece

All authors contributed equally to this work.

Kostas A. Triantis, Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal. E-mails: [email protected] [email protected] Search for more papers by this author

Azorean Biodiversity Group, Departamento de Ciências Agrárias – CITAA, Universidade dos Açores, Angra do Heroísmo, Pico da Urze, 9700-042, Terceira, Açores, Portugal

UMR CNRS-UM2-IFREMER-IRD 5119 ECOSYM, Université Montpellier 2 cc 093, 34 095 Montpellier Cedex 5, France

‘Rui Nabeiro’ Biodiversity Chair CIBIO – Universidade de Évora, Casa Cordovil, Rua Dr. Joaquim Henrique da Fonseca, 7000-890 Évora, Portugal

All authors contributed equally to this work.

Biodiversity Research Group, School of Geography and the Environment, University of Oxford, South Parks Road, Oxford OX1 3QY, UK

Center for Macroecology, Evolution and Climate, Department of Biology, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark

All authors contributed equally to this work.

### Abstract

**Aim** We conducted the most extensive quantitative analysis yet undertaken of the form taken by the island species–area relationship (ISAR), among 20 models, to determine: (1) the best-fit model, (2) the best-fit model family, (3) the best-fit ISAR shape (and presence of an asymptote), (4) system properties that may explain ISAR form, and (5) parameter values and interpretation of the logarithmic implementation of the power model.

**Methods** We amassed 601 data sets from terrestrial islands and employed an information-theoretic framework to test for the best-fit ISAR model, family, and shape, and for the presence/absence of an asymptote. Two main criteria were applied: *generality* (the proportion of cases for which the model provided an adequate fit) and *efficiency* (the overall probability of a model, when adequate, being the best at explaining ISARs evaluated using the mean overall AIC_{c} weight). Multivariate analyses were used to explore the potential of island system properties to explain trends in ISAR form, and to describe variation in the parameters of the logarithmic power model.

**Results** Adequate fits were obtained for 465 data sets. The simpler models performed best, with the power model ranked first. Similar results were obtained at model family level. The ISAR form is most commonly convex upwards, without an asymptote. Island system traits had low descriptive power in relation to variation in ISAR form. However, the *z* and *c* parameters of the logarithmic power model show significant pattern in relation to island system type and taxon.

**Main conclusions** Over most scales of space, ISARs are best represented by the power model and other simple models. More complex, sigmoid models may be applicable when the spatial range exceeds three orders of magnitude. With respect to the log power model, *z*-values are indicative of the process(es) establishing species richness and composition patterns, while *c*-values are indicative of the realized carrying capacity of the system per unit area. Variation in ISAR form is biologically meaningful, but the signal is noisy, as multiple processes constrain the ecological space available within island systems and the relative importance of these processes varies with the spatial scale of the system.

**Appendix S1** References of data sources.

**Appendix S2** Properties of the data sets used.

**Appendix S3** Supplementary analyses and results.

As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer-reviewed and may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.

Filename | Description |
---|---|

JBI_2652_sm_AppS1-3.pdf805.5 KB | Supporting info item |

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

## Earth's Biodiversity

How many species exist on Earth? This is a question that scientists have attempted to answer for centuries, beginning with Carl Linnaeus&rsquo first steps towards naming and classifying organisms in *Systema Naturae* in 1735. Linneaus himself described over 12,000 species and in the nearly 300 years since, scientists have formally described and studied over a million more. Have scientists already discovered all, or most, of the species that live on Earth? If not, how many more remain to be discovered? These questions are inherently difficult to answer because they require us to estimate how much we *don&rsquot* know how many species scientists have *not* discovered. This difficulty is compounded by the fact that certain regions of Earth and certain group of organisms have been much more heavily studied than others and so we know the diversity of organisms that is currently documented is somewhat biased.

Many studies have attempted to estimate the total biodiversity of Earth and, with varying methodologies, have produced estimates anywhere from 2 million to over 100 million species. Recently, Dr. Camilo Mora (at the University of Hawai&rsquoi, Mānoa) and his colleagues reviewed the various estimates and tested a new methodology to narrow down this large range to a plausible estimate. In the study, Mora used a similar strategy to the species-area and rarefaction curves described above however, the estimate of &lsquoeffort&rsquo on the x-axis was not the area or number of individuals sampled, but time. Scientists have been described approximately 1.2 million species since the mid 1700&rsquos has the number of new species described every year begun to level off? If so, that might indicate that most species have been discovered. If not, perhaps we can use the trend in new species discovered through time to predict where the graph might begin to level off. Mora&rsquos team also considered higher taxonomic levels we can be relatively certain that scientists have not yet discovered every single species on Earth, but might scientists have already described all the genera, families, orders, classes, or phyla? The results of Mora&rsquos analysis are shown in Fig 3. By using the pattern of discovery of new organisms at different taxonomic levels (Fig 3A-F) and the relationship between maximum diversity of each level (Fig 3G), Mora&rsquos team arrived at an overall estimate of 8.7 million species on Earth. Since scientists have currently described approximately 1.2 million species, this estimate indicates that approximately 86% of species on Earth have not yet been discovered. At our current rate of discovery, Mora&rsquos team estimated that it will take scientists another 1,200 years to identify all species on Earth.

**Figure (PageIndex<3>):** *Discovery of new animal taxa through time. Gray lines represent accumulation of discovered groups and colored ranges are multimodel agreement horizontal gray lines are the consensus asymptote of the models. Panel G shows the relationship between the consensus asymptote for each taxonomic level. Figure from Mora et al 2011.*

## Species-area curve

In ecology, a **species-area curve** is a relationship between the area of a habitat, or of part of a habitat, and the number of species found within that area. Larger areas tend to contain larger numbers of species, and empirically, the relative numbers seem to follow systematic mathematical relationships. [ 1 ] The species-area relationship is usually constructed for a single type of organism, such as all vascular plants or all species of a specific trophic level within a particular site. It is rarely, if ever, constructed for all types of organisms if simply because of the prodigious data requirements. It is related to, but not identical with, the species discovery curve.

Ecologists have proposed a wide range of factors determining the slope and elevation of the species-area relationship. [ 2 ] These factors include the relative balance between immigration and extinction, [ 3 ] rate and magnitude of disturbance on small vs. large areas, [ 3 ] predator-prey dynamics, [ 4 ] and clustering of individuals of the same species as a result of dispersal limitation or habitat heterogeneity. [ 5 ] The species-area relationship has been reputed to follow from the 2nd law of thermodynamics. [ 6 ] In contrast to these "mechanistic" explanations, others assert the need to test whether the pattern is simply the result of a random sampling process. [ 7 ]

Authors have classified the species-area relationship according to the type of habitats being sampled and the census design used. Frank Preston, an early investigator of the theory of the species-area relationship, divided it into two types: samples (a census of a contiguous habitat that grows in census area, also called "mainland" species-area relationships), and isolates (a census of discontiguous habitats, such as islands, also called "island" species-area relationships). [ 1 ] Michael Rosenzweig also notes that species-area relationships for very large areas—those collecting different biogeographic provinces or continents—behave differently from species-area relationships from islands or smaller contiguous areas. [ 2 ] It has been presumed that "island"-like species-area relationships have higher slopes (in log-log space) than "mainland" relationships, [ 2 ] but a recent metaanalysis of almost 700 species-area relationships found the former had lower slopes than the latter. [ 8 ]

Regardless of census design and habitat type, species-area relationships are often fit with a simple function. Frank Preston advocated the power function based on his investigation of the lognormal species-abundance distribution. [ 1 ] If S is the number of species, A is the habitat area, and z is the slope of the species area relationship in log-log space, then the power function species-area relationship goes as:

Here c is a constant which depends on the unit used for area measurement, and equals the number of species that would exist if the habitat area was confined to one square unit. The graph looks like a straight line on log-log axes. In contrast, Henry Gleason championed the semilog model:

which looks like a straight line on semilog axes, where area is logged and the number of species is arithmetic. In either case, the species-area relationship is almost always decelerating (has a negative second derivative) when plotted arithmetically. [ 9 ]

Species-area relationships are often graphed for islands (or habitats that are otherwise isolated from one another, such as woodlots in an agricultural landscape) of different sizes. [ 3 ] Although larger islands tend to have more species, it is possible that a smaller island will have more than a larger one. In contrast, species-area relationships for contiguous habitats will always rise as areas increases, provided that the sample plots are nested within one another.

The species-area relationship for mainland areas (contiguous habitats) will differ according to the census design used to construct it. [ 10 ] A common method is to use quadrats of successively larger size, so that the area enclosed by each one includes the area enclosed by the smaller one (i.e. areas are nested).

In the first part of the 20th century plant ecologists often used the species-area curve to estimate the minimum size of a quadrat necessary to adequately characterize a community. This is done by plotting the curve (usually on arithmetic axes, not log-log or semilog axes), and estimating the area after which using larger quadrats results in the addition of only a few more species. This is called the **minimal area**. A quadrat that encloses the minimal area is called a **relevé**, and using species-area curves in this way is called the relevé method. It was largely developed by the Swiss ecologist Josias Braun-Blanquet. [ 11 ]

Estimation of the minimal area from the curve is necessarily subjective, so some authors prefer to define minimal area as the area enclosing at least 95 percent (or some other large proportion) of the total species found. The problem with this is that the species area curve does not usually approach an asymptote, so it is not obvious what should be taken as the total. [ 11 ] In fact, the number of species always increases with area up to the point where the area of the entire world has been accumulated. [ 12 ]

## General Overviews

Species-area relationships were first documented and debated among plant ecologists seeking to characterize and compare plant communities. The subject later gained popularity among animal ecologists with the seminal work of Preston 1962 on species abundance distributions and with Robert MacArthur and Edward O. Wilson’s equilibrium theory of island biogeography (MacArthur and Wilson 1967, cited under Habitat Heterogeneity and Area). An excellent historical review is provided in McGuinness 1984, which connects debates over the form and function of species-area relationships with emerging ecological theory. Connor and McCoy 1979 also reviews the evidence linking species-area relationships to biological and ecological explanations, but the authors focus on the statistical validity of attempts to use the form and parameters of species-area curves to discern ecological causality. Rosenzweig 1995 explores in detail several examples of species-area curves and uses them to discuss the many factors that influence the shape of these curves, while Drakare, et al. 2006 builds on the work of Michael Rosenzweig and others through a meta-analysis of species-area relationships to show that the relationship is influenced by habitat, type of organism, sampling scheme, and spatial scale. Because of the variety of research goals inherent in studies of species-area relationships, sampling and analytical methods, as well as definitions of what constitutes a species-area relationship, often vary among studies. Scheiner 2003 defines six types of species-area curves that differ in the spatial arrangement of samples, whether larger samples are constructed in a spatially explicit fashion from adjacent smaller samples, and whether means or single values are used for a given spatial scale. Dengler 2009, however (and references cited therein), considers true species-area relationships to have a narrower definition, because in the author’s view area is a biologically meaningful variable only when it implies that samples are spatially contiguous.

Connor, Edward F., and Earl D. McCoy. 1979. The statistics and biology of the species-area relationship. *American Naturalist* 113.6: 791–833.

Evaluates evidence that species-area relationships are best fit by the power law and are predicted by equilibrium theory. The authors find no unique theoretical basis for any one model or ecological explanation and observe that parameter values may be influenced more by statistical characteristics than by biological drivers.

Dengler, Jürgen. 2009. Which function describes the species-area relationship best? A review and empirical evaluation. *Journal of Biogeography* 36.4: 728–744.

Reviews the literature on functional form and definitions of species-area relationships, distinguishing species-area relationships from species-sampling relationships deduced from species accumulation and rarefaction curves. The author recognizes only nested, spatially explicit, and island curves as true species-area relationships because each point in the curve is internally contiguous.

Drakare, Stina, Jack J. Lennon, and Helmut Hillebrand. 2006. The imprint of the geographical, evolutionary and ecological context on species-area relationships. *Ecology Letters* 9.2: 215–227.

A meta-analysis of 794 species-area relationships from the literature, which synthesizes how the parameter *z* from Arrhenius’s power law (see Species-Area Functions) varies across sampling designs, organisms, body sizes, habitats, and spatial scales.

McGuinness, Keith A. 1984. Equations and explanations in the study of Species–area curves. *Biological Reviews* 59.3: 423–440.

A useful review of the history of the study of species-area relationships, highlighting attempts to connect explanations to the functional form of the relationship.

Preston, Frank W. 1962. The canonical distribution of commonness and rarity: Part I. *Ecology* 43.2: 185–215.

Connects species-area relationships to the lognormal distribution of species abundance under assumptions of spatial uniformity and a canonical lognormal distribution of species and individuals. Predicts *z*-value of 0.262 for the power-law species-area relationship, and documents, and documents that many empirical *z*-values are remarkably close to this prediction and that those that are not arise from truncated lognormal distributions.

Rosenzweig, Michael L. 1995. *Species diversity in space and time*. Cambridge, UK: Cambridge Univ. Press.

Summarizes and differentiates the shapes and underlying causes of species-area relationships, including differentiating curves built from small areas within a single biota, from large areas in a single biota, from island archipelagos, and from those built across two or more biogeographic regions. Discusses the use of power law *c*-values (see Species-Area Functions) in comparing richness across areas.

Scheiner, Samuel M. 2003. Six types of species-area curves. *Global Ecology and Biogeography* 12.6: 441–447.

Summarizes what species-area curves are, and discusses the various ways they can be constructed. Has a more inclusive definition than that in Dengler 2009 but recognizes that certain types reflect phenomena similar to species accumulation or rarefaction curves.

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

## Results

### Model fits and the ‘best’ ISAR model

In 551 cases of the 601 data sets compiled, at least one function provided an adequate fit as determined by the use of the optimization algorithm, the Shapiro normality test and/or the Pearson product–moment correlation coefficient. However, the AIC_{c} could not be calculated for those data sets with fewer than seven islands, so our subsequent analyses were based on 465 data sets, of which 75% have a total land area of < 10,000 km 2 and 79% span less than four orders of magnitude in area. Each major taxon is well represented, as are continental-shelf and oceanic island systems, while there are relatively few inland data sets (Table 2).

Major taxon | No. of cases | Continental-shelf | Oceanic | Inland |
---|---|---|---|---|

(a) | ||||

Invertebrates | 177 | 78 | 88 | 11 |

Vertebrates | 178 | 113 | 37 | 28 |

Plants | 108 | 74 | 22 | 12 |

Other | 2 | 1 | 1 | 0 |

Total | 465 | 266 | 148 | 51 |

(b) | ||||

Invertebrates | 151 | 77 | 64 | 10 |

Vertebrates | 170 | 110 | 37 | 23 |

Plants | 126 | 91 | 23 | 12 |

Other | 2 | 1 | 1 | 0 |

Total | 449 | 279 | 125 | 45 |

In 44 cases (9% of the 465 analysed), the data set was the sum of two or more other data sets, arising either through summing distinct but related groups of islands, or by combining different taxa for a particular set of islands. Although there is a level of interdependency in these cases, sensitivity analyses showed that their inclusion did not affect the results (not shown).

Considering the single ‘best’ model per data set, as judged by the lowest AIC_{c} value, four models accounted for 73% of cases in declining order of performance – the power, linear, Kobayashi and exponential models (Fig. 2a). The *generality* criterion provided relatively small variability of values, i.e. poor discrimination between the 20 models evaluated, with proportions between 0.467 and 0.839 (mean value of 0.725 ± SD 0.122) of adequate fits among the 465 data sets, with half of the models having virtually identical success rates (Fig. 2b, Table 3). However, according to the *efficiency* criterion, which is more discriminatory, four models account for more than 50% of the overall probabilities of being the best at fitting ISARs in declining order they were the power, linear, Kobayashi and exponential models (Fig. 2c, Table 3).

Comparison of the performance of the 20 island species–area relationship (ISAR) models across 465 data sets: (a) the proportion of data sets for which each model provided the lowest small-sample corrected Akaike information criterion (AIC_{c}) value, i.e. single-best model (b) *generality*, i.e. the proportion of the data sets for which each model provided an adequate fit and (c) *efficiency*, i.e. the average AIC_{c} weight (*w*AIC_{c}) for the cases for which the model in question provided an adequate fit. See Table 1 for details of the models. NB Screening out data sets with < 20 data points results in a pronounced decline in the performance of the linear model, but otherwise the relative performance of the models remains practically the same (see text and Appendix S3).

Model | Generality | Efficiency | Overall value | Rank |
---|---|---|---|---|

Power* | 0.798 [9] | 0.207 [1] | 2.996 | 1 |

Koba | 0.798 [9] | 0.154 [3] | 2.081 | 2 |

expo | 0.755 [12] | 0.143 [4] | 1.533 | 3 |

linear | 0.628 [14] | 0.170 [2] | 0.956 | 4 |

P2 | 0.839 [1] | 0.057 [8] | 0.723 | 5 |

monod | 0.731 [13] | 0.106 [5] | 0.698 | 6 |

epm2 | 0.815 [7] | 0.050 [9] | 0.405 | 7 |

weibull3 | 0.834 [2] | 0.041 [11] | 0.404 | 8 |

mmf | 0.834 [2] | 0.040 [14] | 0.391 | 9 |

heleg | 0.830 [4] | 0.040 [12] | 0.360 | 10 |

asymp | 0.794 [11] | 0.043 [10] | 0.115 | 11 |

ratio | 0.802 [8] | 0.033 [16] | 0.010 | 12 |

weibull4 | 0.830 [4] | 0.010 [19] | −0.158 | 13 |

betap | 0.830 [4] | 0.009 [20] | −0.173 | 14 |

negexpo | 0.546 [18] | 0.099 [6] | −0.943 | 15 |

P1 | 0.606 [16] | 0.059 [7] | −1.149 | 16 |

power_R | 0.600 [17] | 0.030 [17] | −1.709 | 17 |

chapman | 0.615 [15] | 0.012 [18] | −1.883 | 18 |

gompertz | 0.544 [19] | 0.040 [13] | −1.986 | 19 |

epm1 | 0.467 [20] | 0.037 [15] | −2.671 | 20 |

- Overall value: the sum of the standardized values of
*generality*and*efficiency*the sum of the overall values for all the models equals zero. Rank: model ranking based on the overall value index. *****Note that as per Table 1, the results reported herein are for the non-linear implementation of the power model.

The correlation between our *generality* and *efficiency* indices is low and statistically non-significant (Appendix S3.4). Hence, the overall ranking of the models (Table 3), combining standardized values of both *generality* and *efficiency* values (see Materials and Methods), synthesizes two distinctive aspects of model performance. To assess the robustness of our results we also re-ran the evaluation using the uncorrected AIC and the Bayesian information criterion (BIC). The overall rankings of the models based on the AIC_{c} were highly correlated (tau > 0.705, *P* < 0.05) with those obtained using AIC and BIC rankings (Appendix S3.4). Similarly, we found the overall rankings to be robust to the sequential removal of data sets with between seven and 19 islands, although notably the performance of the linear model declines rapidly as data sets with seven, eight and nine islands are eliminated (Appendix S3.4 & Table S12). In each case these sensitivity analyses indicate that the results of the overall model-ranking index are robust to the choice of a model selection criterion and to the inclusion of systems with comparatively small numbers of islands (the decline of the linear model in the rankings notwithstanding). The CAP analyses showed significant effects for some system traits, yet, in combination, system traits explained < 11% variability in both model selection and adequate fits profiles (Appendix S3.3).

### Best family of ISAR

The power family [Pow(B)] was ranked first based on the *generality* and *efficiency* criteria and was thus first in the overall ranking. It was followed by the exponential family [Expo(C)], which was also ranked second according to the *efficiency* criterion. The Logis(D) family was ranked third and fourth by the *generality* and *efficiency* criteria, respectively, and was third in the overall ranking (Table 4a).

No. of models | Generality | Efficiency | Overall value | Rank | |
---|---|---|---|---|---|

(a) Family | |||||

Pow(B) | 6 | 0.959 [1] | 0.338 [1] | 2.990 | 1 |

Expo(C) | 2 | 0.858 [4] | 0.269 [2] | 1.635 | 2 |

Logis(D) | 3 | 0.890 [3] | 0.162 [4] | 0.939 | 3 |

Weib(E) | 4 | 0.901 [2] | 0.115 [5] | 0.614 | 4 |

Asym(F) | 1 | 0.793 [7] | 0.043 [6] | −0.821 | 5 |

Beta(I) | 1 | 0.830 [5] | 0.009 [9] | −0.842 | 6 |

Rat(G) | 1 | 0.802 [6] | 0.033 [8] | −0.845 | 7 |

Lin(A) | 1 | 0.628 [8] | 0.170 [3] | −0.955 | 8 |

Gom(H) | 1 | 0.544 [9] | 0.040 [7] | −2.715 | 9 |

(b) Shape | |||||

Convex | – | 0.989 [1] | 0.792 [1] | 2.170 | 1 |

Sigmoid | – | 0.826 [2] | 0.114 [3] | −0.719 | 2 |

Linear | – | 0.630 [3] | 0.191 [2] | −1.452 | 3 |

(c) Asymptote | |||||

Non-asymptotic | – | 1.000 [1] | 0.825 [1] | 1.414 | 1 |

Asymptotic | – | 0.804 [2] | 0.218 [2] | −1.414 | 2 |

If the number of models included in each family is taken into account then the overall final ranking is significantly and highly correlated with that shown in Table 4a (Appendix S3.5). Additionally, the results of further analyses using only the overall most precise model in each family are consistent with Table 4a, with the families Pow(B), Expo(C) and Logis(D) always being the top three families. The CAP analyses for families showed significant effects for some system traits, yet in combination they explained < 10% variability in both model selection and adequate fits profiles (Appendix S3.3).

### Best shape of ISAR

Based on the algorithm used to detect linearity, convexity and inflection point(s), convex models had the highest *generality* and *efficiency* values (Table 4b). The results remain identical when a higher threshold value (0.01 instead of 0.001) was used to detect linearity, and almost identical when basing the assessment on the shape of the single best model for each data set (Appendix S3.1). Were we to follow instead the general shape assignment of Tjørve (2009) , as presented in Table 1 herein, the results would remain largely similar, with convex models having the highest *generality* and *efficiency* values (Appendix S3.1). Although the sigmoid shape appears almost as often as the convex shape, its *efficiency* values are generally much lower (as Table 4b), while often the estimated inflexion point occurs outside the range of observed areas, and thus the *fitted* shape is convex in form and not sigmoid.

The CAP analysis of the *w*AIC_{c} values and the adequate fit profiles for model shape showed significant effects for some system trait variables, yet in combination the system traits explained < 13% variation in both analyses (Appendix S3.3). While the amount of variance explained in the CAP analyses is low, there are significant differences in the range of island area (i.e. the mean values of *Area*_{SCALE}) encompassed by each system between the three shape forms, with data sets of linear form having the lowest, and data sets of sigmoid form the largest, range in island areas (Fig. 3).

The distribution of the values of *Area*_{SCALE} [i.e. log(*Area*_{MAX}/*Area*_{MIN})] for the three ISAR shape categories, using the shape that summed the highest AIC_{c} weight (*w*AIC_{c}) for each data set. There are significant progressive increases of the mean *Area*_{SCALE} value from linear to convex and finally to sigmoid shape (2.185, 2.997 and 4.065, respectively Kruskal–Wallis rank sum statistic, *n* = 465: 32.900, *P* < 0.001). Note, that the values of *Area*_{SCALE} within which a sigmoid shape totalled the highest *w*AIC_{c} (18 cases), ranged from 2.243 to 6.153 with a mean value of 4.065 ± SD 0.21. The results remain the same if instead of the shape that summed the highest *w*AIC_{c} for each data set, the observed shape of the best-fitting model for each data set is considered (see Appendix S3). Furthermore, there is no differentiation of the best shape according to the total area of the island systems considered (see Appendix S3.1), indicating that the pattern is robust regardless of the total area considered, i.e. small or large island groups. Squares represent the mean value, boxes bracket the standard error of the mean (± SE) and whiskers represent 95% confidence intervals of means (± 1.96 SE).

### Asymptotic versus non-asymptotic ISAR form

According to the method we used to detect the presence of an asymptote within the range of the empirical data, the non-asymptotic models had the highest *generality* and *efficiency* values (Table 4c). If the shape of the best model is considered on a case by case basis, then an asymptote is detected in 62 cases (13%), with no asymptote in 403 (87%) cases. No island traits provided significant differentiation of the presence/absence of an asymptote in a logistic regression analysis. Non-asymptotic models remained predominant when classifying shape using the general classification of Tjørve (2009) (see Appendix S3.2)

### The log–log implementation of the power model

The log–log implementation of the power model resulted in a significant ISAR in 449 cases of the 601 original data sets (Table 2a). Of these 449 data sets, 84% are of islands groups of < 50,000 km 2 and 73% have an *Area*_{SCALE} value of < 10,000 km 2 , while the number of islands ranges from four to 213. The ratio in richness values (*S*_{SCALE}) is < 100 in 59% of cases. The *R* 2 for the 449 significant ISARs ranged from 0.065 to 0.993, with a mean value of 0.640 ± SD 0.204. In the multiple regression minimal adequate model explaining variation in the *R* 2 values, only *No. of Islands* and *S*_{MAX} were included (*R* 2 = 0.49, *F* = 70.92, *P* < 0.01), indicating a general tendency for *R* 2 values to decrease with number of islands and to increase with maximum number of species.

Previous syntheses based on the log–log power model have suggested that ISAR *z*-values typically fall within a range of around 0.2–0.4 ( MacArthur & Wilson, 1967 Connor & McCoy, 1979 Rosenzweig, 1995 ), although Williamson (1988) reported exceptions to this generalization, ranging from 0.05 to 1.132. Our analyses produced a mean of *z* = 0.321 ± SD 0.164, and 51% of *z*-values fell between 0.2 and 0.4, while only 25% of values exceeded 0.4 and the full range was from 0.064 to 1.312. Simple regressions showed that no single explanatory variable had a coefficient of determination as high as 0.10, but the minimal adequate model included *Area*_{SCALE}, *S*_{SCALE}, *Island Type*, *S*_{VAR}, *No. of Islands* and *S*_{MAX} and explained 69% of the overall variation (*F* = 156.1, *P* < 0.01). The values of log*c* ranged from −2.197 (*c*-value: 0.006) to 2.982 (960.157) with a mean of 0.907 ± 0.788. The minimal adequate regression model included *Area*_{MAX}, *Area*_{SCALE}, *S*_{MAX}, *S*_{SCALE}, *No. of Islands*, *S*_{VAR} and *Major Taxon* and explained 84% (*F* = 276.400, *P* < 0.001) of the variation in log*c*.

There is a progressive increase in the mean *z*-value from inland systems to continental-shelf and then to oceanic archipelagos but the difference is only significant between the oceanic islands and the other two categories (Fig. 4a). Log*c* values show a progressive decrease from inland to continental-shelf and then oceanic archipelagos, with each category significantly different from the next (Fig. 4b). The *z*-values progressively increase from vertebrate to invertebrate to plant data sets, but only the difference between vertebrates and plants is significant (Fig. 4c). Furthermore, *z*-values appear to vary in relation to the range of island areas encompassed (Fig. 5). For data sets spanning just two orders of magnitude the mean value of *z* is significantly higher than for data sets spanning more orders of magnitude of island area (Fig. 5). The log*c* values increase progressively from vertebrates to invertebrates and finally to plants, with each category being statistically different (Fig. 4d).

Comparisons of *z* and log*c* values for the main taxonomic groups and island types, for the logarithmic form of the power function. (a) Comparison of *z*-values across the three main island types. The value for oceanic islands is higher than the two other categories (Kruskal–Wallis rank sum statistic, *n* = 449: 16.133, *P* = 0.0003). (b) The log*c* values by contrast show a progressive decrease from inland to continental-shelf and then oceanic archipelagos (Kruskal–Wallis rank sum statistic, *n* = 449: 32.130, *P* < 0.0001), with each category significantly different from the next. (c) The comparison of *z*-values for the main taxonomic groupings show that plant and invertebrate data sets have higher *z*-values than vertebrates but only the difference between plants and vertebrates is significant (Kruskal–Wallis rank sum statistic, *n* = 447: 14.104, *P* = 0.0009). (d) The log*c* values increase progressively from vertebrates to invertebrates and finally to plants, with each category being statistically different from each other (Kruskal–Wallis rank sum statistic, *n* = 447: 150.262, *P* < 0.0001). Squares represent the mean value, boxes bracket the standard error of the mean (± SE) and whiskers represent 95% confidence intervals of means (± 1.96 SE).

Comparison of the *z*-values for the main orders of magnitude of *Area*_{SCALE} included in the present study. For data sets spanning just two orders of magnitude the mean value of *z* is 0.438 ± SD 0.216, significantly higher than for all other categories, which exhibit *z*-values close to 0.3 or even lower (Kruskal–Wallis rank sum statistic, *n* = 439: 47.828, *P* < 0.0001). Note that the categories 10 0 –10 1 and 10 7 –10 8 were not considered due to their small sample size: four and six cases, respectively. However, if category 10 0 –10 1 is merged with 10 1 –10 2 and category 10 7 –10 8 with 10 6 –10 7 , the results remain identical (see Appendix S3.5). The logarithm of orders of scale magnitude is presented.

## Acknowledgements

I warmly thank Xuejun Dong (since June 6, 2011), Janet Patton and Bob Patton at North Dakota State University for the helpful suggestions for the manuscript revision. I appreciate the critical comments and encouragement of Dr. James Rosindell (since October 12, 2011) at University of Leeds, Dr. Robert May at University of Oxford and anonymous reviewers. The work is supported by the Tsinghua University, Texas A&M University-Kingsville, Green Design and Planning and Institute of Plant Quarantine of Chinese Academy of Inspection and Quarantine (Pest Risk Analysis Specific Project and Young Innovative Team Program).

## Species-Area Relation The number of species of given taxonomic group within a given habitat (often an island) is a function of the area of the habitat. For islands in the West Indies, the formula S ( A ) = 3 A 0.3 Approximates the number S of species of amphibians and reptiles on an island in terms of the island area A in square miles. This is example of species-area relation. a. Make a table giving the value of S for islands ranging in area from 4000 to 40,000 square miles. b. Explain in practical terms what S ( 4000 ) means and calculate that value. c. Use functional notation to express the number of species on an island whose area is 8000 square miles, and then calculate that value. d. Would you expect a graph of S to be concave up or concave down?

**Species-Area Relation** The number of species of given taxonomic group within a given habitat (often an island) is a function of the area of the habitat. For islands in the West Indies, the formula

Approximates the number *S* of species of amphibians and reptiles on an island in terms of the island area *A* in square miles. This is example of *species-area relation.*

**a.** Make a table giving the value of *S* for islands ranging in area from 4000 to 40,000 square miles.

**b.** Explain in practical terms what *S* ( 4000 ) means and calculate that value.

**c.** Use functional notation to express the number of species on an island whose area is 8000 square miles, and then calculate that value.

**d.** Would you expect a graph of *S* to be concave up or concave down?

## Species-area relation graph - Biology

Equilibrium Theory of Island Biogeography

Corresponding Readings in Primack, Richard B. *Essentials of Conservation Biology.* **Chapter 7: pages 163-174**

**Species-Area Relationship**

We are now moving from a discussion of genetics, populations, and species to communities and ecosystems. The next few lectures will describe concepts of major importance to conservation in terms of the effects of habitat fragmentation and maintenance of species diversity.

A great deal of conservation research has been done on islands, because they are small, replicated units of area, isolated from other habitat. They are very useful for species, community, and ecosystem studies.

Early observations of biogeography involved the examination of the geography of biodiversity around the globe. This was followed by recognition of the species-area relationship - as area increases, the number of species present (diversity) also increases. This can be represented by one of two graphs, depending on the axes used:

1) a concave, upward slope (# of species vs. area)

2) a straight, upward sloping line (log(# of species) vs. log(area)).

If we use the second form of the graph, we find that the equation describing the line is

where z represents the slope.

- Climate, e.g. latitudinal gradient factors

- High average r across the community or group of species

- Isolation, e.g. distance from the mainland

- Type of species represented, e.g. mammals vs. birds

Data collected by Harris for mountaintop islands in the Great Basin show that mammals have a higher z (steeper slope on the species-area graph) than birds.

**Equilibrium Theory of Island Biogeography (ETIB)**

The ETIB describes the theoretical relationship between immigration and extinction of species to islands, depending on their size and distance from the mainland or other species source.

Consider the degree of isolation of the area under study:

Isolate (oceanic and continental islands) vs. Sample (e.g. Amazon)

Oceanic islands are usually created by volcanic activity.

Continental islands are formed when the water level rises (e.g. glaciers melt).

How do species access these islands over time?

1) On oceanic islands, the number of species present increases over time until it reaches the level of the nearest mainland (theoretically the source of the species which immigrate to the island).

2) On continental islands, the number of species present decreases over time. Species richness "relaxes" to a new equilibrium depending on the degree of isolation and the size of the island.

According to ETIB, the number of species present on an island is determined by a balance between immigration and extinction. Generally, as the number of species present increases, the immigration rate decreases and the extinction rate increases.

There are two general relationships to remember:

1) Immigration is higher on near islands than on distant islands (in relation to the mainland), hence the equilibrium number of species present will be greater on near islands.

2) Extinction is higher on small islands than on larger islands, hence the equilibrium number of species present will be greater on large islands.

The number of species on near, large islands > The number of species on distant, small islands

Work by Simberloff and Wilson on mangrove islands in Florida has validated the ETIB:

They killed all of the organisms on various sizes of mangrove islands and different distances from the "mainland" source of species and measured recolonization rates. They found that near, large islands experienced faster recolonization than distant, small islands.

Much of ETIB, which was founded on the study of true islands, can be extended to islands in fragmented habitat. Island biogeography has become an essential component of conservation biology, particularly in the analysis of preserve design, which will be covered in the next lecture.

**Spotlight on Island Biogeorgraphy:** a good, concise summary of Island Biogeography.

## Species Area Relationship 1

The species-area relationship characterizes the relationship between the the number of species observed at a site and the area being sampled. This relationship is used widely in ecology and conservation biology for tasks such as estimating the location of biodiversity hotspots to prioritize for conservation.

Unfortunately there is no consensus on the form of the equation that best describes the species-area relationship. This means that any estimate of species richness depends on the choice of model. Most of the models have roughly equivalent statistical support and we are going to be making predictions for regions where there is no data so we can’t determine the best model statistically. Instead we are going to take a consensus approach where we estimate the species richness using all possible models and then use the average prediction as our best estimate.

We are going to deal with 5 models today (*which is already kind of a lot*), but according to some authors there are as many as 20 reasonable models for the species-area relationship, so we’ll want to make our code easily extensible. The five models we will work with are those defined by Dengler and Oldeland (2010).

- Power: S = b
_{0}* A b_{1} - Power-quadratic: S = 10 (b
_{0}+ b_{1}* log(A) + b_{2}* log(A) 2 ) - Logarithmic: S = b
_{0}+ b_{1}* log(A) - Michaelis-Menten: S = b
_{0}* A / (b_{1}+ A) - Lomolino: S = b
_{0}/ (1 + b_{1}log(b_{2}/A) )

All logarithms are base 10. The parameters for each model are available below, along with the areas at which we wish to predict species richness. Each sublist contains the parameters for one model in the order given above. All models contain b_{0} and b_{1}, but only the Power-quadratic and Lomolino models contain the third parameter b_{2}.

These can be cut and paste into your code. Alternatively, if you’re looking for a more realistic challenge you can import the related csv files for the parameters and the areas directly from the web. Dealing with extracting the data you need from a standard csv import will be a little challenging, but you’ll learn a lot (and you can always solve the main problem first and then go back and solve the import step later which might well be what an experienced programmer would do in this situtation).

Write a script that calculates the richness predicted by each model for each area, and exports the results to a csv file with the first column containing the area for the prediction and the second column containing the mean predicted richness for that area. To make this easily extensible you will want to write a function that defines each of the different species-area models (5 functions total) and then use higher order functions to call those functions. Depending on how you solve the problem you may find zip and Python’s use of asterisks handy.