Information

How does the choice of blood draw site influence the possible specificity of a serological test?

How does the choice of blood draw site influence the possible specificity of a serological test?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

The news has reported that a new serological test for the presence of anti-SARS-CoV-2 antibodies has received an emergency use authorization from the FDA, and notably has a higher specificity than other tests (99.8%).

The new test requires blood to be drawn from a vein, rather than a finger prick, and a quote from the test's manufacturer reads, "If you take blood from a finger prick, you will never be able to achieve the same level of specificity that you will achieve… when you take blood from the vein" (non-scientific news outlet source). Although the question is prompted by news related to COVID-19, I'm interested in the general case as well.

It is unclear to me why this would be the case, since antibodies should be present in capillary blood taken from the fingertip. I would suspect one of two reasons, but I don't have enough of a biology background to verify whether they make sense:

1) The volume of blood which can be drawn from a vein is much larger. This would make sense if it improved the sensitivity to a low-concentration antibody of interest, since there may not be enough antibodies in a fingertip sample to bind to antigens in the test. However, it's not clear why having a larger sample would help with specificity.

2) Fingertip blood samples are contaminated with lymph, matter from the skin surface, and so on. I'm not sure how it would affect the specificity of the test. In my naive understanding of serology tests, the reagent would consist of a synthetic antigen of interest, plus either anti-human-Ig or a second moonoclonal antibody for the antigen that would be used to detect any bound antibodies from the sample. It's not clear to me how these sorts of contamination could cause false positives for this sort of binding.


This ref, https://academic.oup.com/ajcp/article/144/6/885/1761216 show that drops of finger prick blood vary more than venous blood.

The main reason for using venous blood is licensing. A test is licensed only for an exact procedure and materials. If you vary any conditions the license is invalid.

However, I would not expect a test for antibodies (present vs. absent) to show much variation except for detecting the lowest levels of antibodies. (Sorry I cannot find a ref for this).


Interpreting Laboratory Tests

The most sensitive method for confirming a diagnosis of varicella is the use of polymerase chain reaction (PCR) to detect VZV in skin lesions (vesicles, scabs, maculopapular lesions). Vesicular lesions or scabs, if present, are the best for sampling. Adequate collection of specimens from maculopapular lesions in vaccinated people can be challenging. However, one study (Evaluation of Laboratory Methods for Diagnosis of Varicella pdf icon external icon ) comparing a variety of specimens from the same patients vaccinated with one dose suggests that maculopapular lesions collected with proper technique can be highly reliable specimen types for detecting VZV. Other sources such as nasopharyngeal secretions, saliva, blood, urine, bronchial washings, and cerebrospinal fluid are less likely to provide an adequate sample and can often lead to false negative results.

Other viral isolation techniques for confirming varicella are direct fluorescent antibody assay (DFA) and viral culture. However, these techniques are generally not recommended because they are less sensitive than PCR and, in the case of viral culture, will take longer to generate results.

IgM serologic testing is considerably less sensitive than PCR testing of skin lesions. IgM serology can provide evidence for a recent active VZV infection, but cannot discriminate between a primary infection and reinfection or reactivation from latency since specific IgM antibodies are transiently produced on each exposure to VZV. IgM tests are also inherently prone to poor specificity.

Paired acute and convalescent sera showing a four-fold rise in IgG antibodies have excellent specificity for varicella, but are not as sensitive as PCR of skin lesions for diagnosing varicella. People with prior history of vaccination or disease history may have very high baseline titers, and may not achieve a four-fold increase in the convalescent sera. The usefulness of this method for diagnosing varicella is further limited as it requires two office visits. A single positive IgG ELISA result cannot be used to confirm a varicella case.


What Are False Positives and False Negatives?

While many of today's medical tests are accurate, false negative or positives do occur. What causes these erroneous results?

A false negative is a test result that indicates a person does not have a disease or condition when the person actually does have it, according to the National Institute of Health (NIH). False negative test results can occur in many different medical tests, from tests for pregnancy , tuberculosis or Lyme disease to tests for the presence of drugs or alcohol in the body.

Correspondingly, a false-positive test result indicates that a person has a specific disease or condition when the person actually does not have it. An example of a false positive is when a particular test designed to detect melanoma, a type of skin cancer , tests positive for the disease, even though the person does not have cancer.

Double-checking

Because tests differ, the reason behind an inaccurate result and the rate at which they happen depend on the test and on the follow-up protocol used to double-check test results.

An example of how testing protocols are designed to catch false readings and double-check test results can be seen in HIV testing. HIV testing is done using two different types of tests: screening and confirmatory, according to the Centers for Disease Control and Prevention (CDC).

The first test is a screening test called the Enzyme-linked immunosorbent assay (ELISA) that determines a person's status based on the presence of HIV antibodies in their blood. If the initial ELISA test is positive, the lab usually repeats the test using the same sample, according to the CDC.

If the both ELISA test results are positive, a confirmatory test (using different laboratory techniques, such as a western blot or an immunofluorescence assay) is conducted. Both initial and confirmatory tests must have reactive, or positive, results in order for a person to be given a positive result.

What causes false positives

Depending on what a person is being tested for, false positives can occur for several reasons. For example, with tests used to diagnose syphilis (such as the Rapid Plasma Reagin or VORL antigen tests), common causes of false positives include acute viral and bacterial illness, pregnancy and drug addition, according to the State of Alaska Health Care Services.

Some vaccinations (such as flu shots) can occasionally cause a person to test positive for the flu when they do not actually have it, but when the test is repeated, the result is negative, according to the CDC.

For example, a study conducted at a Denver emergency department and published in the July 21 issue of the Journal of the American Medical Association (JAMA) showed that 41.7 percent of HIV-negative people who participated in clinical trials for HIV vaccines tested positive on routine HIV tests - even though they were not actually infected. Those rates differed depending on the type of vaccine administered, ranging from 6.3 percent to 86.7 percent.

Got a question? Email it to Life's Little Mysteries and we'll try to answer it. Due to the volume of questions, we unfortunately can't reply individually, but we will publish answers to the most intriguing questions, so check back soon.


Morphology

To identify S. pyogenes in clinical samples, blood agar plates are screened for the presence of β-hemolytic colonies. The typical appearance of S. pyogenes colonies after 24 hours of incubation at 35-37ଌ is dome-shaped with a smooth or moist surface and clear margins. They display a white-greyish color and have a diameter of > 0.5 mm, and are surrounded by a zone of β-hemolysis that is often two to four times as large as the colony diameter. Microscopically, S. pyogenes appears as Gram-positive cocci, arranged in chains (Figure 1).

Figure 1:

Typical appearance of S. pyogenes on sheep-blood agar plates, following 24 hour incubation under aerobic conditions.


Indirect Serological Tests

The detection of antibodies to HSV allows for diagnosis when other virological methods cannot be performed or yield negative results (39). It is particularly useful in identifying the asymptomatic carrier of infection because, as discussed above, the majority of transmission occurs while the person is asymptomatic. Thus far, the use of these tests has largely been confined to seroepidemiological studies and case management for HSV, while specific clinical uses for serological testing remains a much debated topic. Table ​ Table1 1 outlines some of the current and proposed uses for serological tests for HSV. Table ​ Table2 2 shows the interpretation of serological testing for herpes (2).

TABLE 1

Potential uses of herpes simplex virus (HSV) type-specific antibody assays

Seroepidemiological studiesSeroprevalence studies
Seroincidence studies
Sexual transmission studies
Current and potential clinical usesPatients with apparent first episode and recurrent genital herpes, especially pregnant women
Clinically discordant couples, particularly where the man is positive and the woman is negative and of child-bearing potential
Women of child-bearing potential with a history of lesions suspicious for genital herpes where repeated direct testing for HSV has been negative
Sexually transmitted infection screening, especially those at risk of acquiring HIV infection
Diagnosis of genital herpes when lesions tested using direct tests are negative on at least two occasions
Screening of all HIV-infected individuals at the time of initial diagnosis with HIV, with a view to providing suppressive HSV antiviral therapy in those found to be HSV-2 antibody-positive

TABLE 2

Clinical, virological and serological classification of infection with genital herpes simplex virus (HSV)

Detection of HSV antibodies
Clinical designationType of virus isolatedAcute phase serumConvalescent phase serumClassification of infection
First episodeHSV-2NoneHSV-2Primary HSV-2
HSV-1NoneHSV-1Primary HSV-1
HSV-2HSV-1HSV-1 and HSV-2Nonprimary HSV-2
HSV-1HSV-2HSV-1 and HSV-2Nonprimary HSV-1
HSV-2HSV-2 with or without HSV-1HSV-2 with or without HSV-1First symptoms of prior HSV-2 infection recurrent HSV-2
HSV-2HSV-2 with or without HSV-1HSV-2 with or without HSV-1Recurrent HSV-2
HSV-1HSV-1 with or without HSV-2HSV-1 with or without HSV-2Recurrent HSV-1

Reproduced with permission from reference 2

Although a number of tests can identify HSV antibodies, few available tests are able to differentiate between HSV-1 and HSV-2 (40). Serological assays that are not type-specific have limited clinical utility. In addition, no serological test is able to differentiate between oral and genital infection with HSV. Although there is a very close serological relationship between HSV-1 and HSV-2, they each encode a serologically distinct glycoprotein G (gG-1 and gG-2). This difference has been exploited in developing type-specific serological tests. A recent review describes the new HSV type-specific antibody tests (41). Finally, it appears that seroreversion or waning of immune response to gG-2 occurs with time, raising concerns about the long-term reliability of these tests (41).

Western blot

Western blot (WB) is the gold standard for the detection of antibodies to HSV (41). These tests have a high sensitivity and the ability to discriminate between HSV-1 and HSV-2 antibodies. Sera are reacted against separated, fixed protein arrays ('blots') from either HSV-1 or HSV-2 infected cell lysates. The patterns of antibody binding bands are highly predictive of infection with either HSV-1 or HSV-2. This test is expensive, time consuming and requires skilled interpretation. When initial results are indeterminate or atypical, adsorption of sera with type-specific antigen and reblotting can sometimes 'clean up' the blot and improve interpretation. The WB for HSV is not currently commercially available.

Commercial gG-based type-specific tests

Although most of the available literature evaluating the performance of type-specific tests was based on kits developed by Gull Laboratories (USA), these tests have now been withdrawn from the market.

Presently, two companies produce four kits for the diagnosis of HSV type-specific antibodies. Focus Technologies (USA), formerly MRL Diagnostics, has three tests: HSV-1 and HSV-2 enzyme-linked immunosorbent assays, and an immunoblot test for both HSV-1 and HSV-2. The dual enzyme immunoassay test (HerpeSelect HSV-1 and HSV-2 enzyme-linked immunosorbent assay) has reported 97% to 100% sensitivity and 98% specificity for HSV-1 and HSV-2 (41). This test also reports a more rapid time to seroconversion as compared with WB, showing a median interval of 25 days from the onset of symptoms to seroconversion as determined by HerpeSelect HSV-1 versus 33 days by WB, and 21 days by HerpeSelect HSV-2 versus 40 days by WB in individuals not previously positive for HSV-1 (42).


Sputum smear microscopy as a test for TB

A test for TB, a sputum smear stained using fluorescent acid fast stain © CDC/R W Smithwick

Smear microscopy of sputum is often the first test to be used in countries with a high rate of TB infection. Sputum is a thick fluid that is produced in the lungs and the airways leading to the lungs. A sample of sputum is usually collected by the person coughing. Several samples of sputum will normally be collected. 5 “Sputum Culture”, WebMD www.webmd.com/lung/sputum-culture In 2012 it was suggested that two specimens can be collected on the same day without any loss of accuracy. 6 Davis, J Lucian “Diagnostic accuracy of same-day microscopy versus standard microscopy for pulmonary tuberculosis: a systematic review and meta-analysis”, The Lancet Infectious Diseases 23rd October 2012 www.thelancet.com/ 7 Kirwan, Daniela E “Same-day diagnosis and treatment of tuberculosis”, The Lancet Infectious Diseases 23rd October 2012
www.thelancet.com/

To do the test a very thin layer of the sample is placed on a glass slide, and this is called a smear. A series of special stains are then applied to the sample, and the stained slide is examined under a microscope for signs of the TB bacteria. 8 “Sputum Gram stain - Overview”, University of Maryland Medical Center
www.umm.edu/ency/article/

Sputum smear microscopy is inexpensive and simple, and people can be trained to do it relatively quickly and easily. In addition the results are available within hours. The sensitivity though is only about 50-60%. 9 Siddiqi, Kamran “Clinical diagnosis of smear-negative pulmonary tuberculosis in low-income countries: the current evidence”, The Lancet Infectious Diseases, Vol 3, May 2003, 288
www.thelancet.com/journals/ In countries with a high prevalence of both pulmonary TB and HIV infection, the detection rate can be even lower, as many people with HIV and TB co-infection have very low levels of TB bacteria in their sputum, and are therefore recorded as sputum negative.

In some countries sputum smear microscopy is being phased out, and is being replaced by molecular tests.

Fluorescent microscopy

The use of fluorescent microscopy is a way of making sputum tests more accurate. With a fluorescent microscope the smear is illuminated with a quartz halogen or high pressure mercury vapour lamp, allowing a much larger area of the smear to be seen and resulting in more rapid examination of the specimen.

One disadvantage though is that a mercury vapour lamp is expensive and lasts a very short time. Such lamps also take a while to warm up, they burn significant amounts of electricity, and electricity supply problems can significantly shorten their life span. One way of overcoming these problems is the use of light emitting diodes (LEDs). These switch on extremely quickly, have an extremely long life, and they don’t explode. 10 “TB diagnosis: Improving the yield with fluorescence microscopy”, 2007
www.aidsmap.com/TB-diagnosis-Improving-the-yield-with-fluorescence-microscopy/

In 2011 the World Health Organisation issued a policy statement recommending that conventional fluorescence microscopy should be replaced by LED microscopy. It also recommended that in a phased way, that LED microscopy should replace conventional Ziehl-Neelsen light microscopy. 11 “Fluorescent light-emitting diode (LED) microscopy for diagnosis of tuberculosis”, WHO, 2011
www.who.int/tb/areas-of-work/laboratory/policy_statements/en/ z

A man receives a chest X-Ray during the admission process at a hospital in India. © David Rochkind


INTRODUCTION

Since the outbreak of COVID-19 in Wuhan in December 2019, the virus has, as of October 10th 2020, spread globally with 36 616 555 confirmed cases and 1063 429 deaths worldwide (World Health Organization Coronavirus disease 2020). Following the release of viral genome sequences of SARS-CoV-2 in January (Zhang 2020), molecular detection kits for real-time RT-PCR were soon developed and became the gold standard for diagnosing COVID-19 by confirming the presence of SARS-CoV-2 RNA. The tests have high specificities but varying sensitivities, mostly due to sampling difficulties, including choice of specimen, and timing of peak viral load, which can lead to false-negative results. Before long, however, companies, institutions and research laboratories started flooding the market with serological kits for detection of past (or present) SARS-CoV-2 infection. As of 10th of October 2020, the Foundation for Innovative New Diagnostics lists 342 commercial immunoassays for detecting antibodies (Foundation for Innovative New Diagnostics SARS-CoV-2 diagnostic pipeline 2020), but only 49 have currently been granted an Emergency Use Authorization by the FDA (FDA 2020). The majority of these tests fall within two categories: either a qualitative, rapid immunochromatographic assay (15–20 min), or a slower semi-quantitative enzyme-linked immunoassay (ELISA)/chemiluminescent immunoassay (CLIA) (a few hours). Most commonly, they detect IgM, IgG or both antibodies, but some detect total antibody or IgA.

Thorough validation is needed to facilitate the potential of serology testing

Serology testing is a powerful way to monitor the progression of the pandemic by seroprevalence studies and as a tool in diagnostics. For accurate diagnosis of COVID-19, serology can be a great supplement to molecular detection. Serology is powerful further into the course of the disease, when the virus has been eliminated or exists in small numbers, as suggested in a number of publications indicating antibody testing to surpass PCR sensitivity 5–8 days after symptom onset (Guo et al. 2020 Yong et al. 2020 Zhao et al. 2020). However, in order to accurately use serology for diagnostics or estimates of spread of infection in society, extensive validation is needed. Many of the available tests are of dubious quality, where especially the low specificity is of concern.

Many manufacturers have not made their test validation available and there are no standards to employ that make it possible to compare the performance across tests and to make the tests fully quantitative. Immunoassays vary on not only which antibody they measure but also the antigen used, source of the antigens, specimen type and the secondary antibody conjugate, which influence the test performance (Haselmann et al. 2020 Kontou et al. 2020 Schnurra et al. 2020). The need for test harmonization is highlighted by the increasing number of studies published that compare the head-to-head performance of immunoassays (GeurtsvanKessel et al. 2020 Harritshoej et al. 2020 Jääskeläinen et al. 2020 Lassaunière et al. 2020 Schnurra et al. 2020 Whitman et al. 2020), often showing some discrepancy. Those studies have used pre-pandemic sera, some of which were samples from patients with respiratory virus infections, as it is essential to be able to discriminate between the e.g. ‘common cold’ coronaviruses and SARS-CoV-2 to avoid false positives.

An additional concern is the potential batch-to-batch variation between tests, which leads to the need for repeated validation for each batch used. In Denmark, the study of seroprevalence among blood donors had to be halted, as a new batch of the IgM/IgG Antibody to SARS-CoV-2 lateral flow test from Livzon Diagnostics showed remarkably lower sensitivity than previous batches (Leverance af antistoftest 2020).

What do sensitivity and specificity tell us? Interpreting an individual test result

The high number of antibody tests on the market each has a different sensitivity and specificity. A highly sensitive test should capture all true positive results, whereas a highly specific test should rule out all true negative results. In reality, none of the tests are both 100% sensitive and specific, hence the importance of validating the test before use to know the test characteristics. The test results from a population-based serology survey can then be adjusted for the imperfect test quality. One concern related to validation is what kind of samples were used as positive controls. Do they reflect the population being surveyed? If not, we might underestimate the seroprevalence. The positive control samples are from PCR-confirmed COVID-19 patients, but they might not represent the full clinical spectrum or the different age groups. It is still not known whether children in general have a different pattern of antibody generation compared to adults with COVID-19, and in addition, the severity of disease affects the antibody response, thus samples of asymptomatic ought to be included in the validation.

A general trend seems like that the rapid tests tend to have lower sensitivity than the semi-quantitative tests (Kontou et al. 2020), thus underestimating the true rate of seroconversion in those tested. An advantage of the rapid tests is their speed and ease of use that does not require a laboratory. However, they do depend on the operator to interpret whether they are positive or not, typically by the visualization of a red line, which can result in borderline cases.

Despite the wide spread of SARS-CoV-2, most areas around the world still have an overall low seroprevalence, which potentiates the problem of false positives when deploying antibody tests. Even in a hard-hit country like Spain, findings from perhaps the most extensive population-based sero-epidemiological study to this day, suggests that only 5% of the population had antibodies against SARS-CoV-2 (Pollán et al. 2020).

But how does the seroprevalence impact the interpretation of an individual test result? Let us take an example. In Denmark, a study among 20 640 blood donors showed an adjusted seroprevalence of 1.9% (Erikstrup et al. 2020). The sensitivity of the test was estimated to be 82.6% and specificity 99.5%. These figures result in a negative predictive value of 99.7% and a positive predictive value of 76.2%. Given a negative result as a blood donor, the probability that the result is right is almost 100%. Is the result of your test positive, the probability that the result is correct is only about three-quarters. Are you, as an individual, much better off knowing your antibody status than before? Probably not. If you live in an area with low seroprevalence and you feel healthy, the chances of you having had COVID-19 was small anyway, whereas a positive result has an almost 25% chance of being false. On top of that, we still do not know whether a positive antibody test is associated with protection from future COVID-19 infection and we also do not know for how long the antibodies last, so in fact you should not act any differently than if you had a negative result.

The issue of a low positive predictive value is potentiated the lower the seroprevalence, and thus underscores the challenges of accurately assessing one's antibody status in areas so far spared from big outbreaks of SARS-CoV-2–despite using a test with a seemingly high specificity. An alternative approach to increase the positive predictive value is to focus testing on individuals with an elevated likelihood of previous exposure to SARS-CoV-2 e.g. a history of COVID-19-like illness, or employ a second test with different design characteristics (e.g. antibody format or antigen) if the first test was positive [(CDC Information for Laboratories about Coronavirus (COVID-19) 2020 Hicks et al. 2020)]. As previously mentioned, further into the course of the disease, serology testing is likely more sensitive than molecular methods, and integration of different testing methods could help ensure correct and timely diagnosis of COVID-19. In certain areas without access to advanced laboratories, rapid antigen testing, although typically less sensitive than RT-PCR, could also be a relevant alternative e.g. for screening (CDC Information for Laboratories about Coronavirus (COVID-19) 2020).

Kinetics of SARS-CoV-2 antibody response

By using an ELISA or other semi-quantitative tests, testing COVID-19 cases can potentially reveal something about the kinetics of the antibody response. Despite often being considered a marker of acute infection, IgM does not consistently appear before its IgG counterpart, which hinders its use as a marker of acute or recent infection. A similar trend for IgM was found among studies of SARS-CoV (Meyer, Drosten and Müller 2014), but it could partly be due to differences in testing sensitivities. The median seroconversion reported in several studies falls between 9 and 14 days post symptom onset (Grzelak et al. 2020 Long et al. 2020 Lou et al. 2020 Qu et al. 2020 Zhao et al. 2020), which emphasizes the importance of timing when testing for antibodies. One important point to make is the variability in the antibody response with some patients seroconverting within a few days post symptom onset, and others taking weeks to do so, thus testing too early will miss some cases. Testing for total antibodies appears to be more sensitive and thus detectable a little earlier than IgM or IgG alone (Harritshoej et al. 2020 Lassaunière et al. 2020 Lou et al. 2020 Zhao et al. 2020). IgA specific tests are rare, but some studies report a potential use of IgA as an early diagnostic marker (Dahlke et al. 2020 Ma et al. 2020).

Two large studies found that IgG antibodies persisted for at least three to four months after symptom onset (Gudbjartsson et al. 2020 Iyer et al. 2020), although other studies have observed a gradual decline within the first couple of months (Long et al. 2020 Perreault et al. 2020 Wang et al. 2020). Quantitative measurement of antibody titers also makes it possible to look for correlation to severity status of COVID-19 patients (PCR confirmed), with a range of studies finding higher titers among severe cases (Liu et al. 2020 Long et al. 2020 Qu et al. 2020 Salazar et al. 2020), however, the causality is still unclear. Is it because of higher viral load? Is it because the virus has successfully invaded and colonized the host? Or is the immune response detrimental?

One overall problem though is the lack of proper longitudinal studies, although with time passing since the outbreak of the pandemic more studies are surfacing (Iyer et al. 2020 Perreault et al. 2020 Wang et al. 2020). Most serology studies to this date are retrospective or cross-sectional, and those of longitudinal character often have few patients and/or few sequential samples, which limit their use for accurately answering outstanding issues regarding antibody kinetics.

Comparisons of molecular testing followed by antibody testing show that most individuals with symptoms seroconvert and that PCR testing can be positive up to a month after symptom recovery (Wajnberg et al. 2020). However, neither the clinical features nor the immune responses of asymptomatic cases have been well described yet. So far, most studies have focused on hospitalized, PCR confirmed COVID-19 patients and their antibody response. But some people may fail to mount a detectable antibody response altogether. A small study found that asymptomatic cases may have a weaker immune response to the virus and that the antibodies may diminish sooner than for symptomatic cases with a reduction in neutralizing antibodies after eight weeks (Long et al. 2020).

It is still unclear what antigen(s) are preferred in antibody assays

An important aspect to discuss is the impact of antigens in serological testing. By far the most common antigens to use are the structural proteins nucleocapsid (N) and spike (S) protein, which are also the most immunogenic. The N protein is the most abundant protein it is small and can readily be expressed in e.g. E. coli. On the other hand, the trimeric spike protein extrude from the surface and the S1 subunit is used for receptor binding through the individually folded receptor binding domain (RBD), which is likely a primary target for neutralizing antibodies (Wrapp et al. 2020). The S protein is heavily glycosylated and is therefore typically expressed in mammalian cells. Many antibody kits make use of only one antigen, which opens for the possibility that some individuals might not have a strong antibody response towards that particular antigen. Other kits only use part of the S protein, e.g. the RBD, again possibly introducing a selection bias. The use of recombinant antigens leads to less biosafety needed, it is more standardized and perhaps cross-reactivity can be avoided if only specific epitopes on the viral proteins are used.

A few research groups have developed peptide or protein microarrays, which could help establish the level of cross-reactivity between antigens and which antigens elicit the strongest response (Jiang et al. 2020). However, peptide microarrays come with a risk of false negatives if the antibodies only recognize conformational epitopes instead of linear. The N protein, like the S2 subunit of the spike, is more conserved across coronaviruses, which may increase the risk of cross-reactivity. One study found that seroconversion occurred in average two days earlier for assays detecting total Ig or IgG anti-N than for IgG anti-S (Van Elslande et al. 2020), however another study found more patients had earlier seropositivity for anti-RBD (To et al. 2020). Studies comparing the use of different antigens point in different directions with some concluding that N is preferred, others that S1 subunit or RBD is the more specific and sensitive choice (GeurtsvanKessel et al. 2020 Jiang et al. 2020 Liu et al. 2020 Ma et al. 2020 Schnurra et al. 2020 To et al. 2020).

Correlates of protection—are we any wiser 10 months into the pandemic?

Often when the detection of an antibody response towards SARS-CoV-2 is discussed, it is assumed that reactivity correlates with neutralization, and that neutralization equals immunity (or confers some level of protection), which like WHO warned in April, is too early to say. There are different assays commonly employed when testing for neutralization. Plaque reduction neutralization test (PRNT) is considered the gold standard however, like the cytopathic effect-based microneutralization (MN) assay, it makes use of cultivated live virus that requires a biosafety lab level 3 (BSL-3). Instead, many researchers make use of pseudotyped neutralization assays, which can be handled in a BSL-2 lab. Pseudotyped virus neutralization assays have been used for many types of viruses, however, few SARS-CoV-2 studies have examined its correlation to other neutralization assays like PRNT or MN (Grzelak et al. 2020). A series of studies have reported a correlation between detecting antibodies or antibody titers to neutralizing ability (GeurtsvanKessel et al. 2020 Grzelak et al. 2020 Jääskeläinen et al. 2020 Salazar et al. 2020 To et al. 2020 Wu et al. 2020), but binding is not always predictive of neutralization (Criscuolo et al. 2020 Manenti et al. 2020).

Despite the uncertainty of the role of neutralizing antibodies and the waning of protection, we can probably draw on our knowledge from other viral infections. We can likely expect that we are either immune against reinfection for months or perhaps even a couple of years, or that having encountered the virus before at least will help clear the virus faster the next time around with possibly fewer symptoms. From the SARS epidemic back in 2003 we know that high antibody levels are maintained for at least 16 months before declining significantly (Liu et al. 2006), but one study found that some patients still had detectable neutralizing antibodies 17 years later (Anderson et al. 2020). The humoral response is not the only level of protection, so studies on the cellular immunity are also warranted. In a study by Braun et al., they found that 83% of COVID-19 patients as well as 34% of healthy donors had SARS-CoV-2 spike protein-reactive CD4 + T-cells, albeit at lower frequencies among the healthy donors (Braun et al. 2020). It was speculated that this might correlate to some protection if you have had a common cold from coronaviruses. Other studies have similarly observed T-cell reactivity against SARS-CoV-2 in unexposed people, but the source and clinical relevance remain unknown (Sette and Crotty 2020). Single cell transcriptomic analysis has helped shed light on the remarkable heterogeneity in the SARS-CoV-2 reactive CD4 + T cell response among patients with subsets of T-cells correlating to disease severity and antibody levels (Meckiff et al. 2020).

Recently, confirmed cases of reinfection have been reported in various countries (Gupta et al. 2020 Tillett et al. 2020 To et al. 2020), however, this is not necessarily a big concern nor unexpected. Waning antibody levels, a poorly developed immune response to SARS-CoV-2 from the first infection or genetic changes in the viral surface antigens could be the explanation. These reinfection cases may be outliers, or reinfection may be more common for other infections as well than we know due to less scrutiny compared to SARS-CoV-2. It is important to note though that a decline in antibody levels after a few months since symptom onset is normal and does not rule out the longevity of protection, as it is also conferred by memory cells.

Literature on COVID-19 is exploding, but warrants a word of caution

On a final note, it is challenging to stay aware of all the literature relating to SARS-CoV-2 serology. The number of publications is exploding and preprints are being released at an unprecedented speed. As of October 11th 2020, the preprint servers medRxiv and bioRxiv contain 9456 articles related to COVID-19/SARS-CoV-2 (medRxiv COVID-19 2020). On one hand, such a unified response by the scientific community is remarkable, on the other hand, it is becoming increasingly difficult for proper science to stand out and some studies and manuscripts are likely rushed.

Additionally, in the eagerness of making preliminary results readily accessible, the outcome of many of the earliest seroprevalence studies were reported in the press before a scientific (albeit not necessarily peer-reviewed) article was released. That is problematic since such results might influence public policy and public opinion before the scientific community has had the chance to scrutinize the results and methods. Often these studies were based on convenience sampling with a selected group and/or had a low participation rate, and thus they were not representative for the general population. However, the results from large serological studies like the Spanish ENE-COVID are now appearing (Pollán et al. 2020).


Results

The samples obtained from the SLE patients and the healthy subjects were tested for antibody binding to the arrayed antigens and analysed as described above. Table 1 summarizes clinical and demographic data. Table 2 shows the prevalence of high anti-dsDNA antibodies, low serum C3 and low serum C4 in the different groups, all defined as values outside the normal range in each institution at the time of the blood draw. Information regarding usage of immunosuppressant medications, corticosteroids and anti-malarial drugs is provided in Table 3.

. High anti-dsDNA . Low C3 . Low C4 .
Group 1: T1 and T2 ≤ 10 years
N47 74 73
Positive 40 22 32
Negative 45 57 44
Group 2: T1 < 10 T2 > 10
N12 16 16
Positive 8 * 13 19
Negative 75 69 63
Group 3: T1 and T2 > 10 years
N38 66 66
Positive 16 * 12 15 **
Negative 68 74 76
Group 2 + 3 (combined)
N50 82 82
Positive 14 ** 12 16 **
Negative 70 73 73
. High anti-dsDNA . Low C3 . Low C4 .
Group 1: T1 and T2 ≤ 10 years
N47 74 73
Positive 40 22 32
Negative 45 57 44
Group 2: T1 < 10 T2 > 10
N12 16 16
Positive 8 * 13 19
Negative 75 69 63
Group 3: T1 and T2 > 10 years
N38 66 66
Positive 16 * 12 15 **
Negative 68 74 76
Group 2 + 3 (combined)
N50 82 82
Positive 14 ** 12 16 **
Negative 70 73 73

Values are a percentage unless otherwise stated. N: number of pairs with data available at both time points. Positive: patients who were positive at both time points. Negative: patients who were negative at both time points. All comparisons are vs group 1 using Fisher’s exact test significant comparisons are shown in bold.

. High anti-dsDNA . Low C3 . Low C4 .
Group 1: T1 and T2 ≤ 10 years
N47 74 73
Positive 40 22 32
Negative 45 57 44
Group 2: T1 < 10 T2 > 10
N12 16 16
Positive 8 * 13 19
Negative 75 69 63
Group 3: T1 and T2 > 10 years
N38 66 66
Positive 16 * 12 15 **
Negative 68 74 76
Group 2 + 3 (combined)
N50 82 82
Positive 14 ** 12 16 **
Negative 70 73 73
. High anti-dsDNA . Low C3 . Low C4 .
Group 1: T1 and T2 ≤ 10 years
N47 74 73
Positive 40 22 32
Negative 45 57 44
Group 2: T1 < 10 T2 > 10
N12 16 16
Positive 8 * 13 19
Negative 75 69 63
Group 3: T1 and T2 > 10 years
N38 66 66
Positive 16 * 12 15 **
Negative 68 74 76
Group 2 + 3 (combined)
N50 82 82
Positive 14 ** 12 16 **
Negative 70 73 73

Values are a percentage unless otherwise stated. N: number of pairs with data available at both time points. Positive: patients who were positive at both time points. Negative: patients who were negative at both time points. All comparisons are vs group 1 using Fisher’s exact test significant comparisons are shown in bold.

Group 1: T1 and T2 ≤ 10 years
N82
Immunosuppressants 26 (32)
Corticosteroids 52 (63)
Anti-malarials 21 (26)
Group 2: T1 < 10 T2 > 10
N16
Immunosuppressants 4 (25)
Corticosteroids 4 (25) **
Anti-malarials 1 (6)
Group 3: T1 and T2 > 10 years
N77
Immunosuppressants 27 (35)
Corticosteroids 31 (40) **
Anti-malarials 24 (31)
Group 2 + 3 (combined)
N93
Immunosuppressants 31 (33)
Corticosteroids 35 (38) ***
Anti-malarials 25 (27)
Group 1: T1 and T2 ≤ 10 years
N82
Immunosuppressants 26 (32)
Corticosteroids 52 (63)
Anti-malarials 21 (26)
Group 2: T1 < 10 T2 > 10
N16
Immunosuppressants 4 (25)
Corticosteroids 4 (25) **
Anti-malarials 1 (6)
Group 3: T1 and T2 > 10 years
N77
Immunosuppressants 27 (35)
Corticosteroids 31 (40) **
Anti-malarials 24 (31)
Group 2 + 3 (combined)
N93
Immunosuppressants 31 (33)
Corticosteroids 35 (38) ***
Anti-malarials 25 (27)

Values are n (%) unless otherwise stated. N: total number of pairs in the group. Values are for patient pairs receiving medication at T1 and T2. Immunosuppressants: CYC, AZA, ciclosporin, tacrolimus, MTX, rituximab. Corticosteroids: prednisone or methylprednisolone. Anti-malarials: HCQ or quinacrine. All comparisons are vs group 1 using Fisher’s exact test significant comparisons are shown in bold.

Group 1: T1 and T2 ≤ 10 years
N82
Immunosuppressants 26 (32)
Corticosteroids 52 (63)
Anti-malarials 21 (26)
Group 2: T1 < 10 T2 > 10
N16
Immunosuppressants 4 (25)
Corticosteroids 4 (25) **
Anti-malarials 1 (6)
Group 3: T1 and T2 > 10 years
N77
Immunosuppressants 27 (35)
Corticosteroids 31 (40) **
Anti-malarials 24 (31)
Group 2 + 3 (combined)
N93
Immunosuppressants 31 (33)
Corticosteroids 35 (38) ***
Anti-malarials 25 (27)
Group 1: T1 and T2 ≤ 10 years
N82
Immunosuppressants 26 (32)
Corticosteroids 52 (63)
Anti-malarials 21 (26)
Group 2: T1 < 10 T2 > 10
N16
Immunosuppressants 4 (25)
Corticosteroids 4 (25) **
Anti-malarials 1 (6)
Group 3: T1 and T2 > 10 years
N77
Immunosuppressants 27 (35)
Corticosteroids 31 (40) **
Anti-malarials 24 (31)
Group 2 + 3 (combined)
N93
Immunosuppressants 31 (33)
Corticosteroids 35 (38) ***
Anti-malarials 25 (27)

Values are n (%) unless otherwise stated. N: total number of pairs in the group. Values are for patient pairs receiving medication at T1 and T2. Immunosuppressants: CYC, AZA, ciclosporin, tacrolimus, MTX, rituximab. Corticosteroids: prednisone or methylprednisolone. Anti-malarials: HCQ or quinacrine. All comparisons are vs group 1 using Fisher’s exact test significant comparisons are shown in bold.

Persistence of the SLE-key signature over time

To determine whether the SLE-key signature of SLE patients varied with the time elapsed since diagnosis, we stratified the cohort of SLE patients into those tested within 3 years of diagnosis (n = 116) those tested between 3 and 10 years of diagnosis (n = 117) and those tested after 10 years or more after diagnosis (n = 178), and determined the percentage of subjects Ruled-Out in each group. Figure 1 shows that within 3 years of diagnosis, only 8.6% of the SLE patients were designated as Ruled-Out this result is similar to that observed with the original SLE-key Rule-Out test validation cohort [ 19]. The percentage of patients Ruled-Out increased slightly in patients from 3–10 years after diagnosis (10.3%), although the increase was not statistically significant (P = 0.91) ( Fig. 1).

SLE-key Rule-Out test results over time

Results of the SLE-key Rule-Out test on serum samples from patients obtained at three time points after diagnosis: up to 3 years from 3 to 10 years and >10 years. The solid line indicates percentage Ruled-Out. The dashed lines indicate the 95% CI.

SLE-key Rule-Out test results over time

Results of the SLE-key Rule-Out test on serum samples from patients obtained at three time points after diagnosis: up to 3 years from 3 to 10 years and >10 years. The solid line indicates percentage Ruled-Out. The dashed lines indicate the 95% CI.

In contrast to the results of the SLE-key Rule-Out test until 10 years after diagnosis, at ⩾10 years after diagnosis, we saw a significant increase to 30.9% (P = 2.3 × 10 −7 ) in the number of SLE patients manifesting an Ruled-Out status ( Fig. 1). This group also includes subjects with an initially high (e.g. disease not excluded) SLE-key Rule-Out test score.

The change in the frequency of a Ruled-Out designation among SLE patients could not be attributed to time of serum storage [4.04 (3.46), 5.61 (4.05) and 6.07 (3.44) years, respectively, for the three groups]. The ages of patients at diagnosis were also similar [35 (14), 32 (14) and 29 (12) years, respectively] indicating that the decrease in SLE-key score could not be explained by a late onset of disease. Moreover, there were no significant differences in ethnicity among the groups (P = 0.47). A site effect, however, could not be excluded (P = 0.032): for samples that were drawn ⩽10 years after diagnosis, no more than 33% of the samples came from any one of the five clinical sites. However, the samples that were drawn >10 years after diagnosis came from only four of the five sites, and almost 70% of these samples came from two clinical sites.

The SLE-key test is based on an integrated analysis of IgG and IgM antibodies binding to a set of classifier antigens [ 19]: ssDNA (IgG), U1snRNP (IgG and IgM), histone 3S (IgM), Sm (IgG) and a proprietary synthetic oligonucleotide (IgM). The decrease of the SLE-key Rule-Out score after 10 years could not be attributed to a change in reactivity to any individual antigen among the six classifier antigens (data not shown) this finding highlights the importance of the integrated multiplex signature that takes into consideration the reactivities against all six antigens to determine the Ruled-Out and the not-Ruled-Out status.

Disease activity does not account for differences in the SLE-key signature

SLE disease activity at the time of serum sampling could not explain a significant portion of the observed decrease in SLE-key Rule-Out tests over time. During the first 10 years following diagnosis, a time when the SLE-key Rule-Out test identifies >90% of SLE patients as not Ruled-Out ( Fig. 1), the patients exhibited a wide range of SLEDAI scores—between 0 and 19. Likewise, the range of SLEDAI scores was between 1 and 18 in patients who remained not Ruled-Out at ⩾10 years after diagnosis.

Figure 2 shows the percentage of clinically asymptomatic patients (SLEDAI score 0 at the time of serum draw) who were ruled out by the SLE-key test. Despite SLEDAI scores of 0, only ∼10% of the patients with samples obtained within 3 years of diagnosis, or between 3 and 10 years, exhibited positive SLE-key Ruled-Out test designations. Similar to the results shown in Fig. 1, the percentage of subjects with designations of SLE Ruled-Out increased to about 35% after ⩾10 years.

SLE-key Rule-Out test results over time in asymptomatic lupus patients

The results of the SLE-key Rule-Out test over time in serum samples from clinically asymptomatic patients (SLEDAI = 0). The solid line indicates percentage Ruled-Out. The dashed lines indicate the 95% CI.

SLE-key Rule-Out test results over time in asymptomatic lupus patients

The results of the SLE-key Rule-Out test over time in serum samples from clinically asymptomatic patients (SLEDAI = 0). The solid line indicates percentage Ruled-Out. The dashed lines indicate the 95% CI.

The poor correlation between the SLE-key signature and the SLEDAI scores suggests that the six autoantibody reactivities measured in the SLE-key test are not directly involved in the pathogenesis of target tissue inflammation/damage. Rather, the SLE-key test signature is more likely to reflect an underlying autoantibody profile that distinguishes the immune systems of SLE subjects from those of healthy individuals.

There were several patients in our study with discrepant SLE-key Rule-Out and SLEDAI scores deserving of further study. We focused our analysis on those patients who had a positive Rule-Out score at any time point but that concurrently had active disease as defined by a SLEDAI > 6. However, we found no significant differences between these patients and the rest of the cohort in the time elapsed since blood draw, age at sampling, years since diagnosis, use of prednisone or immune suppressants, or frequency of serological abnormalities (data not shown).

The SLE-key test score wanes late in disease

We found that after ⩾10 years there was an increase in the frequency with which previously diagnosed SLE patients achieved an SLE-key designation of SLE Ruled-Out ( Figs 1 and 2). Figure 3A shows the shift in numerical SLE-key signature scores in the patient subsets categorized according to the time since SLE diagnosis. The median numerical scores of 0.89 [interquartile range (IQR) = 0.51] and 0.83 (IQR 0.5) in disease <3 years and 3–10 years respectively, fell to a median of <0.44 (IQR = 0.78) at ⩾10 years after diagnosis (P = 1.3 × 10 − 9 ). Thus, there is both an increase in the number of subjects developing an SLE-key Ruled-Out designation and a general decrease in the mean SLE-key test scores after ⩾10 years.

SLE-key Rule-Out score distribution in individual lupus patients

(A) SLE-key Rule-Out score distribution of individual samples, grouped by the time after diagnosis, relative to healthy controls (HCs). (B) SLE-key Rule-Out score distribution of SLEDAI = 0 patients, grouped by the time after diagnosis, relative to the HCs.

SLE-key Rule-Out score distribution in individual lupus patients

(A) SLE-key Rule-Out score distribution of individual samples, grouped by the time after diagnosis, relative to healthy controls (HCs). (B) SLE-key Rule-Out score distribution of SLEDAI = 0 patients, grouped by the time after diagnosis, relative to the HCs.

To dissociate the change in immune profile from potential variations in disease activity, we separately examined patients with low disease activity. Figure 3B shows a waning of the SLE-key Rule-Out test scores in asymptomatic subjects manifesting SLEDAI scores of 0 after 10 years the mean numerical score of asymptomatic SLE patients approached that of healthy individuals.

Consistent with our observation that the immunological profile of lupus patients can change 10 years after the original diagnosis of the disease, we found that patients in group 1 (where both samples in the longitudinal study were obtained within 10 years of diagnosis) manifested a significantly higher prevalence of abnormal anti-dsDNA antibodies and serum C4 complement levels ( Table 2). We further analysed a possible relationship between medication use and the SLE-key score. We did not observe an increased incidence of a positive Rule-Out score (an excluded lupus diagnosis) with higher usage of immunosuppression or immunomodulation. Indeed, the opposite was observed. In patient pairs in which at least one of the samples was obtained >10 years after the diagnosis (groups 2 and 3), the prevalence of corticosteroid use was significantly decreased ( Table 3) these patients apparently could be managed with less corticosteroids.

Taken together, these results indicate that an autoimmune signature characteristic of SLE may evolve in some patients over time to a signature score closer to that observed in healthy individuals.


A related point is that for ease of exposition, I generally discus what it is for a causal relationship linking a (single) factor C to an effect E to be stable, specific etc. But my discussion should be understood as applying also to the stability, specificity etc. of relationships linking combinations of causal factors, C 1, C 2 etc. to effects—these too can be more or less stable etc. In particular, it should be kept in mind that even if the individual relationships between C 1 and E and between C 2 and E are by themselves relatively unstable, non-specific etc., it is entirely possible for relationships linking different combinations of values of C 1 and C 2 to E, to be much more stable and specific.

Another way of describing the project is in terms of the development of a vocabulary and framework for describing features of causal relationships that are often of biological interest a framework that (I would claim) is more nuanced and illuminating than more traditional treatments of causation in terms of laws, necessary and sufficient conditions and so on.

Philosophers often focus on causal claims relating types of events. We can represent this with a framework employing variables, by thinking of X and Y as two-valued, with the values in question corresponding to the presence or absence of instances of the event types.

A more precise and detailed characterization of this notion is given in Woodward (2003, p. 98).

For more detailed discussion, see Woodward (2006).

Relatedly, it is no part of my argument that relatively stable gene → gross phenotypical traits relationships are common. Arguably (e.g., Greenspan 2001) they are not, but if so, we still require the notions of stability/instability to express this fact.

This second condition is not redundant even if each individual link in the chain satisfies M, there may be no overall counterfactual dependence between X n and X 1. See Woodward (2003, pp. 57ff).

As Kendler has pointed out to me, this is essentially the logic behind looking for so-called endophenotypes in psychiatric genetics, when these are construed as common pathway variables that are causally intermediate between genotype and phenotype—see, e.g., Gottesman and Gould (2003). Ideally, relationships between endophenotype and phenotype will be more stable than genotype—phenotype relationships and also perhaps more causally specific in the 1–1 sense described in Sect. 5.

Some macro-level relationships may be highly stable (under, say, some range of changes in features of their components) and may better satisfy other conditions like proportionality described below. Relationships among thermodynamic variables provide examples. Whether stable relationships are to be found at more micro or more macro levels is thus always an empirical question.

With respect to a set of variables like , the relationship between the second and third variables will be “direct” or “proximal”. With respect to an expanded more fine grained set of variables the relationship between firing and death is mediated or distal. But the overall stability of the firing → death relationship does not depend on whether we employ a representation with these intermediate variables.

Suppose one has a network of interacting causal structures or units, with, e.g., C 1 causing C 2, C 2 in turn influencing both C 3 and C 4 and so on. I have elsewhere (Hausman and Woodward 1999 Woodward 1999, 2003) characterized such a structure as modular to the extent that various of these causal relationships can be changed or disrupted while leaving others intact—that is, a relatively modular structure is one in which, e.g., it is possible to change the causal relationship between C 1 and C 2 while leaving the causal relationship between C 2 and C 3 intact. When modularity is so understood, it is one kind or aspect of stability—it involves stability of one causal relationship under changes in other causal relationships (which we can think of as one kind of background condition). Like stability, modularity comes in degrees and relative modularity is a feature of some sets of causal relationships, not all. (As recognized in Woodward 1999). Hausman and Woodward (1999) contains some mistaken assertions to the contrary, appropriately criticized in Mitchell (2009). Notions of modularity figure importantly in recent discussions of genetic regulatory networks and other structures involved in development and in evolutionary change—see, e.g., Davidson (2001). Obviously, it is an empirical question to what extent any particular example of such a structure is modular (see Mitchell 2009 for additional discussion.) My claim is simply that modularity (and its absence), like stability more generally, is a feature of causal relationships and their representation that is of considerable biological interest.

That is, there is a change in the condition cited in (3.1) (from scarlet to non-red) which is associated with a change in pecking, so that M judges that (3.1) is true hence requires revision if (3.1) is false.

Another way of understanding proportionality is in terms of employing variables that allow for the parsimonious maximization of predictive accuracy. When P fails there will either be a characterization of the cause such that variation in it could be exploited for predictive purposes but is not so used or else “superfluous” variation in the cause which does not add to the predictability of the effect.

A point recognized by many writers. Greenspan (2001) writes, “specificity has been the shibboleth of modern biology” (383) and Sarkar (2005) that “specificity was one of the major themes of twentieth century biology” (263).

Waters speaks in this passage of DNA as “the” causally specific actual difference maker for RNA molecules “first synthesized” in eukaryotic cells (i.e., presumably pre-mRNA) but he goes onto note that in eukaryotes different varieties of RNA polymerase and different splicing agents are involved in the synthesis of mature RNA, with different splicing agents also acting as causally specific actual difference makers for this mature RNA. Thus, according to Waters, while DNA is causally specific actual different maker for mature RNA in eukaryotes it is not the only such causally specific agent. As previously emphasized, this will not affect my discussion below, which focuses on what it might mean to say that DNA is causally specific with respect to RNA and not on whether other causes are also present that act in a causally specific way. Also the DNA that acts as a causally specific actual difference maker is of course activated DNA.

A mapping F from X to Y is a function iff F(x 1) = y 1 and F(x 1) = y 2 implies y 1 = y 2. A function F is 1–1 iff F(x 1) = F(x 2) implies x 1 = x 2. F is onto iff for every y in Y, there exists an x in X such that F(x) = y. This characterization may be compared with the characterizations in and Weber (2006) and in Sarkar (2005), which I discovered only after formulating the ideas above. I believe that Sarkar’s intent is to capture notions that are very similar to mine, but have some difficulty in understanding how the mechanics of his definitions work. In particular his use of “equivalence classes” seems to make his condition on “differential specificity” redundant satisfaction of this condition is insured just by the assumption that different elements in the domain of the mapping, a and a’, belong to different equivalence classes. In other respects there is close parallelism: Sarkar’s condition (ii) that B be “exhausted” is (I assume) just the assumption that F is onto and the intent of his “reverse differential specificity” condition seems to be captured by the assumption that F is 1–1.

Weber (2006) suggests that “causal specificity is nothing but the obtaining of a Woodward-invariance for two sets of discrete variables”. Weber’s paper is highly illuminating about the role of specificity in Crick’s central dogma, but his characterization of specificity is very different than mine: a functional relationship might be invariant and involve discrete variables but not be 1–1 or onto, might relate only two-valued variables (in violation of the “many different states” requirement in INF) and might violate the one cause one effect condition described below. Weber’s condition seems to me to have more to do with stability than specificity.

This way of formulating matters makes it clear that Proportionality and specificity in the sense of INF are related notions. To the extent that, e.g., there are states of E that cannot be reached by realizing states of C, there will be a failure of proportionality.

This one-cause-one-effect notion of specificity is also closely intertwined with the notion of an intervention, as discussed in Woodward (2003). One wants the relationship between an intervention I and the variable C intervened onto be “targeted” or surgical in the sense that I affects C but does not indiscriminately affect other variables—in particular, those that may affect the candidate effect E via a route that does not go through C. A manipulation lacking this feature is not properly regarded as an intervention on C with respect to E. Thus, to use an example from Campbell’s (2006), derived originally from Locke, pounding an almond into paste is not a good candidate for an intervention on its color because this operation alters so many other properties of the almond. Often, as this example illustrates, the most causally significant variables in a system will be those we can manipulate specifically. Moreover, in many cases, these will be “mechanical” variables like position, density etc.

Referring back to Kendler’s discussion, recall he describes muteness as a “nonspecific consequence” of the hypothetical gene X (which causes mental retardation) in the first of his scenarios. Prima-facie, this may seem puzzling. After all muteness seems, if anything, more specific in the sense of being less abstract and a “narrower” category than mental retardation. The sense in which muteness, in comparison with mental retardation, a non-specific consequence of X seems to be that muteness is one of many effects of X, in contravention of the one cause-one effect ideal of specificity.

Compare Crick’s sequence hypothesis: “the specificity of a piece of nucleic acid is expressed solely by the sequence of its bases, and […] this sequence is a (simple) code for the amino acid sequence of a particular protein” and his association, in his statement of the Central Dogma, of both specificity and “information” with the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein” (Crick 1958, 152, 153). The ideas of causal specificity and information are obviously closely linked as this example illustrates, biologists tend to think of structures as carrying information when they are involved in causally specific relationships. I regret that I lack the space to explore this connection in more detail.

Here, though, we should keep in mind the caveat in footnote 1: it may be that specific stable control is achieved through the interaction of a number of different agents which taken individually have a much less stable and specific effect on the outcome of interest.

As a pre-cautionary move, let me try to head off some possible misunderstandings of this argument. When the issue is control by a human agent, whether a relationship is useful or not for that agent of course depends on (among other considerations) the agent’s purposes and values. In some cases, potential manipulators may not care that some cause has non-specific effects on many other variables (because they regard those effects as neutral) or may even think of this as making the cause a particular good target for intervention, as when these various non-specific effects are all regarded as undesirable and the cause provides a handle for affecting all of them. For example, smoking and childhood sexual abuse have many non-specific effects, virtually all of which are bad and this provides strong reason for trying to intervene to reduce the incidence of both causes. My discussion above is not intended to deny this obvious point. Rather my claim is simply that causal relationships that are stable, specific etc. have control-related features that distinguish them from relationships that are unstable, non-specific etc. Second, and relatedly, I emphasize that my aim has been the modest one of suggesting some reasons why the distinctions between stable and unstable relationships, specific and non-specific relationships and so on is biologically significant. Obviously nature contains (or at least our representations represent nature as containing) stable, specific etc. and unstable, non-specific relationships. I do not claim that the former are always more “important”, fundamental, valuable, or more worthwhile targets of research than the latter. One can coherently claim that the distinctions I have described are real and have biological significance without endorsing such contentions about importance. Thanks to Ken Kendler for helpful discussion of this point.

I don’t claim that these are the only considerations relevant to the classification of a factor as an enabler.


Conclusions

In conclusion, OF was increased in the majority of dogs with IMHA and in dogs with hyperlipidemia, but not in dogs with microcytosis, lymphoma or an infection. Although more detailed information was obtained about the OF by using the COFT, the COFT and ROFT gave similar results. The ROFT does not require specialized equipment, is rapid and easy to perform and can be used easily in daily practice. Although, the ROFT cannot replace other diagnostic tests, it may be a valuable additional tool to diagnose IMHA. Further studies are needed to explain the reason for an increased OF in dogs with IMHA. The degree of spherocytosis likely contributes, but other factors may be involved. Finally, studies with larger number of dogs need to reveal the ideal test conditions and test performances. Because of this, the authors conclude that only anaemic dogs with a positive ROFT, characterized by a clear supernatant that is colourless in the first tube and red in the second tube, are highly likely to have IMHA.