screenshot

Scientific papers can be difficult to understand in general, even for those trained to read them. Study authors tend to assume a familiarity with the subject matter and may not fully explain terminology or background knowledge those in their field typically already know. Additionally English can be ambiguous, and mathematics can be dry. Add an emotionally charged topic, like miscarriage, and some hormones and it's easy to see how misinterpretations occur.

The purpose of this article is to help you learn how to approach scientific studies so that you might get the most possible from them.

Do not Rely on the Abstract Alone

The first thing you need to know is not to rely on the abstract alone. Abstracts are bound in length. As a result, they often lack the specificity needed to fully understand the results. Any conclusion based on a paper's abstract alone is at best incomplete, and at worst incorrect.

Example: One often misquoted paper is Tong et al [4]. The abstract states:

"The risk of miscarriage among the entire cohort was 11 of 696 (1.6%). The risk fell rapidly with advancing gestation; 9.4% at 6 (completed) weeks of gestation, 4.2% at 7 weeks, 1.5% at 8 weeks, 0.5% at 9 weeks and 0.7% at 10 weeks (chi(2); test for trend P=.001)."

At first glance those two sentences seem to contradict. How can the overall risk be 1.6%, yet the risk at 6 weeks be 9.4%? Intuitively, the overall risk should be higher than the risk at any one point. To make sense of these findings we need to turn to the full paper.

Women were included in the study after their first prenatal visit which varied between participants from 6 to 11 weeks. Only 32 of the 696 participants had their first prenatal visit in the 6th week of pregnancy. Three of these women miscarried for an incident rate of 3/32 or 9.4% in that group. Overall only 11 of the 696 women miscarried for an incident rate of 1.6% among all study participants. Neither number is intended to reflect the probability of miscarriage in general. In fact, the 1.6% is just meant to describe this particular group of study participants. It carries no meaning outside the context of this study.

Abstracts should be thought of as movie trailers. They're teasers. The purpose of an abstract is to convince a potential reader that the paper is worth reading, just as a movie trailer is designed to convince a potential moviegoer that the movie is worth seeing. It will cover key points, but it isn't intended to be a substitute for the paper1.

Getting Access to the Paper

Many papers these days are posted online, for free. There are multiple academic search engines you can use, such as Google Scholar or Microsoft Academic to find the full text of the paper. In Scholar search for the paper you're interested in. When you find the paper you're interested in, click the "All Versions" link and scroll through to find a link with either [pdf] or [ps] as the first word of the title. This refers to the file format, and indicates the paper as either a portable document file (pdf) or postscript (ps). If you cannot find such a link, try the rest of the results sequentially. Some may contain the full version of the paper embedded in the body of the page. In Academic if the paper is available a "source" link will appear under the search result.

In some cases the full source of the paper only exists behind a pay wall. Your local library may have access. Another approach is to email the author of the paper and ask for a copy. Most authors are more than happy to share their work. Your best bet is the first person listed, also referred to as the lead author. They're more likely to have a copy of the paper on hand, and have the most to gain from sharing it. If you do get a paper in either manor, please respect the copyright and do not post it online.

Approaching the paper

Reading more than just the abstract doesn't necessarily mean the entirety of the paper, or the paper in order. You certainly can if you wish, but even most academics don't2. If we read every paper possibly related to our work straight to back, we'd never get our own work done. (When I was in graduate school my advisor insisted her students read a minimum of 100 papers a day!) Our approach to reading papers out of order is designed to support constant filtering. At each step we're asking ourselves, "is this paper still worth my time?" Since this out of order approach tends to be fairly universal, papers are often written assuming they may be read out of order.

The most common strategy of reading the paper out of order is to jump to the most important part of the paper - the Discussion & Results - after reading the abstract. After all, if the results aren't compelling, it doesn't really matter how they were achieved, does it? Next we read the Methods and Methodology sections. It's important to read these sections carefully to ensure we're not making any assumptions about the study. Assumptions about how a study was conducted can influence our interpretation of the results. Finally, if a paper is going to be influential to our work, we'll generally go back and reread the paper in order.

A Visual Guide to How Your Knowledge Grows as You Read a Paper Out of Order

Knowledge after reading the abstract: A tiny dot

What you know after reading the abstract.

Very little. The abstract is just the teaser to the paper!
Knowledge after reading the abstract: A giant shapeless blob of information

What you know after reading the Results/Discussion/Conclusion sections.

There's a ton of information in these sections take in, but it might not be clear how it fits together, or how it fits in with other papers. At this point the paper may just feel like a giant blob of data.
Knowledge after reading the abstract: A puzzle piece. You know a lot and now there is structure to what you know

What you know after reading the Method/Methodology.

At this point you're starting to understand the caveats. You'll learn who was included in the study, and how it was conducted. You'll find out how generally applicable the results you just learned about are. The information in the paper is starting to take shape, kind of like a puzzle piece.
The puzzle pieces from multple papers fits together

Putting it all together

Once the paper has taken shape, you can see how it fits in with the general body of research. Perhaps the blue paper may have estimated a higher overall probability of miscarriage than the pink paper, but it's participants also had more risk factors. Differences in results generally reflect differences in study methodology or participant groups.

Understanding "The Who" & "The How" of the Paper

If you're going to do any kind of interpretation of the results (such as trying to assess what they mean to your own risk of miscarriage) or engage in discussion about the results you're going to need to understand the "who" and the "how." This means understanding who the participants are, and how the study was conducted. This information is usually found in the Methods or Methodology section.

What you're looking to understand here is the limits of the research. Different demographics in study participants may keep the results from being generally applicable. This is also the section that usually explains any caveats, limits the study design imposes, and approximations or assumptions the authors made. The goal isn't to diminish the paper - if the paper is peer reviewed you can generally trust the results - but to better understand how it might fit in to the field as a whole. Effectively, you're trying to understand the shape of the puzzle piece.

Example: One major way studies about miscarriage differ is in how women are recruited in the study. The earlier in a pregnancy women are recruited into a study, the higher the study's incident rate of miscarriage will be. Wilcox et al [5] used newspaper advertisements to recruit women who were planning to become pregnant. Daily urine samples were collected from participants to detect pregnancy as early as possible. Twenty two percent of the pregnancies the researchers detected ended before the pregnancy was clinically recognized by a doctor or conventional pregnancy test and thus presumably before the woman herself knew she was pregnant. In contrast, Tong et al [4] included participants whose first prenatal appointment occurred between 6 and 11 weeks, and whose viable pregnancies could be verified on ultrasound. Further, Tong et al was specifically studying asymptomatic women, excluding those with a threatened miscarriage.

Whenever you see discrepancies in reported results between two papers you should ask yourself, are they measuring the same thing? The answer is almost universally "no". That's because the peer review process is designed to weed out papers that do not offer substantial new research. That new research can take the form of a new hypothesis, new study demographics, new potential risk factors, etc. If you're unsure why the results from two different papers differ, a good strategy is to read the "Related Work" or "Background"/"Introduction" section of the more recently published paper. This section is designed to show how the paper fits in with the current body of research. The more similar the papers appear to be, the greater the likelihood the more recent paper will mention the older paper in one of these sections, as well as how the two papers differ. Often you'll find they're measuring two similarly sounding, yet different events.

Understanding "The What" of the paper

The most important part of the paper is the Results, Discussions and Conclusions. Depending on where the study was published it will likely only include two of those sections. You will want to read each of them to get a full understanding what real world phenomenon they're trying to model or measure. You may be tempted to just say "miscarriage", but it is much more nuanced than that.

One key difference is between the probability of miscarriage associated at a specific gestational age, P(X = x)Footnote 3, and probability of miscarriage from a specific gestational age, P(X ≥ x). The former is a measuring the probability of an event at a specific point in time only. The latter measures not only the probability of an event occurring at that specific point in time, but also going forward. In a news article or literary magazine the preposition generally doesn't carry much meaning. Here, it makes all the difference.

Example: Returning to Tong et al [4] as an example, the abstract doesn't make it clear which event it's measuring. Most online sources citing the abstract seem to assume it's modeling P(X = x), the probability of miscarriage at a specific point in time. The fact that there's a lower percentage of miscarriage at 9 weeks than at 10 weeks seems to support this assumption. After all, it wouldn't make sense for the probability of pregnancy to end in miscarriage once at or after 9 weeks to be lower than at or after 10 weeks. Each day that a pregnancy progresses is one less day that a miscarriage can occur. The probability of a pregnancy ending at or after nine weeks is equal to the probability of a pregnancy ending at 9 weeks, plus the probability of a pregnancy ending at or after 10 weeks. Mathematically that's expressed as P(X ≥ 9) = P(X = 9) + P(X ≥ 10). Logically we would expect P(X ≥ 9) ≥ P(X ≥ 10).

In actuality they're modelling P(X ≥ x). Then how could P(X ≥ 9) < P(X ≥ 10)? Turning again to the full paper we see that they're stratifying the participants in the user study based on the gestational age of the fetus at the time of first ultrasound, rather than gestational age at the time of the miscarriage. Each group is, in essence a completely different sample. It is therefore possible for the "10 week" sample group to have a higher incident rate than the "9 week" sample group by chance4.

How much of a difference does the preposition make? The probability of miscarriage occurring at or after 6 weeks is equal to the probability of miscarriage at six weeks, plus the probability that miscarriage didn't occur at 6 weeks and instead occurred at 7 weeks, etc. Mathematically, P(X ≥ 6) = P(X = 6) + P(X ≠ 6) × P(X = 7) + P(X ≠ 6) × P(X ≠ 7) × P(X = 8) .... If we incorrectly assume Tong et al [4] is modelling P(X = x), then we would calculate P(X ≥ 6) as 15.5%. While incorrect, it seems plausible. Since miscarriage drops rapidly from week to week the value of P(X ≥ x) is often close enough to P(X = x) that the either interpretation of the probability seems plausible, which can lead to one being easily mistaken for the other.

When reading these sections be sure to spend time with each of the figures and tables. Read their captions and re-read where they're discussed in the paper to be certain of what is being modelled.

After thoughts about Miscarriage Statistics on the Web

In my work here at data·yze I've come across a number of authoritative looking, and sometimes prominent sites that have gotten the statistics wrong. Often their facts go unattributed, or attributed to the wrong source. For example, there's a pervasive belief that some 70-75%5 of pregnancies end in miscarriage. Online I've seen this statistic attributed to Wilcox et al [5] and, yet Wilcox et al actually found just 31% of the pregnancies in their study resulted in miscarriage.

One explanation for the 75% is that the sites are including instances where a fertilized egg did not implant. For example Chard [1] estimated that 30% of embryos failed to implant, and that 70% of conceptions did not result in a live birth. That might seem overly pedantic to distinguish between conceptions and implantated pregnancies, but conception and pregnancy are no more interchangeable than P(X = x) and P(X ≥ x). If we're not careful in defining our terms that 70% of conceptions gets translated to 70% of implantated pregnancies, which is them sometimes assumed to mean 70% of pregnancies confirmed by a home pregnancy test. That could lead to an already nervous woman with a positive pregnancy test to think her chance of miscarriage is 70% when in reality it is much lower.

When in doubt, turn to the paper. If a fact can be supported by the data, you should be able to find a quote or passage from the paper that states it. Publishing is not just a way of life for researchers, it's a career necessity.6 The best way to get our peers to read our papers is to make the results and data as compelling as possible, which includes making the biggest and strongest claims that are supported by the data. If a claim can be made from the data, it's a safe bet it'll be stated in the paper.

References and Footnotes

Footnotes

  1. Academics who read many papers a day often develop shortcuts to read papers more efficiently. Often the abstract is used to weed out papers are clearly not relevant to the academics research focus.
  2. If the academic thinks the paper is worthwhile, he or she will typically read at least the discussion or conclusion and all figures in the paper in addition to the abstract.
  3. P(X = x) is standard statistical notation that means the probability a random event, denoted capital X, takes on a specific value, denoted lower case x. A more typical way to express the probability of miscarriage, X, given a pregnancy is y weeks along might me P(X | Y = y), which would express miscarriage and pregnancy process as two separate events, with one conditional on the other. We're using P(X = x) as short hand notation for P(X | Y = y) to keep this article approachable to a general audience.
  4. The probabilities associated with 9 weeks and 10 weeks are both within each other's 95% confidence interval. The difference is not statistically significant, and it should not be assumed that one is higher than the other.
  5. There is some support for this 70-75% estimate in the literature. Roberts & Lowe [3] is perhaps the most cited, however they're approach is not without criticism. Jarvis [2] is a good literature review that includes both a summary of the criticisms and supporting arguments of Roberts & Lowe as well as discussions of similar articles and findings.
  6. How often academics publish, and how influential our publications are effects who wants to work with us, and how likely we are to get grant money to continue our work. It is so important that we joke about our Erodos Number, and keep track of our citation counts as bragging rights.

References

*Disclaimer: My PhD is in computer science. I have written and read my share of papers in computer science, and am using the same principles to read these medical studies, but I have no medical training.