Scientific papers can be difficult to understand in general, especially for those who don't read them on a regular basis or are unfamiliar with the domain. That's partially because English can be ambiguous, and mathematics can be dry. Study authors also tend to assume a familiarity with the subject matter and they may not fully explain key points. Add an emotionally charged topic, like miscarriage, and some hormones and it's easy to see how misinterpretations occur.
The purpose of this article is to help you learn how to approach scientific studies so that you might get the most possible from them.
The first thing you need to know is not to rely on the abstract alone. Abstracts are bound in length. As a result, they often lack the specificity needed to fully understand the results. Any conclusion based on a paper's abstract alone is at best incomplete, and at worst incorrect.
Example: One often misquoted paper is Tong et al . The abstract states:"The risk of miscarriage among the entire cohort was 11 of 696 (1.6%). The risk fell rapidly with advancing gestation; 9.4% at 6 (completed) weeks of gestation, 4.2% at 7 weeks, 1.5% at 8 weeks, 0.5% at 9 weeks and 0.7% at 10 weeks (chi(2); test for trend P=.001)."
At first glance those two sentences seem to contradict. How can the overall risk be 1.6%, yet the risk at 6 weeks be 9.4%? Intuitively, the overall risk should be higher than the risk at any one point. To make sense of these findings we need to turn to the full paper.
Women were included in the study after their first prenatal visit which varied between participants from 6 to 11 weeks. Only 32 of the 696 participants had their first prenatal visit in the 6th week of pregnancy. Three of these women miscarried for an incident rate of 3/32 or 9.4% in that group. Overall only 11 of the 696 women miscarried for an incident rate of 1.6% among all study participants. Neither number is intended to reflect the probability of miscarriage in general. In fact, the 1.6% is just meant to describe this particular group of study participants. It carries no meaning outside the context of this study.
Abstracts should be thought of as movie trailers. They're teasers. The purpose of an abstract is to convince a potential reader that the paper is worth reading. It will cover key points, but it isn't intended to be a substitute for the paper1. Just as you wouldn't judge a movie based on it's trailer, you shouldn't make conclusions from a paper's abstract.
Many papers these days are posted online, for free. There are multiple academic search engines you can use, such as Google Scholar or Microsoft Academic to find the full text of the paper. In Scholar click "Versions" and scroll through to find a link with either [pdf] or [ps] as the first word of the title. This refers to the file format, and indicates the paper as either a portable document file (pdf) or postscript (ps). If you cannot find such a link, try the rest of the results sequentially. Some may contain the full version of the paper embedded in the body of the page. In Academic if the paper is available a "source" link will appear.
In some cases the full source of the paper only exists behind a pay wall. Your local library may have access. Another approach is to email the author of the paper and ask for a copy. Most authors are more than happy to share their work. Your best bet is the first person listed, also referred to as the lead author. They're more likely to have a copy of the paper on hand, and have the most to gain from sharing it. If you do get a paper in either manor, please respect the copyright and do not post it online.
Reading more than just the abstract doesn't necessarily mean the entirety of the paper, or the paper in order. You certainly can if you wish, but even most academics don't2. If we read every paper possibly related to our work straight to back, we'd never get our own work done. (When I was in graduate school my advisor insisted her students read a minimum of 100 papers a day!) Our approach to reading papers is a way of constantly filtering. At each step we're asking ourselves, "is this paper still worth my time?" Since this out of order approach tends to be fairly universal, papers are often written assuming they may be read out of order.
The most common strategy of reading the paper out of order is to jump to the most important part of the paper - the Discussion & Results - after reading the abstract. After all, if the results aren't compelling, it doesn't really matter how they were achieved, does it? Next we read the Methods and Methodology sections. It's important to read these sections carefully to ensure we're not making any assumptions about the study. Assumptions about how a study was conducted can influence our interpretation of the results. Finally, if a paper is going to be influential to our work, we'll generally go back and reread the paper in order.
If you're going to do any kind of interpretation of the results (such as trying to assess what they mean to your own risk of miscarriage) or engage in discussion about the results you're going to need to understand the "who" and the "how." This means understanding who the participants are, and how the study was conducted. This information is usually found in the Methods or Methodology section.
What you're looking to understand here is the limits of the research. Different demographics in study participants may keep the results from being generally applicable. This is also the section that usually explains any caveats, limits the study design imposes, and approximations or assumptions the authors made. The goal isn't to diminish the paper - if the paper is peer reviewed you can generally trust the results - but to better understand how it might fit in to the field as a whole. Effectively, you're trying to understand the shape of the puzzle piece.
Example: One major way studies about miscarriage differ is in how women are recruited in the study. The earlier in a pregnancy women are recruited into a study, the higher the study's incident rate of miscarriage will be. Wilcox et al  used newspaper advertisements to recruit women who were planning to become pregnant. Daily urine samples were collected from participants to detect pregnancy as early as possible. Twenty two percent of the pregnancies the researchers detected ended before the pregnancy was clinically recognized by a doctor or conventional pregnancy test and thus presumably before the woman herself knew she was pregnant. In contrast, Tong et al  included participants whose first prenatal appointment occurred between 6 and 11 weeks, and whose viable pregnancies could be verified on ultrasound. Further, Tong et al was specifically studying asymptomatic women, excluding those with a threatened miscarriage.
The peer review process is designed to weed out papers that do not offer substantial new research. That new research can take the form of a new hypothesis, new study demographics, new potential risk factors, etc. Often papers will include a discussion of previous work, and how their work differs in the Abstract, Introduction or Related Work in order to emphasize how their work advances the scientific understanding of a topic. Whenever you see discrepancies between two reported results you should ask yourself, are they measuring the same thing? Often you'll find they're measuring two similarly sounding, yet different things.
If you're still unsure why the results from two different papers differ, a good stratgy is to read the "Related Work" or "Background"/"Introduction" section of the more recently published paper. This section is designed to show how the paper fits in with the current body of research. The more similar sounding the paper topics are, the greater the likelihood the more recent paper will mention the older paper in one of these sections, as well as how the two papers differ.
The most important part of the paper is the Results, Discussions and Conclusions. Depending on where the study was published it will likely only include two of those sections. You will want to read each of them to get a full understanding what real world phenomenon they're trying to model or measure. You may be tempted to just say "miscarriage", but it is much more nuanced than that.
One key difference is between the probability of miscarriage associated at a specific gestational age, P(X = x)Footnote 3, and probability of miscarriage from a specific gestational age, P(X ≥ x). The former is a measuring the probability of an event at a specific point in time only. The latter measures not only the probability of an event occurring at that specific point in time, but also going forward. In a news article or literary magazine the preposition generally doesn't carry much meaning. Here, it makes all the difference.
Example: Returning to Tong et al  as an example, the abstract doesn't make it clear which event it's measuring. Most online sources citing the abstract seem to assume it's modeling P(X = x), the probability of miscarriage at a specific point in time. The fact that there's a lower percentage of miscarriage at 9 weeks than at 10 weeks seems to support this assumption. After all, it wouldn't make sense for the probability of pregnancy to end in miscarriage once at or after 9 weeks to be lower than at or after 10 weeks. Each day that a pregnancy progresses is one less day that a miscarriage can occur. The probability of a pregnancy ending at or after nine weeks is equal to the probability of a pregnancy ending at 9 weeks, plus the probability of a pregnancy ending at or after 10 weeks. Mathematically that's expressed as P(X ≥ 9) = P(X = 9) + P(X ≥ 10). Logically we would expect P(X ≥ 9) ≥ P(X ≥ 10).
In actuality they're modelling P(X ≥ x). Then how could P(X ≥ 9) < P(X ≥ 10)? Turning again to the full paper we see that they're stratifying the participants in the user study based on the gestational age of the fetus at the time of first ultrasound. The participants included in the "9 weeks" group are separate from those in the "10 weeks" group. Participants remain in the group they are initially asigned. Since the "10 weeks" group is not a subset of the "9 weeks" group, they are in essence two different samples. It is therefore possible for the "10 week" sample group to have a higher incident rate than the "9 week" sample group by chance4.
How much of a difference does the preposition make? The probability of miscarriage occurring at or after 6 weeks is equal to the probability of miscarriage at six weeks, plus the probability that miscarriage didn't occur at 6 weeks and instead occurred at 7 weeks, etc. Mathematically, P(X ≥ 6) = P(X = 6) + P(X ≠ 6) × P(X = 7) + P(X ≠ 6) × P(X ≠ 7) × P(X = 8) .... If we incorrectly assume Tong et al  is modelling P(X = x), then we would calculate P(X ≥ 6) as 15.5%. While incorrect, it seems plausible. Since miscarriage drops rapidly from week to week the value of P(X ≥ x) is often close enough to P(X = x) that the either interpretation of the probability seems plausible, which can lead to one being easily mistaken for the other.
When reading these sections be sure to spend time with each of the figures and tables. Read their captions and re-read where they're discussed in the paper to be certain of what is being modelled.
In my work here at data·yze I've come across a number of authoritative looking, and sometimes prominent sites that have gotten the statistics wrong. Often their facts go unattributed, or attributed to the wrong source. For example, there's a pervasive belief that some 70-75%5 of pregnancies end in miscarriage. Online I've seen this statistic attributed to Wilcox et al  and, yet Wilcox et al actually found just 31% of the pregnancies in their study resulted in miscarriage.
One explanation for the 75% is that the sites are including instances where a fertilized egg did not implant. For example Chard  estimated that 30% of embryos failed to implant, and that 70% of conceptions did not result in a live birth. That might seem overly pedantic to distinguish between conceptions and implantated pregnancies, but conception and pregnancy are no more interchangeable than P(X = x) and P(X ≥ x). If we're not careful in defining our terms that 70% of conceptions gets translated to 70% of implantated pregnancies, which is them sometimes assumed to mean 70% of pregnancies confrimed by a home pregnancy test. That could lead to an already nervous woman with a positive pregnancy test to think her chance of miscarriage is 70% when in reality it is much lower.
When in doubt, turn to the paper. If a fact can be supported by the data, you should be able to find a quote or passage from the paper that states it. Publishing is not just a way of life for researchers, it's a career necessity. How often we publish, and how influential our publications are effects who wants to work with us, and how likely we are to get grant money to continue our work.6 The best way to get our peers to read our papers is to make the results and data as compelling as possible, which includes making the biggest and strongest claims that are supported by the data. If a claim can be made from the data, it's a safe bet it'll be stated in the paper.
*Disclaimer: My PhD is in computer science. I have written and read my share of papers in computer science, and am using the same principles to read these medical studies, but I have no medical training.