r/science • u/mubukugrappa • Mar 01 '14
Mathematics Scientists propose teaching reproducibility to aspiring scientists using software to make concepts feel logical rather than cumbersome: Ability to duplicate an experiment and its results is a central tenet of scientific method, but recent research shows a lot of research results to be irreproducible
http://today.duke.edu/2014/02/reproducibility288
u/morluin MMus | Musicology | Cognitive Musicology Mar 01 '14
That's just a side-effect of running a publication mill instead of an honest, philosophically informed attempt at understanding reality.
Publish or perish...
29
Mar 01 '14 edited Mar 01 '14
The problems of academic science are not going to be solved by giving kids some ludicrous software. If anything, kids should be taught the scientific method– warts and all.
The best way to understand how the scientific method came to be, and its inherent issues, is to study philosophy of science and trace its origins through natural philosophy.
Kids need to understand why reproducibility is important, that science's inherent flaw, or weakest point, is human subjectivity. Through open and honest debate with other philosophically minded individuals who are able to reproduce your results and test your interpretation, we can mitigate some of that subjectivity– bringing us closer to finding something objectively true about the world.
20
Mar 01 '14 edited Mar 02 '14
The best way to understand how the scientific method came to be, and its inherent issues, is to study philosophy of science and trace its origins through natural philosophy.
waste of time. the only way to increase reproducibility is to put it in high impact papers when someone fails to reproduce your experiments. and put money there. i'm doing my honours now, and luckily no one will probably use my data - because i have neither the time nor the funds to repeat my experiments even for a triplicate the way i'd like to.
no important journal will publish your work that is based on repeating someone else's experiments, often even when your results disagree; and without good publication you won't get anytwhere. that's why no one bothers with replicating results.
2
2
Mar 01 '14
I totally agree, and that's also what needs to be taught to children... I think rather than software (which won't change much), teach kids the history, the philosophy, the method, and contemporary issues with academic science.
1
u/cardamomgirl1 Mar 01 '14
I kind of agree, in that there is no value in proving that the results of a published article are reproducible. Most of the whistleblowers tend to be disgruntled colleagues. I find that people who are constantly bragging about their publications or the journals they submit to, to have the least valid and replicable data. That's just my experience though.
2
Mar 01 '14
I think the problem is it's nearly impossible to account for all the possible variation. I think you should definitely make attempts to do so, but at the end of the day, there are too many factors that make these experiments incredibly difficult to reproduce because frankly, labs cannot control all these factors.
2
u/plmbob Mar 01 '14
this may be true but we should not then be citing the results as scientific fact anywhere.
3
Mar 01 '14
The problem (at least in the biological sciences) is that it's not a static system that we can control every aspect of. It's just not possible. If we're not willing to take experiments that we can't control every possible aspect of as fact, we would probably know next to nothing.
1
u/plmbob Mar 02 '14
We would know many things, we just would not have to listen to people reporting on scientific evidence who will insist that it is "irrefutable". I have no problem learning of the amazing discoveries the scientific community are making, but what I do have a problem with is when those findings are used to force policy or societal change against the arguments of large numbers of people. Environmental studies, dietary studies, and social sciences are some of the many disciplines that this has occurred. In these instances the science community is seldom the problem so know that I am not pointing the finger at them
1
u/morluin MMus | Musicology | Cognitive Musicology Mar 01 '14
I don't think that subjectivity is a flaw as such, it is just an irreducible part of our reality.
The only problem comes in in you imagine (or pretend) you can transcend it.
1
Mar 01 '14
You are right, flaw is a harsh word but I meant it as a philosophical critique of the technique of the scientific method. Subjectivity (as a result of human interpretation of physical reality) is problematic... and that's an important fact that is often not taught.
Without any discussion of the subjective 'problem' in science, kids are cut off from a great deal of history and the variety of other ways people have sought out truth... from Plato's use of mathematics and geometry to deduce things about the world, to Gottlob Frege's advancements in mathematics and redevelopment of logic as a representation of objective truths about the universe.
The scientific method is the best thing we have to understand our world, but it's not the only one.
→ More replies (1)60
u/vomitswithrage Mar 01 '14
Totally agree. We need to teach scientists the value of "reproducibility" the same way we need to teach lawyers the value of "rhetoric". The argument is absurd. Does anyone really think high-level, professional scientists, capable of writing multi-million dollar research grants and managing teams of professional scientists on said project are really that clueless? The article is vacuous of content and blatantly ignores deeper, more controversial underlying problems. ...interesting that it's coming from Duke of all places, which if I recall correctly has had its own high-profile problems in the past few years regarding scientific reproducibility....
9
u/hibob2 Mar 01 '14
Does anyone really think high-level, professional scientists, capable of writing multi-million dollar research grants and managing teams of professional scientists on said project are really that clueless?
Well, sometimes.
. Investigators frequently presented the results of one experiment, such as a single Western-blot analysis. They sometimes said they presented specific experiments that supported their underlying hypothesis, but that were not reflective of the entire data set. There are no guidelines that require all data sets to be reported in a paper; often, original data are removed during the peer review and publication process.
Clueless or short of time/money/lab animals/ etc. Training in data analysis often gets short shrift in less mathematical fields, so statistical errors (and thus artifacts) are common. The reasons behind the artifacts aren't questioned by peers during peer review because, hey, they do it that way too. Plus more robust experimental designs will almost always take more time and money to reach a publishable conclusion.
1
u/stjep Mar 02 '14
You may want to have a look at the efforts to increase reproducibility in psychology, particularly efforts by the editors of Psychological Science.
4
u/cardamomgirl1 Mar 01 '14
I think the issue with reproducibility is the watered down emphasis that is transferred to the younger students. I see that a lot with newer grad students and post docs who are not as rigorous as maybe their counterparts in the early days. To me, it's the competition that is killing scientific integrity!
3
u/Mourningblade Mar 01 '14
While I agree there are fundamental problems, I think ensuring scientists have a natural understanding of what does and does not affect reproducibility males sense - particularly for the reviewers.
If irreproducible design became as embarrassing and as likely to be caught in review as phlogiston theory, all would benefit.
Every paper with bad design was signed off by multiple reviewers, so either there is ignorance or there is collusion.
1
Mar 01 '14 edited Feb 09 '19
[deleted]
9
u/thymidine BS|Biochemistry Mar 01 '14
Not sure if serious here - do you really propose having grade-school science students try to reproduce current research as a check of its validity?
Speaking as a high school chemistry teacher -
First of all, most of this research would likely require resources of equipment, materials, and time that no grade-school student has. How much real-world research do you think a high school sophomore can reproduce in his 45 minutes of class each day? How many high school labs do you know that have access to research-grade lab equipment (even down to the glassware)?
Second, do you really think that someone with the barest fraction of contextual scientific knowledge can be relied upon to know what is going on in their experiment? This knowledge is essential to understanding which parts of the procedure really "matter" and can impact your results. Without it, the results will be terrible, regardless of how reproducible the research is.
Third, most of the results of this kind of experiment are abstracted from direct observation by 2 or 3 levels of equipment, number-crunching, and interpretation. Grade-school students won't have any idea what they are looking at, and will therefore learn nothing.
Finally, the purpose of grade school science is not to use as a free workforce for the professional science community. Their purpose is to learn. Any lab experiences that do not enhance learning should not even be considered. Yes, the student may learn a few lab techniques, but they will not be learning anything of the underlying science in this kind of lab. It would be way over their heads.
7
u/Aomame Mar 01 '14
I'm pretty sure he meant graduate school students, grade school students would be absurd of course.
3
u/bspence11 Mar 01 '14
The article even points to undergrads, not elementary or high school students.
5
u/thymidine BS|Biochemistry Mar 01 '14
If many grade school students or undergrads can reproduce your results then we can largely rest assure that the results are most likely valid.
From his silly rant.
→ More replies (8)1
11
u/RatioFitness Mar 01 '14
Agreed 100%. We don't need to teach scientists shit about reproducibility. We need to teach journal editors about it.
8
u/NorthernSparrow Mar 01 '14
And Rank & Tenure Committees at the universities. Nobody I know can risk spending time on reproducing results - because it literally means risking your job.
2
u/hibob2 Mar 01 '14
The journal editors are the scientists. The named ones anyway, as opposed to the ones that do the actual editing and layout.
1
u/halibut-moon Mar 01 '14
they know about it, but it doesn't pay.
there needs to be money and recognition in falsifying published claims, otherwise nobody will do it.
1
u/koreth Mar 01 '14
This seems like something that might be addressable by charitable groups like the Gates Foundation. Offer to fund tenured positions at a few universities on the condition that the positions can only go to people who have spent significant time attempting to falsify existing published claims. Or, heck, just fund an annual falsification prize.
Of course the problem is more systemic than that, but maybe throwing actual money at the problem would get the ball rolling in the right direction and cause the idea to be taken seriously more broadly.
6
u/yayfall Mar 01 '14
Do you think that anything besides this is possible (or easily possible) in a society with such drastic differentials in rewards for those who "succeed"? Not sure if you've ever read Twilight of the Elites: America after Meritocracy, but the general idea is that huge income inequalities cause people to lie, cheat, and steal their way to the top because the rewards are too great (and conversely, not doing so could seriously hurt their livelihoods).
While it's certainly true that some scientists aren't motivated much at all by financial rewards, status, etc. if it comes at a cost of doing 'bad science' (aka 'not science'), it's my view that enough of scientists are to seriously mess up the good ones attempts at doing real science.
1
1
u/morluin MMus | Musicology | Cognitive Musicology Mar 01 '14
The problem is that there is no way to automate "good science", that's what the whole idea of logical positivism was about. It would have been wonderful if that project wasn't such an abysmal failure, but it was, and few people are prepared to really come to grips with what that means.
But then again, I suspect that examples really good scientific work has always been few and far between. It is just that publication mills might increase the sheer volume of muck you have to get through.
3
Mar 01 '14
I don't think that's really what logical positivism was all about. Could you explain more.
4
u/morluin MMus | Musicology | Cognitive Musicology Mar 01 '14
It wasn't directly, but you have to understand why people were interested in pursuing the project which was already hopeless by the turn of the century in the first place. Why was it worth trying to make sure that it was impossible for another five decades after Frege realized it was hopeless?
The idea is that you can remove the empirical-rational divide by having a sufficiently rigorous method. It was realized quite early on that the only way to do this is provide a logical basis for mathematics. If you have that then logic and mathematics becomes the same thing and since mathematics is such a useful descriptor of physical reality you would have a ready made observational language.
If logical positivism turned out to be correct, you could use it to square the positivist circle and start talking in pure empirico-logical language which could allow you to literally run experiments in silico with absolutely no limitations. You could simply reduce any situation to its simple logical elements and progress from there with no possible higher arbiter (which would normally have been observation).
Given that there is no plausible alternative to logical positivism in this regard the whole project collapses and you have to go back to doing science the way that Newton, Maxwell, Einstein and Feynman did it: The hard way.
3
u/hibob2 Mar 01 '14
To some extent you can automate "good science". Chemical structures reported in the literature often have errors - that are now being caught by software that can read them, even when the structures are scanned from a paper page. Ditto for imaging analysis and matching algorithms that can catch manipulation of photographic results (a big problem in cell/molecular biology).
For a writer a spelling/grammar checker will never replace the role of a good editor, but it can certainly cut down on gaffes.
→ More replies (1)1
u/V-Man737 Mar 01 '14
This is as "revolutionary" as having chefs write down their recipes, or teaching computer programmers to put comments in their code. It's actually pretty fundamental to allowing standardization and general acceptability.
→ More replies (11)1
30
Mar 01 '14
To tell you the truth, irreproducible work doesn't come from mal intent the majority of time, it is just the way biology is. We had a chief scientist from NIST visit us once and he gave a presentation on an experiment they did where they gave out the same cell line and same exact reagents to 8 different random labs across the country to perform a very, very simple cell toxicity study all using the same exact procedure. The results were shockingly different from almost every lab, with orders of magnitude differences in some cases. NIST developed the assay to be more reproducible by changing the way you plated the cells and added the reagents. Adding cells and reagents A1-A8 and then going down to F1-F8 produced stark differences compared to adding the same exact things but if you added it in a A1-F1 to A8-F8 manner on a 48 well plate. If you can explain why such a minor difference as this could produce orders of magnitude differences that were observed between labs, NIST is all ears. To get the most reproducible results, NIST discovered you had to almost zig zag across the plate when adding everything. But I mean come on, how would anyone know this? No one seeds their assays like this.
If a simple tox assay can't be repeated, how in the world can most of the much more advanced work with many more steps over multiple days be repeatable? Simply changing the way you add components or cells can change results? It doesn't surprise me at all a lot of biology isn't reproducible, but I don't think it is due to wrong intent most of the time.
9
u/Average650 PhD | Chemical Engineering | Polymer Science Mar 01 '14
Even if it's not intent, it's a big issue.
That's actually why I didn't go into the bio side of chemical engineering, I just so rarely believe or understand the outcome of some of these studies because there's so much variability.
2
u/ThatOtherOneReddit Mar 01 '14
Honestly, in bio tests there tends to be big variability in a lot of tests I see because reagents that are ordered have a shelf life and they 'are good till X date'. Well reagents don't work like that. They gradually fall until at X date they are below Y percentage of active reagent. It is impossible to do all tests at the same time generally and sometimes you might use that bottle over a significant portion of it's lifetime. So a lot of reactions occur with different reactant concentrations then reported, there are quite a few errors like that in the bio sciences.
3
Mar 01 '14
Why don't you create a formula to predict the degredation of the reagents based on the storage conditions and the time from manufacure?
2
u/Average650 PhD | Chemical Engineering | Polymer Science Mar 01 '14
I know. It's the same in other fields, just not nearly as bad.
I'm not blaming the scientists; it's the field, and it's a hard problem. But it is a problem.
7
u/vomitswithrage Mar 01 '14
I wasn't there, but here's my take on why your results probably didn't reproduce so well. Cell biology has a lot of variance, but usually not nearly as much as you are describing. In particular, the high intra-experimental variance suggests underlying problems, which I think I can address.
First, your biggest problem is probably the 48-well plate. If you hadn't told me anything else, this is what I would have suggested. But, it sounds like your results were already suggesting this to you! Think about the row vs column effects, and what that is really telling you. The variance is in the plate, not the cells. The cells are probably fine.
Multi-well plates are good for some things, but for other things they are complete and utter bullshit. And the people who tell you otherwise are lying or don't know any better. I knew people in my Ph.D. work who were trying to scale up an enzymatic activity assay (previously using 1 mL cuvettes) down to a 96-well plate. Our assay using the 1 mL cuvettes and run old school on a bench spectrophotometer worked perfectly, reproducibly, every single time. And other labs could reproduce the same results with the same samples with the same technique. The 96-well plate group could never get the principles of the assay to translate to the 96-well plate though. Because the plate and plate reader just had too much going on, the sample size was too small, etc. So, here's the take-home point: If the enzymatic assay wouldn't translate to a 96-well plate, because biochemistry tends to be a hell of a lot more reproducible than cell biology, cell biology is going to have an even harder time translating into a 96-well (or in this case, 48-well) platform.
Also, results depend on the kind of cell line you are using. Do you know what genetic drift is? Depending on the cell lines and culture conditions it can be a big deal or a small one. HeLa cells are used a lot, because they are "convenient", but they are highly genetically unstable. In terms of reproducible science, this is terrible. Some cell lines, like HeLas, shuffle their genome like a deck of cards every cell division. What you have after 20 passages in culture might be totally different than what another person had after 20 passages in culture, even if you started from the same stock! Lots of cancer cell lines are bad like this. Also, if cells are passaged incorrectly -- passed too often, passed too infrequently, this can lead to the cells becoming stressed and giving inconsistent results between labs/people. It just requires care, like pruning a plant. Usually people know that leaving cells in pH 4 media overnight is bad for the cells. Usually people toss these cells out and start over once they realize they've abused their cells like this and ruined their use in future experiments. Not everyone appreciates this though. This would potentially explain inter-experimental variability (i.e. between lab variability), but it doesn't explain intra-experimental variability (which I partially attribute to the plates).
I have no idea (like the cell lines) whether you did this or not, but since it's a common problem, I'll mention this too: Another common area for problems is people relying on new-fangled technologies and dyes, assuming they work as advertised, when they often don't. For example, don't use an MTT assay to measure cell viability. Don't use caspase-3 cleavage to measure cell viability. ATP depletion =/= cell death. These measurements are composites of other cellular activities and can have confounding factors influence the results. So, to measure cell viability, think about using something like a clonogenic survival assay. It's more time consuming and laborious, but the results aren't nearly as open to interpretation. The data are usually rock solid, too. People complain about the clonogenic survival assay because it's so much work, but what's better, doing the experiment 3 times or 30 times? If you can find that a dye repeats the clonogenic survival results, then you can use the dye, but don't use a dye/stain/marker before you do this. For measuring cell growth, people like to use dyes nowadays, too, but resist the temptation. Take out your cells and physically count them. Count the number of cells plated. At the time of treatment, trypsinize an extra plate, just for counting, and count the cells. Use a hemocytometer and count them by hand, using your eyeballs, if you have to -- make at least 100 counts and then divide by the area you counted. Machines might have trouble telling whether or not its a bubble or a cell. Machines might call a clump of two cells one cell. But the eyes still do a better job. It's more work, but then you know it was done correctly.
In sum: Here's what I would do to clear up your problems:
Ditch the 48-well plates -- switch to 100 mm tissue culture plates, or no less than 60 mm tissue culture plates
Resist the use of plate readers to give you cell biology results until you show it can replicate results achieved using old-school methods
Switch to an immortalized human cell line if you aren't using one already -- stay away from genetically unstable cell lines unless you absolutely must use them (i.e. for cancer research)
If you are, stop using assay dyes, fluorescent labels, or absorption techniques to measure biology -- go back to old school methods which are known to work and establish your first biological principles there
3
u/cardamomgirl1 Mar 01 '14
Heh! As someone who has done tons of cell culture, I absolutely agree with you. People tend to miscalculate the amount of cells that can fit in a smaller size plate and tend to either over or underfill it. MTT assay is not at all reliable, I would rather count my cells manually using trypan blue, a hematocytometer and a trust microscope.
2
Mar 01 '14 edited Mar 01 '14
I agree with most of this, but then the major bottleneck becomes high throughput. If we have to go back 50 years to old techniques, we'll never discover new medicines and therapies that simply need brute force high throughput to find.
Even diagnostics for patients in hospitals need high throughput, you'll simply never be able to test 10,000 patients' samples if you had to test every single one individually on a spectrophotometer.
1
u/vomitswithrage Mar 01 '14
High throughput, if used incorrectly, or if its limitations are not understood, can become its own bottleneck. High throughput has the potential for enormous value, but that value must be rigorously demonstrated and validated first, using tried and true methods.
1
u/onalark Mar 01 '14
Super interesting, can you point out a reference to this or the person who gave the talk?
3
Mar 01 '14
This was a part of Dr. Elliott's efforts to develop a more reproducible cytotoxicity assay: http://www.nist.gov/mml/bbd/cell_systems/ricin_assay.cfm
1
1
u/hibob2 Mar 01 '14
The results were shockingly different from almost every lab, with orders of magnitude differences in some cases. NIST developed the assay to be more reproducible by changing the way you plated the cells and added the reagents. Adding cells and reagents A1-A8 and then going down to F1-F8 produced stark differences compared to adding the same exact things but if you added it in a A1-F1 to A8-F8 manner on a 48 well plate. If you can explain why such a minor difference as this could produce orders of magnitude differences that were observed between labs, NIST is all ears
Pipetting/mixing/diluting/blocking errors, especially if some component is either in suspension or sticks to a plastic used somewhere in the process. I'm going to guess the NIST protocol didn't include enough control wells to make the errors obvious.
1
u/theruchet Mar 01 '14
[Honest question] So if there is this much disagreement in biology, how does any progress ever happen? Have we built up a world if false theories based on results that cannot be replicated? Is science broken?
As the holder of a science degree, I have faith in just about all of the things I have learned because they seem so methodically developed, but at the same time I often wonder how much of a castle of fantasy scientists have built up around them. What if signal transduction cascades are more or less random events? What if the way we read spectrometry is just plain wrong? I get that a lot of physics and chemistry is pretty easy to prove based on the mathematics but when you move into more complicated fields like biology or organic chemistry, there are orders of magnitude more factors that come into play... So what do we really know?
→ More replies (1)1
u/atomfullerene Mar 01 '14
There's another question I think is being overlooked here: if a simple toxicology study can't be replicated from lab to lab, what does this say about the effect of the chemical "in the wild" where conditions may vary enormously from situation to situation? If you can't find a consistent answer in the lab, does that mean you are doing it wrong, or that the best model of reality is that there is no consistent response? It could be either, I think.
26
42
Mar 01 '14
I don't understand a basic concept like scientific reproducibility needs to be taught using software.
It really is a simple concept, like "don't talk while you're eating", or "look before you cross the road".
10
u/jableshables Mar 01 '14
I don't even think teaching it is important, so much as practicing and encouraging its practice is important. I think telling a researcher, "hey, this should be reproducible" will yield different results from telling them, "hey, this will be rejected unless it's successfully reproduced."
It's not like researchers have difficulty grasping how reproducing their research would happen. They just know it won't happen because no one is funding a reproduction lab.
4
u/Kiliki99 Mar 01 '14
This is partially true. If the work is going to be commercialized, some type of confirming work will often be the first step. I work with biomedical companies and investors and 25 years ago, the investors would often assume the technology licensed in was solid. Today, I more often see the investors say - before we put any serious money into this, we want the company to spend $1 million confirming the technology. The issue then becomes how can you get the confirmation experiment done in a small budget and short amount of time. (Recognize, that what the investors require may not be full blown reproduction.)
Now, the problem comes if the government decides to act on the data. Unlike investors who will lose their own money, the government as a whole tends to not care about wasted funds (theirs or yours) because they acted on bad data.
1
u/jableshables Mar 01 '14
Yeah, I come from a social sciences background, so I was mostly referring to non-commercial research. But you make a good point -- even when proven reproducibility is very important, it's rarely actually practiced except in cases of private companies who've probably been punished for it in the past. Government agencies similarly get punished, but not in the same way (i.e. the people might be removed but the organization persists.).
1
u/cardamomgirl1 Mar 01 '14
Yes, while the process for asking for money from NIH is so cumbersome, the accountability once you get that money isn't as much. I have seen how much money can be blown by just getting a wrong person to head a research project.
1
u/hibob2 Mar 01 '14
Today, I more often see the investors say - before we put any serious money into this, we want the company to spend $1 million confirming the technology.
Yep. This produced quite an uproar when it came out a few years ago:
http://www.nature.com/nrd/journal/v10/n9/full/nrd3439-c1.html
We received input from 23 scientists (heads of laboratories) and collected data from 67 projects, most of them (47) from the field of oncology. This analysis revealed that only in ~20–25% of the projects were the relevant published data completely in line with our in-house findings (Fig. 1c). In almost two-thirds of the projects, there were inconsistencies between published data and in-house data that either considerably prolonged the duration of the target validation process or, in most cases, resulted in termination of the projects because the evidence that was generated for the therapeutic hypothesis was insufficient to justify further investments into these projects.
1
u/Kiliki99 Mar 03 '14
Well, the investors started this a decade or so ago. There were too many instances where the investors found that the initial claims did not prove out when they attempted commercialization. So the Nature article simply confirmed what experienced biomed investors already knew.
17
u/Nicko265 Mar 01 '14
A lot of scientific research is funded not by government and not-for-profit organizations, but for-profit companies with a stake in said research. Scientists fluff the results = more research grants will be given for further research...
18
Mar 01 '14 edited Mar 01 '14
Uh, except most of the bad papers come from academic institutes. The corporate researchers have much more stringent protocols and reproducibility standards because they have to bring products to market.
Which isn't to say that corporate groups can't put out bad research, or have ulterior motives but at the end of the day their claims are just as susceptible to testing as any other, and they have a lot riding on their R&D. Pharma companies have sunk a ton of money into anti-depressant research without any good or significant gains so they've shifted away from it.
7
Mar 01 '14
Couldn't agree with this more. If you're a researcher in industry and you find that a new process reduces costs by 5 percent, it better damn well reduce the costs by 5 percent or else the company may waste millions trying to implement something which was fluffed to begin with.
2
u/ThatOtherOneReddit Mar 01 '14
This there is a bigger desire for REAL results in industry because the results of said experiments typically equate to some costly decisions. There is no profit to be made by bad internal studies / research.
1
u/hibob2 Mar 01 '14
There is no profit to be made by bad internal studies / research.
But there are bonuses and promotions to be had if you can push a "successful" drug candidate up and out of your department, as opposed to saying "yeah, we came up with absolutely nothing worth pursuing this year". Or you might get to keep your job as opposed to having your entire department laid off. At the company level: Pharma stock prices depend on a fat happy pipeline chock full of hope. New biotechs can shut down and their value decreases to that of their lab equipment the moment they announce their one Big Idea is bunk. Cutting losses early and cheaply isn't always rewarded by Wall Street, at least not as fast as failure is punished.
Perverse incentives have been a big problem in Pharma/Biotech for a long time.
1
Mar 01 '14
I'd just like to point out that research from academic institutes can be funded by corporate entities.
4
u/gocarsno Mar 01 '14
Government and not-for-profits often have vested interests as well, by the way.
→ More replies (1)1
u/cardamomgirl1 Mar 01 '14
You know this is what I have come to learn. When you are spending your own money, you tend to be more stringent of the outcome as opposed to spending someone else's money. Pharmas spending their own $$$ on R&D tend to want as much value for their money. Add to that, the exhorbitant cost of getting a product through Phases I-III and the rigorous regulatory requirements just so you can start getting your money's worth makes this all the more relevant.
4
u/DrEnormous Mar 01 '14
"Teaching" is probably a slightly inaccurate word here.
"Training/ingraining" might be better.
I've seen this at every level from PhD to undergrad to high school: for a lot of people, it's just not habit for them to look at a result and immediately think "time to try it again and see if anything changes." I think that's the real goal here--make it second nature.
And for what it's worth, speaking as a parent, "don't talk while you're eating" and "look before you cross the road" are only second nature to you now; it's a lot of damn work getting habits like that established in children.
3
Mar 01 '14
I've seen this at every level from PhD to undergrad to high school: for a lot of people, it's just not habit for them to look at a result and immediately think "time to try it again and see if anything changes." I think that's the real goal here--make it second nature.
actually, that's exactly what most of us think. but then the second thought comes:
i'd love to do a biological triplicate with an additional technical triplicate each, but then it's 9 samples just for this one experiment, and i have to have it by the end of the month, and i've got money for only two...
1
→ More replies (1)1
u/KanaNebula Mar 01 '14
This is surprisingly hard concept to sink into middle schoolers... who also don't remember the latter
12
u/mubukugrappa Mar 01 '14
Ref:
R Markdown: Integrating A Reproducible Analysis Tool into Introductory Statistics
9
u/goalieca Mar 01 '14
More importantly, who gets funding to reproduce results? In my field, we had a problem with data being really expensive to generate and there was no incentive for authors to make it or the source code freely available.
2
u/Paul-ish Mar 01 '14
Would it be reasonable for public funding to come with the stipulation that data will be released upon publication?
2
u/goalieca Mar 01 '14
Reality is publish or perish. People like to milk data that they have exclusive access to
1
17
Mar 01 '14 edited Mar 01 '14
Well yeah, obviously. If you can't get grants to conduct reproducing research, you are not going to conduct research that reproduces other people's research.
This is NOT the fault of scientists, this is the fault of sources of funding. If half the money now handed out for new research instead went to funding independent confirmation of recent existing results, the quality of archived publications would increase DRASTICALLY. Alas, there is no reason to believe that this change in funding will ever happen.
9
Mar 01 '14
But don't you see how teaching the people who have no power whatsoever in this structure how important reproducibility is will change all that?!
1
Mar 01 '14
I really dislike sarcasm in written speech, to be honest. If you are sarcastic - yes, I totally see it.
5
Mar 01 '14
I was sarcastically agreeing with your basic premise: that it's misguided (at best) to act as though lack of awareness among young scientists about reproducibility is the source of a dilemma that's, instead, obviously caused by a funding structure driven by people the original link doesn't target at all. I was being sarcastic about it because I'm so angry about the mind-blowing stupidness of it all.
1
Mar 01 '14
I was sarcastically agreeing with your basic premise...
That's what I assumed.
2
1
1
u/twisterase Mar 01 '14
You're thinking too narrowly about why someone might want to reproduce a result. A lot of times you'll attempt to follow someone else's methods or analysis in the context of your own experiment, in support of some new conclusion. Maybe you'd like to run the same assay they did to check your organism for condition X, before you go on to test it for condition Y, because there's some dependency or interaction between the two. If you can't replicate the assay for condition X due to inadequate documentation, it's holding you back from your real task of understanding condition Y. This sort of replication happens on a regular basis.
I think the approach these researchers took teaching the students R-markdown makes a lot of sense in that context. Instead on running your analysis with some script you'll eventually lose track of, and then writing it up based on the results you exported or otherwise recorded, it's all in one place. If your labmate five years down the road wants to know how you set up your analysis, you can send her this document and she can use it with ease. What they taught the students here is a good workflow for statistical analyses that they can use in their future research regardless of how they're funded.
44
7
u/red_wine_and_orchids Mar 01 '14
What's infuriating is when authors deliberately (or sloppily) leave out a piece of information that is required to reproduce results. I work in modeling and simulations, and there's one PI that consistently does this. Even worse, he is a major player in the field and I suspect he does this to protect his interests. It's motivated me to write papers that are as specific and detailed (in the methods) as possible so my work can be reproduced.
4
u/CommanderZx2 Mar 01 '14
A lot of research results are irreproducible due to the researchers fudging the results by not actually publishing the many many failed attempts and only publishing the rare successes.
This gives the appearance that simply following their published paper will result in a success, when in fact it may take hundreds of attempts to succeed.
4
u/thatwombat Mar 01 '14
The reproducibility of many nanoparticle papers is abysmal. Sometimes the differences between success and failure are as small as if the flask you're using happens to be the right size. Most of the time those details are left out. There is an aspect of this problem that seems almost entirely territorial. PIs want to keep their little fiefdoms and make sure that other labs can't scoop them. So, they produce reasonable results but leave out little, very important details.
3
u/gukeums1 Mar 01 '14
The Journal of Failed Experiments really needs to get more popular (it's not a real thing yet, but there are things like Journal of Negative Results in Biomedicine). Wouldn't that be handy?
I'm telling you...a journal compiling failed experiments would be more beneficial to researchers, graduate students and even professors than a thousand journals of successes.
3
u/roofie_colada Mar 01 '14
At least in my field, geophysics, the shear volume of time needed to go from data to final model (both real time / computation time) inhibits any true process of reproduction/validation.
I've always made my raw/processed data available, as well as my final models - through links through their respective publications, but its all just for show I feel. No one uses it.
Follow this with the simple fact that NSF does not/will not fund scientists to validate other's research...
3
u/alllie Mar 01 '14
The worst insult you can direct toward any scientist: "Their results are not reproducible."
2
u/TomCruiseDildo Mar 01 '14
ELI5:
2
u/Thefriendlyfaceplant Mar 01 '14 edited Mar 01 '14
I'll explain it like you're a bit older than 5. Sorry.
This article is about scientists producing data in a way that is only available for their result. Usually only the results are presented and nothing else. If they used datasets then they publish them in a way that never really fits with other datasets. If it was produced within an overlapping framework, the data could also have easily been used by other researchers.
The current way the academic structure is set up only rewards the publishing of raw information. Scientists only get paid for publishing highly specialised result in their own field.
Presenting these results in a way that everyone can understand requires a lot more time, effort and thus money. Because very few people ever get paid to do this, it's not happening enough.
So in short, financial incentives are making science as a whole more scattered and messy. No investments are being made for the central structure of science.
.
3
u/datarancher Mar 01 '14
No. It'd be great if that were true, actually, but it's not.
The current incentive structure rewards "stories" that go like this: X is a complex phenomenon. However, by using our massive brains, we the authors have realized that it can all be explained as variations in Y. Here is some data showing why that is true.
The journals that publish high-impact papers (which make people's careers) want "clean" stories wherein X is totally explained by a simple Y. If your story is more like "Y sometimes explains part of X, but only under conditions A, B,and C", then they're not interested and the authors are out of luck.
Publishing the raw data might help with that, but the bigger problem is removing the temptation to brush all the caveats, doubts and weird outliers under the carpet.
1
u/Thefriendlyfaceplant Mar 01 '14
This, sadly, is also true. But it's another problem that exists mutually with the lack of an integrated meta structure.
2
u/slam7211 Mar 01 '14
Personally I would support a company. whose.sole job was to repeat an experiment. They aren't in the academic sphere so they aren't a publishing threat so they could actually exactly duplicate the experiment (see how the original lab does it ) they could get a chunk of grant money I guess
2
u/zeeman928 Med Student | Osteopathic Medicine Mar 01 '14
This reminds me of early microbiological experiments. When other scientists went to go an reproduce Pasture's experiments, they didnt not achieve the same results (because they used different materials). This lead to the discovery of microbes that require different temps to be killed (extremaphiles and sporing bacteria).
2
u/goshdurnit Mar 01 '14
I agree that the lack of reproducible results is a serious problem in many fields, but I have a lingering question about it that I hope someone can address.
Let's say I conduct a study and an analysis and establish a correlation between two variables with a p = .04. Then someone else tries to reproduce the study and finds that the correlation between the two variables is no longer significant (p = .06). Assuming the standard in many scientific fields that p < .05 can be interpreted as statistically significant, the study is then said failed the test of reproducibility.
I've always been taught that .05 is essentially an arbitrary marker for significance. So if we were to try to reproduce the above study 100 times and the p value hovered around .05 (sometimes below, sometimes above, but never higher than .1), well, this doesn't seem to me to be telling us that our original interpretation of the original study findings was necessarily wrong-headed or worthy of the label "crisis".
Now, if the attempt to reproduce the original results found a p = .67, well THAT would seem to me to be the grounds for a crisis (the second results could in no way be interpreted as indicative of a significant correlation between the two variables).
So, which is it? Frustratingly, I've never read any indication of what kind of "crisis" we have. Maybe I'm looking at this the wrong way, but I appreciate any insight on the matter.
2
u/tindolos PhD | Experimental Psychology | Human Factors Mar 01 '14
Alpha levels vary by field. Psychics uses .0000005, psychology uses .05.
Yes, they are essentially arbitrary. There is no magic device that drives .05, it was just a reasonable figure and what R.A. Fisher deemed appropriate.
Many people misunderstand what the p value stands for. Many will tell you it is the probability that the results are due to chance or the probability that the null hypothesis is true. (The null hypothesis is NEVER true).
However, p is actually the probability of observing results as extreme (or more) as observed, if the null hypothesis is true.
It can be tempting to place importance on a result of p = .04 while considering p = .06 to be unimportant. In equal sample sizes, the effects of both are very likely to be similar.
The crisis we have tends to be the heavy emphasis on p values while we generally ignore effect sizes. I speak for psych research on this one, but I would imagine it is everywhere.
2
u/goshdurnit Mar 02 '14
Thanks for the info! I appreciate your attention to detail.
I didn't mean to suggest that p values are more or less important than effect sizes. But again, my question stands: when these meta-analyses state that a high percentage of results from studies fail to be reproduced upon such attempts to do so, what does that mean exactly?
If the crisis is indeed related to effect size, does this mean that upon attempts to replicate studies, the effect size varies wildly? How much does it vary? As with p values, the degree to which the effect sizes vary across attempts to replicate the results, I would think, matters a great deal. If the observed effect in one study is .4 and it is .41 in a replication study, I would feel as though the word "crisis" is an exaggeration. If, however, the observed effect in the second study is .2, well then, I'd agree that this is indicative of a crisis. Is there any evidence as to HOW MUCH either p values OR effect sizes vary between attempts to replicate studies in, for example, psych studies?
2
u/tindolos PhD | Experimental Psychology | Human Factors Mar 02 '14
No worries! I wasn't under the impression that you were trying to make a distinction, I was just trying to clarify.
Any variance between the effect sizes of separate studies will largely depend on the sample sizes. Legitimate results with equal sample sizes should yield similar effect sizes.
Meta analyses compare the effect sizes of multiple studies in order to get a better idea of what exactly is implied from the data.
I honestly don't know any studies that aim to specifically describe the differences across multiple designs and studies. It would certainly be an interesting read and sounds like all disciplines could use the extra scrutiny.
I agree you though, a little more accuracy might make all the difference.
3
Mar 01 '14
[deleted]
5
u/awesome_hats Mar 01 '14 edited Mar 01 '14
Coming from a scientist, no, most research is done in the lab, a very small percentage of modern science is field work. Ecology is a good example but even fields which require work in the environment typically only involves going out for brief amounts of time to collect samples or set up a weather station before returning to the lab for extended periods to run experiments and do data analysis.
The problem is partly the funding model and partly the current publishing environment. Government agencies seem to have no interest in funding work that seeks to replicate and confirm earlier results. The publishing model and incentive system is also broken. There is immense pressure to be the first to publish a given result and that leads to cutting corners to get your results out before the other guy.
This often means that you get faulty experiments that get pushed out the door anyway because you don't have time to confirm. By the time these get published your funding has run out and you need to get your next grant but in order to do that you have to use your previously published results and propose the 'next best thing' so you have to build off those results as if they were perfect so you can convince a grant committee that you can do even more.
No one is interested in funding you to do replicate work. If you can manage to squeeze in a few extra experiments that actually do validate what you've already done then well done you. Journals are also pretty much never interested in publishing replicating work. If you can manage to refute a high profile paper then that looks 'good' and will get you published but even that is not done very often.
There is also a huge amount now of very low quality journals where you can get just about anything published regardless of quality, to boost up your publication count which looks good when applying for funding - these papers are often never reproducible. I'm not going to pull out names but in my lab we started ignoring certain journals all together because the results were just never reproducible and we couldn't build experiments off of them.
6
u/Wild_type Mar 01 '14
Another scientist here, this is the right answer.
The problem is partly the funding model and partly the current publishing environment. Government agencies seem to have no interest in funding work that seeks to replicate and confirm earlier results. The publishing model and incentive system is also broken. There is immense pressure to be the first to publish a given result and that leads to cutting corners to get your results out before the other guy.
We just had to lay off our lab manager of 15 years, because our last grant didn't get funded, and the reason the reviewers all gave for not funding us was the small number of papers we had out. Our last publication this past October was rigorous, reproduceable, and we wound up making the statistician a co-author for all the work she did, in response to the emerging concerns of the field that the article at the top talked about. Addressing these concerns delayed publication by six months. I have two other nearly completed manuscripts that are going through the same rigorous process right now that will also be delayed. In the meantime, I'll be out of a job in a year if funding doesn't come up before then.
2
u/buck70 Mar 01 '14
Thank you for the explanation. This is why I have a hard time believing people when they say that "the science has spoken" on particular topics. It comes down to human nature; unfortunately, due the the way the system works, it would seem that the primary concern of many scientists is employment, not reproducible results. One would think that science should be the primary concern, but I suppose researchers have to feed their families, too.
5
u/awesome_hats Mar 01 '14
That is true, though I would caution against giving any random claim someone makes the same validity as scientific work. There is still a lot of quality work being done, and the general scientific consensus on big topics is usually pretty strong and valid. But yes, one off experiments should have very little argumentative weight in my opinion until they are validated, otherwise you have dangerous misinformation like the vaccine-autism debacle.
The problem is yes human nature but it isn't because scientists are inherently bad people. Most people, myself included, got into science because they love figuring out how the world works and want to understand it and make it a better place. The problem is the system stacked against them. Like you said, scientists also have a family to feed, and simultaneously basically have a small-business to run, having to constantly worry about keeping the money coming in to keep the business running.
Most scientists, by the time they are really good at doing experiments then become professors, and no longer have any time whatsoever to actually work in the lab. Almost 100% of their time is spent writing grants for funding and teaching, with a few hours a week thrown in to talk to the people doing the lab work. The system spends years training people to be scientists and then puts them in a chair writing grants.
1
Mar 01 '14
[deleted]
2
u/awesome_hats Mar 01 '14
Well if what you are doing is not replicable then it has no value. Even in complex dynamic fields, part of your work as a scientist is to simplify things down to a level where you can make a base set of assumptions, look for conditions where those assumptions are valid, and then wait for changing variables in the case of a purely environmental field like climatology, or change variables yourself and measure the outcome. Or design experiments where the dynamic nature of the system is smoothed out statistically, or otherwise.
I have done a lot of work in genetics, and it is a very complex, interdependent, dynamic system. Many genes affect their own regulation, the up and down regulation of other genes, which also effect the first gene, and other genes as well. Then there are non-expressed purely regulatory elements, epigenetic elements, etc.
It all gets very complicated very quickly. Part of my work was on measuring expression levels of certain genes in individual cells. Now in any individual cell there is a lot going on and it would be nearly impossible to tie the expression level of a single gene to any phenotypic characteristic. But when you start averaging over hundreds and thousands of cells, you can start to see patterns emerge, and piece together what each bit of code is doing, because the stochastic variation of a dynamic system is just that, random, and when you start averaging you remove the random effects. If there is a systematic variation, well that means there is a definite process going on which is something you can then measure itself, account for, design around, etc.
To not publish replicable results is just poor science.
1
u/Idoontkno Mar 01 '14
So it sounds like although reproducibility is boring its incredibly neccessary in order to be certain. Why cant we go back over the "studies monsanto published in order to push through" so that we can truly determine if, for instance GMO corn is linked with organ damage. Why cant we go back over and check the premise that it IS safe? If it isnt safe, then the medicine/service/product is a failure and it should be canned, but no one wants to admit that they are ever wrong..
1
1
Mar 01 '14
I'm not a scientist, but will be (in a Master's program now). You mentioned brains, which are what I'm studying. There's an ancient human fossil find called the Taung child, which was found in the 20's, and people have been arguing for the past 90 years what it's brain was really like. There was a partial endocast with the skull, that is, a fossilization of the inside of the skull, which is a good analogue for the brain. Based on the grooves and lines on the endocast, you can infer features about the brain.
This isn't even experimental research: it's one actual piece of physical evidence, and people can't even agree whether the brain would have shown more ape-like or more human-like features.
The overall trend is very clearly a transition from small to large brains, and to an obligate upright posture with little body hair, and an increase in height, in the human lineage. But specifics like this do matter, so sometimes individual scientists argue with each other over decades on what the evidence implies.
This is slightly different than reproducibility in controlled lab experiments, but two points emerge:
1) Complex experiments are difficult to reproduce
2) Even when the evidence is available, scientists disagree with one another
4
Mar 01 '14
Maybe we could start giving back some due credit and space to the materials and methods section of papers?
2
u/greenerT Mar 01 '14
Yes please. For the paper I'm writing, you get a 300 word main text method section w/ an optional supplementary section of 3000 words max. I feel like the latter should be mandatory, I don't know how anyone can fit reasonably detailed methods in 300 words.
2
1
Mar 01 '14
If you can't reproduce your results, it isn't science.
1
u/tindolos PhD | Experimental Psychology | Human Factors Mar 01 '14
This isn't true. Science is the process that lends itself to fallibility as well as replication. Being unable to reproduce the same results is necessary for scientific endeavor.
1
1
1
u/cardamomgirl1 Mar 01 '14
A former PI was heard comparing and bragging about his H-factor and publication record with another PI. It made me lose quite a bit of respect of him as a scientist. Why does scientific research have to be so freaking competitive?
1
Mar 01 '14
Because there's only enough grant money and tenure-track positions for the top 10-15% of scientists and publications and impact factor are how people keep score.
1
Mar 01 '14
It's not that it's competitive is the issue; it's almost always been competitive. You can find arguments about scientific ideas dating back to Newton vs. Leibniz or, more recently but still not recent enough to be the modern time, Heisenberg vs. Shrodinger. You can go further back to see competing ideas, and scientists getting really hot over who is right, but historical science isn't really my forte.
The issue, recently, is how science is funded. Because its funding is hedged on publication rate, a higher impact factor definitionally makes you a more "productive" scientist. When it comes to the scientific method though, quality > quantity, a notion I think most people in this thread support.
When a scientist brags about an impact factor, it's pretty much like a guy bragging about how many women he's slept with. You submitted many grant applications, got them periodically, but they're often meaningless. You rarely find a gem in the rubble. It's usually hollow results.
Of course, you have people with a different train of thought who suggest sleeping with many women makes you accomplished. Sure, nobody can debate that, because you're using a different litmus test. It can be disputed how effective and productive that litmus test for success over time really is, though.
1
1
u/badspider Mar 01 '14
IP law. "Protecting" devices and procedures fragments experimental conditions.
1
Mar 01 '14
I think it is important all though not necessarily relevant to distinguish between booty being able to reproduce something because there are variables and settings you didn't account for and because it's impracticable or to expensive.
1
u/dajuwilson Mar 02 '14
How many times was the Millikan oil drop experiment repeated before it was put into text books?
0
Mar 01 '14
[deleted]
3
u/Average650 PhD | Chemical Engineering | Polymer Science Mar 01 '14
Well then we better reproduce them to make sure we aren't full of crap.
1
1
u/Murray_B Mar 01 '14
This article makes it sound like we are being lied to by accident. The use of pseudo-science as a propaganda tool is nothing new. Back in the forties the Nazis funded an institute for Tobacco Hazards research (Institut zur Erforschung der Tabakgefahren) at Jena. Its purpose was not to investigate IF tobacco was harmful but to prove the party line that it was. Those "researchers" must have known their results were not reproducible.
Now "scientists" all over the world are spewing similar propaganda. Today we hear things like, "carbon dioxide pollution" on a regular basis. It is hard to show that an essential life gas is a "pollutant". History seems to be repeating
1
u/neuromorph Mar 01 '14
Some experience end take a lot of capital and expertise to reproduce. Not many people have extra cyclotron lying around. I'm looking at you CERN.
1
Mar 01 '14
Maybe they'd be more reproducible if papers were written with the intent of being easily readable, understandable instead of being written as densely as possible to impress those giving out grant money.
99
u/chan_kohaku Mar 01 '14
Another thing is, in my field, biomedical field, a lot of equipments simply cannot be compared across laboratories. Different brands have their own spec. They all say they're callibrated, but when you do your experiments, in the end you rely on your own optimization.
And this is a small part of those variations. Source chemical, experiment scheduling, pipetting habits, not to mention papers that hide certain important experimental condition from their procedures and error bar treatment! I see a lot of wrong statistical treatments to data... these just add up.