It is clear that most if not all of these experiments have difficulties that are unrelated to SR. In some cases the anomalous experiment has been carefully repeated and been shown to be in error (e.g. Miller, Kantor, Munera); in others the experimental result is so outrageous that any serious attempt to reproduce it is unlikely (e.g. Esclangon); in still other cases there are great uncertainties and/or unknowns involved (e.g. Marinov, Silvertooth, Munera, Cahill, Mirabel), and some are based on major conceptual errors (e.g. Marinov, Thimm, Silvertooth). In any case, at present no reproducible and generally accepted experiment is inconsistent with SR, within its domain of applicability. In the case of some anomalous experiments there is an aspect of this being a self-fulfilling prophecy (being inconsistent with SR may be considered to be an indication that the experiment is not acceptable). Note also that few if any standard references or textbooks even mention the possibility that some experiments might be inconsistent with SR, and there are also aspects of publication bias in the literature—many of these papers appear in obscure journals. Many of these papers exhibit various levels of incompetence, which explains their authors' difficulty in being published in mainstream peer-reviewed physics journals; the presence of major peer-reviewed journals here shows it is not impossible for a competently performed anomalous experiment to get published in them.
There is a common thread among most of these experiments: the experimenters make no attempt to measure and quantify the systematic effects that could affect or mimic the signal they claim to observe. And none of them perform a comprehensive error analysis, which is necessary for any experiment to be believable today—especially ones that purport to overturn the foundations of modern physics. For Esclangon and Miller this is understandable, as during their lifetimes the use of error bars and quantitative error analyses was not the norm; the modern authors have no such excuse. In several cases (Esclangon, Miller, Marinov, Torr and Kolen, Cahill) it is possible to perform an error analysis which shows that the experiment is not inconsistent with SR after all.
Another common thread among many of these experiments is the claim of “agreement with Miller's result” (Kantor, Marinov, Silvertooth, Torr and Kolen, Munera, Cahill). Miller was the first to claim to have measured the “absolute motion of the Earth”, and his result has achieved a sort of “cult status” among people who doubt the validity of SR. The paper referenced below in the discussion of Miller's results shows conclusively that his result is wrong, and explains why in detail. So claims of “agreement with Miller” generate doubts about the validity of experiments making such claims (how likely is it that a valid result would “agree” with a demonstrably bogus result?).
A key point is: if one is performing an experiment and claiming that it completely overthrows the foundations of modern physics, one must make it bulletproof or it will not be believed or accepted. At a minimum this means that a comprehensive error analysis must be included, direct measurements of important systematic errors must be performed, and whatever “signal” is found must be statistically significant. None of these experiments come anywhere close to making a convincing case that they are valid and refute SR. This is based on a basic and elementary analysis of the experimenters' technique, not on the mere fact that they disagree with the predictions of SR. Most of these experiments are shown to be invalid (or at least not inconsistent with SR) by a simple application of the elementary error analysis or other arguments relating to error bars, showing how important that is to the believability of a result—the authors merely found patterns:
Amateurs look for patterns, professionals look at error bars.
All that being said, I repeat: as of this writing there are no reproducible and generally accepted experiments that are inconsistent with SR, within its domain of applicability.
- A.A. Michelson and E.W. Morley, “On the Relative Motion of the Earth and the Luminiferous Ether”, Am. J. Sci. (3rd series) 34 pg 333–345 (1887).: Some people claim to see a “signal” in this iconic experiment. Indeed there does appear to be a sinusoidal variation in their plots, with period Ѕ turn, just as any real signal would be. But an elementary error analysis shows these variations are not statistically significant. Appendix I of arxiv:physics/0608238 discusses this experiment, including the error analysis, and Section III of that paper shows why their noise appears to be a sinusoid with period Ѕ turn. There is no justification for claims of a real “signal” here.
- Esclangon, C.R.A.S. 185 no. 26 (1927), pg 1593.: He observed a systematic variation in the position of an optical image, correlated with sidereal time. This result is inconsistent not only with SR, but is also inconsistent with the hypothesis that space is Euclidean and light travels in straight lines.
- His “signal” is actually composed of points that are an average of several thousand measurements each, and the magnitude of the signal is about 25 times smaller than the resolution of the individual measurements; an elementary error analysis shows that his result is not significantly different from no variation (the prediction of both SR and Euclidean geometry). So there is no reason to believe this result is inconsistent with SR. Also see Experimenter's Bias below—this is a clear example of over-averaging.
- Miller, Rev. Mod. Phys. 5 (1933), pg 203.: This is a laborious repetition of the Michelson-Morley experiment (MMX), with observations taken over a decade. He reports a net aether drift of about 10 km/s, and describes the variation in velocity and direction in terms of the motions of the sun and the Earth combined with a net aether drift.
- This experiment was re-analyzed in: R.S. Shankland, S.W. McCuskey, F.C. Leone and G. Kuerti, “New Analysis of the Interferometric Observations of Dayton C. Miller”, Rev. Mod. Phys.27 pg 167–178 (1955). They re-examined Miller's original data logs and explained his non-null result as partly due to statistical fluctuations and partly due to local temperature conditions. Their re-analysis is consistent with a null result at all epochs during a year. They gave no justification for any correlation with sidereal time such as Miller reported.
- Remarkably, the raw data of this experiment have survived (copies can be ordered from the C.W.R.U. archives). They were also re-analyzed in: T.J. Roberts, “An Explanation of Dayton Miller's Anomalous 'Ether Drift' Result”, arXiv:physics/0608238. This paper explains in detail how and why Miller was fooled (using digital signal processing techniques), and performs an error analysis showing his results are not statistically significant. It also presents a new analysis that models his systematic drift and obtains a zero result with an upper bound on “aether drift” of 6 km/s (90% confidence). In short, this is every experimenter's nightmare: Miller was unknowingly looking at statistically insignificant patterns in his systematic drift that precisely mimicked the appearance of a real signal. While Miller himself could not have known this, there is no reason to believe or accept his anomalous result today.
- There are dozens of other papers that discuss and/or attempt to “re-analyze” Miller's data, all of which claim to find some real signal in his data. They are all worthless as they do not perform the elementary error analysis of his raw data (see Section II of arXiv:physics/0608238). Miller's anomalous result comes from averaging data—the elementary error analysis is indisputable and shows that his result is not statistically significant. Some modern authors even perform a complicated statistical analysis on plots of his run results vs. sidereal time, proclaiming there is a “significant signal”—they forgot to look at the raw data and compute the statistical significance of each run's result: those are not significant, which destroys their house of cards.
- There is also an aspect of experimenter's bias in Miller's original result (and in the modern “re-interpretations” that find a “signal”). He clearly over-averaged his data, and the “signal” he (and others) found is an order of magnitude smaller than the resolution with which his raw data points were recorded. It is a fact of arithmetic that when averaging data one will obtain an answer, but an error analysis is required to determine whether or not it is statistically significant. People unfamiliar with modern experimental physics can impose their personal desires onto Miller's plots and find a “signal” by ignoring the huge scatter of the individual runs and just looking at the averages. The quantitative error analysis shows this approach is woefully inadequate and the “signal” found this way is not significant.
- So there is no reason to believe or accept Miller's anomalous result today.
- Kantor, J. O. S. A. 52 (1962),978.: Criticized in: Burcev, Phys. Lett. 5 no. 1 (1963), pg 44.
Repeated by: Babcock and Bergman, J.O.S.A. 54 (1964), pg 147.
Repeated by: Rotz, Phys. Lett. 7 no. 4 (1963), pg 252.
Repeated by: Waddoups et al., JOSA 55, pg 142 (1965).
The consensus is now that Kantor's non-null result was due to his rotating mirrors dragging the air; repetitions in vacuum yield a null result consistent with SR.
- Marinov, “Measurement of the Laboratory's Absolute Velocity”, Gen. Rel. and Grav. 12 no. 1 (1980), pg 57.Marinov, Czech J. Phys. B 24, (1974), pg 965.Marinov, J. Phys. A: Math. Gen. 16 (1983), pg1885–1888.Marinov, Progr. in Physics, 1 (2007), pg 31; (posthum. reprint from “Deutscher Physik” 1992).: This is a series of experiments using mechanically rotating mirrors and apertures that claim to measure a local anisotropy in the one-way speed of light.
- Marinov thinks his rotating mirrors and apertures provide an “absolute synchronization” which can be used to measure the one-way speed of light; this is not so, and is a major conceptual error in his design: they merely provide synchronization in the rest frame of his lab. He is also conspicuously bad about ignoring errors and resolutions, to the point of being ridiculous. Simple estimates based on his apertures and rotation rates show that his apparatus is incapable of measuring what he claims, by a factor of 1,000 or more. His apparatus inherently averages over several microseconds (or more), and he completely ignores this basic fact and claims to be measuring the speed of light over a distance of 1.4 meters (!). And he does not bother to monitor various environmental factors (temperature, humidity, barometric pressure) that could easily induce the variations he observes. There is no reason to believe his experiments have any value at all.
- Silvertooth and Jacobs, Applied Optics 22 no. 9 (1983), pg 1274.Silvertooth, Specl. Sci. and Tech. 10 no. 1 (1986), pg 3.Silvertooth and Whitney, Physics Essays 5 no. 1 (1992), pg 82.: This is a series of experiments using variations on a novel interferometer in which Silvertooth claims to have observed the aether. The first paper is simply a description of the special phototube and its usage in measuring Wiener fringes. The second and third present different variations of Silvertooth's basic double-interferometer; both claim to observe the aether.
- The experiments are marred by two clear instrumentation effects: there is feedback into the laser, and the multi-mode lasers employed could mimic the effect seen due to the interrelationships among the different modes. And the apparatus is excessively finicky—an attempt to repeat the measurement using his apparatus failed to see any effect at all (unpublished, see Publication Bias). In addition, the analysis presented is downright wrong—the anisotropy in the speed of light postulated in the 2nd and 3rd papers is completely unable to account for the observations (two different erroneous analyses are presented in the last two papers, making the same elementary mistake both times: not considering the entire light path). Indeed, their postulated transforms belong to the class of theories that are experimentally indistinguishable from SR (see Test Theories above).
- Kolen and Torr, Found. Phys. 12 no. 4 (1982), pg 401 (proposal).Torr and Kolen, NBS Special Publication 617 (1984), pg 675–679(results).: This is an experiment using two atomic clocks separated by 500 meters connected by an underground coaxial cable, which looks for sidereal variations in the phase between them. Variations in that phase are interpreted as variations in the one-way speed of propagation in the cable. This experiment is quite similar to those of Krisher et al. and Cialdea referenced above (both of whom reported null results).
- It is not clear why some people interpret this result as inconsistent with SR—certainly the authors themselves do not do so. Their monitoring of systematic effects such as temperature and barometric pressure (both of which affect the propagation speed of their cable) is woefully inadequate, and such uncontrolled and unmonitored environmental effects could easily explain the tiny variations they observe. Those variations are about a factor of 100 times smaller than the relative drift of the clocks for zero separation (which they tried to subtract in their analysis by assuming it is linear, an assumption that is unlikely to hold to better than 1% as they require). Their data analysis method is also inadequate, as it involves averaging over 23 days (averaging data like this is almost never justified). Moreover, they omitted an error analysis related to that averaging, implying they do not understand the terrible implications of such averaging (see arXiv:physics/0608238 for an example of how disastrous it was for Miller to perform such averaging). They also have some days during which no variation was detected—that is consistent with an unmonitored environmental effect, and is inconsistent with any sort of cosmic effect. Because of the great variability in their diurnal variations, and the inadequacy of their monitoring and analysis, there is no reason to believe this result is inconsistent with SR; the experimenters themselves considered their result “preliminary”.
- Pearson et al., “Superluminal Expansion of Quasar 3C273”. Nature 290 (1981), pg 365.Mirabel and Rodriguez, “A Superluminal Source in the Galaxy”, Nature 371 no. 1 (1994), pg 46.: The simple observation of a visibly superluminal expansion or motion of a distant object does not necessarily imply that anything actually exceeds c locally. See for instance: Gabuzda, Am. J. Phys.55 no. 3 (1987), pg 214. If a high-gamma (subluminal) object is moving at a small angle w.r.t. our line-of-sight it can appear to be going faster than light, but is not. This is different from any uncertainties in distance scales.
- Thimm, “Absence of the Relativistic Transverse Doppler Shift at Microwave Frequencies”, IEEE Trans. Instrum. and Meas., 52 no 5 (2003), pg 1660.: This experiment uses antennas mounted on a rotating disk inside a pair of metal enclosures to attempt to measure the transverse Doppler effect. Unfortunately the author fails to realize that he has merely constructed two closed RF cavities with a rotating coupler between them. That is, the “antennas” he mounted on the rotating disk are not free, and reflections from the surrounding walls completely dominate the RF pattern inside his apparatus, setting up wave patterns in what amounts to a coupled pair of untuned RF cavities. As the input and output of the enclosure have no relative motion, no frequency shift is predicted, in agreement with his measurement. This experiment does not actually test transverse Doppler at all, and it is fully consistent with SR.
- H. A. Mъnera, D. Hernбndez-Deckers, G. Arenas, and J. E. Alfonso, “Observation during 2004 of periodic fringe-shifts in an adialeiptometric stationary Michelson-Morley experiment”, Electromagnetic Phenomena 6, No. 1 (16) pg 70–92 (2006).: This experiment is a repeat of the Kennedy-Thorndike experiment, but with equal-length arms at 90°. The interferometer is fixed to the Earth at a latitude quite close to the equator.
- When Kennedy and Thorndike performed a similar experiment more than 70 years ago, they realized they would be unable to distinguish between temperature effects and orientation effects, so they took great pains to keep the apparatus temperature constant to 0.001°C. In contrast, Mъnera et al. did not make any attempt to control temperature or humidity effects (both are quite large in their room). They measured the temperature with a resolution of just 0.2°C, and attempted to correct for the large temperature and humidity changes—a simple estimate shows that an unmeasurable drift of 0.2°C between the two arms can cause a fringe shift comparable to their “signal”. Humidity differences can generate similarly large fringe shifts. Because they insulated each arm's light path from the room and from each other, it is clear that such variations did occur between the two arms (variations in the room itself were much larger). Because of inadequate environmental monitoring and control, there is no reason to believe this measurement is inconsistent with SR.
- R.T. Cahill, “A New Light-Speed Anisotropy Experiment: Absolute Motion and Gravitational Waves Detected”, Progr. in Physics, 4 (2006), pg 73.: This experiment measures the round-trip delay of RF signals that go out via an optical fiber and come back via coaxial cable, minus the delay from signals that go out via coaxial cable and come back via optical fiber. The apparatus has 5-meter cables, and is fixed to the Earth with the cables aligned north-south at a latitude of 38° S. He claims that optical fiber is insensitive to “absolute motion” but coax cable can observe it, and this combination maximizes the “signal”.
- Cahill made a reasonable effort to minimize the effects of temperature variations on his apparatus, but then assumed that no systematic error remained, and did not monitor the environmental factors (temperature, barometric pressure) that can affect his apparatus. By temporarily arranging the cables to form a circle, he used a test setup that eliminates any real signal and permits him to directly measure his systematic errors, which is an important thing to do. But then he inexplicably ignores that measurement (the last 4 hours of his Fig. 14). The apparatus drifts up and down by 8 ps during this signal-canceling configuration, which is roughly half the magnitude of his “signal”. The presence of such a large, unmonitored, and unknown drift completely invalidates his conclusions. For instance, there are several 24-hour periods in his data plots during which the variations are less than the 8 ps during that short measurement of the systematic drift—that is consistent with the entire “signal” being this unknown systematic drift.
- Cahill has a novel way of dealing with the clear and obvious variations in his data at a given orientation (i.e. data points 24 hours apart): he calls this “gravitational waves” and claims they are an additional part of his “signal” (these “gravitational waves” are from his theory, not GR). The presence of comparable variations when the cables were configured in a circle invalidates this claim, as that cancels any real signal. For this apparatus any real signal corresponding to “absolute motion” must have a period of 24 hours, and it is clear from his plots that there is no significant signal present (in his Fig. 15, the variance of the differences between points separated by 24 hours is comparable to the variance of the entire plot and of the signal-canceling configuration; this is just an application of the elementary error analysis). Calling the variations at a given orientation “gravitational waves” does not change the fact that these variations are comparable to the orientation dependence, which is therefore not statistically significant. In order to separate an orientation-dependent signal from the “gravitational waves” he claims, it is necessary to perform an experiment that can actually separate them, and this one cannot possibly do so. That requires an apparatus capable of separating systematic environmental effects from the data, and also capable of varying its orientation on a time scale smaller than that of the “gravitational waves” (remember these are not the gravitational waves of GR).
- Cahill emphasizes that his experiment agrees with Miller's results and with an unpublished experiment by de Witte. But his comparisons are without error bars and are therefore worthless. Error bars for Miller's data can be computed, and error bars for de Witte's and Cahill's data can be estimated, and each of the three results is not significantly different from null (for “absolute motion”; one must ignore his “gravitational waves” for this comparison). So they actually are consistent with each other, in a way completely unanticipated by Cahill: all three are consistent with a null “absolute motion” result! Do not be deluded by his Fig. 18, as Cahill does not display any error bars, and NONE of those variations are statistically significant; like Miller he is unknowingly looking at insignificant patterns in systematic drifts.
- In short, there is no reason to believe this result because: a) the systematic effects cannot be separated from the “signal” while taking data, b) the brief measurement of systematic drift (in a signal-canceling configuration) is comparable to the “signal”, c) the data for “absolute motion” are not significantly different from zero, and d) the “agreement” with other experiments is not at all what he thinks it is.
- C. E. Navia, C. R. A. Augusto, D. F. Franceschini, M. B. Robba and K. H. Tsui, “Search for Anisotropic Light Propagation as a Function of Laser Beam Alignment Relative to the Earth's Velocity Vector”, Progr. in Physics, 1 (2007), pg 53.Also, arxiv:astro-ph/0604145v1.: This experiment looks for a variation in the transverse position of the laser light diffracted by a grating as the orientation of the apparatus is changed.
- The authors apparently think their laser came out of a textbook and provides a perfectly coherent monochromatic light source. Real lasers are not so perfect, and the difference is important. As they did not describe their laser other than saying it is He-Ne, I'll use generic values for a typical classroom or laboratory He-Ne laser: such a laser has a line width of about 1.5 parts per million, including 3–5 longitudinal modes among which the power is shared, with the fraction in each mode varying widely during operation. Such a laser also has a beam divergence of about 2 milliradians, and a pointing accuracy about 0.1 milliradians. These effects generate systematic uncertainties in the diffraction peak position comparable to their “signal”; they were neither controlled nor monitored by the experimenters.
- The authors provided no error analysis, despite the fact that averaging is central to their analysis. From their plot of raw data (their Fig. 3), it appears to me that the data are quantized quite coarsely, and are then overaveraged to obtain their “result”. The averaged data have unexplained variations (noise) unrelated to orientation which is about half the variations of their “signal”. One can obtain an estimate of the variance of their data from their Fig. 7, and their data plots have variations only about double that variance. So their “signal” is of marginal significance at best.
- Without a careful measurement of their obvious systematic errors, and a comprehensive error analysis, there is no reason to believe the variations they observe are significant.
Elementary Error Analysis of an Average
When multiple measurements of a single quantity are made, their mean provides the best estimate for the actual value of the quantity being measured. But this value is not perfect, and there is still uncertainty in the estimate. A histogram of the original measurements can provide an error estimate for the mean: the best estimate for the error bar on the mean comes from the r.m.s. variance of the histogram (i.e. its σ). If the original values are all statistically independent, and there are N of them, then the best estimate of the error bar on the mean is σ/√N(this comes from the central limit theorem of statistics). That is a lower bound for the error bar on the mean. But if the original measurements are not truly independent, such as when some systematic effect is present, then the error bar on the mean will be larger. For a purely systematic error, the error bar on the mean is σ (independent of N), because one does not know which of the original measurements is correct. This is not necessarily an upper bound on the error bar because additional errors could be present, such as calibration errors of the instrumentation.
It is a fact of arithmetic that when averaging data one will obtain an answer, but the above error analysis is required to know whether or not it is significant. As a rule of thumb, a signal that is 5σ (or more) from zero is difficult or impossible to ignore; a “signal” that is less than 3σ from zero is unconvincing at best. The challenge is usually in determining what the σ actually is; but for an average, σ/√N gives a lower bound that is indisputable.
Usually the averaging of data is unwarranted, and in most cases one can apply an analysis to the entire data sequence—one should normally fit a parameterized theoretical expression to such a data sequence. So if an experiment measures a series of fringe positions as the apparatus is rotated, the theoretical fringe position should be parameterized as a function of orientation, and the parameters fit to the entire measurement sequence. A parameterization of backgrounds and/or systematic errors should be included. Such a fit inherently provides error bars on the resulting parameter values. This is vastly better than averaging the data taken at each orientation and looking for patterns in the averages, because such averaging introduces artifacts, because averaging cannot distinguish between orientation dependence and systematic drifts, and because the fit inherently accounts for correlations in the parameters that averaging ignores. See arXiv:physics/0608238 for examples of both the artifacts introduced by averaging (Section III), and an analysis performed without such averaging (Section IV).
Error bars have become such an important part of modern experimental physics that it is not uncommon to make multiple measurements of a quantity, or to split one sequence of measurements into multiple smaller sequences, specifically so the error bar on the result can be estimated.
Note that the word “error” here is standard terminology, and is used in the sense of “uncertainty” rather than “mistake”. For well-designed experiments, care is taken to minimize the backgrounds and systematic errors, and major systematic errors are measured; then a comprehensive error analysis is performed and used to quantify the resolutions and significance of the results. For most experiments in this section the authors simply did not do this. Prior to 1950 or so that was common and accepted practice; today it is not acceptable at all.
Experimenter's bias is a phenomenon caused by the inability of human participants in an experiment to remain completely objective, in which the human experimenter directly influences the experiment's outcome based upon his or her personal desires or expectations. It is most commonly a concern in medical and sociological experiments, in which “single-blind” and “double-blind” protocols are usually required. But some physical experiments in which a human observer is required to round-off measurements can also be subject to it. In the experiments here the conditions for this are basically the combination of a signal smaller than the actual measurement resolution and an over-averaging of the data used to extract the “signal” from the measurements.
In principle, if a measurement has a resolution of R, then if the experimenter averages N independent measurements the average will have a resolution of R/√N (this is just an application of the error analysisabove). This is an important experimental technique used to reduce the impact of randomness on an experiment's outcome. But note that this requires that the measurements be statistically independent, and there are several reasons why that may not be true—if so then the average may not actually be a better measurement but may merely reflect the correlations among the individual measurements and their non-independent nature.
The most common cause of non-independence is systematic errors (errors affecting all measurements equally, causing the different measurements to be highly correlated, so the average is no better than any single measurement). But another cause can be due to the inability of a human observer to round off measurements in a truly random manner. If an experiment is searching for a sidereal variation of some measurement, and if the measurement is rounded off by a human who knows the sidereal time of the measurement, and if hundreds of measurements are averaged to extract a “signal” that is smaller than the apparatus's actual resolution, then it should be clear that this “signal” can come from the non-random round-off, and not from the apparatus itself. In such cases a single-blind experimental protocol is required; if the human observer does not know the sidereal time of the measurements, then even though the round-off is non-random it cannot introduce a spurious sidereal variation.
Note that modern electronic and/or computerized data acquisition techniques have greatly reduced the likelihood of such bias, but it can still be introduced by a poorly designed analysis technique. Experimenter's bias was not well recognized until the 1950s and 1960s, and then it was primarily in medical experiments and studies. Its effects on experiments in the physical sciences have not always been fully recognized. Several experiments referenced above were clearly affected by it.
There are two very different aspects of publication bias:
- Unpopular or unexpected results may not be published because either the original experimenters or some journal referees have misgivings or reservations about the results, based on the results themselves and not any independent evaluation of experimental procedures or technique.
- Expected experimental results may not be published because either the original experimenters or some journal referees do not consider them interesting enough to merit publication.
In both cases the experimental record in the literature does not fully and accurately reflect the actual experiments that have been performed. Both of these effects clearly affect the literature on experimental tests of SR. This second aspect is one reason why this list of experiments is incomplete; there have probably been many hundreds of unpublished experiments that agree with SR.
Note that this does not include papers that are rejected for other reasons, such as: inappropriate subject or style, major internal inconsistencies, or downright incompetence on the part of authors or experimenters. Such rejections are not bias, they are the proper functioning of a peer-reviewed journal.